C16orf86

From WikiProjectMed
Jump to navigation Jump to search
C16orf86
Identifiers
AliasesC16orf86, chromosome 16 open reading frame 86
External IDsMGI: 1918296 HomoloGene: 19274 GeneCards: C16orf86
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001012984

NM_027655
NM_001357256

RefSeq (protein)

NP_001013002

NP_081931
NP_001344185

Location (UCSC)Chr 16: 67.67 – 67.67 MbChr 8: 106.43 – 106.44 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene.[5] It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver.[5] For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

Function

C16orf86 protein function is still not well understood, however, based on the DNA microarray data and the post-translational modifications data below, this protein could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles.

Localization

Tissue

C16orf86 has tissue expression high in the testes along with expression in regions such as the kidney, colon, brain, fat, spleen, liver.[5]

C16orf86 microarray data was found using NCBI UniGene and going to GeoProfiles for C16orf86.[6] This data below shows C16orf86 tissue expression patterns for cell cycle regulation in kidney cells, colon cancer cells, and adipose tissue.

This DNA microarray figure below was done on MIF deficient cells and control cells using cDNA.[7][8] Results showed that the MIF cytoplasmic protein is a regulator for promoting cell proliferation and cell cycle progression in kidney cells, for example, HEK293.[7][8][9] When MIF is inhibited, P53 blocks cell cycle of G1/S phase progression. Also, inhibition of E2F and AP1 and activation of P53 contribute to cell cycle regulators that result in cell cycle arrest at the G0/G1 phase in MIF cells. These are transcription factors in the C16orf86 promoter. E2F is important for cell cycle progression with AP1 and these are blocked by MIF and P53 takes over. C16orf86 could be important in cell cycle progression in the kidney, where it is expressed in the tissues.

This DNA microarray figure was done on MIF deficient cells and control cells using cDNA.[7][8] Results showed that there was an increase in gene expression for C16orf86 in MIF depleted kidney cells compared to the control of kidney cells.[7][8][9]

This DNA microarray figure below shows purified T98G Glioblastoma Cells that were cycled, G0 arrested, or released into S phase for 10 to 16 hours.[10][11] The researchers tested to see how the mechanism of PRB, p107, and p130 represses the E2F target genes and how P130 complex interacts with Dp, RB like, and other E2F transcription factors to help module DREAM in cell cycle arrest.[12] The results showed that the E2F4 along with P130 and other transcription factors mediate the repression of the cell cycle from G1 cell to G0. If there is activation, S phase is going to bind E2F1/2/3 with other transcription factors to activate transcription in the cell cycle.[12] C16orf86 could be important in cell cycle progression in the brain due to the E2F4 and the E2F1/2/3 transcription factors being located in its promoter sequence.

This DNA microarray figure shows purified T98G Glioblastoma Cells that were cycled, G0 arrested, or released into S phase for 10 to 16 hours.[10][11] The results showed change in gene expression for C16orf86 in glioblastoma cells that are in G0/G1.[10][11][12]

This DNA microarray experiment below uses the idea of Infinium HumanMethylation450 BeadChip arrays with GWAS to figure out the DNA methylation profiles at day 3, day 8, and day 15 for skeletal myoblasts.[13][14] This DNA methylation at day 3, day 8, and day 15 for skeletal myoblasts profiles were used to study myogenic cell differentiation.[15] The results showed that methylation patterns do indeed affect myogenic cell differentiation. One of the transcription factors tested in this experiment in particular, as pertaining to one of the transcription factors in the experiment, MYF6, it is a transcription factor that is located in C16orf86 promoter.[15] This transcription factor are supposed to be down-regulated during muscle cell differentiation.[15] This can be seen when first introduced with the stimulus and never being able to reach its top peak. This could mean that C16orf86 could be muscle cell differentiation in skeletal myoblast cells.[15]

This DNA microarray experiment uses the idea of Infinium HumanMethylation450 BeadChip arrays with GWAS to figure out the DNA methylation profiles at day 3, day 8, and day 15 for skeletal myoblasts.[13][14] The results showed a change in gene expression in C16orf86 for myoblastic cells in day 3, day 8, and day 15 when reacted by the stimuli.[13][14]

Subcellular

Protein C16orf86 is mainly localized in the nucleus along with being in the cytoplasm, mitochondria, and endoplasmic reticulum. This result were found using the protein tool on Expasy called PSORTII.[16] This tool was used to put in sequence data along with comparing the results to its distant orthologs of Weddell seal and red fox.[17][18]

Gene

Location

C16orf86 (Chromosome 16 Open Reading Frame 86) is a gene found on the long arm of chromosome 16 at position q22.11.[5] It has a genomic sequence that starts at 67,667,030 base pair and ends at base pair 67,668,590.[19] Its genomic sequence is read in the forward direction with the positive strand.[5]

C16orf86 is part of the ENKD1 region.[5] This region contains 3 genes with the ENKD1 protein along with its isoforms ENKD1 isoform X1 and ENKD1 isoform X2.[5][20] Other genes located near C16orf86 are GFOD2 to the right, ACD to the left, and PARD6A to the left.[5]

Exons and introns

C16orf86 has a total of 4 Exon regions within its protein sequence.[5][19] The first exon boundary is located at amino acid 34 and 35 within base pairs G and T. Then, the second exon boundary is located at amino acid 111 and 112 within base pairs A and G. Finally, the third exon boundary is located between amino acid 184 and 186 within base pairs C and G.[19]

C16orf86 has a total of 3 Intron regions within its protein sequence.[19]

Length of coding gene

C16orf86 spans 317 amino acids long and starts transcription at a amino acid 1 Methionine and goes until amino acid 317, which is a stop codon.[19][21]

Isoforms

There are 2 isoforms of C16orf86, which is uncharacterized protein C16orf86 isoform X1 and uncharacterized protein C16orf86 isoform X2.[5]

uncharacterized protein C16orf86 isoform X1 has a span of 332 amino acids long and has a total of 2 exon regions and 1 intron region.[22][23]

uncharacterized protein C16orf86 isoform X2 has a span of 326 amino acids long and has a total of 4 exon's and 3 introns regions.[24][25]

Gene regulation

Promoter

There are three different promoter sequences in C16orf86. These promoter sequences were found using the tool on Genomatix called Gene2Promoter for C16orf86.[26] These promoter sequences were each compared to C16orf86 distant ortholog promoters with the human C16orf86 human protein sequence in the program Clustal Omega multiple sequence alignment.[27] The results had promoter GXP_107609 match more closely in its sequence compared to the GXP_7544221 promoter and the GXP_6033384 promoter.[26]

Transcription factor binding sites

Promoter for C16orf86 protein (GXP_107609) had transcription factor binding sites that were found using the Genomatix tool Gene2Promoter and clicking on analyze binding sites.[26] Binding sites were chosen based on a high matrix score along with a high amount of occurrences within the promoter.[26] The transcription factors that was in the conserved regions of the promoter sequence for C16orf86 (GXP_107609) was MYF3, MYF4, E2F, and CCCTC binding factor.[26] These transcription factors all deal with cell cycle regulation.

Transcript level regulation

5'UTR region

For C16orf86, there was a multiple sequence alignment done on Clustal Omega for 5'UTRs for orangutans, gorillas, chimpanzees, macaque, and humans.[27] The results of the MSA was compared with figures of the structure of the 5'UTR. These figures were created using the bioinformatics tool called m-fold[28] The sequences that stood out in the 5'UTR compared within the MSA is base pairs 105 to 113. These regions could have a stem-loop region pertaining to a certain function or dealing with protein interactions.

3'UTR region

For C16orf86, there was a multiple sequence alignment done on Clustal Omega for 5'UTRs for orangutans, gorillas, chimpanzees, macaque, and humans.[27] The results of the MSA was compared with figures of the structure of the 3'UTR. These figures were created using the bioinformatics tool called m-fold.[28] The sequences that stood out in the 3'UTR compared within the MSA is base pairs 1294 to 1300. These regions could have a stem loop region pertaining to certain function or dealing with protein interactions.

Structure

C16orf86 has found to have a molecular weight of 33.5 kilodaltons and a PI of 5.30.[29]

C16orf86 protein sequence is rich in Proline and Glutamate having a total of 39 Proline's (P) and 39 Glutamate's (E).[30] In addition, C16orf86 has low amino acid regions of Asparagine (N), Threonine (T), Isoleucine (I), and Phenylalanine (F).[30] These regions have 3 Asparagine's, 9 Threonine's, 2 Isoleucine's, and 1 Phenylalanine.[30] This makes the protein acidic with a low PH.

C16orf86 contains Domain of Unknown Function (DUF4691) from amino acid 1 to 184 and a Nuclear Localization Signal from amino acids 105–109.[31][32] This figure was created using the Expasy prosite tool.[33]

For this image, it has a domain of unknown function spanning from 1 to 184 and a Nuclear localization sequence is in there from 105 to 109.[31][32] The red circles represent post-translational modifications and the grey areas are where the start-codon and stop-codon are located. Uncharacterized protein C16orf86 is 317 amino acids long and this image has a scale of 2.[33]

For the C16orf86 protein, there is a nuclear localization signal that is from amino acid 105 to 109 and is composed of (PKRKP) in the forward direction.[16] This pattern is conserved and seen in humans and its distant orthologs such as the red fox and Weddell seal.[16]

Secondary

Tertiary Structure of C16orf86. Labeled above are amino acids Proline 105, Serine 142, Leucine 183, Leucine 244, Glycine 242, and Threonine 258. Proline 105 green region is a location for Nuclear Localization Signaling. Serine 142 yellow region is a location for phosphorylation sites of CDC2, ATM, CKII, DNAPK. In addition, Threonine 258 yellow region is a location for Phosphorylation sites for PKA, PKC. Lastly, Glycine 242, Leucine 183, Leucine 244 are Nuclear Export Signals. These Nuclear Export Regions are labeled orange.

C16orf86 overall has a high census of alpha helices compared to beta sheets. For the predicting location of alpha helices and beta sheets, Phyre 2 was used. For the alpha helices, there is a high-level prediction for amino acids 187–199, 231–244, 265–270, and 294–307.[34] In addition to the alpha helices, there is a high level of prediction for beta strands at amino acids 96–97.[34]

Tertiary

The tertiary structure for C16orf86 PDB file was taken from Phyre2 and I-Tasser.[34][35] The PDB files were put into EZmol bioinformatics tool to create the tertiary structure.[36] This figure has amino acids labeled with sites that pertain to Phosphorylation, Nuclear Localization Signaling, and Nuclear Export Signaling.

Post-translational modifications

C16orf86 post-translational modifications were found using protein modification tools from Expasy.[32] For this protein, the sites that were most intriguing for this protein was its nuclear export signals (L rich regions), Nuclear localization signals, and phosphorylation sites. The nuclear localization signals and export signals allow for this protein to become localized within the cell's nucleus. In addition, this protein sequence has phosphorylation sites for CDK5, GSK3, P38MAPK, PKA, PKC, CDC2, ATM, CKII, and DNAPK. These all play a specific role in cell cycle regulation. There is also a conceptual translation for C16orf86 below with the rest of the post-translation modifications.

This shows a conceptual translation of C16orf86 mRNA labeled with post-translational modifications.

Evolution

The orthologs were sorted by increasing data of divergence and sequence similarity

Genus Species Common name Taxonomic group Date of divergence (MYA) Accession number Sequence length (AA) Sequency identity to human Sequence similarity to human
Homo sapiens Humans Primates 0.00 NP_001013002.2 317 100.00% 100.00%
Pongo abelii Sumatran orangutan Primates 15.20 XP_002826596.1 318 95.00% 96.00%
Rhinopithecus bieti Black snub-nosed monkey Primates 28.10 XP_017707751.1 314 92.00% 94.00%
Otolemur garnettii Northern greater galago Primates 73.00 XP_003799435.1 319 74.00% 79.00%
Ochotona princeps American pika Lagomorphas 88.00 XP_004584223. 417 60.00 65.00%
Cricetulus
griseus Chinese hampster Rodentias 88.00 XP_007647376.1 324 64.35% 71.00%
Castor canadensis American beaver Rodentias 88.00 XP_020026748.1 328 67.00% 73.00%
Sorex araneus Common shrew Soricomorphas 94.00 XP_004600963.1 320 63.87% 70.00%
Rousettus
aegyptiacus Egyptian fruit bat Chiropteras 94.00 XP_016019485.1 339 64.81% 71.00%
Leptonychotes
weddellii Weddell seal Carnivoras 94.00 XP_006749032.1 324 67.68% 72.00%
Vulpes vulpes Red fox Carnivoras 94.00 XP_025867300.1 325 70.46% 70.00%
Ovis aries Sheep Artiodactylas 94.00 XP_027833899.1 329 70.61% 76.00%
Elephantulus edwardii Cape elephant shrew Macroscelideas 102.00 XP_006878955.1 298 58.12% 61.00%
Vombatus ursinus Common wombat Marsupials 160.00 XP_027703451.1 281 52.00% 61.00%
Aptenodytes forsteri Emperor penguin Birds 320.00 XP_009289088.1 262 37.00% 42.00%
Pogona vitticeps Central bearded dragon Reptiles 320.00 XP_020667121.1 266 40.00% 52.00%
Notechis scutatus Tiger snake Reptiles 320.00 XP_026531742.1 266 42.00% 50.00%
Python bivittatus Burmese python Reptiles 320.00 XP_025026382.1 267 44.00% 54.00%
Latimeria chalumnae West Indian Ocean coelacanth Fish 414.00 XP_014342026.1 275 40.00% 48.00%
Rhincodon typus Whale shark Fish 465.00 XP_020387814.1 242 29.00% 44.00%

Paralogs

After conducting a search with NCBI Blast and after finding no paralog sequences similar to C16orf86 in BLAT, it was confirmed that C16orf86 does not have any paralogs. Only isoforms were shown below for the sequence, but no full sequences.

Orthologs

Multiple sequence alignment of strict orthologs for C16orf86 in the chart above.

C16orf86 orthologs include dogs, chimpanzee, cows, rats, mice, and chimpanzees.[37][38]

Ortholog space: C16orf86 orthologs include only placental mammals. This means there are no other mammal groups, birds, fungi, archaea, protists, reptiles, plants, or any other invertebrate species that are orthologs to C16orf86. The most distant ortholog in the placental mammal group, macroscelidea, was the most diverged species from C16orf86, which was 102 million years ago.[39]

Homologs

The most distant homologs with partial sequences to C16orf86 include marsupial mammals, reptiles, and fish. The furthest homolog for C16orf86 was the whale shark that diverged 465 million ago from humans.[39]

References

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000159761Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000013158Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ a b c d e f g h i j "C16orf86 chromosome 16 open reading frame 86 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-02-10.
  6. ^ "GEO Profile Links for UniGene (Select 2139102) - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-02.
  7. ^ a b c d "GDS3626 / ILMN_1697800". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  8. ^ a b c d "62756576 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  9. ^ a b Liu L, Ji C, Chen J, Li Y, Fu X, Xie Y, Gu S, Mao Y (June 2008). "A global genomic view of MIF knockdown-mediated cell cycle arrest". Cell Cycle. 7 (11): 1678–92. doi:10.4161/cc.7.11.6011. PMID 18469521.
  10. ^ a b c "81993008 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  11. ^ a b c "GDS3364 / 231153_at". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  12. ^ a b c Smith MJ, Simco BA, Warren CO (December 1975). "Comparative effects of antimycin A on isolated mitochondria of channel catfish (Ictalurus punctatus) and rainbow trout (Salmo gairdneri)". Comparative Biochemistry and Physiology C. 52 (2): 113–7. doi:10.1016/0306-4492(75)90024-6. PMID 3364.
  13. ^ a b c "129260808 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  14. ^ a b c "GDS5632 / 231153_at". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  15. ^ a b c d Pollow K, Lübbert H, Pollow B (January 1976). "On the mitochondrial 17beta-hydroxysteroid dehydrogenase from human endometrium and endometrial carcinoma: characterization and intramitochondrial distribution". Journal of Steroid Biochemistry. 7 (1): 45–50. doi:10.1016/0022-4731(76)90163-1. PMID 5632.
  16. ^ a b c "Welcome to psort.org!!". www.psort.org. Retrieved 2019-05-05.
  17. ^ "PREDICTED: uncharacterized protein C16orf86 homolog [Leptonychotes wed - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  18. ^ "uncharacterized protein C16orf86 homolog [Vulpes vulpes] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  19. ^ a b c d e "User Sequence vs Genomic". genome.ucsc.edu. Retrieved 2019-04-30.
  20. ^ "ENKD1 enkurin domain containing 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.
  21. ^ "uncharacterized protein C16orf86 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-05-05.
  22. ^ "uncharacterized protein C16orf86 isoform X1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-30.
  23. ^ "User Sequence vs Genomic". genome.ucsc.edu. Retrieved 2019-04-30.
  24. ^ "uncharacterized protein C16orf86 isoform X2 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-30.
  25. ^ "User Sequence vs Genomic". genome.ucsc.edu. Retrieved 2019-04-30.
  26. ^ a b c d e "Genomatix: Login Page". www.genomatix.de. Retrieved 2019-05-02.
  27. ^ a b c "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-02.
  28. ^ a b "RNA Folding Form | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 2019-05-03.
  29. ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2019-04-30.[permanent dead link]
  30. ^ a b c "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2019-05-05.
  31. ^ a b "uncharacterized protein C16orf86 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-04-22.
  32. ^ a b c "ExPASy: SIB Bioinformatics Resource Portal - Categories". www.expasy.org. Retrieved 2019-05-02.
  33. ^ a b "ExPASy - PROSITE". prosite.expasy.org. Retrieved 2019-05-03.
  34. ^ a b c "Phyre 2 Results for Undefined". www.sbg.bio.ic.ac.uk. Archived from the original on 2019-05-02. Retrieved 2019-05-02.
  35. ^ "I-TASSER results". zhanglab.ccmb.med.umich.edu. Archived from the original on 2019-05-02. Retrieved 2019-05-02.
  36. ^ "EzMol - Molecular display wizard". www.sbg.bio.ic.ac.uk. Retrieved 2019-05-05.
  37. ^ "C16orf86 Gene - GeneCards | CP086 Protein | CP086 Antibody". www.genecards.org. Retrieved 2019-02-10.
  38. ^ "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-02-10.
  39. ^ a b "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2019-04-22.