C16orf90

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens.^[1] C16orf90's protein has four predicted alpha-helix domains^[2]^[3]^[4]^[5] and is mildly expressed in the testes^[6]^[7] and lowly expressed throughout the body.^[8] While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays^[9] and post-translational modification data.^[10]^[11]

Gene

C16orf90 or Chromosome 16 open reading frame 90 has no aliases^[12] and spans 3169 nucleotides from 3,493,484 - 3,496,652 on the short arm of chromosome 16.^[7] It is located in position 16p13.3 on the reverse strand.^[1] There are 3 exons and the mRNA strand contains 972 base pairs. The C16orf90 protein is 182 amino acids in length.^[13]

Exons

C16orf90 contains 3 exon regions and 2 intron regions. The exon boundaries occur between amino acids 30 & 31 and 147 & 148.^[14] The first exon is poorly conserved, but exons 2 & 3 are highly conserved.^[15]

C16orf90 marked on chromosome 16 at 16p13.3

Protein

C16orf90 has a molecular weight of 21 kDa and an alkaline isoelectric point of 9.2.^[16] It is a soluble protein.^[17]

Transcripts

There are 3 isoforms of C16orf90.^[1] They are uncharacterized protein C16orf90 isoform a (197aa) producing all 3 exons, uncharacterized protein C16orf90 isoform b (175aa) producing the 2nd and 3rd exons, and uncharacterized protein C16orf90 isoform c (95aa) producing the last 95 amino acids of C16orf90.^[1]

Expression

C16orf90 has relatively high tissue expression in the testes^[6]^[7] and very low (0.213) expression in all other tissues^[8] in healthy humans.^[9] Under stressful conditions, C16orf90 appears upregulated in graphs found at NCBI Geo.^[9]

Subcellular location

The nucleus is the most likely home of C16orf90's produced protein^[18] and is not a transmembrane protein.^[19] These results were verified by comparing the results of the homologous mouse and dolphin C16orf90 proteins.

Structure

Secondary structure

mRNA

The mRNA secondary structure found by RNAfold appeared to show medium to high affinity for the structure produced with stem-loop and hairpin turns. Only two areas indicated a low probability for the secondary structure produced.^[20]

Protein

C16orf90's protein contains 4 alpha helices^[4] and no beta-sheets with coiled-coils likely connecting the helices.^[2]^[3] These helices are approximately equally spaced across the protein.^[5] A nuclear localization signal^[21] was identified as well as four alpha-helix domains^[22] which help determine C16orf90s secondary structure.

C16orf90 gene schematic^[23]

Tertiary structure

C16orf90's tertiary structure includes linear^[5] alpha-helices separated by a disordered or coiled-coil region.^[24]

Regulation

Promoter

Using the Genomatix^[25] tool Gene2Promoter, C16orf90 was found to have 4 possible promoter sequences. The promoter set 3, GXP_644807, is the promoter for the reverse strand because it contained the most CAGE tags, aligned on the 5' end of the gene and contained the correct GeneID.

Protein level regulation

A nuclear localization signal (NLS) at the C-terminus of the protein from 173-197 supports the subcellular localization prediction.^[26]^[18]

Post translational modifications

Phosphorylation occurs at many amino acids on C16orf90.^[10] The red markers on the protein schematic indicate likely phosphorylation sites. NetPhos, a phosphorylation site predictor, returned many sites including amino acids 16, 34, 56, 63, 67, 86, 130, 144, 147, 148, 150, 151, 152, 153, 165, 167, 174, 177, 189, and 191.^[10]

A CTCF binding site (CCCTC-binding factor) is an 11-zinc finger transcription factor that generally represses transcription.^[12] There is one indicated location for this binding site on the C16orf90 protein^[27] and its effects could contribute to C16orf90's low expression levels.

O-GlcNAc sites inhibit phosphorylation. C16orf90 has two serine amino acids that are home to potential O-GlcNAc sites at 34 & 144.^[11] O-GlcNAc sites compete with phosphorylation for control of the protein’s activation site so in C16orf90 this property might inactivate the protein until a severe circumstance when the protein is needed and then can be activated.

NetGlycate^[28] (a glycation prediction tool) found 2 lysine residues at amino acids 70 (.709) and 158 (.595) that predict glycation sites. Glycation sites add sugars to lysines post-translationally and can be necessary for protein folding or stability^[29]

There is a cleavage site located between 172R & 173K on C16orf90's protein.^[21]^[30] This location is also where the nuclear localization signal begins, indicating the NLS may be cleaved to possibly to remove the protein from the nucleus or when the protein requires degradation.

Homology and evolution

C16orf90 orthologs have a relatively high mutation rate as seen in the graph to the right comparing C16orf90 with fibrinopeptides, hemoglobin, and cytochrome C.^[31]

The orthologs are sorted by increasing date of divergence and sequence similarity. C16orf90 is limited to mammals but is found in monotremes and marsupials indicating the gene entered the genome around 180 million years ago.^[32]

Genus	Species	Common name	Taxonomic group	Date of divergence (MYA)	Accession number	Sequence length (AA)	Sequence identity to human	Sequence similarity to human
Homo	sapiens	Humans	Primates	0.00	XP_024306160.1	197	100.00%	100.00%
Gorilla	gorilla	Gorilla	Primates	8.6	XP_004057139	185	82.00%	82.50%
Mus	musculus	Mouse	Rodentia	89	NP_082760.2	171	63.50%	66.50%
Bison	bison	Bison	Even-Toed Ungulate	94	XP_010838682	186	65.50%	69.00%
Zalophus	californianus	Sea lion	Carnivora	94	XP_027973424.1	185	65.20	69.10%
Canis lupus	familiaris	Dog	Carnivora	94	XP_003434913.2	214	64.50%	69.60%
Equus	caballus	Horse	Odd-Toed ungulate	94	XP_001502184.1	183	63.70%	67.60%
Sorex	araneus	Common shrew	Soricomorphas	94	XP_004600963.1	320	63.87%	70.00%
Acinonyx	jubatus	Cheetah	Carnivora	94	XP_026899211	225	61.90%	67.30%
Pteropus	vampyrus	Large flying fox	Chiroptera	94	XP_023376984.1	224	61.80%	64.50%
Lagenorhynchus	obliquidens	Pacific white-sided dolphin	Artiodactyla	94	XP_026974160	192	54.30%	58.00%
Dasypus	novemcinctus	Nine-banded armadillo	Cingulata	102	XP_004474400.1	185	61.30%	67.20%
Orycteropus	afer	Aardvark	Tubulidentata	102.00	XP_007937762.1	185	59.80%	65.70%
Monodelphis	domestica	Gray short-tailed opossum	Marsupial	160.00	XP_001363889.1	187	53.80%	60.50%
Phascolarctos	cinereus	Koala	Marsupial	160.00	XP_020851162.1	187	53.10%	60.20%
Ornithorhynchus	anatinus	Platypus	Monotreme	180.00	XP_016082126.2	216	33.90%	40.40%

Clinical significance

In research, the sequence has been identified as containing a possible pathogenic recessive variant (K53N) for various intellectual disabilities among 31 others.^[33] The protein is suspected to be an adaptor/cofactor that binds to other molecules. In this case a non-homologous substitution could change binding to other molecules and potentially cause intellectual disability, inguinal hernia, frontal upsweep of hair, macrotia, high palate, hypertonia, hyperreflexia, abnormality of the cerebrum, or vitamin D deficiency^[33]

References

^ ^a ^b ^c ^d "NCBI Nucleotide". National Center of Biotechnology Information. 6 December 2019. Retrieved Feb 5, 2020.
^ ^a ^b Chou, Peter; Fasman, Gerald D. (1989), "Chou-Fasman Prediction of the Secondary Structure of Proteins", Prediction of Protein Structure and the Principles of Protein Conformation, Springer US, pp. 391–416, doi:10.1007/978-1-4613-1571-1_9, ISBN 978-1-4612-8860-2
^ ^a ^b Prabi-Gerland. "GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-01.
^ ^a ^b "SOPMA secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-01.
^ ^a ^b ^c "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.
^ ^a ^b "C16orf90 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-05-01.
^ ^a ^b ^c "C16orf90 [Homo sapiens] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ ^a ^b "AceView: Gene:C16orf90". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ ^a ^b ^c "C16orf90 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ ^a ^b ^c "NetPhos 3.1: Phosphorylation Site Predictor". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ ^a ^b "YinOYang 1.2: O-ß-GlcNAc Site Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ ^a ^b "C16orf90 Gene - GeneCards | CP090 Protein | CP090 Antibody". www.genecards.org. Retrieved 2020-05-01.
^ "C16orf90 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
^ "Gene: C16orf90 - Summary - Homo sapiens - Ensembl genome browser". useast.ensembl.org. Retrieved 2020-05-01.
^ "Clustal Omega". Multiple Sequence Alignment. EMBL-EBI. Retrieved February 17, 2020.
^ "Compute pI/Mw Tool". ExPASy. Bioinformatics Resource Portal. Retrieved April 14, 2020.
^ "SOSUI: Solubility Prediction". harrier.nagahama-i-bio.ac.jp. Retrieved 2020-05-01.
^ ^a ^b "PSORT II k-NN Prediction". psort.hgc.jp. Retrieved 2020-05-01.
^ "TMHMM 2.0: Transmembrane Domain Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ "RNAfold". Vienna RNA Web Services. Universitat Wien. Retrieved April 26, 2020.
^ ^a ^b ELM. "Eukaryotic Linear Motif resource". Eukaryotic Linear Motif resource.
^ "PRABI". SOPMA secondary structure prediction.
^ "PROSITE". prosite.expasy.org. Retrieved 2020-05-01.
^ "PHYRE2". Protein Homology/analogy recognition engine.
^ "Genomatix: Retrieve and analyze promoters". www.genomatix.de. Retrieved 2020-05-01.
^ "ELM - Eukaryotic Linear Motif resource". elm.eu.org. Retrieved 2020-05-01.
^ "Gene: C16orf90 - Summary - Homo sapiens - Ensembl genome browser". uswest.ensembl.org. Retrieved 2020-05-01.
^ "NetGlycate 1.0: Glycation Site Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ Bansode, Sneha; Bashtanova, Uliana; Li, Rui; Clark, Jonathan; Müller, Karin H.; Puszkarska, Anna; Goldberga, Ieva; Chetwood, Holly H.; Reid, David G.; Colwell, Lucy J.; Skepper, Jeremy N. (2020-02-25). "Glycation changes molecular organization and charge distribution in type I collagen fibrils". Scientific Reports. 10 (1): 3397. Bibcode:2020NatSR..10.3397B. doi:10.1038/s41598-020-60250-9. ISSN 2045-2322. PMC 7042214. PMID 32099005.
^ "ProP 1.0 Protopeptide Cleavage Site Predictor". www.cbs.dtu.dk. Retrieved 2020-05-01.
^ "C16orf90 Orthologs". NCBI gene.
^ "TimeTree the Divergence of Life". TimeTree. Retrieved February 23, 2020.
^ ^a ^b Anazi, S. (2017). "Clinical genomics expands the morbid genome of intellectual disability and offers a high diagnostic yield". Molecular Psychiatry. 22 (4). Nature: 615–624. doi:10.1038/mp.2016.113. PMID 27431290. S2CID 29175951. Retrieved April 26, 2020.

[NCBI_Nucleotide-1] "NCBI Nucleotide". National Center of Biotechnology Information. 6 December 2019. Retrieved Feb 5, 2020.

[:6-2] Chou, Peter; Fasman, Gerald D. (1989), "Chou-Fasman Prediction of the Secondary Structure of Proteins", Prediction of Protein Structure and the Principles of Protein Conformation, Springer US, pp. 391–416, doi:10.1007/978-1-4613-1571-1_9, ISBN 978-1-4612-8860-2

[:7-3] Prabi-Gerland. "GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-01.

[:8-4] "SOPMA secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-01.

[:9-5] "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.

[:0-6] "C16orf90 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-05-01.

[:1-7] "C16orf90 [Homo sapiens] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[:2-8] "AceView: Gene:C16orf90". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[:3-9] "C16orf90 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[:4-10] "NetPhos 3.1: Phosphorylation Site Predictor". www.cbs.dtu.dk. Retrieved 2020-05-01.

[:5-11] "YinOYang 1.2: O-ß-GlcNAc Site Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.

[:10-12] "C16orf90 Gene - GeneCards | CP090 Protein | CP090 Antibody". www.genecards.org. Retrieved 2020-05-01.

[13] "C16orf90 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.

[14] "Gene: C16orf90 - Summary - Homo sapiens - Ensembl genome browser". useast.ensembl.org. Retrieved 2020-05-01.

[Clustal-O-15] "Clustal Omega". Multiple Sequence Alignment. EMBL-EBI. Retrieved February 17, 2020.

[pI/Mw-16] "Compute pI/Mw Tool". ExPASy. Bioinformatics Resource Portal. Retrieved April 14, 2020.

[17] "SOSUI: Solubility Prediction". harrier.nagahama-i-bio.ac.jp. Retrieved 2020-05-01.

[:11-18] "PSORT II k-NN Prediction". psort.hgc.jp. Retrieved 2020-05-01.

[19] "TMHMM 2.0: Transmembrane Domain Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.

[20] "RNAfold". Vienna RNA Web Services. Universitat Wien. Retrieved April 26, 2020.

[:12-21] ELM. "Eukaryotic Linear Motif resource". Eukaryotic Linear Motif resource.

[22] "PRABI". SOPMA secondary structure prediction.

[23] "PROSITE". prosite.expasy.org. Retrieved 2020-05-01.

[24] "PHYRE2". Protein Homology/analogy recognition engine.

[25] "Genomatix: Retrieve and analyze promoters". www.genomatix.de. Retrieved 2020-05-01.

[26] "ELM - Eukaryotic Linear Motif resource". elm.eu.org. Retrieved 2020-05-01.

[27] "Gene: C16orf90 - Summary - Homo sapiens - Ensembl genome browser". uswest.ensembl.org. Retrieved 2020-05-01.

[28] "NetGlycate 1.0: Glycation Site Prediction". www.cbs.dtu.dk. Retrieved 2020-05-01.

[29] Bansode, Sneha; Bashtanova, Uliana; Li, Rui; Clark, Jonathan; Müller, Karin H.; Puszkarska, Anna; Goldberga, Ieva; Chetwood, Holly H.; Reid, David G.; Colwell, Lucy J.; Skepper, Jeremy N. (2020-02-25). "Glycation changes molecular organization and charge distribution in type I collagen fibrils". Scientific Reports. 10 (1): 3397. Bibcode:2020NatSR..10.3397B. doi:10.1038/s41598-020-60250-9. ISSN 2045-2322. PMC 7042214. PMID 32099005.

[30] "ProP 1.0 Protopeptide Cleavage Site Predictor". www.cbs.dtu.dk. Retrieved 2020-05-01.

[31] "C16orf90 Orthologs". NCBI gene.

[TimeTree-32] "TimeTree the Divergence of Life". TimeTree. Retrieved February 23, 2020.

[Research-33] Anazi, S. (2017). "Clinical genomics expands the morbid genome of intellectual disability and offers a high diagnostic yield". Molecular Psychiatry. 22 (4). Nature: 615–624. doi:10.1038/mp.2016.113. PMID 27431290. S2CID 29175951. Retrieved April 26, 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

C16orf90

Gene

Exons

Protein

Transcripts

Expression

Subcellular location

Structure

Secondary structure

mRNA

Protein

Tertiary structure

Regulation

Promoter

Protein level regulation

Post translational modifications

Homology and evolution

Clinical significance

References

Navigation menu

Search