Protein chemical shift re-referencing

From WikiProjectMed
Jump to navigation Jump to search

Protein chemical shift re-referencing is a post-assignment process of adjusting the assigned NMR chemical shifts to match IUPAC and BMRB recommended standards in protein chemical shift referencing. In NMR chemical shifts are normally referenced to an internal standard that is dissolved in the NMR sample. These internal standards include tetramethylsilane (TMS), 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) and trimethylsilyl propionate (TSP). For protein NMR spectroscopy the recommended standard is DSS, which is insensitive to pH variations (unlike TSP). Furthermore, the DSS 1H signal may be used to indirectly reference 13C and 15N shifts using a simple ratio calculation [1]. Unfortunately, many biomolecular NMR spectroscopy labs use non-standard methods for determining the 1H, 13C or 15N “zero-point” chemical shift position. This lack of standardization makes it difficult to compare chemical shifts for the same protein between different laboratories. It also makes it difficult to use chemical shifts to properly identify or assign secondary structures or to improve their 3D structures via chemical shift refinement. Chemical shift re-referencing offers a means to correct these referencing errors and to standardize the reporting of protein chemical shifts across laboratories.

Importance of NMR chemical shift re-referencing in biomolecular NMR

Incorrect chemical shift referencing is a particularly acute problem in biomolecular NMR.[1] It has been estimated that up to 20% of 13C and up to 35% of 15N shift assignments are improperly referenced.[2] [3][4] Given that the structural and dynamic information contained within chemical shifts is often quite subtle, it is critical that protein chemical shifts be properly referenced so that these subtle differences can be detected. Fundamentally, the problem with chemical shift referencing comes from the fact that chemical shifts are relative frequency measurements rather than absolute frequency measurements. Because of the historic problems with chemical shift referencing, chemical shifts are perhaps the most precisely measurable but the least accurately measured parameters in all of NMR spectroscopy.[5][3]

Programs for protein chemical shift re-referencing

Because of the magnitude and severity of the problems with chemical shift referencing in biomolecular NMR, a number of computer programs have been developed to help mitigate the problem (see Table 1 for a summary). The first program to comprehensively tackle chemical shift mis-referencing in biomolecular NMR was SHIFTCOR.[2]

Table 1. Summary and comparison of different chemical shift re-referencing and mis-assignment detection programs.[5]

Program [Reference] Detects or performs shift re-referencing Detects gross assignment errors Detects subtle assignment errors Distinguishes assignment errors from referencing errors Requires 3D structure
CheckShift [6][7] Yes No No No No
AVS[8] No Yes No No No
LACS [4][9] Yes Sometimes No No No
PSSI [10] Yes No No No No
SHIFTCOR [2] Yes Yes Sometimes Yes Yes
PANAV [11] Yes Yes Yes Yes No

SHIFTCOR: A structure-based chemical shift correction program

SHIFTCOR is an automated protein chemical shift correction program that uses statistical methods to compare and correct predicted NMR chemical shifts (derived from the 3D structure of the protein) relative to an input set of experimentally measured chemical shifts. SHIFTCOR uses several simple statistical approaches and pre-determined cut-off values to identify and correct potential referencing, assignment and typographical errors. SHIFTCOR identifies potential chemical shift referencing problems by comparing the difference between the average value of each set of observed backbone (1Hα, 13Cα, 13Cβ, 13CO, 15N and 1HN) shifts and their corresponding predicted chemical shifts. The difference between these two averages results in a nucleus-specific chemical shift offset or reference correction (i.e. one for 1H, one for 13C and one for 15N). In order to ensure that certain extreme outliers do not unduly bias these average offset values, the average of the observed shifts is only calculated after excluding potential mis-assignments or typographical errors.[2]

SHIFTCOR output

SHIFTCOR generates and reports chemical shift offsets or differences for each nucleus. The results contain the chemical shift analyses (including lists of potential mis-assignments, the estimated referencing errors, the estimated error in the calculated reference offset (95% confidence interval), the applied or suggested reference offset, correlation coefficients, RMSD values) and the corrected BMRB formatted chemical shift file (see Figure 1 for details).[2]

SHIFTCOR uses the chemical shift calculation program SHIFTX [12] to predict 1Hα, 13Cα,15N shifts based on the 3D structure coordinates of the protein being analyzed. By comparing the predicted shifts to the observed shifts, SHIFTCOR is able to accurately identify chemical shift reference offsets as well as potential mis-assignments. A key limitation to the SHIFTCOR approach is that requires that the 3D structure for the target protein be available to assess the chemical shift reference offsets. Given that chemical shift assignments are typically made before the structure is determined, it was soon realized that structure-independent approaches were required to develop.[5]

Structure-independent chemical shift correction programs

Several methods have been developed that make use of the estimated (via 1H or 13C shifts) or predicted (via sequence) secondary structure content of the protein being analyzed. These programs include PSSI,[10] CheckShift,[6][7] LACS,[4][9] and PANAV.[11] Both PANAV <[1]> and CheckShift are also available as web servers.

The PSSI and PANAV programs use the secondary structure determined by 1H shifts (which are almost never mis-referenced) to adjust the target protein’s 13C and 15N shifts to match the 1H-derived secondary structure. LACS uses the difference between secondary 13Cα and 13Cβ shifts plotted against secondary 13Cα shifts or secondary 13Cβ shifts to determine reference offsets. A more recent version of LACS has been adapted to identify 15N chemical shift mis-referencing.[4] This new version of LACS exploits the well-known relationship between secondary 15N shifts and the secondary 13Cα and 13Cβ shifts of the preceding residue.[3] In contrast to LACS and PANAV/PSSI, CheckShift uses secondary structure predicted from high-performance secondary structure prediction programs such as PSIPRED [13] to iteratively adjust 13C and 15N chemical shifts so that their secondary shifts match the predicted secondary structure. These programs have all been shown to accurately identify mis-referenced and properly re-reference protein chemical shifts deposited in the BMRB,.[7][11] Note that both LACS and CheckShift are programmed to always predict the same offset for 13Cα and 13Cβ shifts, whereas PSSI and PANAV do not make this assumption. As a general rule, PANAV and PSSI typically exhibit a smaller spread (or standard deviation) in calculated reference offsets, indicating that these programs are slightly more precise than either LACS or CheckShift. Neither LACS nor CheckShift are able to handle proteins that have the extremely large (above 40 ppm) reference offsets, whereas PANAV and PSSI seem to be able to deal with these kinds of anomalous proteins.[11]

In a recent study,[11] a chemical shift re-referencing program (PANAV) was run on a total of 2421 BMRB entries that had a sufficient proportion of (>80%) of assigned chemical shifts to perform a robust chemical shift reference correction. A total of 243 entries were found with 13Cα shifts offset by more than 1.0 ppm, 238 entries with 13Cβ shifts offset of more than 1.0 ppm, 200 entries with 13C’ shifts offset of more than 1.0 ppm and 137 entries with 15N shifts offset by more than 1.5 ppm. From this study, 19.7% of the entries in the BMRB appear to be mis-referenced. Evidently, chemical shift referencing continues to be a significant, and as yet unresolved problem for the biomolecular NMR community.[5][11]

See also

References

  1. ^ Wishart, DS; Bigam CG; Yao J; Abildgaard F; et al. (1995). "1H, 13C and 15N chemical shift referencing in biomolecular NMR". Journal of Biomolecular NMR. 6 (2): 135–40. doi:10.1007/bf00211777. PMID 8589602. S2CID 12534280.
  2. ^ a b c d e Zhang, H; Neal, S. & Wishart, D.S. (Mar 2003). "RefDB: A database of uniformly referenced protein chemical shifts". J. Biomol. NMR. 25 (3): 173–195. doi:10.1023/A:1022836027055. PMID 12652131. S2CID 12786364.
  3. ^ a b c Wishart, DS; Case DA (2001). Use of chemical shifts in macromolecular structure determination. Methods in Enzymology. Vol. 338. pp. 3–34. doi:10.1016/s0076-6879(02)38214-4. ISBN 9780121822392. PMID 11460554.
  4. ^ a b c d Wang, L; Markley JL (2009). "Empirical correlation between protein backbone 15N and 13C secondary chemical shifts and its application to nitrogen chemical shift re-referencing". Journal of Biomolecular NMR. 44 (2): 95–99. doi:10.1007/s10858-009-9324-0. PMC 2782637. PMID 19436955.
  5. ^ a b c d Wishart, DS (Feb 2011). "Interpreting protein chemical shift data". Progress in Nuclear Magnetic Resonance Spectroscopy. 58 (1–2): 62–87. doi:10.1016/j.pnmrs.2010.07.004. PMID 21241884.
  6. ^ a b Ginzinger, SW; Gerick F; Coles M; Heun V (2007). "CheckShift: automatic correction of inconsistent chemical shift referencing". Journal of Biomolecular NMR. 39 (3): 223–227. doi:10.1007/s10858-007-9191-5. PMID 17899394. S2CID 42088330.
  7. ^ a b c Ginzinger, SW; Skocibusić M; Heun V (2009). "CheckShift improved: fast chemical shift reference correction with high accuracy". Journal of Biomolecular NMR. 44 (4): 207–211. doi:10.1007/s10858-009-9330-2. PMID 19575298. S2CID 43883142.
  8. ^ Moseley, NH; Sahota G; Montelione TG (Jul 2004). "Assignment validation software suite for the evaluation and presentation of the protein resonance assignment data". Journal of Biomolecular NMR. 28 (4): 341–355. doi:10.1023/B:JNMR.0000015420.44364.06. PMID 14872126. S2CID 14483199.
  9. ^ a b Wang, L; Eghbalnia HR; Bahrami A; Markley JL (May 2005). "Linear analysis of carbon-13 chemical shift differences and its application to the detection and correction of errors in referencing and spin system identifications". Journal of Biomolecular NMR. 32 (1): 13–22. doi:10.1007/s10858-005-1717-0. PMID 16041479. S2CID 33690078.
  10. ^ a b Wang, Y; Wishart DS (2005). "A simple method to adjust inconsistently referenced 13C and 15N chemical shift assignments of proteins". Journal of Biomolecular NMR. 31 (2): 143–148. doi:10.1007/s10858-004-7441-3. PMID 15772753. S2CID 21209202.
  11. ^ a b c d e f Wang, B; Wang Y (2010). "A probabilistic approach for validating protein NMR chemical shift assignments". Journal of Biomolecular NMR. 47 (2): 85–99. doi:10.1007/s10858-010-9407-y. PMID 20446018. S2CID 22564072.
  12. ^ Neal, S; Nip AM; Zhang H; Wishart DS (Jul 2003). "Rapid and accurate calculation of protein 1H 13C and 15N chemical shifts". Journal of Biomolecular NMR. 26 (3): 215–240. doi:10.1023/A:1023812930288. PMID 12766419. S2CID 29425090.
  13. ^ McGuffin, LJ; Bryson K; Jones DT (2000). "The PSIPRED protein structure prediction server". Bioinformatics. 16 (4): 404–405. doi:10.1093/bioinformatics/16.4.404. PMID 10869041.

General References