Publications

CysDB: A Human Cysteine Database based on Experimental Quantitative Chemoproteomics

Abstract

Cysteine chemoproteomics studies provide proteome-wide portraits of the ligandability or potential ‘druggability’ of thousands of cysteine residues. Consequently, these studies are enabling resources for closing the druggability gap, namely achieving pharmacological manipulation of ~99% of the human proteome that remains untargeted by FDA approved small molecules. Recent interactive dataset repositories, such as OxiMouse and SLCABPP, have enabled users to interface more readily with cysteine chemoproteomics studies1,2. However, these databases remain limited to single studies and therefore do not provide a mechanism to perform cross-study analyses. Here we report CysDB as a curated community-wide repository of cysteine chemoproteomics data that incorporates high coverage data derived from nine studies generated by the Backus, Cravatt, Gygi, Wang, and Yang research groups. CysDB is a SQL relational database that is publicly available at https://backuslab.shinyapps.io/cysdb/ and features chemoproteomic measures of identification, hyperreactivity, and ligandability for 62,888 cysteines (24% of all cysteines the human proteome). The CysDB web application also includes annotations of functionality (UniProtKB/Swiss-Prot, Pfam, Panther), known druggability (FDA approved targets, DrugBank, ChEMBL), disease-relevance and genetic variation (ClinVar, Cancer Gene Census, Online Mendelian Inheritance in Man), and structural features (Protein Data Bank). Showcasing the utility of CysDB, here we report the discovery and enrichment of ligandable cysteines in undruggable classes of proteins, the observation that a subset of cysteines showed marked preference for specific classes of electrophiles (chloroacetamide vs acrylamide), and that ligandable cysteines are present in numerous undrugged disease-relevant proteins. Most importantly, we have designed CysDB for the incorporation of new datasets and features to support the continued growth of the druggable cysteineome.

Novel variants in KAT6B spectrum of disorders expand our knowledge of clinical manifestations and molecular mechanisms

Abstract

The phenotypic variability associated with pathogenic variants in Lysine Acetyltransferase 6B (KAT6B, a.k.a. MORF, MYST4) results in several interrelated syndromes including Say-Barber-Biesecker-Young-Simpson Syndrome and Genitopatellar Syndrome. Here we present 20 new cases representing 10 novel KAT6B variants. These patients exhibit a range of clinical phenotypes including intellectual disability, mobility and language difficulties, craniofacial dysmorphology, and skeletal anomalies. Given the range of features previously described for KAT6B-related syndromes, we have identified additional phenotypes including concern for keratoconus, sensitivity to light or noise, recurring infections, and fractures in greater numbers than previously reported. We surveyed clinicians to qualitatively assess the ways families engage with genetic counselors upon diagnosis. We found that 56% (10/18) of individuals receive diagnoses before the age of 2 years (median age = 1.96 years), making it challenging to address future complications with limited accessible information and vast phenotypic severity. We used CRISPR to introduce truncating variants into the KAT6B gene in model cell lines and performed chromatin accessibility and transcriptome sequencing to identify key dysregulated pathways. This study expands the clinical spectrum and addresses the challenges to management and genetic counseling for patients with KAT6B-related disorders.

From chemoproteomic-detected amino acids to genomic coordinates: insights into precise multi-omic data integration

Abstract

The integration of proteomic, transcriptomic, and genetic variant annotation data will improve our understanding of genotype–phenotype associations. Due, in part, to challenges associated with accurate inter-database mapping, such multi-omic studies have not extended to chemoproteomics, a method that measures the intrinsic reactivity and potential “druggability” of nucleophilic amino acid side chains. Here, we evaluated mapping approaches to match chemoproteomic-detected cysteine and lysine residues with their genetic coordinates. Our analysis revealed that database update cycles and reliance on stable identifiers can lead to pervasive misidentification of labeled residues. Enabled by this examination of mapping strategies, we then integrated our chemoproteomics data with computational methods for predicting genetic variant pathogenicity, which revealed that codons of highly reactive cysteines are enriched for genetic variants that are predicted to be more deleterious and allowed us to identify and functionally characterize a new damaging residue in the cysteine protease caspase-8. Our study provides a roadmap for more precise inter-database mapping and points to untapped opportunities to improve the predictive power of pathogenicity scores and to advance prioritization of putative druggable sites. 

See Web Applications for the companion shiny app.

SP3-FAIMS Chemoproteomics for High-Coverage Profiling of the Human Cysteinome

Abstract

Chemoproteomics has enabled the rapid and proteome-wide discovery of functional, redox-sensitive, and ligandable cysteine residues. Despite widespread adoption and considerable advances in both sample-preparation workflows and MS instrumentation, chemoproteomics experiments still typically only identify a small fraction of all cysteines encoded by the human genome. Here, we develop an optimized sample-preparation workflow that combines enhanced peptide labeling with single-pot, solid-phase-enhanced sample-preparation (SP3) to improve the recovery of biotinylated peptides, even from small sample sizes. By combining this improved workflow with on-line high-field asymmetric waveform ion mobility spectrometry (FAIMS) separation of labeled peptides, we achieve unprecedented coverage of >14000 unique cysteines in a single-shot 70 min experiment. Showcasing the wide utility of the SP3-FAIMS chemoproteomic method, we find that it is also compatible with competitive small-molecule screening by isotopic tandem orthogonal proteolysis–activity-based protein profiling (isoTOP-ABPP). In aggregate, our analysis of 18 samples from seven cell lines identified 34225 unique cysteines using only ∼28 h of instrument time. The comprehensive spectral library and improved coverage provided by the SP3-FAIMS chemoproteomics method will provide the technical foundation for future studies aimed at deciphering the functions and druggability of the human cysteineome.

Sex differences in obesity, lipid metabolism, and inflammation— A role for the sex chromosomes?

Highlights

GitHubLinkedInTwitterEmail