![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
News Archive
This week we are highlighting the three finest examples of proteomics data made public in 2011. As we
did last year, we are naming the best data in three categories.
![]() to "http://www.antibodypedia.com" Any old links to the ".org" domain will no longer function properly. The GPM interface has been updated and any
users of GPM-XE should perform a software update to convert to the new domain name.
Data set of the week: (2011/12/19)
Virus-induced dilated cardiomyopathy is characterized by increased levels of fibrotic extracellular matrix proteins and reduced amounts of energy-producing enzymes. Overall rating: ![]() ![]() This data is a good example of what can be done using 2D-SDS PAGE DIGE methods when coupled with
high resolution mass spectrometry-based protein identifications. The analysis showed a small number of proteins per
spot, with good clustering of predicted molecular masses (from the protein sequence) in each sample spot. There
was very signficant contamination of all of the samples with common adventious proteins (H. sapiens KRT1, KRT2, KRT9 and KRT10;
B. taurus α- & κ-casein; and S. scrofa trypsin). The high levels of these proteins made some of
the data analysis a bit tricky: the porcine trypsin in particular contained one peptide that was consistently identified as being from
mouse Try10 while it clearly was from the porcine reagent instead. It would be helpful to the entire field if more effort
was put in to preventing the contamination of polyacrylamide gels.
![]() ![]()
Thanks to our users and the general community's commitment to making their data openly available, GPMDB has
grown in a peculiar way: the number of peptide identifications in the system has nearly doubled each year. This
doubling (technically "exponential growth") has had the rather happy consequence of keeping the full
data set surprisingly up-to-date. The pie chart below shows the fraction of peptide identifications in the current
database (410,648,190 total) as a function of the calendar year in which the identifications were added.
![]() ![]() Data set of the week: (2011/12/12)
Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor. Overall rating: ![]() ![]() The five analyses presented here are a good example of the type of MS/MS
identification work that is necessary when setting up a solid SRM/MRM assay for quantitation. There
are several good replicates to establish reproducibility and the MS/MS spectra were generated
on the same type of instrument used to perform the quantitative analysis. The group also paid careful attention
to the chromatography used, which is an under-appreciated necessity for this type of quantitation.
Data set of the week: (2011/12/05)
Phosphoproteomic analysis of Salmonella-infected cells identifies key kinase regulators and SopB-dependent host phosphorylation events. Overall rating: ![]() ![]() The results derived from this data really show the state-of-the-art when using
an Orbitrap with CID and SILAC quantitation to follow the changes in phosphorylation patterns that
occur during a biological event (in this case Salmonella infection in human cells). All aspects of
the measurement (sample preparation, phosphopeptide enrichment, HPLC and mass spectrometry) were performed
with excellent attention to detail and quality. Any one interested in developing new ways of handling quantitative
proteomics data while simultaneously following a post-translational modification should use these
experiments as a model system for testing their methods.
Data set of the week: (2011/11/27)
A pipeline that integrates the discovery and verification of plasma protein biomarkers reveals candidate markers for cardiovascular disease. Overall rating: ![]() ![]() This data represents the maturing of proteomics measurements into a clinical tool. The experiments
were performed using state-of-the-art techniques and allow the in-depth profiling of the proteins present in
clinically-derived plasma samples for the differential diagnosis of cardiovascular events. The combination of
good, solid experimental technique in the plasma measurements in combination with SRM/MRM methods for more
routine monitoring is probably the pattern many clinically-oriented studies will follow for the next few years.
Data set of the week: (2011/11/20)
Systematic and quantitative assessment of the ubiquitin-modified proteome. Overall rating: ![]() ![]() The experiments that generated this data used affinity purification to select
peptides that had been modified by ubiquination. The antibody used recognized the unusual addition of Gly-Gly
to the sidechain of lysine, which only occurs in tryptic peptides generated from ubiquinated proteins. There
have been many studies that used this modification (+114 Da) to identify ubiquitination sites, but these particular
experiments have the largest (and most broadly distributed) set of identified modified lysines in human
proteins currently available. The use of the proteosome inhibitor bortezomib created significantly higher concentrations of
these modified peptides in the cell culture, allowing the antibody pull-down method to be much more effective
than it would have been in untreated cells.
![]() Data set of the week: (2011/11/14)
Comparative phosphoproteome profiling reveals a function of the STN8 kinase in fine-tuning of cyclic electron flow (CEF). Overall rating: ![]() ![]() These results contain some of the best plant phosphorylation information available. The experiments
were very well planned and the analysis was done carefully. Many of the phospho-domains were previously undocumented
and the data was analyzed in a reasonable manner for the resulting manuscript.
Data set of the week: (2011/11/07)
A protein epitope signature Tag (PrEST) library allows SILAC-based absolute quantification and multiplexed determination of protein copy numbers in cell lines. Overall rating: ![]() ![]() The data provided by these experiments is a tremendous resource for anyone interested in
proteomics search engine development, testing or statistical analysis. The first 107
LC/MS/MS runs were generated using individual SILAC-labelled PrEST peptides. There are effectively no contaminants, making these
spectra excellent examples to use for determining algorithm sensitive and noise rejection. The remaining sets were large, high quality measurments of
mixtures of either normal PrESTs and SILAC heavy HeLa proteins or
SILAC heavy PrESTs and normal HeLa proteins. The multiple
replicates and well-characterized samples make these runs perfect for determining statistical error rates and
comparing predictions from theoretical distributions to laboratory data.
![]()
Data set of the week: (2011/10/30)
Proteome-wide mapping of the Drosophila acetylome demonstrates a high degree of conservation of lysine acetylation. Overall rating: ![]() ![]() The MS/MS data generated for this paper was first-rate, using Higher-energy Collisional Dissociation
(HCD) and high accuracy fragment ion mass measurement to produce a large set of excellent Drosophila melanogaster
peptide identifications. This sort of data would normally receive a better rating than a single étoile. However, for some reason the investigators
choose to use urea as part of their experiment sample workup, leading to an observable amount of lysine carbamylation in
their proteins. The presence of these carbamylations (Lys + 43 Da) makes unambiguously determining acetylation (Lys +42 Da)
much more difficult than would have been necessary if a urea-free sample workup protocol had been utilized.
Data set of the week: (2011/10/23)
A phospho-proteomic screen identifies substrates of the checkpoint kinase Chk1. Overall rating: ![]() ![]() Any one interested in targeted phosphopeptide analysis should look at this
data carefully. The methods used here generated identifications that were > 99% phosphopeptides, for
the very specific proteins of interest in the cell-cycle checkpoint kinase Chk1 system. Every aspect of
the measurements was done well, while collecting a very small number of spectra compared to other techniques.
Even though there are relatively few spectra, there were a surprising number that were either unique
or the best obtained for that particular sequence.
![]()
![]()
The results showed that there was a significant difference in the rate of processing spectra, depending
on the processor used. Predictably, the newest processors aimed at the gaming market (AMD Phenom X6 and the Intel i7-2600)
performed the best. The i7-2600 was clearly the winner, processing 1 spectrum every 600 microseconds. The following table
gives a few more details on the processors used.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
Data set of the week: (2011/10/16)
Global network analysis of drug tolerance, mode of action and virulence in methicillin-resistant S. aureus. Overall rating: ![]() ![]() The data collected here was for a focussed study which was
well suited to analysis using a QQ-TOF style instrument and isobaric tags for relative and absolute quantitation.
Using the results the authors were able to draw some conclusions about changes in the concentrations
of the most abundant proteins in S. aureus, caused by their specific experimental conditions. The
protein concentration limit of detection was significantly higher than might be expected for
a survey-style proteomics study but in this case it was the perturbations in metabolic proteins
that was desired measurement, rather than a thorough catalogue of all proteins present.
Data set of the week: (2011/10/9)
DNA affects the composition of lipoplex protein corona: A proteomics approach. Overall rating: ![]() ![]() This data was a nice demonstration of the use of protein isolation
methods to generate a much-reduced set of proteins (compared to blood plasma) associated with
a very specific biomedically-relevant stimulus. The identifications were sound and the
overall experimental setup produced a good set of appropriate peptides for the
proteins found in this study, all of which are well-known plasma proteins.
A hardware failure has shut down the GPM's FTP site for the next few days,
until we can get replacement equipment and put it on-line.
![]() Data set of the week: (2011/09/18)
Shotgun proteomic analysis of the unicellular alga Ostreococcus tauri. Overall rating: ![]() ![]() This paper does an excellent job of characterizing the proteome of a very unusual
eukaryote, Ostreococcus tauri.
Discovered in 1994, it is still the smallest known eukaryote in size — at 0.8 microns in diameter, 1000 O. tauri
cells would fit in a HeLa cell, with plenty of room left over. This data set thoroughly examines the proteome
of the organism, which has significant sequence divergence from the model eukaryotes commonly used in proteomics experiments. Any group interested in
the molecular evolution of phosphorylation signalling should find their phosphopeptide isolations instructive.
This data holds the modern record for the shear volume of tryptic peptide sequences that had never been observed before these spectra became publicly available.
The methods used here should serve as a guide for anyone interested in characterizing the proteome of a novel, single-celled eukaryote.
Data set of the week: (2011/09/11)
Quantitative phospho-proteomics to investigate the Polo-like kinase 1-dependent phospho-proteome. Overall rating: ![]() ![]() What separated this study from other surveys of HeLa cell phosphopeptides was the
use of a SILAC approach that has significant benefits. Rather than relying on
metabolic incorporation of heavy amino acids, this study used light and heavy methyl groups, added to
the acidic groups of the cleaved peptides (Glu, Asp and C-terminus). This treatment
blocked all of the acidic groups in these peptides, except for the phosphorylated Ser, Thr and Tyr residues.
Because of this protocol, the
IMAC enrichment produced an unusually pure set of phosphopeptides that were not dominated by peptides
containing additional acidic side chains, as is often the case with IMAC experiments. It also
generated particularly simple, accurate peptide quantitation.
Data set of the week: (2011/09/04)
Proteomic analysis of outer membrane vesicles derived from Pseudomonas aeruginosa. Overall rating: ![]() ![]() The data reported here gives a first look at the outer membrane proteins
of this important pathogenic species. The proteins discovered and the techniques used provide
an excellent comparison with the proteins found for the related species, Pseudomonas syringae, in
a previously featured data set. The results would have been more
broadly applicable at the peptide level if the chromatography had been better, but the proteins
identified were based on very good ion-trap spectra and the data analysis used
in the manuscript was appropriate.
![]() The collaborative process of formulating the provocative questions
should engage the NCI’s scientific community in serious debate and energize the NCI’s many constituencies
(advocacy groups, health professionals, Members of Congress, and others) about the prospects for improving
the welfare of cancer patients through research. These other constituencies are encouraged to take part in
the "Provocative Questions" enterprise through discussions and activities ...
Data set of the week: (2011/08/29)
A tissue-specific atlas of mouse protein phosphorylation and expression. Overall rating: ![]() ![]() The data gives a general survey of the most abundant phosphopeptides that
were found in nine different mouse tissue samples. The phosphopeptide enrichment was lower than
in other, more specific studies and the chromatography was somewhat less consistently performed than
has become best-practice in the field. The study did, however, provide many good observations of phosphorylation
sites in proteins that are not well-represented in cell culture studies.
![]() ![]()
Data set of the week: (2011/08/21)
Quantitative phosphoproteomics identifies substrates and functional modules of Aurora and Polo-like kinase activities in mitotic cells. Overall rating: ![]() ![]() This paper provides a good survey of the phosphopeptides present in HeLa cells and
should be viewed as a model for further study of quantitative phophoproteomics in cell culture. The
experimental analysis used CID fragmentation and it demonstrates very clearly that it is not
necessary (or desirable) to use ETD when looking for sensitive, reproducible phosphopeptide
quantitation. The data analysis in the paper has some flaws, but the conclusions were reasonable and within
the limitations of the analytical approach that was used.
![]() ![]()
FOA 1: Technology Development: MS-based protein ID and quantitation . (Years 1-5)
Goals include:
FOA 2: Technology Development: Non-MS-based protein ID and quantitation.
Goals include:
Data set of the week: (2011/08/14)
Proteome profiling of wild type and lumican-deficient mouse corneas. Overall rating: ![]() ![]() These experiments truly answered the question: "What proteins are present in
mouse corneas?" It contains excellent observations of many not-so-common collagens, keratins and a variety of other
proteins associated with intermediate filaments, such as desmoplakin, periplakin, envoplakin and
uroplakin. The original data analysis presented in the paper was very deeply flawed: it should not
be considered reliable. The data itself, though, was an excellent example of the benefits of using an
Orbitrap-LTQ hybrid instrument with a sensitive HCD collision cell.
Data set of the week: (2011/08/08)
Proteomic analysis of microvesicles derived from human colorectal cancer ascites. Overall rating: ![]() ![]() The experiments performed here provide about as much information as can be obtained
from a clinically obtained sample — in this case ascities from human colorectal cancer patients — using gel band analysis and
an LTQ mass spectrometer. The identifications were good quality and they provide a good template for the proteins
to be expected in the micro-vesicular fraction of this class of clinical isolates. The results were
relatively free of artifacts and comparision of the three isolates provides an interesting example of the
variability that can be expected from real samples related only by their method of isolation.
For anyone interested, these three result sets can be used to compare the utility of
a purely web-based system (GPMDB) with a local client computer app (PRIDE's new PRIDE Inspector utility). To use
PRIDE Inspector, click on the "PRIDE" link for any of the three data sets and then click on the
red "PRIDE Inspector" link on the resulting page. You will need to have Java installed on your computer
(this will not work on most smart phones or iPad tablets).
![]() ![]()
The 5th issue of the EuPA bulletin has been released. It contains this month the message from the president
and EuPA latest news, information from the Italian and Turkish proteomics societies, meeting reports,
plant proteomics initiatives reports,
information from the Journal of Proteomics, and many other information from the proteomics world.
Data set of the week: (2011/07/31)
Global profiling of proteolysis during rupture of Plasmodium falciparum from the host erythrocyte. Overall rating: ![]() ![]() This study generated a large number of gel bands from a critical point in the life cycle
of the protozoan parasite Plasmodium
falciparum in the context of its normal home for the part of its life cycle as the causitive
agent of malaria, the human erythrocyte. The results provide insights into the organism's metabolism as
it exists as a schizont containing multiple merozoites (inside of a erythrocyte) and the subsequent rupturing of
the infected erythrocyte. The data provides an excellent example of the bioinformatics challenges associated with
the analysis of multi-proteome samples, even when they are nicely isolated into gel bands and the
proteomes have little sequence overlap.
Data set of the week: (2011/07/24)
in vivo versus in vitro protein abundance analysis of Shigella dysenteriae type 1 reveals changes in the expression of proteins involved in virulence, stress and energy metabolism. Overall rating: ![]() ![]() These experiments provided the most comprehensive collection of peptide identifications
for the important pathogenic enterobacteria species Shigella dysenteriae,
a close relative of the common Escherichia coli. Type 1 S. dysenteriae causes a severe form of dysentery
referred to as shigellosis. The experiments reported here use whole cell lysates to try to understand protein
abundances using label-free methods. The proteins found showed significant cleavage at non-tryptic sites (up to 10% of identified peptides), probably
caused by endogenous proteases in the lysate itself rather simple chymotryptic activity in the cleavage reagent used.
The peptide identifications also revealed extensive deamidation of both Q and N residues.
![]() D4.1 - ProteomeXchange repository data flow definition, and D4.2 - ProteomeXchange metadata format definition.
D4.1 describes the overall vision of the central role of PRIDE in archiving and maintaining the tables of
identifications produced for publications in addition to their established role of generating new XML formats to set these tables in context.
D4.2 describes the first of these new XMLs — ProteomeXchangeDataset. This new XML will be used to describe data
submissions to PRIDE (in a very similar way to the existing PRIDE submission XML), but with new field
names and some new fields for additional ontology information. As well, there will be provision for an overall accession number to be
generated by the new EBI entity ProteomeCentral, which has a tentative launch date of Dec. 31, 2012. Links to
files coded in this new XML will be made available via another XML, the RDF Site Summary (RSS).
RSS feeds are commonly used by information providers to list updates to a web site. If you are unfamiliar with RSS
feeds, try the existing
feeds for PRIDE, Tranche and
GPMDB's Protein-of-the-day to see
what sort of information they can make available.
GPMDB adopts the Human Genome Variation Society conventions for amino acid polymorphisms (2011/07/19)
![]()
We will maintain the use of the RefSNP to track the origins of snAPs, but to serve our wider needs for a protein splice
specific method of tracking sAPs in general, we have adopted the
Human Genome Variation Society
nomenclature recommendations for protein
sAPs. This system is fairly simply and it is readily mapped onto any set of protein accession numbers that a
user might like to use. For example, the snAP corresponding to the SNP "rs30855079" can now be accessed using
the HGVS-style nomenclature:
ENSMUSP00000107760:p.I541V, or ENSMUSP00000107760:p.Ile541Val where "ENSMUSP00000107760" is the accesssion number for the protein (mouse Pzp) and "I541V" is the original residue (I), its position in the protein (residue #541) and the mutated residue (V). If the identify of either residue is unknown, either "X" or "Xxx" may be substituted as a wild-card place holder. A specific snAP in this format can be accessed either by entering that value into the GPMDB SNAP interface or directly as a URL using the convention: http://gpmdb.thegpm.org/protein/snap/ENSMUSP00000107760:p.I541V The accession number can be any that have been used by the GPM, such as yeast "Y" ORF numbers. NCBI gi numbers and SwissProt accessions require their normalized formats "gi|...|" and "sp|...|", respectively. Data set of the week: (2011/07/17)
Glycoprotein capture and quantitative phosphoproteomics indicate coordinated regulation of cell migration upon lysophosphatidic acid stimulation. Overall rating: ![]() ![]() These experiments demonstrate the value of using a multiple-step affinity purification
strategy to investigate molecules of interest. Here the authors use a combination of lectins to capture glycoproteins and
titanium oxide to capture highly acidic peptides. These peptides allowed them to investigate cell surface protein responses to lysophosphatidic acid
treatment. The set of peptides captured were quite different from a typical metal-oxide pulldown experiment,
as the intracellular proteins with large numbers of high occupancy phopho-domains that tend to dominate the results
were mainly absent (such as the usual suspects SRRM2, P53BP1, TRIM28, MAP1A, NPM, et fratres eorum). These high abundance phosphoproteins
do not have the necessary glycosylation to have been pulled-down in the first step and therefore they were almost completely removed. This simple
purification procedure allowed the reliable detection and quantitation of relatively low occupancy phospho-domains, such as those in WNK1,
PTPRK and DTX3L.
![]()
![]() Starting today, July 5th 2011, researchers in all EU member states and associated countries can submit a project proposal via the online application system of PRIME-XS. European researchers can request access to proteomics techniques at the six access facilities of PRIME-XS via an online application. Researchers can choose a preferential access facility where the project should be carried out and propose the proteomics technology they would like to use. All project proposals will be peer reviewed by independent reviewers. If the application is positively evaluated, the researcher is allowed to perform the experiment at the access facility. The users can get practical support with final sample preparation and staff of PRIME-XS will perform the proteomics data acquisition. Users will be able to visit the access facility, gain experience on sample preparation, sample analysis and data handling and analysis. ![]() PRIDE is currently undergoing unplanned but necessary database maintenance and normal service should resume by Wednesday, July 13. . This means that no new submissions are going to be processed until that time and users are encouraged not to create new user accounts as there might be some disruptions during this time. Thank you for your understanding. Data set of the week: (2011/07/10)
A high-quality catalog of the Drosophila melanogaster proteome. Overall rating: ![]() ![]() The work was one of the best of the once popular attempts to create a full-body proteome atlas of
an organism. In this case a model organism of historical interest, the fruit fly, was used and a large number
of Thermo LTQ and LCQ Classic runs were recorded. While an achievement at the
time (only 5 years ago), the relatively small number of identifications obtained per run and the very small amount of
quantitative information available makes this study seem a little dated. However, it still provides quite
a bit of insight about the most abundant proteins present in D. melanogaster and a general overview of those proteins' relative
concentration in a variety of organs and developmental stages, such as
larvae,
pupa membranes,
adult heads,
adult membranes,
adult membranes, and
adult brains.
![]() Data set of the week: (2011/07/04)
A cost-benefit analysis of multidimensional fractionation of affinity purification-mass spectrometry samples. Overall rating: ![]() ![]() These experiments were performed to provide a systematic evaluation of the use
of several common sample preparation/separation techniques for the analysis of the type of affinity purified samples
commonly used to determine protein-protein interaction partners. In this type of experiment the total number of proteins
identified has to be carefully balanced against the background level proteins present due to non-specific protein interactions.
The authors do a careful job of applying common methods and studying the results provides a number of interesting
case studies that can be used in both planning experiments and teaching practitioners (even experienced ones) about the
intricacies of this important class of samples.
Data set of the week: (2011/06/27)
Accurate quantification of more than 4000 mouse tissue proteins reveals minimal proteome changes during aging. Overall rating: ![]() ![]() This study is a large, multiple tissue examination of the effects of aging on
the proteome of M. musculus. The results give a very good survey of the distributions of proteins
that can be studied by whole mouse SILAC in a set of tissues: heart, kidney, cerebellum, frontal cortex, and hippocampus. The
interesting finding of the study was that there was little quantitative change in the proteins found:
aging seems to be a more subtle effect than can be accounted for by gross changes in a tissue's proteome composition.
![]()
A new link has been added to the main model display in GPM to allow users to generate
their own GPMDB-SQLite database for any GPM result online. Simply click the "sqlite" link on
a model page you are interested in (the link position is illustrated
below) and you will be taken to a page that will track the generation of the associated
".gpmdb" database file. It takes some time to create the new database, so please be patient.
![]() ![]() ![]() Data set of the week: (2011/06/19)
Large scale phosphoproteome profiles comprehensive features of mouse embryonic stem cells. Overall rating: ![]() ![]() When the authors referred to their study as "Large scale", they were not kidding.
The data made available rather thoroughly captures the proteins and peptides that can be observed
using current technology from whole cell lysates of mouse embryonic stem cells. The identifications were
very high quality and the chromatography was consistent. The only small flaw was the trypsin used: it
cleaved bonds between K-P, R-P and H-X more frequently than one might hope in a study of this sort. It is not
uncommon that trypsin will cleave these non-cannonical sites, but the frequency of this type of cleavage in this study
was unusually high.
This is possibly the first use of a protein sequence to generate music. It was
developed by the SMART (Science Meets ART) collective,
and in their words: [to] use music to describe the complexity of biomolecules (nuclear acids, DNA and RNA, proteins etc) unifying one more the linkage between Science and Art.
Data set of the week: (2011/06/13)
A comprehensive map of the human urinary proteome. Overall rating: ![]() ![]() If you have any interest in developing a diagnostic test that uses human urine, you should
take a good close look at the data in this study. The investigators used the most up-to-date techniques (Orbitrap-Velos using HCD)
and one important type of protein fractionation (lectin pull-down). The results give quite a clear picture
of the major and minor proteins present in urine and its provides a nice map to the peptides and modifications
that can be expected from this important class of clinical samples.
Data set of the week: (2011/06/06)
Proteomics analysis of the cardiac myofilament subproteome reveals dynamic alterations in phosphatase subunit distribution. Overall rating: ![]() ![]() This study provides some interesting insights into the protein composition of rat
cardiac myocytes, both in control and treated cases. The data clearly supports the conclusions in the
paper and it also provides many of the best observations of the cardiac muscle proteins associated with
these cells. There has been significantly less attention to rat proteomics than to mouse or human, so
quality data sets such as this one significantly improve what is known about this important model species.
Data set of the week: (2011/05/30)
Novel In Situ Collection of Tumor Interstitial Fluid from a Head and Neck Squamous Carcinoma Reveals a Unique Proteome with Diagnostic Potential. Overall rating: ![]() ![]() These results give an excellent insight into the proteins that can be
expected in interstitial fluid, a clinically important fluid that has not been studied extensively by proteomics
methods. The composition of the fluid was most similar to blood plasma and plasma-derived fluids, e.g. saliva,
urine or cerebrospinal fluid. Anyone planning to do an experiment involving interstitial fluid should
examine these results carefully.
Data set of the week: (2011/05/24)
Proteomic analysis reveals a virtually complete set of proteins for translation and energy generation in elementary bodies of the amoeba symbiont Protochlamydia amoebophila. Overall rating: ![]() ![]() The results presented in this paper consistuted the first proteomics information available
about an ameobiod obligate symbiont of the Acanthamoeba spp.
These common amoeba are only rarely pathogenic, however studying their symbiont's metabolism may provide
insight into the molecular basis of the eukaryote/prokaryote endosymbyotic relationships that seem to be very common in nature. The recent availability of
the symbiont's genome made the use of proteomics techniques possible. The combination of methods used in this
study were a little unusual, but they resulted in a good survey of the proteins in the organism, adding
1447 P. amoebophila proteins
to GPMDB.
![]() ![]() ![]()
... Given the presence of about 30% undisclosed proteins out of 20,300 protein gene products,
a systematic global effort is necessary to achieve this goal with respect to protein abundance,
distribution, subcellular localization, interaction with other biomolecules, and functions at
specific time points. As a general experimental strategy, HPP groups employ the three working
pillars for HPP: mass spectrometry, antibody capture, and bioinformatics tools and knowledge base.
The HPP participants will take advantage of the output and cross-analyses from the ongoing HUPO initiatives
and a chromosome-based protein mapping strategy, termed C-HPP with many national teams currently engaged ...
Data set of the week: (2011/05/15)
Multi-omics approach to study the growth efficiency and amino acid metabolism in Lactococcus lactis at various specific growth rates. Overall rating: ![]() ![]() This study was an outstanding example of the application of proteomics methods carefully
and methodically to a problem in biotechnology. All of the aspects of the investigation — experimental design, sample preparation,
chromatography and mass spectrometry — were well thought out and executed with a consistent attention
to detail and quality. The experiments reported in the paper go well beyond simply performing proteomics experiments by the use of other 'omics approaches,
significantly increasing the value of the proteomics results. The information generated by this study has greatly expanded general knowledge with regards to the proteome of
Lactococcus lactis, one of the most important bacteria in the food processing industry. It
also provides a good basis for understanding aspects of this organism's metabolism.
![]()
HUPO and HUPO Industry Advisory Board (IAB) are pleased to announce that the nomination period for
the new “HUPO Science Technology Award” is now open.
The technical award should be presented at the HUPO Annual World Congress to an individual whose
contributions drove a proteomic based technological product or procedure to commercial success.
The industrial based individual should be a key player in the commercialization (either R&D or
marketing) of a proteomics based technology (but does not necessarily have to be the original
inventor). Although academic settings often provide initial design of a new technology or technique, this
award is intended to pay recognition to the industrial partnership that developed a proteomic
based tool or application into a format that allows the advancement of the whole scientific community.
![]() Data set of the week: (2011/05/08)
Large-scale label-free quantitative proteomics of the pea aphid-Buchnera symbiosis. Overall rating: ![]() ![]() These experiments explore the proteomics of the relationship between the pea aphid, Acyrthosiphon pisum,
and its endosymbiont bacterium Buchnera aphidicola.
Buchnera bacteria are obligate endosymbionts in aphids, having lost the metabolic pathways necessary to be free living organisms.
The recent availability of the genomes of both the aphid and the bacterium makes it possible to do a thorough job of examining
the proteins present from both genomes in the intact organism. The results clearly demonstrate that any investigation of insect proteomics should be very
mindful of selecting an appropriate mixture of proteomes when analyzing raw data. This data set should also be
revisited when the genomes of other secondary endosymbionts of the pea aphid become known, such as
Hamiltonella defensa, Regiella insecticola, and Serratia symbiotica.
![]() From the HUPO web site: A workshop took place in Busan (Korea) on March 30, 2011 for the creation of the HPP consortium.
A short summary of the discussion is provided, followed by the recommendations and decisions forwarded
to the HUPO Executive committee that validated these decisions on April 5, 2011.
Data set of the week: (2011/05/01)
Large-scale Arabidopsis phosphoproteome profiling reveals novel chloroplast kinase substrates and phosphorylation networks. Overall rating: ![]() ![]() This study was a very successful application of the prefractionation techniques that
have been developed to enrich phosphopeptides. The detailed examination of plant phosphoproteomics has been well behind fungal (yeast)
and animal (human/mouse) studies, but this series of experiments shows conclusively that the same methods
can be used to great effect. The data was of sufficient quality to allow the identification of more than 2,000 phosphopeptides per run. The identifications
show the enrichment of acidic residues characteristic of metal oxide enrichment schemes.
The displayed information for proteins sourced from the US NCBI has been augmented by the
addition of Conserved
Domains Database (CDD) information to the display
(from the example
GPM64300013159):
![]()
The domain information is displayed immediately below the protein's text description
line. Each domain is linked back to CDD for additional information and an exerpt of the domain's description
is also displayed. A more detailed version of this information is available for each protein by clicking
on the "protein" link and reading the NCBI information sheet at the bottom of the page. If there are multiple examples
of a specific domain in a protein, the CCD link is followed by the number of times that domain is repeated. The CDD
information will be displayed for all proteins with "gi"-type accession numbers.
![]() The results of this study demonstrated the importance of examining specific tissues
in an organism, even one with as few differentialed organ systems as C. elegans. Even though C. elegans
is well represented in GPMDB (> 1,000,000 protein ids), this study contains many top ranking identifications
for specific proteins, almost certainly because of the relatively high concentration of those proteins in the oocyte. The
data itself was taken in a very consistent manner, with each gel band having good correlation between the detected
gene product molecular masses. With 6,691 total protein ids, this rather modest study provides a very
comprehensive view of the C. elegans oocyte proteome.
Data set of the week: (2011/04/17)
Identification of outer membrane proteins from an Antarctic bacterium Pseudomonas syringae Lz4W. Overall rating: ![]() ![]() This study demonstrates how to gain significant insights into prokaryotic cell organization
using proteomics techniques, once you have a good genome sequence for a closely related species (or two). The species
under study here was a plant pathogen — Pseudomonas syringae —
that has the singular ability to elevate the freezing point of water. This paper focuses on a cryophilic
strain of the bacteria in an attempt to understand how it can function effectively in a rather extreme environment.
The authors do a good job of using a proteomics strategy to acquire useful information about the organism's biology.
Data set of the week: (2011/04/10)
Improved Peptide Identification by Targeted Fragmentation Using CID, HCD and ETD on an LTQ-Orbitrap Velos. Overall rating: ![]() ![]() These results were produced by a well thought-out study to determine
the validity of various claims that have been made about the efficacy of the three most
popular fragmentation modalities for MS/MS-based proteomics: CID, ETD and HCD. Each
of these mechanisms was given a good workout and a fair, side-by-side comparison was
made without apparent bias. If you are interested in selecting between one of these
methods for an upcoming experiment, it would be well worth your while to look at this
comparative study to assist you in making up your own mind.
![]() Several countries have already signed up, including Australia, Canada, China, Japan, Russia,
South Korea, Sweden, Switzerland and the USA, and it is under active consideration elsewhere,
e.g. in France and Germany. There may be some major scientific advantages in participation but, equally,
there may be opportunity costs.
Additionally, gene-, protein- and disease-centric strategies for the HPP have been proposed
but their relative merits need to be considered.
Data set of the week: (2011/04/03)
A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry. Overall rating: ![]() ![]() These experiments were a tour de force of how to study
whole organism proteomics in insects. The organism was disected and important
organ systems were studied in detail. Even though the A. gambiae genome
has been available since 2002, this study was the first thorough examination of
the distribution of proteins in this important mosquito (it is the insect vector
of malaria). Technically, it uses cutting edge mass spectrometry-based identification
methods. The measured fragment ion mass accuracy was < 5 ppm for most of the individual
runs, allowing for high confidence peptide identifications (≤ 0.05% FPR).
![]()
ProteomeXchange informal meeting, where the project stakeholders would also be
invited to attend and talk. The idea is to discuss open issues such as the
expected data workflow or exchange formats. -
Friday April 15th: ProteomeXchange formal kickoff meeting.
For members of the consortium only. The idea is also, at least for some of the
stakeholders, to have some more focused meetings in the late afternoon-evening.
- Saturday April 16th: ProteomeXchange stakeholders meeting.
The idea is that it would be possible to fly back on the early evening.
![]()
We just decided that the website will remain open until 4 April. It will not be advertised
specifically other than the text which is currently
on the website
where we will change the date from 20 March to 4 April.
On 18 April we will open a system for submission for late breaking abstracts (posters only)
until probably July. This new opening will be announced by HUPO, EuPA and SPS to
their members in a mailing.
![]() Each of the individual experiments was derived from a set of 10 control and 10 tumour-bearing
Her2/Neu mice. These mice have been a popular model system for cancer research because of their
tendency to generate metastatic breast tumours. The results give a good profile of the proteins
detectable in M. musculus plasma under normal control conditions using standard methods and
the Thermo-Finnigan LTQ as the main detection platform. Each TRANCHE entry was entitled using
a mnemonic, for example:
"MARS_Sample_Pool_2_Normal_mzXML". This abbreviated form describes the protein handling (MARS depletion), the specific replicate (Sample_Pool_2) and the animal pool (Normal). The use of this type of mnemonic has become a wide-spread (but regretable) practice in the proteomics community for describing information deposited in repositories. Data set of the week: (2011/03/20)
An integrated workflow for charting the human interaction proteome: insights into the PP2A system. ![]() These results clearly demonstrated the merits of using highly specific affinity purification experiments
when trying to thoroughly study the proteins associated with a specific pathway or particle.
The data was of good quality, although the ion source did not perform uniformly in the low-organic
phase portion of the liquid chromatography runs. For example, contrast the retention time vs pI plot for GPM33000032760 (good)
with GPM33000032731 (not as good).
This commonly seen experimental artifact probably had little effect on the
biological conclusions drawn from the results. However, if the same data was used
to draw inferences about which peptides were appropriate candidates for quantitation methods,
this ion source inconsistency would lead to a bias against early eluting peptides.
![]()
The current trends suggest that this type of platform is becoming an integral tool for accessing information
by biomedical researchers. GPM will be increasing its efforts to make interfaces that provide as much information as possible
in a form that is compatible with the requirements of these devices.
![]()
![]()
Data set of the week: (2011/03/13)
Primary tumor xenografts of human lung adeno and squamous cell carcinoma express distinct proteomic signatures. ![]() The results give the proteins present in each of 10 human tumours grafted into SCID mice, with
three replicates per tumour. The analysis required the simultaneous use of both the mouse and human
proteomes, resulting in protein lists composed of a mixture of the two types of proteins. The human
proteins show the proteins that would normally be expected in human tumour tissue, as well as a normal
compliment of mouse blood proteins. In addition to the blood proteins, there was also clear evidence for
a set of murine extracellular matrix proteins. The presence of these proteins strongly suggest that
the host was able to begin infiltrating the tumour with ECM, even without a normal immune response to the xenograft material.
![]() ![]()
The purpose of these ontology terms is to aid the identification of data sets of interest at a later
time. By standardizing the terminology associated with data in GPMDB, the process of retrieving
useful information associated with a particular biological/analytical context becomes easier.
Data set of the week: (2011/03/06)
The ubiquitin-proteasome system is a key component of the SUMO-2/3 cycle. ![]() The data in this study resulted from a series of pull-down experiments with SILAC quantitation using
HeLa cells. The results contained an unusually large number of identifications for rare proteins, as well as
an over-representation of identifications that rated in the top percentile of all id's for particular proteins.
Analysis of the protein sequence motifs present showed that the RNA recognition motif, RNP-1
had been highly enriched by this particular pull-down strategy.
The underlying peptide id's were top quality with a very low number of false positives in the reported
sequences assignments.
![]() The results of this workshop will define
the character of Canada's contribution to the Human Proteome Project, e.g.,
the chromosome chosen (probably C6 or C21), the technologies to be employed, as
well as the estimated cost and the number of groups required for this cross-country
collaborative effort.
![]() ![]() The purpose of this research was to compare the proteomes of six human cell lines and determine
which candidate proteins were present in all six. This set of proteins they postulated to be a "central"
proteome: those proteins required by all human cells. While this concept will be debated for some time,
this study provides excellent insight into the proteins present in
these 6 cell lines under controlled conditions. The data divided up by cell lines are as follows:
![]() As a result of Peptidome's closing, some of the links associated with
results in GPMDB from Peptidome-sourced spectra would have become non-functional. To
ensure continuity of information, we have set up an alternate site for the experiment
and project information that would normally be obtained from Peptidome. All of the
links in GPMDB have been updated to point to this new resource.
To use this alternate annotation resource, a simple link can be used. For example, experiment
PSM1250 or project PSE132 can be accessed by the respective links:
This study contains two
experiments. The data was imported from Peptidome and was published by
Ettwig KF, Butler MK, Le Paslier D, Pelletier E, Mangenot S, Kuypers MM, Schreiber F, Dutilh BE, Zedelius J,
de Beer D, Gloerich J, Wessels HJ, van Alen T, Luesken F, Wu ML, van de Pas-Schoonen KT, Op den Camp HJ,
Janssen-Megens EM, Francoijs KJ, Stunnenberg H, Weissenbach J, Jetten MS, and Strous M in
Nature. 2010 464:543-8 (PubMed).
The data was generated using lysed cells from an unusual anaerobic bacterium, referred to by
NCBI as "NC10 bacterium 'Dutch sediment'". The sample itself was obtained from mud dug out of a
ditch in Holland. Compared to the well-controlled studies done with lab strains of bacteria or
cell lines, the researchers in this case dealt with generating identifiable proteins from real field samples. The
genome of the dominant species (Candidatus Methylomirabilis oxyfera) was available and the data could be
interpreted in light of an unusual feature of the organism's methane oxidation metabolism.
The European Proteomics Association has just published its 4th
informational bulletin
(get it here). It has a nice summary of the status of various projects in
Europe. Congratulations to Jean-Charles Sanchez and György Marko-Varga for their elections to
be EuPA Vice-President and President, respectively.
From the Peptidome website:
Due to budgetary constraints NCBI will be discontinuing the Peptidome Repository.
Over the next few weeks, we will phase out the online browser, query, and display
interfaces.
All existing data and metadata files will continue to be made available from our ftp
server ftp://ftp.ncbi.nih.gov/pub/peptidome/ indefinitely. Those files are named
according to their Peptidome accession number, allowing cited data to still be
identified and downloaded. Furthermore, we will endeavor to deposit all
Peptidome data in a different public mass spectrometry repository;
information about this effort will follow soon.
For those datasets that have been accessioned, but have not yet been made public, submitters have the option of withdrawing the data now and moving it to another repository. If we retain the data, it will move to the Peptidome FTP site on the date at which it is currently designated to go public. Data set of the week: (2011/02/13)
Identification of cell wall and cytoplasmic proteins of Aspergillus fumigatus. This study contains one summary
of LC/MS/MS runs. The data sets were obtained from a whole organism extract using a Thermo-
Finnegan LTQ mass spectrometer. The results have not be published, but were made available
through Peptidome, sample PSM1346.
Aspergillus fumigatus is a commonly occuring environmental saphrophytic fungus. It can
become clinically important in individuals with suppressed immune systems. The MS/MS data was
typical of LTQ-based analysis, but the results obtained from the data was a bit of a puzzle. The
original analysis (in Peptidome) only reported identifications for 2,223 spectra, whereas
a fairly straighforward analysis in our hands yielded approximately 20,000 identifications. While
the parameters used in the Peptidome analysis were not optimized (particularly the parent
ion mass tolerance and the list of variable modifications), repeated examination and re-analysis
in our hands was unable to resolve this significant difference. The data annotation stored in GPMDB
was performed twice: once with the CADRE
protein sequences alone and again with CADRE + RefSeq sequences for the same fungus strain. Because the
original MASCOT analysis was made available on Peptidome's FTP site, it was possible to
annotate each spectrum in the GPMDB analysis with those results for comparison (these appear as
comments on each of the spectrum display pages).
The CNPN is promoting a Canadian Human Proteomics Project (CHPP), which will be
developed during a Toronto-based Workshop (February 22, 2011) and a
Vancouver-based Workshop (date to be announced). CNPN invites you to participate
and provide feedback on the first draft of the CHPP Position Paper.
Further details on the Toronto Workshop can be found at www.cnpn.ca,
including an agenda outlining presentations and speakers.
Breakout sessions will allow the community to address critical components of CHPP
and develop strategies for integration into a White Paper. The White Paper
will be presented to the scientific community and funding agencies at the
CNPN Annual Symposium, May 8-11th, in Banff, Alberta.
This study contains 118
LC/MS/MS runs. The data sets were a combination of gel band and multidimensional chromatography
separations. The mass spectrometry appears to have been performed using HCD fragmentation
with an Orbitrap-LTQ hybrid instrument.
This data has not yet been published.
The results obtained from this data serve as a primer on what can be obtained
from the proteomics analysis of Leishmania major,
a trypanosomatid protozoan that causes leishmaniasis.
The data was generated from the two dominant life stages of the organism: the amastigote stage
that is adopted in the mammalian host; and the promastigote stage, adopted in the insect vector. The
combination of protein-level and peptide-level separation as well as the very high accuracy fragment
ion mass measurements make for a very broad coverage of proteins and peptides. Anyone interested
in the proteomics of L. major should study these results thoroughly before planning their
own experiments.
The daily incremental update of GPMDB has brought the total number of spectra
assigned to peptide sequences up to 253,866,646. For the last 6 years the number
of assigned spectra available has doubled year-over-year and it would appear that
this trend is continuing. Thanks to all of our search site users as well as all of
the laboratories that have made their data available through other sites, such as
TRANCHE, PRIDE and Peptidome.
Data set of the week: (2011/01/30)
The steady-state repertoire of human SCF Ubiquitin ligase complexes does not require ongoing Nedd8 conjugation. This study contains 41
LC/MS/MS runs.
This data was published in
Lee JE, Sweredoski MJ, Graham RL, Kolawa NJ, Smith GT, Hess S, and Deshaies RJ.,
Mol Cell Proteomics. 2010 Dec 17 (PubMed).
These interesting experiments were performed to explore the details of the current
model of how intracellular protein degradation is organized and regulated. The
experiments used SILAC and non-SILAC quantitation methods and experimental techniques that
did a good job of pulling out the relavent cellular machinery. The results contained
the most detailed observations yet of some of the important proteins in the
ubiquitin-mediated protein degradation pathway, such as CAND1, CUL1, and the COPS subunits.
This study contains 19
LC/MS/MS runs.
This data has not been published, but was made available by Mastrobuoni, G, et al.,
through Tranche,
along with a few experimental details.
This data was very high quality, using isoelectric focussing to separate peptides in a similar
manner to the use of SCX in MudPit. The organism studied was Schmidtea mediterranae, which
is a free-living planarian (flatworm) with an exceptional ability to self regenerate when injured.
While there is a genome project underway for this organism, the proteome sequence has not been
made available. As an alternative, RNA sequence information was used, based on the
current version of Unigene. The results
show how well data can be analyzed with assembled transcriptional sequences only, which may remain
the best alternative for many species of zoological or botanical interest
for some years to come.
This study contains 6
LC/MS/MS runs, generated from HPLC experiments.
This data has not been published, but was made available by Taejoon Kwon, et al
on the Marcotte Lab web site's data section, under the heading Data_12 (see the
experimental description link for details).
This study provides a good view of the proteome of an important pathogen, Pseudomonas aeruginosa.
P. aeruginosa is a common free-living bacteria that can rapidly colonize human tissue if it has been
damaged or if there is a defect in the immune system. The results represent two biological replicates of cultured
cells and provides a good starting point for any study of proteins produced by this organism.
Data set of the week: (2011/01/02)
The leukocyte nuclear envelope proteome varies with cell activation and contains novel transmembrane proteins that affect genome architecture. This study contains 8
summary results, generated from multidimensional chromatography experiments.
The manuscript describing this work was published by
Korfali N, Wilkie GS, Swanson SK, Srsen V, Batrakou DG, Fairley EA, Malik P, Zuleger N, Goncharevich A, de Las Heras J, Kelly DA, Kerr AR, Florens L, and Schirmer EC,
Mol Cell Proteomics 2010 Dec;9:2571-85
(PubMed).
The results of this study provide a good survey contrasting the proteins present in R. norvegicus
and H. sapiens microsomes. The GO displays for the individual experiments demonstrate the quality of the
preparation methods used, showing very significant enrichment of endoplasmic reticulum, Golgi aparatus, integral membrane,
mitochondrion and other membrane associated subcellular structures.
Copyright © 2011, The Global Proteome Machine Organization
|