Proteomics, The Snapshots of Proteins
INTRODUCTION
Imagine
peering into the microscopic world of proteins,
the molecular workhorses that drive the processes vital to life. Proteomics—the
study of all the proteins produced by an organism—provides a powerful framework
for understanding how our cells function, adapt, and interact. Proteins are
essential to nearly every biological activity, from growth and repair to immune
defense and disease progression. By analyzing the vast and complex network of
proteins, researchers are uncovering insights that could revolutionize
medicine, health, and drug development. In this blog, we’ll explore the
captivating field of proteomics and how it’s reshaping our understanding of
biology, health, and disease.
The term “proteomics” was first coined in 1995 and is defined as the large-scale characterization of the entire protein complement of a cell line, tissue, or organism.
HISTORY
·
The first protein studies that can be
called proteomics began in 1975 with the introduction of the two-dimensional
gel by O’Farrell, Klose, and Scheele, who began mapping
proteins from Escherichia coli, mouse, and guinea pig, respectively. In their
experiment, although many proteins could be separated and visualized, they
could not be identified.
·
Further 2D Protein electrophoresis was used to catalog all human proteins.
·
The first major technology to emerge for
the identification of proteins was the sequencing
of proteins by Edman
degradation.
·
A major breakthrough was the development
of microsequencing techniques for
electroblotted proteins. This technique was used for the
identification of proteins from 2-D gels to create the first 2-D databases.
·
One of the most important developments
in protein identification has been the development of MS technology, which helped the research in enhancing the
sensitivity and accuracy as it can detect the proteins in the femtomolar level
and can be used in high-throughput operations.
METHODOLOGY OF PROTEOMICS
1.
Genome sequencing and Annotation of the genome.
·
Without Genomic data base proteins
cannot be sequenced.
·
Haemophilus
influenza (a gram-negative coccobacillary bacteria) was the
first organism, whose genome was sequenced for the first time in 1995.
· Gene annotation is the process of identifying the total no of the genes in the genome along with its Exons and Introns.
2.
Protein expression studies.
·
The analysis of mRNA expression by
various methods includes Serial Analysis of Gene Expression (SAGE) and DNA microarray
technology.
·
It includes the studies of the
transcription, post-transcriptional modifications like splicing and editing,
translation, post-translational modification, etc.
·
It is estimated that up to 200 different
types of post-translational protein modification exist.
· Proteins can also be regulated by proteolysis (Protein degradation) and compartmentalization (spatial organization of proteins in different regions inside the cell).
3.
Functional analysis of proteins.
· The functions of many proteins can only be inferred by examination of their 3-D structure.
4.
Protein-protein interactions studies.
·
Of fundamental importance in biology is
the understanding of protein-protein interactions.
·
The process of cell growth, programmed
cell death, and the decision to proceed through the cell cycle are all
regulated by signal transduction through protein complexes.
·
Proteomics aims to develop a complete
3-D map of all protein interactions in the cell.
A typical proteomics experiment (such as protein expression profiling) can be broken down into the following categories: (i) The separation and isolation of proteins from a cell line, tissue, or organism; (ii) The acquisition of protein structural information for the purposes of protein identification and characterization; and (iii) Database utilization.
(i)
Separation and Isolation of Proteins
The predominant technology for protein separation
and isolation is polyacrylamide gel electrophoresis (PAGE).
The main types of Protein PAGE (Polyacrylamide Gel
Electrophoresis) in electrophoresis are:
1.
SDS-PAGE (Sodium Dodecyl Sulfate-PAGE): Separates proteins
based on size (molecular weight) in the presence of SDS, which denatures and
coats proteins with a negative charge.
2.
Native PAGE: Separates proteins based on their
native charge and size, without denaturing agents, preserving protein structure
and function.
3.
Gradient PAGE: Uses a gradient of acrylamide
concentrations to separate proteins over a wide range of sizes.
4.
1D-PAGE (One-Dimensional PAGE): Separates proteins on
the basis of size.
5.
2D-PAGE (Two-Dimensional PAGE): Combines two
separation techniques:
- IEF
(Isoelectric Focusing): separates proteins by charge (Isoelectric point)
-
SDS-PAGE: separates proteins by size (molecular weight)
6.
Urea-PAGE: Uses urea to denature proteins and separate them
based on size, often used for membrane proteins.
7.
Gel-Free PAGE: Uses microfluidic devices or capillary
electrophoresis to separate proteins without a gel matrix.
8.
CN-PAGE (Clear Native PAGE): Separates proteins in
their native state, preserving protein complexes and interactions.
9.
BN-PAGE (Blue Native PAGE): Separates proteins in their native
state, using a charge-based separation mechanism.
Ø Generally in Proteomics, 1DE is used as it preserves the native structure and conformations intact. If the sample is complex like crude cell lysate, then 2DE is performed.
Ø Although
the 2DE helps in the separation, there are a number of limitations with it
including-
·
Time consuming up to 2 days.
·
Only a single sample can be analysed at
a time.
·
It is limited by the number and type of
protein to be resolved.
·
Many large or hydrophobic proteins will
not enter the gel during the first dimension.
·
Proteins of extreme acidity or basicity
(proteins with pIs below pH 3 and above pH 10) are not well represented.
·
Low copy proteins cannot be detected
when a cell lysate is analysed.
Ø To
overcome these issues, one method is to convert the entire protein mixture to
peptides by digesting them with Trypsin followed by purification of those
peptides in by using the methods like capillary electrophoresis, liquid
chromatography, cation exchange chromatography, reverse phase chromatography
Ø These
techniques aid the researcher in bypassing the 2DE. But, these techniques also
have limitations like-
·
Time consuming.
· Requires computing power to deconvolute the data obtained.
Ø One of the most exciting techniques to emerge as an alternative to protein electrophoresis is that of Isotope-Coded Affinity Tags (ICAT). This method allows the quantitative protein profiling between different samples without the use of electrophoresis.
(ii)
Acquisition of protein structural information
1. Sequencing
Edman sequencing
It
is one of the earliest methods used for protein identification.
It
is also called micro sequencing by Edman chemistry to obtain N-terminal amino
acid sequences.
Limitation
– The proteins modified at N-terminal cannot be sequenced.
To
overcome this, another method is used –
Mixed peptide sequencing
The
process of mixed-peptide sequencing involves separation of a complex protein
mixture by polyacrylamide gel
electrophoresis (1-D or 2-D) and then transfer of the proteins to an inert
membrane by electroblotting. The
proteins of interest are visualized on the membrane surface, excised, and
fragmented chemically at methionine
(by CNBr) or tryptophan (by skatole)
into several large peptide fragments. On average, three to five peptide
fragments are generated, consistent with the frequency of occurrence of
methionine and tryptophan in most proteins. The membrane piece is placed
directly into an automated Edman sequencer without further manipulation and
then sequenced simultaneously.
Ø After
sequencing, the mixed-sequence data are fed into the FASTF or TFASTF
algorithms, which sort and match the data against protein (FASTF) and DNA
(TFASTF) databases to unambiguously identify the protein.
2.
Mass spectrometry
Mass
Spectrometry helps the researchers to get information like peptide mass, amino
acid sequence, type and locations of protein modifications, etc.
It involves steps like sample preparation, ionization and mass analysis.
Sample Preparation
In
most of proteomics, a protein is resolved from a mixture by using a 1- or 2-D
polyacrylamide gel.
As
the extraction of the whole protein is inefficient, its constituent peptides
from the gel are then extracted by digesting it by a protease (in-gel digestion),
The
obtained peptides with gel contaminants are purified by Reverse phase chromatography or with ZipTips (Millipore) or with Poros
R2 perfusion material or by High
performance liquid chromatography (HPLC).
Sample Ionization
For
biological samples to be analyzed by MS, the molecules must be charged and dry.
This is accomplished by converting them to desolvated ions. The two most common
methods for this are electrospray
ionization (ESI) and matrix-assisted
laser desorption/ionization (MALDI). In both methods, peptides are
converted to ions by the addition or loss of one or more protons.
Mass Analysis
This
is accomplished by the mass analyzers
in a mass spectrometer, which resolve the molecular ions on the basis of their
mass and charge in a vacuum.
Different types of mass analysers include- Quadrupole mass analyzers, Time of flight, Ion trap, etc.
(iii)
Database Utilization for Identification of Protein
Databases
allow protein structural information harvested from Edman sequencing or MS to
be used for protein identification.
The
goal of database searching is to be able to quickly and accurately identify
large numbers of proteins. The success of database searching depends on the
quality of the data obtained in the mass spectrometer, the quality of the
database searched, and the method used to search the database.
Methods
of Protein Identification
1.
Peptide mass
fingerprinting database searching
In
this method, the masses of peptides obtained from the proteolytic digestion of
an unknown protein are compared to the predicted masses of peptides from the
theoretical digestion of proteins in a database.
Limitations
Mass analysis can be hindered when
·
A protein is extensively modified during
post translational modifications or,
·
A protein is present in a complex
mixture with several other proteins.
To
overcome the above limitations, researchers use some bioinformatics tools like ProFound enables protein identification
in simple protein mixtures; the ExPASy
server provides a variety of tools for proteomics and programs for protein
identification, PepSea, PeptIdent/MultiIdent, MS-Fit, MOWSE and many more.
2.
Amino acid sequence
database searching
·
The most specific type of database
searching for protein identification uses peptide amino acid sequence. One
method which utilizes this information is Peptide
Mass Tag Searching.
·
In this method, a partial amino acid
sequence is obtained by interpretation of the MS/MS spectrum (the sequence tag)
and this information is combined with the mass of the peptide and the masses of
the peptide on either side of the sequence tag where the sequence is not known.
·
Peptide mass tag searching is a more
specific tool for protein identification than peptide mass fingerprinting.
3.
De novo peptide
sequence information
·
It involves the method to obtain de novo
sequence data from peptides by MS/MS
and then use all the peptide sequences to search appropriate databases.
·
Multiple peptide sequences can be used
for protein identification by searching databases with the FASTS program.
· The single biggest advantage of this method is the capability of searching peptide sequence information across both DNA and protein databases.
4.
Uninterpreted MS/MS
data searching
·
The development of un-interpreted MS/MS
search algorithms that are error tolerant.
·
The searches against un-annotated or
un-translated DNA databases with un-interpreted MS/MS data are likely to suffer
from the same pitfalls associated with mass fingerprinting. In particular,
polymorphisms, sequencing errors, and conservative substitutions will probably
contribute to failure to accurately identify a protein, we can overcome these
shortcomings using this method.
·
Examples
include programs such as Mascot, SONAR, and SEQUEST.
Figure 22: Strategies for protein
identification.
TYPES OF PROTEOMICS AND THEIR APPLICATIONS IN BIOLOGY
1.
Protein Expression Profiling
·
Disease Mechanism
·
Signal Transduction
·
Medical Microbiology
2.
Post-Translational Modifications
·
Glycosylation
·
Phosphorylation
·
Proteolysis
3.
Protein-Protein Interactions
·
Yeast two-hybrid
·
Co-precipitation
·
Phage Display
4.
Structural Proteomics
·
Organelle Composition
·
Sub-proteome Isolation
· Protein Complexes
5.
Functional Proteomics
·
Yeast genomics
·
Affinity Purified Protein Complexes
· Mouse Knockouts
6.
Proteome Mining
·
Drug Discovery
·
Target Identification/Validation
·
Differential Display
SIGNIFICANCE OF PROTEOMICS
Many types of information cannot be obtained from
the study of genes alone. For example, proteins, not genes, are responsible for
the phenotypes of cells. It is impossible to elucidate mechanisms of disease,
aging, and effects of the environment solely by studying the genome. Only
through the study of proteins can protein modifications be characterized and
the targets of drugs identified.
Cho, W.C.S. (2016). Proteomics technologies and challenges. Clinical and Translational Medicine, 5(1), 1-10. doi:10.1186/s40169-016-0107-6






.png)










.png)
.png)

.png)
.png)
.png)
.jpg)

Comments
Post a Comment