Genetics and Genomics Biochemistry

scientificprotocols authored over 8 years ago

Authors: André Murad, Gustavo Souza, Jerusa Garcia & Elibio Rech


Identification of recombinant protein expressed in a total soluble protein (TSP) plant extract by mass spectrometry is desirable and necessary to accelerate further processing steps. Basically, the protocol consists of an initial TSP sample preparation and trypsin digestion prior preliminary characterization of recombinant proteins expressed in TSP samples of transgenic soybean seeds utilizing a nanoUPLC-MSe. As low as a 50 ug TSP sample can be effectively analyzed. Experimental data for the TSP extraction and sample preparation are discussed. The development of the process takes up to 3 days.


The production of recombinant protein is an important step in several academic, industrial and pharmaceutical processes. Several heterologous protein expression systems are available, including bacterial (1), mammalian cell-culture (2) and plant (3, 4) systems. Although these comprise the main production systems, the search for novel methods to increase protein yield, facilitate manipulation and reduce cost continues. Seeds are a vital alternative for recombinant protein production for several reasons: they can undergo long-term storage at ambient temperatures (5, 6), they can provide an appropriate biochemical environment for protein stability through the creation of specialised storage compartments (6, 7), they are not contaminated by human or animal pathogens (8), they do not undergo non-enzymatic hydrolysis or protease degradation owing to their desiccation characteristics (5, 8) and they do not carry the phenolic substances that are present in tobacco leaves, which is important for downstream processing (3, 8). We recently produced several soybean transgenic plants expressing important pharmaceutical molecules, such as proinsulin (6), human growth hormone (hGH) (9) and human coagulation factor IX (hFIX) (10), showing the viability of this system. On other hand, producing these transgenic lines is extremely time consuming (11) and requires at least 150 days to obtain the first seeds and another 3 years for a homozygote line. At the early stage, we have little material for recombinant protein purification; as a result, the detection, quantification and characterisation of recombinant molecules rely mainly on the manipulation of total soluble protein (TSP), which contains a complex mixture with a low abundance of the protein of interest. Thus, we need a method that detects, qualifies and quantifies recombinant proteins in TSP using less than ¼ of a single seed mass (50 mg).

Typically, the identification of a recombinant protein is performed using western blot analysis (12) and is quantified by enzyme-linked immunosorbent assays (ELISAs) (13). These methods are widely used because they are simple and relatively fast for identification and quantification, but they lack sensitive detection ability when small amounts of antigens are used, or no antibody is available, or a false positive is found and there is no way to verify the quality, amino acid sequence or post-translational modification of the recombinant protein. Two dimensional electrophoresis (2-DE) has been developed for proteomics (14, 15), and because of its association with mass spectrometry, it has become a primary tool for the identification and characterisation of plant complex mixtures (15, 16). 2-DE can also be used for quantification and protein mapping of tissues (17), comparative proteomics (18, 19) and post-translational identification (20), but it requires a minimum sample amount, cannot detect molecules in low abundance, needs spot manipulations for good identification (15), is mainly performed by peptide mass fingerprinting (PMF) (21, 22), and has difficulty in analysing proteins with similar mass and pI because they appear as a single spot. The combination of gel and liquid chromatography mass spectrometry (LC-MS) methods may result in better identification of proteins in complex samples (23, 24), overcoming the problems of 2-DE. Liquid chromatography (LC) increases the low detection/resolution of complex mixtures on mass spectrometers (MS) (25). Furthermore, the analysis of peptides or complex samples commonly known as “system samples” that are digested by trypsin is key in the detection of low abundance proteins, but this technique has limitations in terms of the analyte dilution and the minimum amounts of complex protein mixtures needed to guarantee a good dynamic range and detection of low abundance proteins (15, 25-28).

Nano-scale liquid chromatography with 2D separations as a strong cation exchange (SCX) followed by reverse-phase (RP) chromatography or 2D RPxRP using two pH and acetonitrile pulses combined with mass spectrometry with data independent acquisitions (nanoLC-MS^E) has several benefits for proteome analysis. Among these benefits are detection and linear sequence structural information at the femtomole level (29, 30), small surface areas and minimal dead-volumes, which minimises analyte losses due to surface adsorption, and low flow rates that reduce analyte dilution. Thereby, analytes of low abundance can be separated with a high recovery rate when associated with a high dynamic range and a prevailing MS detection system (31). Recently, the nanoLC-MS method was used for the detection of differences in expression of soybean plasma membrane proteins under osmotic stress (32), the regulation of stress identification on tomatoes induced by iron deficiency (33) and the detection of neuropeptides secreted in Cancer borealis (34), demonstrating the capability and potential of this method. Moreover, nanoLC-MSE is an important tool in post-translational characterisation of proteins, such as the identification of N-terminal peptide modifications in the chloroplast proteome (35), the analysis of human protein oxidations leading to functional reduction/annulation (36), and the characterisation of the phosphorylation pattern of several phosphatase splice variants expressed in a human cell line (37, 38). Finally, quantification is also possible with the nanoLC-MS technique using labelling methods such as (18 O) labelling peptides (39) and the iTRAQ™ method (40), based on relative quantification methods, such as the use of stochastic measurements between mass and intensity deviations for each ion detected (41) or the absolute quantification based on a constant ion current acquired with low (MS) and high energies (MS/MS) into the mass spectrometer, called MS^E (42-45).

We describe herein (Fig. 1) an easy-to-handle, label-free nanoUPLC-MS^E method with absolute quantification and small sample usage for the detection, quantification and characterisation of low abundance recombinant proteins expressed in soybean seeds, specifically the immunogenic tumour NY-ESO-1 antigen (cancer testis antigen 1, CTAG) (46). CTAG is a protein product of the human X chromosome with 180 amino acid residues (Fig. 2), mass 18 kDa, a glycine-rich N-terminal region and an extremely hydrophobic C-terminal region that is so insoluble it can be confused with a transmembrane domain (46, 47) and is therefore a challenge in the identification and characterisation of TSP extracts, as in our case. The expression pattern analysis by RT-PCR for CTAG has confirmed that expression is restricted to testis and is not present in other normal tissue, but is found in several types of cancer, including bladder, breast and lung cancer (48). The recombinant CTAG produced in Escherichia coli (E. coli) was the first to be evaluated in the clinical setting and ranks among the most promising trials published so far with CTAG because of the broad immunological and favourable clinical results (46, 49); thus, the use of CTAG as a vaccine is viable only if coupled with a low cost, scalable recombinant protein production system. Additionally, the nanoUPLC-MS^E used in this procedure has particularities that enhance recombinant protein characterisation with high selectivity and specificity. The nanoUPLC-MS^E is composed of a non-split, direct pump infusion, nanoscale liquid chromatography system (nanoACQUITY® UPLC, Waters, Milford, MA) and related columns and accessories. These include the use of columns packed with smaller particle sizes (<2 μm) (50) and the use of columns with a smaller internal diameter (I.D. <100 μm) (51). Another development to couple RP with a different separation mechanism is the method of 2D chromatography. This method can be accomplished using the ion exchange properties between the peptides or proteins with the stationary phase and mobile phase, e.g., an increase or decrease of chaotropic “salting plugs” or pH. For the last 10 years, this technique has been used as a cation exchange column (SCX) and “salting pulses” with ammonium formate, e.g., at different concentrations.

Advances in this technology may allow the exploration of new frontiers in separation science to avoid ion suppression from orthogonal separation and to increase peak capacity (52). These chromatography systems coupled with a high-end mass spectrometry instrument allow minimal amounts of system samples to be injected and detected with high selectivity and specificity. To achieve such high standards in this particular experiment workflow, from sample preparation to acquisition and processing, these standards must be controlled to avoid contamination and other characteristics, as described in detail in this protocol.


Chemicals and solvents

  1. Sterile deionised water with a conductivity of less than 1.3 µS/cm, total organic carbon (TOC) less than 2 ppb, and a semiconductor equivalent specification of 0.055 µS/cm (18.2 mΩ.cm) at point-of-use at 25 °C
  2. Petroleum Ether, 30-75 °C, BAKER ANALYZED Reagent (J.T. Baker, cat. no. 9274-03)
  3. Tris base (2-Amino-2-(hydroxymethyl)-1,3-propanediol) – (Fisher Scientific Ltd, cat. no. BP152-5)
  4. KCl (Aldrich-Sigma Chemical Co. Ltd, cat. no. P9541)
  5. DL-Dithiothreitol (threo-1,4-dimercapto-2,3-butanediol) for molecular biology, ≥98% (DTT, Sigma-Aldrich, cat. no. D9779)
  6. Phenylmethanesulfonyl fluoride ≥98.5% (PMSF, Sigma-Aldrich, cat. no. P7626)
  7. Sodium dodecyl sulphate for molecular biology, ≥98.5% (SDS, Sigma-Aldrich, cat. no. L4390)
  8. Acetone CHROMASOLV® Plus, for HPLC, ≥99.9% (Sigma-Aldrich, cat. no. 650501)
  9. NH4HCO3 ReagentPlus®, ≥99.0% (Sigma-Aldrich, cat. no. A6141)
  10. RapiGEST™ SF (Waters, cat. no. 186001861) 53
  11. Iodoacetamide BioUltra (Sigma-Aldrich, cat. no. I1149)
  12. Trifluoroacetic acid spectrophotometric grade, ≥99% (TFA, Sigma-Aldrich, cat. no. 302031)
  13. Acetonitrile LC-MS CHROMASOLV®, ≥99.9% (Fluka, cat. no. 34967)
  14. Formic puriss. p.a., for mass spectroscopy, ~98% (T) (FA, Fluka, cat. no. 94318)
  15. nanoACQUITY™ UPLC™ trap column Symmetry C18 5 μm, 180 µm x 20 mm trap column (Waters, cat no. 186003514)
  16. nanoACQUITY™ UPLC™ analytical column of 100 μm x 100 mm, 1.7 μm BEH130 C18 (Waters, cat. no. 186003546).

Enzyme and standards

  1. Trypsin (Promega, cat. no. V511A)
  2. MassPREP Protein Digestion Standard Alcohol Dehydrogenase (MPDS ADH - Waters, cat. no. 186002328)
  3. [Glu1]-Fibrinopeptide B human (GFP – Sigma-Aldrich, cat. no. F3261) Kits
  4. Quant-iT™ Protein Assay Kit, 500 Assays, 0.25-5 µg for use with the Qubit™ fluorometer (Invitrogen, cat. no. Q33212)

Buffers and Solutions

  1. Extraction buffer (see REAGENT SETUP)
  2. 50 mM NH4HCO3 (see REAGENT SETUP)
  3. Digestion solution (see REAGENT SETUP)
  4. Alkylation solution (see REAGENT SETUP)
  5. Reduction solution (see REAGENT SETUP)
  6. Hydrolysis solution (see REAGENT SETUP)
  7. Sample solution for nanoUPLC-MSE analysis (see REAGENT SETUP)
  8. MPDS ADH solution (see REAGENT SETUP)
  9. Surfactant solution (see REAGENT SETUP)
  10. Mobile phase A (see REAGENT SETUP)
  11. Mobile phase B (see REAGENT SETUP)
  12. GFP solution (see REAGENT SETUP)
  13. Cold Acetone (Store acetone at -20 °C)


  1. Extraction buffer (20 mM Tris-HCl, pH 8.3, 1.5 mM KCl, 10 mM DTT, 1 mM PMFS, 0.1 % V/V SDS) For 1 litre, dissolve 2.42 g of Tris base, 0.1 g of KCl, 1.54 g of DTT, 0.174 g of PMSF and 1 g of SDS in 800 mL of deionised water. Adjust the pH to 8.3 with HCl and add water to make up a final volume of 1 litre. Store at -20 °C for up to 6 months.
  2. 50 mM NH4HCO3 For 1 litre, dissolve 3.95 g of NH4HCO3 in 800 mL of deionised water. Filter through a 0.22 µm filter and store at room temperature (20–24 °C) for up to 6 months.
  3. Digestion solution Add 400 μL of 50 mM NH4HCO3 to one 20 μg vial of Promega Trypsin. Make aliquots of 10 µL and store at -80 °C for up to 6 months.
  4. Alkylation solution (300 mM Iodoacetamide) For 1 mL, dissolve 55 mg in 500 µL of deionised water. Add water to 1 mL. Store at -80 °C for up to 6 months.
  5. Reduction solution (100 mM DTT) For 1 mL, dissolve 15 mg in 500 µL of deionised water. Add water to 1 mL. Store at -80 °C for up to 6 months.
  6. Hydrolysation solution (5 % V/V TFA) For 10 mL, add 0.5 mL of TFA in 9.5 mL of deionised water. Store at room temperature (20–24 °C) for up to 6 months.
  7. Sample solution for nanoUPLC-MSE analysis (3 % V/V acetonitrile, 0.1% V/V FA) For 10 mL, add 0.3 mL of acetonitrile and 0.01 mL of FA to 9.5 mL of deionised water. Store at room temperature (20–24 °C) for up to 6 months. ADH solution Add 1 mL of the nanoUPLC-MSE solution to one vial of MPDS ADH. Make aliquots of 10 µl and store at -80 °C for up to 6 months.
  8. Surfactant solution (0.2 % V/V) Add 0.5 mL of water to one vial of 1 mg of RapiGest™ SF. Store at 4 °C for up to 3 months.
  9. Mobile phase A (0.1% V/V FA) For 1 litre, add 1 mL of FA to 999 mL of deionised water. Store at room temperature (20–24 °C) for up to 3 months.
  10. Mobile phase B (0.1% V/V FA in acetonitrile) For 1 litre, add 1 mL of FA to 999 mL of acetonitrile. Store at room temperature (20–24 °C) for up to 1 year.
  11. GFP solution (200 fmol.µL-1) Stock Solution: Add 2000 µL of acetonitrile/water 2.5/7.5 to 0.1% FA to give a solution of 32 pmol.l-1. Store in the freezer. Take 625 µL of the stock solution and fill to 100 mL with acetonitrile/water 2.5/7.5 with 0.1% of FA, giving a solution of 320 fmol.l-1. Use within 3 months.


  1. Coffee grinder (Krups, model n. F203)
  2. Refrigerated centrifuge (Eppendorf, model 5810R)
  3. Analytical balance (Metter Toledo, cat. no. XP105D)
  4. 2 mL microtubes (Axygen, cat. no. MCT-200-C)
  5. 1.5 mL microtube (Axygen, cat. no. MCT-150-C)
  6. Vortex (Scientific industries, model G560E)
  7. Dry bath (Fisher Scientific, cat. no. 11-718-2)
  8. Waters Total Recovery vial (Waters, cat. no. 186000384c)
  9. nanoACQUITY™ UPLC™ system (Waters, Milford, MA, USA)
  10. NanoLockSpray™ – nanoESI source (Waters, Manchester, UK)
  11. Synapt HDMS™ mass spectrometer (Waters, Manchester, UK)


Total soluble protein extraction from recombinant CTAG soybean seeds. TIMING 1-2 h for one sample

  • 1| Using a coffee grinder, grind the soybean seeds into a fine powder. Using an analytical balance, weigh out 100 mg of powder and store the remaining powder in a vacuum bag at -80 °C for up to 1 year.
  • 2| Place the weighed sample into a 2 mL capped centrifuge tube. Add 1 mL of petroleum ether and slowly vortex the sample for 15 min. Discard the supernatant and repeat the step twice (2X). Troubleshooting: Gently drop the solution out to avoid powder losses.
  • 3| Allow the petroleum ether to evaporate for 10 min. Add 1 mL of the extraction buffer and slowly vortex the sample at room temperature for 10 min.
  • 4| Leave the sample on the centrifuge for 5 min at 5000 r.min-1 at 4 °C. Transfer the supernatant to a new centrifuge tube. At this step, it can be stored at -20°C for one week. Pause point

Protein concentration. TIMING 1-2 h

  • 5| For each 200 μL of sample, add 800 μL of cold acetone to the centrifuge tube. Vortex thoroughly and keep at -20 °C for 1 h, vortexing every 15 min.
  • 6| Centrifuge the sample for 10 min at 13000 rpm. Discard the supernatant and allow the pellet to dry at room temperature for 30 min. Critical Step Do not overdry the pellet or it may become instable and partially insoluble.
  • 7| Carefully dissolve the pellet with 500 μL of 50 mM NH4HCO3. Quantify it using the Quant-iT™ Protein Assay Kit (Invitrogen) and dilute it with 50 mM NH4HCO3 to a 1 μg.μl-1 concentration. At this point, the sample can be stored at -20 °C for one week. Critical Step For quantification purposes, the fluorometer must be calibrated for the correct protein dosage.

Sample preparation for nanoUPLC-MSE acquisition. TIMING 2 d

  • 8| Place 50 μL of the 1 μg.μl-1 sample in a capped microcentrifuge tube.
  • 9| Add 10 μL of 50 mM NH4HCO3.
  • 10| Add 25 μL of the surfactant solution and vortex. Critical step The surfactant solution must be applied only if the sample is placed in the ammonium bicarbonate buffer at an alkaline pH. At an acidic pH, the surfactant will be depredated, and the solution’s kinetic energy will be reduced prior to digestion, resulting in more missed cleavages and bigger peptide fragments. .?Troubleshooting
  • 11| Place the tube in a dry bath set at 80 °C. Heat for 15 min. Critical step: Ensure the dry bath is set to the correct temperature before heating the sample.
  • 12| Remove the tube from the dry bath. Perform a short spin; then add 2.5 μL of the reduction solution and vortex slightly.
  • 13| Place the tube in a dry bath set at 60 °C and heat for 30 minutes. Critical step: Ensure the dry bath is set to the correct temperature before heating the sample.
  • 14| Remove from the dry bath, allow the tube to cool to room temperature and then centrifuge it. Add 2.5 μL of the alkylation solution and vortex slightly.
  • 15| Place the sample in the dark at room temperature and allow 30 minutes of reaction time.
  • 16| Add 10 μL of the digestion solution and vortex slightly. Digest the sample at 37°C in a dry bath overnight. This produces a 1:100 wt:wt ratio of enzyme:protein.
  • 17| Following digestion, to precipitate the surfactant, add 10 μL of hydrolysation solution and vortex. Then centrifuge the samples at 14000 rpm at 6 °C for 30 minutes. Transfer the supernatant to a Waters Total Recovery vial. Critical step The surfactant must be fully precipitated to ensure proper dissolution of the protein prior to injection in the chromatograph and to avoid contamination during MSE acquisition. Ensure the centrifugation step is well controlled to avoid the injection of precipitation residues into the nanoUPLC system. Troubleshooting.
  • 18| Add 5 μL of ADH and then add 85 μL of the nanoLC-MSE solution. The final concentration of the protein is 250 ng.μL-1 and that of ADH is 25 fmol.μL-1. The final volume is 200 μL. Store at -80 °C up to 6 months. Critical step: Correctly pipetting these solutions is crucial for a good protein quantification by PLGS; therefore, it is critical to keep the counts/fmol stoichiometric ratio between the sum of the ion intensity and the concentration for a standard protein (manual response factor). It is desirable to use a manual response factor instead of the concentration amount of the internal standard protein for the best quantification analysis.

NanoUPLC-MSE acquisition. TIMING 1 d

  • 19| The nanoACQUITY™ UPLC™ system was configured as follows: the samples were initially transferred with an aqueous 0.1% formic acid solution to trap the column with a flow rate of 15 μL.min-1 for 1 min with a 5 μL loop.
    • CRITICAL STEP: To acquire data with the system, some considerations must be made upon installation and engineering the setup. The initial instrument setup is critical. For this purpose and for system qualification, 1 μg of the E. coli digestion standard was acquired during installation. The E. coli sample was spiked with rabbit phosphorylase B for a final concentration of 40 fmol.μL-1 on the column. The expected dynamic range was measured and the specifications were applied to reach a minimum of 2-3 orders of magnitude for the Synapt HDMS first generation mass spectrometer. After system qualification completion, the samples were left running in the MS^E positive mode with a nano-electrospray source.
  • 20| The peptides were separated with a gradient of 5–40 % mobile phase B over 90 min at a flow rate of 600 nL.min-1, followed by a 10 min rinse with 85% of mobile phase B.
  • 21| The column was re-equilibrated at the initial conditions for 10 min. The column temperature was maintained at 35 °C. The lock mass was delivered from the auxiliary pump of the nanoACQUITY pump with a constant flow rate of 150 nL.min-1 at a concentration of 200 fmol of GFP solution (Sigma-Aldrich, USA) to the reference sprayer of the mass spectrometer NanoLockSpray™ source. ?Troubleshooting: The column diameter is critical to achieve the best resolving power and increase the peak capacity. For optimum loading for 75 μm inner diameter columns, consider using 250 to 500 ng of protein digest and 200 to 400 nL.min-1; for 100 μm columns, use 440 to 880 ng of digest and 400 to 600 nL.min-1; for 150 μm columns, use 1 to 2 μg of digest and 800 nL.min-1 to 1.2 uL.min-1; and for 300 μm columns, use 4 to 8 ug and 4 to 5 uL.min-1 with an analytical ESI source. If the analysis is with a common 2D SCX or 2D with dilution, the amount of sample injected can be multiplied by the fraction number to keep the column capacity at a maximum.
  • 22| All samples were analysed in triplicate using a Synapt HDMS™ first generation mass spectrometer. For all measurements, the mass spectrometer operated in the “V-mode” of analysis with a typical resolving power of at least 10000 full-width half-maximum (FWHM) and a sampling rate of 10 to 20 points across the chromatography peak to provide good quantification and peak representation into the chromatogram.
  • 23| All analyses were performed using the positive nano-electrospray ion mode (nanoESI+).
  • 24| The time-of-flight analyser of the mass spectrometer was externally calibrated with GFP b+ and y+ ions from m/z 50 to 1990 with the data post acquisition lock mass corrected using the GFP monoisotopic precursor ion of [M + 2H] 2+ = 785.8426.
  • 25| The reference sprayer was sampled with a frequency of 30 s.
  • 26| The nanoUPLC-MS^E data were collected in an alternating low energy and elevated energy mode of acquisition. The continuum spectra acquisition time in each mode was 1.5 s of scan time with at least 10 points per peak on the chromatogram.
  • 27| In the low energy MS mode, the data were collected at a constant collision energy of 3 eV.
  • 28| In the elevated energy MS mode, the collision energy was increased from 12 to 45 eV during each 1.5 s spectrum.
  • 29| The radiofrequency applied to the quadrupole mass analyser was adjusted such that ions from m/z 50 to 2000 were efficiently transmitted.

Data Processing and Protein Identification. TIMING 1 d

  • 30| The MS data obtained from the nanoUPLC-MSE were processed and searched using the ProteinLynxGlobalServer (PLGS) version 2.4v configured as follows. Sequences from Glycine max were downloaded from UniProt (54). In PLGS, a new databank named “GLYCINE” was created, and the file containing amino acid sequences was appended. The protein identifications were obtained with the embedded ion accounting algorithm of the software and by searching the database with MassPREP™ Protein Digestion Standards (MPDS) inside as an UniProtKB/Swiss-Prot sequences (Phosphorylase – P00489 - PHS2RABIT, Bovine Hemoglobin – P02070 - HBBBOVIN, ADH - P00330 – ADH1YEAST, BSA - P02769 – ALBUBOVIN) and a CTAG-P78358 protein appended to the database. CRITICAL STEP: The database must be correctly loaded into the PLGS. The identifications and quantitative data packaging were generated using dedicated algorithms (42, 55) and searching against a species-specific database (56). Refer to the software manual on how to proceed with the input method into the databank administration tool. ?Troubleshooting.
  • 31| In PLGS, a new workflow was created for Electrospray-MS^E analysis by setting the data bank to “GLYCINE” and setting the peptide and fragment tolerance to automatic. The minimum fragment ion matches per peptide was set to 3. The minimum fragment ion matches per protein was set to 7. The minimum peptide matches per protein was set to 1. The maximum protein mass was set to 600 kDa. Trypsin was chosen as the primary digest reagent, allowing 1 missed cleavage. Carbamidomethyl-C and the oxidation of M were set to fixed and variable modification, respectively. N-linked and O-linked options were set as variable glycosylation modification, the calibration protein was set to P00330 (corresponding to ADH sequence in database) and the calibration protein concentration was set to 25 fmol.uL-1.
    • CRITICAL STEP: These configurations will determine the protein identification processes and may vary from sample to sample. Changes in specificity and selectivity can vary because the minimum fragment ion matches per peptide was set to 3 and can be as low as 1; the minimum fragment ion matches per protein was set to 7 and can be as low as 5; and the minimum peptide matches per protein was set to 1. The maximum protein mass was set to 600 kDa; if the EST database was used, this can be increased to at least 1000 kDa. For standard concentration assignments, it is preferable to use the manual response to keep the counts/fmol ratio within a minimum coefficient of variation (CV).
  • 32| In PLGS, a new data preparation was created for Electrospray-MSE analysis by setting the chromatographic peak width and MS TOF resolution in automatic mode. The lock mass for charge 2 was set to m/z 785.8426 (corresponding to GFP mass), and the lock mass windows were set to ±0.25 Da. The low and elevated energy thresholds were set to 250.0 and 100.0 counts, respectively. The retention time windows were set to automatic, and 1500 counts were applied to the intensity threshold.
    • CRITICAL STEP: Ensure the m/z value of GFP and the charge state set are correctly assigned to avoid error in the PLGS processing. Check the instrument calibration prior to analysis. If the interval window is more than 0.4 Da for GFP, calibrate the instrument. ?Troubleshooting
  • 33| In PLGS, open a new project. Add 3 new original samples, named SOYCTAG L3, SOYCTAG L37, and SoyCN, which correspond to the lineage 3, 37 of the recombinant CTAG in soybean and non-transgenic soybean samples to be analysed and compared, respectively. If more samples need to be compared, add more original sample tags.
  • 34| In PLGS, add a new microlitre plate named CTAG. For each sample, add the original raw data from the acquisition, the data preparation file and the workflow file to a vial position. After the files are combined, raw data processing is possible. Tables 2 and 3 indicate a typical result.
    • CRITICAL STEP: Ion detection, clustering, and normalisation were performed in PLGS with ExpressionE software license installed (Waters, Manchester, UK). The intensity measurements are typically adjusted, i.e., deisotoped and charge state-reduced EMRTs that replicate throughout the complete experiment for analysis at the EMRT cluster level. The components are typically clustered together with a 10 ppm mass precision and a 0.25-min time tolerance or sufficient value to achieve at least 15 points per peak. The alignment of elevated energy ions with low energy precursor peptide ions is conducted with an approximate precision of 0.05 min. To analyse the protein identification and quantification level, the observed intensity measurements are normalised to the intensity measurement of the identified peptides of the digested internal standard, as described elsewhere56.
  • 35| For expression analysis, add a new “expression analysis” in PLGS, placing the samples created in step 33 into separate groups. In the quantification analysis, use the normalisation in proteins, selecting ADH protein in the table. The results are shown in Fig. 4.

Troubleshooting advice can be found in Table 3.


  • Problem: A contaminant with a repetitive cluster with singly charged ions encountered during chromatography.

    • Recommendations: Use only high quality pipette tips and tubing. Poor quality plastics release quantities of compounds into the sample that will affect chromatography and MS analysis.
  • Problem: Poor peptide profile

    • Recommendations: Digest a new sample with a recently prepared high-quality trypsin. Check the pH of the sample before adding surfactant; it must be alkaline.
  • Problem: After MSE acquisition, PLGS processes stop with message “failed to process raw data” or resulted in insufficient data.

    • Recommendations: This indicates a problem in the MS acquisition. Check the ionisation source, the changing and cleaning probe and cone; check also that the GFP solution is adequately delivered by the lock mass. Look into the raw data.
  • Problem: High pressure during chromatography stops the acquisition.

    • Recommendations: The column or capillary has clogged. Replace the column and capillary and ensure that the sample is digested and correctly centrifuged.
  • Problem: PLGS does not quantify the sample.

    • Recommendations: Ensure that ADH was added to the sample and that the information was given in the workflow process of PLGS.
  • Problem: Contamination appears during chromatography

    • Recommendations: Check all solutions. Use only MS and HPLC reagents and deionised water with total organic compounds less than 4 ppb to avoid contamination.
  • Problem: PLGS does not process the database

    • Recommendations: Introducing the database into PLGS requires that the sequences are in a FASTA format with the same strings and character patterns.
  • Problem: Low reproducibility due to column saturation

    • Recommendations: Keep the total protein mass load into column depending on the column diameter: 75 μm for 250 μg to 500 μg, 100 μm for 440 μg to 880 μg, 150 μm for 1 μg to 2 μg and 300 μm for 4 μg to 8 μg.
  • Problem: PLGS does not show results

    • Recommendations: If no result is displayed, check the log files or LockMass m/z window with no errors. Check the data preparation file for errors in the LockMass values.

Anticipated Results

This is an easy-to-follow protocol to determine if a target recombinant protein has been expressed in any expression system, especially in a situation where a small sample must be used or no antibody is available to run blotting detection methods. We successfully detected the human growth hormone and coagulation factor IX proteins expressed in transgenic soybean lines (9, 10) and present the preliminary results on the CTAG recombinant molecule expressed in the same system. Two lineages, SOYCTAG L3 and SOYCTAG L37, and a SOYBEAN Negative from the BR-16 cultivar were used as samples in this protocol. The amino acid sequence of CTAG can be observed in Fig. 2. Fig. 1 shows a diagram of the workflow. The sample preparation from TSP to the nanoUPLC procedure is critical for a successful identification. The use of high purity water and reagents is recommended due to the sensitivity of the technique. The low peptide dilution provided by nanoUPLC permits each compound to enter the mass spectrometer almost individually, allowing the production of MS and MSMS spectra from almost every peptide in the sample. When nanoACQUITY is associated with MS^E acquisitions (43), as the ion current is continuous and both MS and MS/MS are acquired in parallel, the chromatography peaks are sharpened as more points per peak are obtained, and there is high reproducibility between different injections, usually in the full loop method with 2 μL or 5 μL sample injection loading. Fig. 3 shows the resulting nanoUPLC chromatogram, MS^E spectra from [M + 2H] +2 = 857.87 CTAG fragment and the respective processed spectra by PLGS. The orthogonal separations (57) with the SCX columns (58, 59) or recent technologies at the first-dimension linear gradient with fractions at different pH levels with high-resolution separations both in the first- and in the second dimensions (52) are permitted due to the complexity of the chromatogram in this particular sample (Fig. 2A). To improve separation, this nanoUPLC system can be used with 2D RPxRP nanocolumns with small particles sizes at 1.7 μm for BEH or 1.8 μm for HSS T3 capillary column technologies that allow, for the first dimension, a high-resolution separation with organic mobile phase pulse fractions with 20 mM ammonium formate at pH 10 with a 300 μm x 50 mm XBridge™ BEH 130 Å C18 5 μm column (Waters, Milford, MA) and a second dimension separation with a trap column followed by an analytical column of 75 μm X 100 mm at a low pH of 2.6. Even so, five peptides from CTAG (Table 2, Fig. 2) were detected with high selectivity and specificity. These peptides showed no trace of post translational modification, but the possibility cannot be discarded because another 6 CTAG peptides were not detected (Fig. 2). Additionally, a proteomic profile can be processed with absolute quantitative values for each protein (Table 1). In this example, the CTAG recombinant protein was detected and quantified in nanograms based on the stoichiometric ion intensity values of the minimum three prototypic peptides of ADH and the identified protein. A relation between the total detected protein and the specific protein concentration can be applied, allowing calculation of the percentage of the expressed protein in relation to TSP. The percentage of each detected protein can be observed in Table 1. CTAG has an expression value of 0.1%, which is low compared to that of the other transgenic soybean seeds expressing hGH (9) (2.9%), but it has a similar value compared to factor IX expression (0.2%) (10). Other soybean proteins, such as β-conglycinin and glycinin, have expected values mainly for storage proteins from soybean seeds (60). Through this protocol, it is also possible to check the protein expression changes by comparing two or more samples. Fig. 4 shows a two-by-two comparison among SOYCTAG L3, SOYCTAG L37, and a SOYBEAN Negative protein expression list. It is possible to compare the expression level of the two transgenic lines and choose one with more recombinant protein production, in this case SOYCTAG L37. This technique, as with the IdentityE and ExpressionE software in PLGS (Waters, UK), can also be used to check higher and lower regulations of native proteins, providing information regarding the side effects of the introduction of transgenes at the proteomic level.


  1. Swartz, J.R. Advances in Escherichia coli production of therapeutic proteins. Curr. Opin. Biotechnol. 12, 195-201 (2001).
  2. Chu, L. & Robinson, D.K. Industrial choices for protein production by large-scale cell culture. Curr. Opin. Biotechnol. 12, 180-187 (2001).
  3. Tremblay, R., Wang, D., Jevnikar, A.M. & Ma, S. Tobacco, a highly efficient green bioreactor for production of therapeutic proteins. Biotechnol. Adv. 28, 214-221 (2010).
  4. Daniell, H., Singh, N.D., Mason, H. & Streatfield, S.J. Plant-made vaccine antigens and biopharmaceuticals. Trends Plant Sci. 14, 669-679 (2009).
  5. Boothe, J. et al. Seed-based expression systems for plant molecular farming. Plant Biotechnol. J. 8, 588–606 (2010).
  6. Cunha, N.B.d. et al. Correct targeting of proinsulin in protein storage vacuoles of transgenic soybean seeds. Genet. Mol. Res. 9, 1163-1170 (2010).
  7. Jolliffe, N.A., Craddock, C.P. & Frigerio, L. Pathways for protein transport to seed storage vacuoles. Biochem. Soc. Trans. 33, 1016-1018 (2005).
  8. Ma, J.K.-C., Drake, P.M.W. & Christou, P. The production of recombinant pharmaceutical proteins in plants. Nat. Rev. Genet. 4, 794-805 (2003).
  9. Cunha, N.B. et al. Expression of functional recombinant human growth hormone in transgenic soybean seeds. Transgenic Res. (2010).
  10. Cunha, N.B. et al. Accumulation of functional recombinant human coagulation factor IX in transgenic soybean seeds. Transgenic Res. (2010).
  11. Rech, E.L., Vianna, G.R. & Aragão, F.J.L. High-efficiency transformation by biolistics of soybean, common bean and cotton transgenic plants. Nat. Protoc. 3, 410-418 (2008).
  12. Blas, A.L.D. & Cherwinski, H.M. Detection of antigens on nitrocellulose paper immunoblots with monoclonal antibodies. Anal. Biochem. 133, 214-219 (1983).
  13. Perlmann, P. & Engvall, E. Enzyme-linked immunosorbent assay (ELISA). Quantitative assay of immunoglobulin G. Immunochemistry 8, 871-874 (1971).
  14. O’Farrells, P.H. High resolution two-dimensional electrophoresis of proteins. J. Biol. Chem. 250, 4007-4021 (1975).
  15. Shevchenko, A., Tomas, H., Havlis, J., Olsen, J.V. & Mann, M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 2856-2860 (2006).
  16. Weiss, W. & Görg, A. Two-dimensional electrophoresis for plant proteomics. Methods Mol. Biol. 355, 121-143 (2007).
  17. Blackstock, W.P. & Weir, M.P. Proteomics: quantitative and physical mapping of cellular proteins. Trends Biotechnol. 17, 121-127 (1999).
  18. Murad, A.M. et al. Screening of entomopathogenic Metarhizium anisopliae isolates and proteomic analysis of secretion synthesized in response to cowpea weevil (Callosobruchus maculatus) exoskeleton. Comp. Biochem. Physiol., C 142, 365-370 (2006).
  19. Murad, A.M. et al. Proteomic analysis of Metarhizium anisopliae secretion in the presence of the insect pest Callosobruchus maculatus. Microbiology 154, 3766–3774 (2008).
  20. Halligan, B.D. ProMoST: A tool for calculating the pI and molecular mass of phosphorylated and modified proteins on 2 dimensional gels. Methods Mol. Biol. 527, 283-298 (2009).
  21. Henzel, W.J. et al. Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. Proc. Natl. Acad. Sci. U. S. A. 90, 5011-5015 (1993).
  22. Wilson, N., Simpson, R. & Cooper-Liddell, C. Introductory glycosylation analysis using SDS-PAGE and peptide mass fingerprinting. Methods Mol. Biol. 534, 205-212 (2009).
  23. Gevaert, K. et al. Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nat. Biotechnol. 21, 566-569 (2003).
  24. Hunter, A.P. & Games, D.E. Chromatographic and mass spectrometric methods for the identification of phosphorylation sites in phosphoproteins. Rapid Commun. Mass. Spectrom. 8, 559-570 (1994).
  25. Wilkins, J.A., Xiang, R. & Horváth, C. Selective enrichment of low-abundance peptides in complex mixtures by elution-modified displacement chromatography and their identification by electrospray ionization mass spectrometry. Anal. Chem. 74, 3933-3941 (2002).
  26. Husson, S.J. et al. Comparative peptidomics of Caenorhabditis elegans versus C. briggsae by LC–MALDI-TOF MS. Peptides 30, 449-457 (2009).
  27. Guerrier, L. & Boschetti, E. Protocol for the purification of proteins from biological extracts for identification by mass spectrometry. Nat. Protoc. 2, 832-837 (2007).
  28. Guerrier, L., Righetti, P.G. & Boschetti, E. Reduction of dynamic protein concentration range of biological extracts for the discovery of low-abundance proteins by means of hexapeptide ligand library. Nat. Protoc. 3, 883-890 (2008).
  29. Deterding, L.J., Moseley, M.A., Tomer, K.B. & Jorgenson, J.W. Nanoscale separations combined with tandem mass spectrometry. J. Chromatogr. A 554, 73-82 (1991).
  30. Shen, Y. et al. High-efficiency nanoscale liquid chromatography coupled on-line with mass spectrometry using nanoelectrospray ionization for proteomics. Anal. Chem. 74, 4235-4249 (2002).
  31. Mirgorodskaya, E., Braeuer, C., Fucini, P., Lehrach, H. & Gobom, J. Nanoflow liquid chromatography coupled to matrixassisted laser desorption/ionization mass spectrometry: Sample preparation, data analysis, and application to the analysis of complex peptide mixtures. Proteomics 5, 399–408 (2005).
  32. Nouri, M.-Z. & Komatsu, S. Comparative analysis of soybean plasma membrane proteins under osmotic stress using gel-based and LC MS/MS-based proteomics approaches. Proteomics 10, 1930-1945 (2010).
  33. Brumbarova, T., Matros, A., Mock, H.-P. & Bauer, P. A proteomic study showing differential regulation of stress, redox regulation and peroxidase proteins by iron supply and the transcription factor FER. Plant J. 54, 321-334 (2008).
  34. Behrens, H.L., Chen, R. & Li, L. Combining microdialysis, NanoLC-MS, and MALDI-TOF/TOF to detect neuropeptides secreted in the crab, Cancer borealis. Anal. Chem. 80, 6949–6958 (2008).
  35. Zybailov, B. et al. Sorting signals, N-terminal modifications and abundance of the chloroplast proteome. PLoS one 3, e1994 (2008).
  36. Barnes, S. et al. High-resolution mass spectrometry analysis of protein oxidations and resultant loss of function. Biochem. Soc. Trans. 36, 1037-1044 (2008).
  37. Bouché, J.-P. et al. NanoLC-MS/MS analysis provides new insights into the phosphorylation pattern of Cdc25B in vivo: full overlap with sites of phosphorylation by Chk1 and Cdk1/cycB kinases in vitro. J. Proteome Res. 7, 1264-1273 (2008).
  38. Unwin, R.D., Griffiths, J.R. & Whetton, A.D. A sensitive mass spectrometric method for hypothesis-driven detection of peptide post-translational modifications: multiple reaction monitoring-initiated detection and sequencing (MIDAS). Nat. Protoc. 4, 870-877 (2009).
  39. Mori, M. et al. Production of 18O-single jabeled peptide fragments during trypsin digestion of proteins for quantitative proteomics using nanoLC−ESI−MS/MS. J. Proteome Res. 9, 3741–3749 (2010).
  40. Yang, Y. et al. A comparison of nLC-ESI-MS/MS and nLC-MALDI-MS/MS for GeLC-based protein identification and iTRAQ-based shotgun quantitative proteomics. J. Biomol. Tech. 18, 226-237 (2007).
  41. Levin, Y. et al. Real-time evaluation of experimental variation in large-scale LC–MS/MS-based quantitative proteomics of complex samples. J. Chromatogr. B 877, 1299-1305 (2009).
  42. Li, G.-Z. et al. Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures. Proteomics 9, 1696–1719 (2009).
  43. Geromanos, S.J. et al. The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS. Proteomics 9, 1683–1695 (2009).
  44. Xu, D. et al. Novel MMP-9 Substrates in Cancer Cells Revealed by a Label-free Quantitative Proteomics Approach. Mol. Cell Proteomics 7, 2215-2228 (2008).
  45. Cheng, F.-y., Blackburn, K., Lin, Y.-m., Goshe, M.B. & Williamson, J.D. Absolute protein quantification by LC/MSE for global analysis of salicylic acid-induced plant protein secretion responses. J. Proteome Res. 8, 82–93 (2009).
  46. Gnjatic, S. et al. NY-ESO-1: Review of an Immunogenic Tumor Antigen. Adv. Cancer Res. 95, 1-30 (2006).
  47. Chen, Y. et al. A testicular antigen aberrantly expressed in human cancers detected by autologous antibody screening. Proc. Natl. Acad. Sci. U. S. A. 94, 1914-1918 (1997).
  48. Kurashige, T. et al. NY-ESO-1 expression and immunogenicity associated with transitional cell carcinoma: correlation with tumor grade. Cancer Res. 61, 4671-4674 (2001).
  49. Murphy, R. et al. Recombinant NY-ESO-1 cancer antigen: production and purification under cGMP conditions. Prep. Biochem. Biotechnol. 35, 119-134 (2005).
  50. Liu, H. et al. Effects of column length, particle size, gradient length and flow rate on peak capacity of nano-scale liquid chromatography for peptide separations. J. Chromatogr. A 1147, 30-36 (2007).
  51. Liu, H., Finch, J.W., Luongo, J.A., Li, G.-Z. & Gebler, J.C. Development of an online two-dimensional nano-scale liquid chromatography/mass spectrometry method for improved chromatographic performance and hydrophobic peptide recovery. J. Chromatogr. A 1135, 43-51 (2006).
  52. Gilar, M., Olivova, P., Daly, A.E. & Gebler, J.C. Two-dimensional separation of peptides using RP-RP-HPLC system with different pH in first and second separation dimensions. J. Sep. Sci. 28, 1694–1703 (2005).
  53. Yu, Y.-Q., Gilar, M., Lee, P.J., Bouvier, E.S.P. & Gebler, J.C. Enzyme-friendly, mass spectrometry-compatible surfactant for in-solution enzymatic digestion of proteins. Anal. Chem. 75, 6023-6028 (2003).
  54. Consortium, T.U. The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res. 38, D142-D148 (2010).
  55. Silva, J.C. et al. Quantitative proteomic analysis by accurate mass retention time pairs. Anal. Chem. 77, 2187-2200 (2005).
  56. Silva, J.C., Gorenstein, M.V., Li, G.-Z., Vissers, J.P.C. & Geromanos, S.J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell Proteomics 5, 144-156 (2005).
  57. Gilar, M., Olivova, P., Daly, A.E. & Gebler, J.C. Orthogonality of separation in two-dimensional liquid chromatography. Anal. Chem. 77, 6426–6434 (2005).
  58. Millea, K.M. et al. Evaluation of multidimensional (ion-exchange/reversed-phase) protein separations using linear and step gradients in the first dimension. *J. Chromatogr. A *1079, 287-298 (2005).
  59. Gilar, M. et al. Comparison of 1-D and 2-D LC MS/MS methods for proteomic analysis of human serum. Electrophoresis 30, 1157–1167 (2009).
  60. Li, C. & Zhang, Y.-M. Molecular evolution of glycinin and β-conglycinin gene families in soybean (Glycine max L. Merr.). Heredity doi 10.1038/hdy.2010.97 (2010).


We are grateful to G. Ritter at Ludwig Cancer Research Institute (New York Branch) for providing genes and antibodies. We acknowledge support from C. Bloch at the Mass Spectrometry Laboratory-EMBRAPA. We acknowledge discussions with G. Ritter and C. Bloch and thank J. Taquita for technical help. This work was supported by Brazilian Agricultural Research Corporation, National Council for Scientific and Technological Development and Fundacao de Apoio a Pesquisa-DF.


Experimental Results: tables and figures

Download Experimental Results

Author information

André Murad & Elibio Rech, Embrapa Genetic Resources and Biotechnology, Laboratory of Gene Transfer, Parque Estação Biológica, PqEB, Av. W5 Norte, Brasília, DF, 70770-917, Brazil

Gustavo Souza, Waters Corporation, MS Applications Research and Development Laboratory, Alameda Tocantins, 125, 27th floor, West Side, Alphaville, São Paulo, SP, 06455-020, Brazil.

Jerusa Garcia, Alfenas Federal University, Institute of Exact Sciences, Alfenas, MG, 37170-000, Brazil

Correspondence to: Elibio Rech ([email protected])

Source: Protocol Exchange (2011) doi:10.1038/protex.2011.216. Originally published online 3 March 2011.

Average rating 0 ratings