Chapter One
PLANT METABOLOMICS IN A NUTSHELL: POTENTIAL AND FUTURE CHALLENGES Robert D. Hall
Plant Research International, Wageningen University and Research Centre (Wageningen-UR), PO Box 16, 6700 AA Wageningen, The Netherlands Centre for BioSystems Genomics, PO Box 98, 6700 AB Wageningen, The Netherlands Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
Abstract: In just 10 years, plant metabolomics has been transformed from a purely theoretical concept into a highly valued and widely exploited technology. Moving on from the many and wide-ranging hopes, enthused upon in a multitude of early reviews, metabolomics for plant research has already proved itself despite the technology still experiencing certain limitations. We have a long road to travel before we reach our desired destination–a very happy place where large-scale, (semi-) automated, unbiased multiplex analyses of plant materials, leading to exhaustive lists of named metabolites become possible. This biochemical Holy Grail will, by definition, never be reached. Nevertheless, continued advances in hardware, software and biostatistics are enabling us to generate ever-advancing, detailed insights into the chemical composition of plants and how this is influenced by genetical and environmental perturbation. There continues to be a major driving force behind further developments, spurred on by myriad existing and potentially new applications in both applied and fundamental plant science. It is no longer so much a case of hope springs eternal but rather, who dares wins!
Keywords: applied metabolomics; fundamental research; crops; technology development; systems biology
1.1 The history and the goals of plant metabolomics
For thousands of years, man has understood the importance, value and even dangers of the chemicals that are to be found in plant organs, be they fresh, dried, fermented or processed in many other ways. Learning which plants are nutritious, poisonous, curative, fibrous, etc., has been of fundamental importance to the survival and evolution of humans and has shaped mankind as it is today. However, recognition is far from understanding and while even the ancient Greeks and Romans had a significant knowledge of the extraction and application of plant materials for a myriad of applications, it is only in modern times that we have had the technologies available to extract stably, separate and identify some of the individual components in plants that form the basis of their exploitable properties be they nutritional, medicinal or otherwise. Developments in analytical chemistry, related to aspects of both separation and detection, have been central to establishing our capacity to decipher the biochemical composition of usually, highly complex tissues and extracts. Recent developments have dramatically improved our analytical potential and the emergence of metabolomics as an analytical concept has initiated a paradigm shift in the way we think about and approach metabolic analyses and indeed, biological experiments in general.
Metabolomics, as a follow-on from transcriptomics and proteomics was a term coined at the end of the 1990s (Oliver et al., 1998) (Table 1.1). The concept entailed the analyses of the metabolite composition of biological materials, aimed to be fully complementarily with the other, potentially unbiased or non-targeted '-omics' approaches. Restrictions were set, and primarily, metabolomics was specifically focused on the smaller metabolites, while the larger organic polymers were excluded. Concentrating on these smaller molecules enabled the focus to be directed to global sets of compounds involved both in primary metabolism as well as in so-called secondary metabolism, involving metabolites that may or may not (yet) have a proven role in the daily functioning of a plant cell or tissue but that might be of great importance in the long-term survival of the individual plant or whole species. Low molecular weight metabolites are the (end) products of a huge network of metabolic pathways and represent the activities of cell regulatory processes (Fiehn et al., 2011). As such, they advertise the response of biological systems to a variety of genetical and environmental responses (Fiehn, 2002). Detecting and monitoring such global sets of metabolites quickly enables us to assess changes in the distribution and concentration of a broad range of potentially biochemically-unrelated compounds and such strategies therefore permit the detection of perturbations at multiple levels of organization: from cell to whole organ.
The primary goal of any plant metabolomics approach is therefore, to gain a helicopter view of metabolism at a specific point in time, in a chosen tissue, obtained either under control or experimental conditions. Extrapolating this to use time-resolved samples, taken at appropriate intervals, can then also introduce a degree of dynamics to the system. However, even when employing a number of extraction, separation and detection conditions (see below) the view gained will never be truly holistic as some element of bias will always be involved. This results from a failure to extract or detect certain compounds that perhaps are unstable or that have chemical or chemico-physical properties unsuited to the methodologies chosen. Nevertheless, educated choices as to the best approaches to use and optimization of data collection and mining strategies have greatly enhanced our capacity to expand our biochemical knowledge of plant materials.
Initially, concepts such as 'targeted' and 'untargeted' analyses were used where the latter was essentially referring to a true metabolomics approach while the former was more equivalent to well-established analytical chemistry methods. These are still useful descriptors although the delineation is greying somewhat as the technologies develop. Lipidomics, for example, is a prominent and established form of metabolomics for 'untargeted' analyses but has been specifically optimized for the huge group of lipids and lipid-related compounds such as the sterols, phospholipids, glycerides, waxes etc (Lessire et al., 2009). Other sub-themes for plants might also be expected to be developed in the future for other major groups of plant compounds such as the terpenoids, alkaloids or phenolics, each of which are already known to contain many thousands of different chemical structures, often with identical accurate masses and elemental formulae.
Other early terms regularly used were metabolic 'fingerprinting' and metabolic 'profiling'. The former was generally taken to refer to the use of machine output as a potentially recognizable chemical pattern, specific to an individual sample. These unique fingerprints were usually the starting point for comparative metabolomics where the researcher wished to compare up to several hundred extracts in order to quickly assess the degree of variation and often, select the most divergent samples or genotypes for more detailed study (Hall, 2006; Saito et al., 2006). There are multiple and diverse examples of such approaches being used in the plant field for a wide range of applications (Hall et al., 2005, 2008; Saito et al., 2006; Schijlen et al., 2008). Specific software tools have also been developed to speed up and semi-automate this process and to optimize the output. In contrast, metabolic profiling is a term that has been employed to refer to a deeper form of analysis where one proceeds to complete metabolite structural elucidation. But again here also, as the technology is progressing, the boundaries are again becoming vague. For example, several labs have already developed extensive in-house databases for mainly primary plant metabolite analyses applicable with their own specific instrumentation. This enables them to identify unambiguously, up to 150 polar small molecules in a non-targeted approach (Fernie, 2007). In such cases, metabolic fingerprints are becoming more and more annotated and are moving towards extensive, true metabolite profiles. Metabolite identification, and particularly of so-called secondary plant metabolites remains however, a significant challenge, or indeed a major bottleneck. This aspect requires extensive, unified multi-disciplinary approaches crossing many different laboratory boundaries (see below).
1.2 The technologies
Perhaps one of the most formidable tasks facing any inexperienced researcher approaching metabolomics for the first time is to become familiar with the technologies available. These are not only extensive but also, are in a continual state of modification and improvement (Saito et al., 2006; Weckwerth, 2007). User-friendliness is also in no way enhanced by the extensive use of abbreviations for the different approaches (see Table 1.2). Most metabolomics experiments involve combinations of separation and detection technologies that can also be used in serial or even in parallel combinations, often referred to as 'hyphenated approaches'. Simpler forms are commonplace–such as LC-MS or GC-MS but extreme examples can be daunting–such as HPLC-PDA-SPE-NMR-ESI-(ToF)MS (Moco & Vervoort, 2011). In this case, one separation technique (HPLC) has been combined subsequently with UV/Vis spectral detection (PDA) after which the sample is split and one fraction proceeds to an electrospray ionization (ESI) unit before entering an accurate mass Time of Flight MS ((ToF)MS). Simultaneously, the other fraction is adsorbed and concentrated using a Solid Phase Extraction (SPE) unit so that sufficient quantities of individual compound peaks can be collected for detection and identification using Nuclear Magnetic Resonance (NMR). Basically, the hyphenated code just refers to the analytical workflow used for that particular analysis. Many reviews have been written on the separation and detection technologies available for plant metabolomics so we include here only a nutshell-type introduction. For more detailed information the reader is referred to seminal volumes such as: Saito et al. (2006), Weckwerth (2007) and Hardy and Hall (2011). Compact summaries have also been provided by Browne et al. (2011) and Beale and Sussman (2011).
1.2.1 Extraction, separation and detection
In contrast to the other '-omics' approaches, which are focused on DNA sequencing (genomics), gene expression analysis (transcriptomics) and proteins (proteomics), the major challenge for complementary, untargeted metabolomics approaches is related to the huge diversity of molecules present, their different physico-chemical properties and their dynamic range (Makkar et al., 2007; Cevallos-Cevallos et al., 2009; Fernie & Keurentjes, 2011). This chemical complexity within just the smaller molecules, involving both the basic structures as well as their different combinations of functional groups such as hydroxyls, carboxyls, amines, etc. has been well documented (Saito et al., 2006; Fernie & Keurentjes, 2011). This complexity is also both the primary reason for our desire to develop metabolomics approaches in the first place, as well as being the primary reason why we shall unlikely ever succeed in obtaining a truly holistic overview of the complete metabolite profile of a plant (Hall, 2006). Dynamic range is a particular challenge as compounds with high biological relevance, relating for example, to their bioactivity or physiological importance, can be present in plant tissues at concentrations differing by a multitude of orders of magnitude–from molar to nanomolar or maybe even lower.
If a compound is not extracted from a sample it can, of course, never be subsequently detected. The choice and optimization of sample preparation and extraction procedures are therefore critical (Hall, 2006; Weckwerth, 2007; Hardy & Hall, 2011). Polar/semipolar lipophobic extraction procedures, often based on hot water or alcohol/water mixtures, as well as lipophilic extraction methods (e.g. often using chloroform) are widely used with the choice depending on the main groups of compounds of interest. Such protocols can also be used sequentially. Methanol-water-chloroform mixtures are also popular as they allow extraction of a range of both hydrophilic and hydrophobic compounds in a single method. For volatile components, organic solvents or Solid Phase Extraction approaches can be employed (Tikunov et al., 2007; Verhoeven et al., 2011). In all cases, the efficiency and balance of compounds moving from the biological sample into the extract determine the quality of the extract and thus how representative it is of the original sample. Inevitably, an element of bias is already introduced at this stage as few compounds will be extracted to 100%. This will later be reflected in the analytes ultimately detected and measured.
Chromatographic separation using either gas (GC) or liquid (LC) phases are very common and are widely applied for different groups of compounds. GC can be used for naturally volatile compounds at temperatures up to 250°C but is also used for heat stable molecules that can be structurally modified through a chemical derivatization process to make them so. GC-MS of (semi)polar primary metabolite extracts is one of the most widely used metabolomics approaches currently employed (Fiehn et al., 2011). LC methods often involving high pressure (HPLC) or Ultra Performance (pressure) (UPLC) are particularly popular with plant scientists (Verhoeven et al., 2006). Protocols can be developed that are highly suited to many of the (semi)polar secondary metabolites in which plants can be particularly rich. A final separation option, which is not yet widely applied but which is becoming increasingly popular in certain labs, is Capillary Electrophoresis (CE) (Soga, 2007). CE-MS in certain circumstances can have particular advantages relating to sensitivity, rapidity and resolving power (Timischl et al., 2008). For a full review of current CE developments, please see the special issue of Electrophoresis, 2009 volume 30, issue 10.
For metabolite detection after separation, there are basically two key players–NMR and MS. Each has its own advantages and disadvantages (see Table 1.3). NMR requires relatively minor sample preparation, is nondestructive and inherently quantitative. It is also not restricted to specific compound groups and has the potential to give unambiguous information for metabolite identification. The current greatest drawback relates to its lower sensitivity compared to MS-based approaches and its requirement for relatively large samples. However, recent improvements are making this less of an issue. MS has a wide dynamic range and high sensitivity but does require molecules to be ionized (charged) in order for them to enter the instrument. MS also requires extensive sample preparation and extraction. Furthermore, these highly complex extracts, typical of plants, despite good chromatography, can still be prone to significant ion suppression/matrix effects that can interfere with or mask molecule detection. This, together with variable ionization frequencies, makes MS-based quantification more difficult and totally dependent on available reference standards. Recent developments in improved mass accuracy machines such as the Orbitrap™ and FT-ICR-MS instruments, have created additional interest due to their greater applicability regarding empirical formula calculations and metabolite identification.
(Continues...)
Excerpted from Annual Plant Reviews, Biology of Plant Metabolomics Copyright © 2011 by Blackwell Publishing Ltd.. Excerpted by permission of John Wiley & Sons. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.