Skip to content

Comprehensive Cow Milk Metabolite Database Now Online

    Written by: Lauren Milligan Newmark, Ph.D. | Issue # 89 | 2019

    • A team of Canadian researchers combined several laboratory metabolic profiling techniques with computer text-mining software to compile the most comprehensive list of cow milk metabolites to date.
    • The Milk Composition Database is a publicly available, online database containing over 2,000 different metabolite entries, of which nearly half are lipid molecules.
    • The creation of a centralized, open access database promotes collaborative research on, and consumer awareness of, cow milk metabolites.

    Some recipes are meant to be top secret—Colonel Sanders’ fried chicken; Big Mac’s special sauce; your great aunt Ingrid’s sherry cake. But the ingredients in cow milk shouldn’t be private and confidential. The advent of targeted metabolomics approaches, which characterize large numbers of small molecules in milk, offers the opportunity to produce a detailed and comprehensive picture of cow milk’s chemical composition. And yet, many studies employing these new techniques have not publicly reported their findings, or report the components they have found but not their concentrations [1]. Rather than having scientists continuing to re-invent the analytical milk wheel, a team of Canadian researchers has just published a “centralized, comprehensive, and electronically accessible database” of all detectable metabolites in cow milk [1]. We may never know what Colonel Sanders uses to season his fried chicken batter, but the (detectable) chemicals that make up cow milk—all 2,355 of them—are now on the record.

    Filling the Database

    Wouldn’t it be amazing to be able to simply pour a glass of milk into a machine and moments later have that machine spit out a piece of paper listing all of milk’s ingredients (and their relative concentrations)? Technically, such a magical machine does exist, but it only measures total macronutrients (e.g., total protein, total fat), components that are already pretty well established for cow milk. To determine the specific types of amino acids that make up that total protein or specific fatty acids that contribute to the total lipids requires much more complicated analytical techniques.

    One of the goals of this major research undertaking was establishing which specific techniques would yield the most detailed results. The team used four different metabolic profiling methods: Nuclear magnetic resonance (NMR) spectroscopy; liquid chromatography high-resolution mass spectrometry (LC–HRMS); liquid chromatography mass spectrometry (LC–MS/MS); and inductively coupled plasma mass spectrometry (ICP–MS). Rather than lose sensitivity by trying to make one method work for all milk metabolites, the team maximized the amount of data they were able to retrieve by optimizing each method for a specific set of metabolites. For example, NMR is good for water-soluble compounds that are abundant in a biological sample, whereas ICP–MS was utilized because it targets metal ions [1]. But all in, LC–HRMS was determined to be the preferred analytical method because of its broad coverage (bonus points for not needing a large sample, being relatively inexpensive, and being largely automated!)[1].

    In addition to generating their own data, the team utilized computer-aided text mining to add in metabolites that other researchers had previously identified [1]. Utilizing software programs developed for the Human Metabolome Project, the team searched for publications that had particular key words of interest grouped together (for example, pulling up abstracts for all papers that have milk, dairy, bovine, concentration, and identification). This approach identified nearly 150 papers, abstracts, or books with relevant information. Then came the tedious task of taking all of the information from these articles on milk metabolites and entering it into the database; this required not only the time of manually inputting the information but several rounds of double-checking the details by experts in biochemistry, physiology, and animal science [1].

    Combining the data from the four different analytical methods with the results of the computer text mining yielded 2,355 unique metabolic structures. Triglycerides made up almost half of the 2,355 metabolites, some of which were identified for the first time. The dominant ingredients were carbohydrates (like lactose), inorganic ions (like calcium and potassium), organic acids (like citrate), and amine-containing compounds (like creatinine and choline). Consumers may be more interested in the least abundant compounds, however. These include vitamin D3 and vitamin D2 (hence the vitamin D fortification program) and antimicrobial agents such as tetracycline and penicillin G. The latter are exogenous, present in milk if a dairy cow was given antibiotics for infection and the milk withdrawal period was insufficient. Their concentrations are very low (fractions of a micromole), but nevertheless, they are counted alongside the other metabolites as part of cow milk’s chemical composition.

    Accessing the Database

    Dying to see the complete list of metabolites that Foroutan et al. [1] detected? You are in luck! The Milk Composition Database (MCDB) is available for dairy researchers, dairy farmers, nutritionists, milk drinkers, and all other interested parties at The website has a complete list of all detected chemicals, including their structure and reference spectra. If the structures were identified via computer text mining, rather than detected and quantified directly by Foroutan et al. [1], the listing provides the reference so you can go straight to the source to learn more.

    The database started with 2,355 detectable unique metabolic structures (representing 972 metabolite species), of which 1,285 structures (168 metabolite species), or roughly 60% of reported data, were identified by Foroutan et al. for the first time [1]. The hope is that this number will continue to grow as analytical techniques improve. Indeed, Foroutan et al. already have plans to get back to the lab to employ a handful of different experimental methods with the hopes of adding rows to their already gigantic spreadsheet [1].

    It should be emphasized that despite this impressively high number of identified structures, the MCDB is not by any means an exhaustive list of all of the components that are in cow’s milk, just those that qualify as metabolites. For example, the most abundant proteins in cow’s milk are not in this database, which would have been more appropriately named the “Bovine Milk Metabolite Database.”

    Nevertheless, centralized and publicly available databases like the MCDB are science research at its best, offering the chance to collaborate across laboratories, disciplines, and international borders. As milk research moves away from quantification of macronutrients to more detailed and comprehensive methods of analysis, and as the research questions become more focused and targeted, collaborative science is the only way forward.


    1. Foroutan A, Guo AC, Vazquez-Fresno R, Lipfert M, Zhang L, Zheng J, Badran H, Budinski Z, Mandal R, Ametaj BN, Wishart DS. 2019. Chemical composition of commercial cow’s milk. Journal of Agricultural and Food Chemistry 67: 4897-4914.