Skip to content

Predicting Performance in Dairy Cows of the Future

    Written by: Peter Williamson, Ph.D. | Issue # 39 | 2015

    • Modern genomic tools can be used to predict how much milk dairy cows will produce.
    • Scientists are working hard on maximizing the accuracy of these predictions while minimizing the cost.
    • This study used markers that identify large sections of DNA to simplify analysis.
    • The approach improved prediction accuracy.
    • Soon these methods will be applied to predict milk production in the herds of the future.

    Selective breeding of dairy cows is a major part of modern dairy farming. Farmers can select the bulls they want to use to produce animals for their herd. One bull may sire thousands of daughter cows via highly developed systems for artificial insemination. The availability of lots of stored semen from bulls that have been shown to produce cows with excellent production and health traits has been a backbone of improving efficiency and production in dairy farms for several decades. There has been a continuous effort to build on the methods and procedures that contribute to selective breeding, most recently with the advent of genomic tools.

    Genomic tools are based on detailed information arising from the Bovine Genome Sequencing Project. Since its first release, the bovine genome assembly has fuelled an enormous increase in activity to find the regions in the bovine genome where there are differences between individuals (referred to as genetic polymorphisms) that contribute to the capacity of cows to produce milk. Genome technologies have advanced to a stage where many individuals have had their entire genome sequenced, but when comparing cattle, the polymorphisms are a tiny percentage of the whole. Scientists are able to focus just on these parts to do calculations that predict how individual cows, and future generations of cows will perform in milk production.

    The most common forms of genetic variation are called SNPs—single nucleotide polymorphisms. As the name suggests, these are single “letter” changes in the DNA sequence. Some are found within genes, but the majority are spread throughout the DNA sequence between the gene coding regions. In the case where a gene contains a SNP that alters a trait, it then is referred to as a causal SNP. However, because the genome is organized in blocks or haplotypes, knowing the position of SNPs that are close to the causal SNP allows scientists to do calculations based on the assumption that these SNPs track the cause. This assumption is how the science has progressed to-date, and the major effort has gone into increasing the number of SNPs analyzed, from 50,000 SNPs, to more recent tools that can identify over 770,000 SNPs.

    Armed with this additional level of detail, geneticists have been evaluating how it can be used most effectively in genomic prediction. Surprisingly, the additional data does not translate into an equivalent increase in prediction accuracy. Mogens Lund and his colleagues from Aarhus University have been studying this issue in Nordic cattle and, for comparison, in French Holsteins with Didier Boichard from Institut National de la Recherche Agronomique (INRA). Their recent publications explored the improvement to prediction by using either haplotype-based methods [1], or including information from sequence data analysis [2].

    The first study was based on the knowledge that, because they represent a block of DNA sequence, a haplotype contains more information than an individual SNP, and therefore are more likely to result in improved accuracy. The study began by defining haplotypes based on all the information drawn from 770,000 SNPs in each individual from a large number of cattle. When they used the haplotype approach for their genomic prediction calculations, they found a small but significant (3.1%) gain in accuracy for prediction of milk protein yield. In addition to the increase in accuracy, the method also allows a simplification of the number of SNPs required to capture the information, and would therefore translate into a reduced cost if the method was adopted.

    The second study used data from previous analyses that identified some key DNA sequence information for dairy traits. This allowed the researchers to add SNPs from regions that had a known link to dairy traits. Again, the researchers were mindful of keeping the complexity of any direct measurements on cattle DNA to a minimum, with a view to producing a cost-effective method for potential industry applications. They looked at the capacity of this new set of markers to improve predictions of 17 dairy traits—ranging from body shape to milk production. The analysis showed an improvement in accuracy of up to 4% for production traits when the calculation put more emphasis on the added SNPs from the sequence data.

    These studies are indicative of what is emerging in the field of dairy genetics globally: drawing together highly detailed data on dairy cattle genomes, reducing the complexity of the information that it contains to minimize the difficulty of calculations and the cost of potential applications, and gradually improving the accuracy of predictions. With such a flurry of research activity and the palpable excitement of what will result, we are very close to having accurate predictions of the best genetic composition of future generations of dairy cows.


    1. Cuyabano BC, Su G, Lund MS (2014) Genomic prediction of genetic merit using LD-based haplotypes in the Nordic Holstein population. BMC Genomics 15: 1171.

    2. Brøndum RF, Su G, Janss L, Sahana G, Guldbrandtsen B, Boichard D, Lund MS (2015) Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction. Journal of Dairy Science 98: 4107-4116.