Iowa State University
The Pre-Genomic EraIn the Pan-American Evaluation there are four countries, US, Canada, Uruguay and Argentina. Agricultural Business Research Institute (ABRI) collects the pedigree and trait data. Then the Animal Genetics & Breeding Unit uses BREEDPLAN to estimate EPDs.
To put the number of records turned in to the AHA data in context, we can compare them to other breeds. The 11 breed associations in International Genetic Solutions add 340,000 new animals in each of 2012 and 2013. AGI adds about 300,000 per year. Herefords add a little less than 100,000 animals per year.
The Early-Genomic EraCattle have 30 pairs of chromosomes. There are about 100 million base pairs per chromosome and about 2.6 billion base pairs in the entire DNA (genome).
Most errors in chromosome replication are fixed. But, some slip through and are passed down through generations.
The EPD of a bull is a sum of the average gene effects he carries. Until the genomic era, we progeny tested a bull to see how his progeny performed, and thus the gene effects he carries. In genomic prediction we try to use DNA information to estimate the gene effects the bull carries. A good bull has more of the favorable gene effects, and fewer unfavorable gene effects. A bad bull has fewer favorable gene effects and more unfavorable gene effects. But the good bull still carries unfavorable gene effects and can produce bad progeny, based on the chromosomes and genes that progeny inherits.
We tend to think of EPDs associated with an animal. But in genomics we associate an EPD (breeding value) with a segment of DNA to predict the genetic merit of the bull.
In current genomic prediction, we don't test the actual DNA variants causal or responsible for the variation in the trait. We use DNA variants spread throughout the genome. There are relationships between DNA variants on SNP chips and the actual causal variants. We call this relationship linkage disequilibrium. So, even though we are not testing the causal variant, we can use the linkage disequilibrium to predict the causal variant inherited based on the DNA variants in the same chromosome neighborhood.
Using the DNA variants on the SNP chip, Garrick's group came up with a prediction equation. Each SNP gets an effect in this model. The animal's DNA is then tested at GeneSeek, the prediction equation is applied, and a molecular breeding value is calculated. This molecular breeding value is then combined with pedigree and trait data to produce a genomic-enhanced EPD. The American Hereford Association now has over 20,000 animals genotyped. In the last month, 2,000 Hereford animals were genotyped.
Usually a very small portion of a contemporary group was genotyped (usually one or no animals are genotyped per contemporary group). When a larger portion of the contemporary group is genotyped many of the issues (statistical difficulties) with genomic prediction go away.
Make sure an animal has a registration number before doing DNA testing. Make sure the registration number is correct when submitting samples.
If the genotyped sex does not match the recorded sex, Garrick's group doesn't use these genotypes.
Also, we can't use duplicate genotypes. If an animal has multiple DNA tests run, either the failed test is removed, matching genotypes are merged, or if the genotypes don't match, both sets of genotypes are thrown out.
Garrick's group can also do breed verification.
We can also do parentage verification. If the Dam has an AA genotype at a DNA variant, then the calf cannot have a BB genotype at that DNA variant (calf can be AB or AA). Breeds should begin considering testing all animals in the herd. This would allow us to identify unknown parentage, not just verify the reported parentage.
The first SNP test widely used in cattle was the Illumina 50K SNP chip (the earlier Affymetrix chip was not widely adopted). In a hope to use genomic predictions across breeds, a 700K SNP chip was created.
GeneSeek then created a 25K and a 30K chips (GGP LD chips).
GeneSeek also created a 70K chip and later a 150K GGP uHD chip.
Initally, when implementing genomic selection reduced marker panels were created. Pfizer (now Zoetis) recognized they could get a bulk discount on 50K and never released a low density product.
Next step forward
Garrick gave a list of developments he is working on, namely:
- Single step analyses (no blending or interms).
- Actual rather than Approximated Accuracies.
- More regular runs.
- Readily incorporate new traits.
Bruce Golden and Garrick are creating a new evaluation system called BOLT CUDA evaluation system. At this point in the presentation, Garrick took off his ISU hat and put on his Theta Solutions hat (private company). The difference in this evaluation is the use of graphics cards (GPUs) instead of CPUs. Water cooling systems are also used in GPU systems.
Theta Solutions is currently using computers with 4 graphics cards.
Using a traditional computing, it would take an hour and a half to run an analysis. Using parallel computing, can get this down to 25 minutes. But, with graphic cards, can do this in 1 to 2 minutes.
Using 4.5 million animals can solve genetic predictions for Herefords in less than 12 minutes. It takes another hour to calculate the real accuracies.
ConclusionsGarrick gave five take home points for producers:
- Genomic analyses are developing rapidly
- New statistical models
- New marker panels
- More animals genotyped
- New computing resources