The IGS Implementation of BOLT

June 14, 2016

Bruce Golden
Theta Solutions

Over the last 50 years we have had evolution of the statistical methods used to calculate genetic predictions, EPDs, for livestock. What drove the evolution of these methods? Knowledge of statistical models? New methods? Data? Enabling computer technology? Golden states that he believes the drive for better models has been a desire to increase the accuracy of prediction.

Golden and Garrick had written grants to write genetic prediction software in the past. This avenue appears to have dried up, so they decided to start a company, Theta Solutions, in order to fund the development of genetic prediction. The latest genetic prediction runs contained 46,000 animals with genomic data.

Theta Solutions uses graphical processing units, originally built for video gaming, to have a high performance computer at a relatively low cost. The BOLT software focuses on custom turnkey analyses, once the system is set up all one needs to do is feed it data.

Using non-GPU computing, Golden can solve 51 million equations in 1649 seconds. The fastest GPU implementation took 78 seconds.

Why do we use a Bayesian sampler for solving mixed models?

No accuracy approximation bias
Can get PE covariance
Can apply marker selection methods
Can include prior information

With traditional methods, it took 23 seconds per sample, with new implementation can do a sample in 2 seconds. (Gibbs sampling is kind of like turning a statistical crank over and over to solve very complex equations, each sample is one turn of the crank.) They also parallelized the sampling, further speeding up the process. This parallelized processing is like working cattle with 100s of chutes rather than a single cute.

There are three ways to combine genomics with traditional EPDs,

blending Genomic BLUP (combine pedigree prediction with genomic prediction, two separate analyses)
single-step Genomic BLUP (combine pedigree relationships and genomic relationships, one analysis)
hybrid model (single step with marker effects)

Single-step genomic models outperform traditional EPDs. But, the hybrid model outperforms both models, especially for unproven animals. The purpose of the hybrid model is to squeeze more information out of the data.

Currently looking at a data set with 6 million pedigree records, 4.8 million birth weight records, and 1.9 million post weaning gain records, 46,402 genotyped animals and used 44,414 SNP markers.

Hybrid models allow

Marker selection models
multiple components i.e. maternal effects
Multiple traits different markers for different traits
Extral polygenic effects
MSRP approach (identifying SNPs with effects across traits and breeds)

IGS analysis enhancements and refinements

Superior marker effects model
Superior accuracy computation
New stayability approach
New breed effects model
Carcass traits solved together with birth weight
New method for external EPDs

Decker's Take Home Message

The use of a hybrid model is simply improving methods for computing EPDs. These new predictions will be looked at and scrutinized by many sets of eyes. The breed associations and their partners know how important accuracy and reliability are.

You may understand very little about this post. There are no boogie men or tricks with this method. It is simply a better way to estimate accurate EPDs from data.

Search This Blog

A Steak in Genomics™

Featured Post

Dr. Jamie Courter is your Mizzou Beef Genetics Extension Specialist

The IGS Implementation of BOLT

Comments

Popular posts from this blog

New Show-Me-Select Sire EPD Requirements Announced

Bob Hough Comments on Changes at Breed Associations

Show-Me-Select Board Approves Genomic Testing Requirement for Natural Service Sires

Get new posts by email: