Tuesday, October 3, 2017

Reverend Bayes and Cattle Breeding

Thomas Bayes
Reverend Bayes
via Wikimedia Commons
You are asking yourself, who is Reverend Bayes and what does he have to do with cattle? The answer to this question will answer a major misconception in cattle genetics.

Reverend Bayes was an 18th century Presbyterian minister. He was also trained in logic. Due to Bayes’ work on probabilities, an approach to statistics called Bayesian statistics is named after him. In Bayesian statistics, we start with a prior belief (prior probability). As more information and data are gathered, we update this prior belief. We call this new update a posterior belief. We continue this process as we collect additional data. Further, a key tenant of Bayesian statistics is evaluating the methods (i.e. models) used in our analysis. Statisticians and scientists did not frequently use this system of statistics in the early 20th century. But, with increased computing power, Bayesian statistics has become very popular in the 21st century.

White Bull
By Lutz Koch
CC BY-NC-ND 2.0
Cattle genetic prediction is very much Bayesian. We start with a prior belief. For genetic predictions (e.g. EPDs), this prior is the parental average. The parental average is half of the sire’s genetic prediction plus half of the dam’s genetic prediction. This prior prediction is not only based on quantitative genetics theory, but biology as well. If we mated the same sire and dam hundreds of times to produce hundreds of progeny, the average breeding value of those progeny would be the parental average.

When we receive new data on an animal, an important step happens. We update our prediction! Every calf receives a random sample of its sire’s and dam’s genetics. As we collect data on the animal, either its own performance, the performance of its progeny, or genomic test results, we now have information to sort out this random sample of genetic effects the animal received from its parents. The amount of variation associated with this random shuffle of genes between generations is quite large. In typical situations, the amount of variance between full siblings (same sire and dam) is equal to half of the variance in the entire population. This random sample of genes is unknown. It takes data to identify how an individual animal’s breeding value is different from its parental average. Thus, as we update predictions based on new data, it improves the prediction, making it closer to its unknown true value. We continue this updating process until we reach enough certainty that we call this animal a “proven” parent.

Not only do predictions change by updating the data, but also occasionally, we improve the statistical models used to estimate them. In the early 2000s, many dairy breeds added fertility traits to their economic indexes (breeding objectives). Prior to this, the genetic trend for fertility had been negative and the fertility of dairy herds was decreasing. After this change to the indexes, genetic merit for fertility increased and dairy herds became more fertile.

Several beef breed associations have recently switched or are in the process of switching from multi-step genomic prediction to single-step genomic prediction. Concurrent with this switch, several other changes will be made to the statistical models. For example, July 7, 2017 Angus Genetics Inc. (AGI) updated the evaluation of carcass traits for the American Angus Association. When producers were turning in carcass data it tended to be actual carcass data from low performing bulls who became steers or ultrasound data from the very best animals. This was not a random sample. It was a biased sample from a selected set of animals, which were either the very worst or the very best. In other words, there was selection bias. But, by fitting weaning weights as a correlated trait in the analysis of carcass data AGI removed this selection bias. This improved the genetic predictions.

As Nate Silver and Philip Tetlock have both written, the best predictions update as they receive new data. Yet, when a recently purchased animal quickly losses value as data and models are updated, this causes anxiety and alarm for many cattle breeders. When confronted with these situations, livestock breeders need to remind themselves of two facts.

  1. When they made the purchase, they used the best available predictions. 
  2. The new answer will serve them better in the long run. 

Imagine if predictions are not updated, or we stick our head in the sand and ignore updated predictions. In this scenario, our customers will become frustrated because what we are saying in our marketing is not matching real world performance. We can either swallow the bitter pill now resulting in happy, confident customers. Or, we can ignore the truth and end up with unhappy, distrusting customers.

Cattle breeders can use this updating process to their advantage by collecting and reporting data. Collect all the data that you can afford based on financial, time, and labor resources. Make sure the data you report is accurate (clean data). Do not guess on weights or use birth weight tapes. Report actual weights recorded on a scale! Turn in complete data. Record and report data on every calf born on your farm. Do not pick and choose which data you report; report all of it. Otherwise, you are simply biasing the predictions.

Updated predictions are valuable. Although updates may be uncomfortable in the short term, these updates make predictions more accurate. These updated predictions increase the precision of genetic predictions, improve the rate of genetic progress, and advance the sustainability, including profitability, of our cattle enterprises. By using updated predictions, we separate the signal from the noise and reap the benefit of modern statistics.

Written for the Fall 2017 MWI Veterinary Update.
Post a Comment