Journal Club

05/21/2020 Host by Tianjing

Zhou, Xiang, and Matthew Stephens. “Genome-wide efficient mixed-model analysis for association studies.” Nature genetics 44.7 (2012): 821-824.

Jiayi: The goal of this paper is to speed up the exact computation of standard test statistics for GWAS. The author proposed a new method called GEMMA, and it is n times faster than the most popular used method EMMA, where n is the number of individuals included in the GWAS analysis. More specifically, in EMMA, each marker requires an eigendecomposition with the computational complexity of O(n^3). In GEMMA, the author replaces the eigendecomposition process with an inexpensive matrix-vector multiplication followed by a few recursions involving only scalar multiplications O(n^2).

Zhou, Xiang, and Matthew Stephens. “Efficient multivariate linear mixed model algorithms for genome-wide association studies.” Nature methods vol. 11,4 (2014): 407-9.

Zigui: The goal of this paper is to provide an efficient approach for running the multivariate linear mixed model on the large scale data. The author provides an efficient version of Expectation-Maximization (EM) and Newton Raphson (NR) algorithm, so compared to other algorithms, the method is much more efficient in terms of maximizing likelihood function, which makes this method efficient.

Jiang, J., Cole, J. B., Freebern, E., Da, Y., VanRaden, P. M., & Ma, L. (2019). Functional annotation and Bayesian fine-mapping reveals candidate genes for important agronomic traits in Holstein bulls. Communications Biology, 2(1).

Debbie: In this paper, the author used the very reliable genotype, phenotype data from the 1000 Bull Genomes Project (27,000 bulls) to do both single-trait and multi-trait GWAS analysis. First, the significant regions were identified from the Manhattan plot in GWAS. Then, Bayesian fine-mapping was applied to find QTLs from those regions. The functional annotation information (from SnpEff software) was used to inform variant priors, which substantially changed the posterior probability of causality. The results showed that markers with moderate effects are more likely to become QTL, and the author also identified that markers annotated as existing within conserved DNA regions are more likely to become QTLs.

Gorjanc G, Cleveland MA, Houston RD, Hickey JM. Potential of genotyping-by-sequencing for genomic selection in livestock populations. GenetSel Evol. 2015;47:12.

Keyu: The aim of this paper is to evaluate the potential of genotyping-by-sequencing data compared to genotype data for genomic selection in the livestock population. To test these two data, the author used the ridge regression method to predict the accuracy of EBV and bias of EBV. The results showed that GBS data with low coverage sequencing and a larger number of markers will have higher prediction accuracy than genotype data.

Erbe M, Hayes B J, Matukumalli L K, et al. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. Journal of dairy science, 2012, 95(7): 4114-4129.

Tianjing: In this paper, the author proposed a Bayesian Alphabet method named BayesR. The prior assumption for marker effects is that they have an identity and independent mixture distributions to represent a class of markers with no effects and three further classes of small, medium, and large effects. Thus, the BayesR can be regarded as an extension of BayesCπ with a more flexible mixture prior for the marker effect.