--- title: 'GWAS 6 Practicals' author: "Matti Pirinen, University of Helsinki" date: "6-Feb-2019" output: html_document: default urlcolor: blue --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) set.seed(19) ``` Let's study the covariates in logistic regression. ```{r} source("https://www.mv.helsinki.fi/home/mjxpirin/log_regression_covariate_functions.R") ``` binary.covariate(K, freq.G, or.G, freq.X, or.X, ncases, ncontrols, population.controls = FALSE) #INPUT #K, the (target) prevalence of the disease #freq.G, frequency of risk allele in general population #or.G, odds-ratio for each copy of the risk allele #freq.X, frequency of the risk factor of binary exposure #or.X, odds-ratio of the risk factor #ncases, number of cases in the case-control sample #ncontrols, number of controls in the case-control sample #population.controls, if TRUE then controls have general population frequencies, otherwise controls have proper control frequencies. Model M with covariate is always more powerful in population samples. How does the power difference depend on K in population samples? Try K=0.001 and K=0.2 What if risk factor is less frequent (freq.X=0.1) vs more frequent (freq.X = 0.8) in ascertained case-control study? ```{r} K = 0.001 freq.G = 0.3 or.G = 1.2 freq.X = 0.1 or.X = 10 N = 100000 ncases = K*N ncontrols = (1-K)*N binary.covariate(K, freq.G, or.G, freq.X, or.X, ncases, ncontrols, population.controls = FALSE) ```