---
title: 'GWAS 6 Practicals'
author: "Matti Pirinen, University of Helsinki"
date: "6-Feb-2019"
output:
  html_document: default
urlcolor: blue
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
set.seed(19)
```


Let's study the covariates in logistic regression.

```{r}
source("https://www.mv.helsinki.fi/home/mjxpirin/log_regression_covariate_functions.R")
```
    binary.covariate(K, freq.G, or.G, freq.X, or.X, ncases, ncontrols, population.controls = FALSE)
    #INPUT
    #K, the (target) prevalence of the disease
    #freq.G, frequency of risk allele in general population
    #or.G, odds-ratio for each copy of the risk allele 
    #freq.X, frequency of the risk factor of binary exposure
    #or.X, odds-ratio of the risk factor
    #ncases, number of cases in the case-control sample 
    #ncontrols, number of controls in the case-control sample
    #population.controls, if TRUE then controls have general population frequencies,
                          otherwise controls have proper control frequencies.

Model M with covariate is always more powerful in population samples.
How does the power difference depend on K in population samples? 
Try K=0.001 and K=0.2

What if risk factor is less frequent (freq.X=0.1) vs more frequent (freq.X = 0.8)
in ascertained case-control study?

```{r}
K = 0.001
freq.G = 0.3
or.G = 1.2
freq.X = 0.1
or.X = 10 
N = 100000
ncases = K*N
ncontrols = (1-K)*N
binary.covariate(K, freq.G, or.G, freq.X, or.X, 
                 ncases, ncontrols, population.controls = FALSE)
```