diff --git a/02_activities/assignments/assignment_1.html b/02_activities/assignments/assignment_1.html
new file mode 100644
index 0000000..3cc271c
--- /dev/null
+++ b/02_activities/assignments/assignment_1.html
@@ -0,0 +1,928 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
+
+<meta charset="utf-8">
+<meta name="generator" content="quarto-1.8.27">
+
+<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
+
+
+<title>Assignment #1</title>
+<style>
+code{white-space: pre-wrap;}
+span.smallcaps{font-variant: small-caps;}
+div.columns{display: flex; gap: min(4vw, 1.5em);}
+div.column{flex: auto; overflow-x: auto;}
+div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+ul.task-list{list-style: none;}
+ul.task-list li input[type="checkbox"] {
+  width: 0.8em;
+  margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
+  vertical-align: middle;
+}
+/* CSS for syntax highlighting */
+html { -webkit-text-size-adjust: 100%; }
+pre > code.sourceCode { white-space: pre; position: relative; }
+pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
+pre > code.sourceCode > span:empty { height: 1.2em; }
+.sourceCode { overflow: visible; }
+code.sourceCode > span { color: inherit; text-decoration: inherit; }
+div.sourceCode { margin: 1em 0; }
+pre.sourceCode { margin: 0; }
+@media screen {
+div.sourceCode { overflow: auto; }
+}
+@media print {
+pre > code.sourceCode { white-space: pre-wrap; }
+pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+}
+pre.numberSource code
+  { counter-reset: source-line 0; }
+pre.numberSource code > span
+  { position: relative; left: -4em; counter-increment: source-line; }
+pre.numberSource code > span > a:first-child::before
+  { content: counter(source-line);
+    position: relative; left: -1em; text-align: right; vertical-align: baseline;
+    border: none; display: inline-block;
+    -webkit-touch-callout: none; -webkit-user-select: none;
+    -khtml-user-select: none; -moz-user-select: none;
+    -ms-user-select: none; user-select: none;
+    padding: 0 4px; width: 4em;
+  }
+pre.numberSource { margin-left: 3em;  padding-left: 4px; }
+div.sourceCode
+  {   }
+@media screen {
+pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+}
+</style>
+
+
+<script src="assignment_1_files/libs/clipboard/clipboard.min.js"></script>
+<script src="assignment_1_files/libs/quarto-html/quarto.js" type="module"></script>
+<script src="assignment_1_files/libs/quarto-html/tabsets/tabsets.js" type="module"></script>
+<script src="assignment_1_files/libs/quarto-html/axe/axe-check.js" type="module"></script>
+<script src="assignment_1_files/libs/quarto-html/popper.min.js"></script>
+<script src="assignment_1_files/libs/quarto-html/tippy.umd.min.js"></script>
+<script src="assignment_1_files/libs/quarto-html/anchor.min.js"></script>
+<link href="assignment_1_files/libs/quarto-html/tippy.css" rel="stylesheet">
+<link href="assignment_1_files/libs/quarto-html/quarto-syntax-highlighting-ed96de9b727972fe78a7b5d16c58bf87.css" rel="stylesheet" id="quarto-text-highlighting-styles">
+<script src="assignment_1_files/libs/bootstrap/bootstrap.min.js"></script>
+<link href="assignment_1_files/libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
+<link href="assignment_1_files/libs/bootstrap/bootstrap-d6a003b94517c951b2d65075d42fb01b.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="light">
+
+
+</head>
+
+<body class="fullcontent quarto-light">
+
+<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
+
+<main class="content" id="quarto-document-content">
+
+<header id="title-block-header" class="quarto-title-block default">
+<div class="quarto-title">
+<h1 class="title">Assignment #1</h1>
+</div>
+
+
+
+<div class="quarto-title-meta">
+
+    
+  
+    
+  </div>
+  
+
+
+</header>
+
+
+<section id="assignment-1" class="level2">
+<h2 class="anchored" data-anchor-id="assignment-1">Assignment 1</h2>
+<p>You only need to write lines of code for each question. When answering questions that ask you to identify or interpret something, the length of your response doesn’t matter. For example, if the answer is just ‘yes,’ ‘no,’ or a number, you can just give that answer without adding anything else.</p>
+<p>We will go through comparable code and concepts in the live learning session. If you run into trouble, start by using the help help() function in R, to get information about the datasets and function in question. The internet is also a great resource when coding (though note that no outside searches are required by the assignment!). If you do incorporate code from the internet, please cite the source within your code (providing a URL is sufficient).</p>
+<p>Please bring questions that you cannot work out on your own to office hours, work periods or share with your peers on Slack. We will work with you through the issue.</p>
+<p>You will need to install PLINK and run the analyses. Please follow the OS-specific setup guide in <a href="../../SETUP.md"><code>SETUP.md</code></a>. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.</p>
+<section id="question-1-data-inspection" class="level4">
+<h4 class="anchored" data-anchor-id="question-1-data-inspection">Question 1: Data inspection</h4>
+<p>Before fitting any models, it is essential to understand the data. Use R or bash code to answer the following questions about the <code>gwa.qc.A1.fam</code>, <code>gwa.qc.A1.bim</code>, and <code>gwa.qc.A1.bed</code> files, available at the following Google Drive link: <a href="https://drive.google.com/drive/folders/11meVqGCY5yAyI1fh-fAlMEXQt0VmRGuz?usp=drive_link" class="uri">https://drive.google.com/drive/folders/11meVqGCY5yAyI1fh-fAlMEXQt0VmRGuz?usp=drive_link</a>. Please download all three files from this link and place them in <code>02_activities/data/</code>.</p>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Load the packages needed for this assignment</span></span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(data.table)</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2)</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(seqminer)</span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(HardyWeinberg)</span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(dplyr)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+</div>
+<ol type="i">
+<li>Read the .fam file. How many samples does the dataset contain?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">wc</span> <span class="at">-l</span> ../data/gwa.qc.A1.fam</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>    4000 ../data/gwa.qc.A1.fam</code></pre>
+</div>
+</div>
+<p>The fam dataset contains 4000 samples</p>
+<ol start="2" type="i">
+<li>What is the ‘variable type’ of the response variable (i.e.Continuous or binary)?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span> ../data/gwa.qc.A1.fam</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>0   A2001   0   0   1   -0.694438129641973
+1   A2002   0   0   1   1.85384536141856
+2   A2003   0   0   1   2.08263677761584
+3   A2004   0   0   1   2.73871473943968
+4   A2005   0   0   1   1.34114035564636
+5   A2006   0   0   1   0.416778586749647
+6   A2007   0   0   1   2.38297123290054
+7   A2008   0   0   1   1.51429928826958
+8   A2009   0   0   1   0.718686390529039
+9   A2010   0   0   1   2.08904136245205</code></pre>
+</div>
+</div>
+<p>The variable type or phenotype is continuous</p>
+<ol start="3" type="i">
+<li>Read the .bim file. How many SNPs does the dataset contain?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">wc</span> <span class="at">-l</span> ../data/gwa.qc.A1.bim         </span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>  101083 ../data/gwa.qc.A1.bim</code></pre>
+</div>
+</div>
+<p>The bim file has 101083 SNPs</p>
+</section>
+<section id="question-2-allele-frequency-estimation" class="level4">
+<h4 class="anchored" data-anchor-id="question-2-allele-frequency-estimation">Question 2: Allele Frequency Estimation</h4>
+<ol type="i">
+<li>Load the genotype matrix for SNPs rs1861, rs3813199, rs3128342, and rs11804831 using additive coding. What are the allele frequencies (AFs) for these four SNPs?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="co"># Create SNP list</span></span>
+<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a><span class="bu">printf</span> <span class="st">"%s\n"</span> rs1861 rs3813199 rs3128342 rs11804831 <span class="op">&gt;</span> ../data/snplist_A1.txt</span>
+<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a><span class="fu">cat</span> ../data/snplist_A1.txt</span>
+<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a><span class="co"># Subset the 4 SNPs from the PLINK dataset</span></span>
+<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a><span class="ex">plink2</span> <span class="at">--bfile</span> ../data/gwa.qc.A1 <span class="at">--extract</span> ../data/snplist_A1.txt <span class="at">--make-bed</span> <span class="at">--out</span> ../data/gwa_A1_subset</span>
+<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a><span class="co"># Additive coding on the subsetted SNPs</span></span>
+<span id="cb8-10"><a href="#cb8-10" aria-hidden="true" tabindex="-1"></a><span class="ex">plink2</span> <span class="at">--bfile</span> ../data/gwa_A1_subset <span class="at">--export</span> A <span class="at">--out</span> ../data/gwa_A1_subset_additive</span>
+<span id="cb8-11"><a href="#cb8-11" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb8-12"><a href="#cb8-12" aria-hidden="true" tabindex="-1"></a><span class="co"># Calculate allele frequencies for the 4-SNP subset</span></span>
+<span id="cb8-13"><a href="#cb8-13" aria-hidden="true" tabindex="-1"></a><span class="ex">plink2</span> <span class="at">--bfile</span> ../data/gwa_A1_subset <span class="at">--freq</span> <span class="at">--out</span> ../data/gwa_A1_subset_freq</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+</div>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Load additive-coded genotype matrix</span></span>
+<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>geno <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_A1_subset_additive.raw"</span>)</span>
+<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span>(geno)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>     FID    IID   PAT   MAT   SEX PHENOTYPE rs3813199_G rs11804831_T
+   &lt;int&gt; &lt;char&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;     &lt;num&gt;       &lt;int&gt;        &lt;int&gt;
+1:     0  A2001     0     0     1 -0.694438           2            2
+2:     1  A2002     0     0     1  1.853850           2            2
+3:     2  A2003     0     0     1  2.082640           2            1
+4:     3  A2004     0     0     1  2.738710           2            2
+5:     4  A2005     0     0     1  1.341140           2            1
+6:     5  A2006     0     0     1  0.416779           2            1
+   rs3128342_C rs1861_C
+         &lt;int&gt;    &lt;int&gt;
+1:           2       NA
+2:           2        2
+3:           2        2
+4:           1        2
+5:           1        2
+6:           2       NA</code></pre>
+</div>
+</div>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Read and display allele frequencies of four SNPs</span></span>
+<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a>freq <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_A1_subset_freq.afreq"</span>)</span>
+<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a>af_table <span class="ot">&lt;-</span> freq[, .(<span class="at">SNP =</span> ID, <span class="at">AF =</span> ALT_FREQS)]</span>
+<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a>af_table</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>          SNP        AF
+       &lt;char&gt;     &lt;num&gt;
+1:  rs3813199 0.0569126
+2: rs11804831 0.1543410
+3:  rs3128342 0.3051210
+4:     rs1861 0.0539859</code></pre>
+</div>
+</div>
+<p>The allele frequencies (AF) of the four SNPs are as follows: rs1861 = 0.0539859, rs3813199 = 0.0569126, rs3128342 = 0.3051210, rs11804831 = 0.1543410</p>
+<ol start="2" type="i">
+<li>What are the minor allele frequencies (MAFs) for these four SNPs?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>maf_table <span class="ot">&lt;-</span> freq[, .(<span class="at">SNP =</span> ID, <span class="at">AF =</span> ALT_FREQS, <span class="at">MAF =</span> <span class="fu">pmin</span>(ALT_FREQS, <span class="dv">1</span> <span class="sc">-</span> ALT_FREQS))]</span>
+<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a>maf_table</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>          SNP        AF       MAF
+       &lt;char&gt;     &lt;num&gt;     &lt;num&gt;
+1:  rs3813199 0.0569126 0.0569126
+2: rs11804831 0.1543410 0.1543410
+3:  rs3128342 0.3051210 0.3051210
+4:     rs1861 0.0539859 0.0539859</code></pre>
+</div>
+</div>
+<p>Since the estimated allele frequencies of these SNPs are &lt;0.5 from the ALT_FREQS column of the PLINK frequency output, ALT is already the minor allele. Therefore, ALT_FREQS = minor allele frequencies.</p>
+</section>
+<section id="question-3-hardyweinberg-equilibrium-hwe-test" class="level4">
+<h4 class="anchored" data-anchor-id="question-3-hardyweinberg-equilibrium-hwe-test">Question 3: Hardy–Weinberg Equilibrium (HWE) Test</h4>
+<ol type="i">
+<li>Conduct the Hardy–Weinberg Equilibrium (HWE) test for all SNPs in the .bim file. Then, load the file containing the HWE p-value results and display the first few rows of the resulting data frame.</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="ex">plink2</span> <span class="at">--bfile</span> ../data/gwa.qc.A1 <span class="at">--hardy</span> <span class="at">--out</span> ../data/gwa_qc_A1_hwe</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+</div>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>hwe <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_qc_A1_hwe.hardy"</span>)</span>
+<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span>(hwe)  </span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>   #CHROM         ID     A1     AX HOM_A1_CT HET_A1_CT TWO_AX_CT O(HET_A1)
+    &lt;int&gt;     &lt;char&gt; &lt;char&gt; &lt;char&gt;     &lt;int&gt;     &lt;int&gt;     &lt;int&gt;     &lt;num&gt;
+1:      1  rs3737728      G      A      1713      1841       428  0.462330
+2:      1  rs1320565      C      T      3368       589        19  0.148139
+3:      1  rs3813199      G      A      3531       428        12  0.107781
+4:      1 rs11804831      T      C      2820      1061        81  0.267794
+5:      1  rs3766178      T      C      2391      1378       214  0.345970
+6:      1  rs3128342      C      A      1927      1655       382  0.417508
+   E(HET_A1)         P
+       &lt;num&gt;     &lt;num&gt;
+1:  0.447932 0.0437892
+2:  0.145262 0.2734290
+3:  0.107347 1.0000000
+4:  0.261040 0.1133540
+5:  0.350629 0.4158770
+6:  0.424044 0.3302730</code></pre>
+</div>
+</div>
+<ol start="2" type="i">
+<li>What are the HWE p-values for SNPs rs1861, rs3813199, rs3128342, and rs11804831?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Create a subset of the four SNPs</span></span>
+<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a>snps_interest <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"rs1861"</span>, <span class="st">"rs3813199"</span>, <span class="st">"rs3128342"</span>, <span class="st">"rs11804831"</span>)</span>
+<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a>hwe_subset <span class="ot">&lt;-</span> hwe[ID <span class="sc">%in%</span> snps_interest, .(<span class="at">SNP =</span> ID, <span class="at">HWE_P =</span> P)]</span>
+<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a>hwe_subset</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>          SNP    HWE_P
+       &lt;char&gt;    &lt;num&gt;
+1:  rs3813199 1.000000
+2: rs11804831 0.113354
+3:  rs3128342 0.330273
+4:     rs1861 0.274719</code></pre>
+</div>
+</div>
+<p>The HWE p-values for the four SNPs are as follows: rs1861 = 0.274719, rs3813199 = 1.000000, rs3128342 = 0.330273, rs11804831 = 0.113354</p>
+</section>
+<section id="question-4-genetic-association-test" class="level4">
+<h4 class="anchored" data-anchor-id="question-4-genetic-association-test">Question 4: Genetic Association Test</h4>
+<ol type="i">
+<li>Conduct a linear regression to test the association between SNP rs1861 and the phenotype. What is the p-value?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Linear regression model</span></span>
+<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a>geno <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_A1_subset_additive.raw"</span>)</span>
+<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a>model <span class="ot">&lt;-</span> <span class="fu">lm</span>(PHENOTYPE <span class="sc">~</span> rs1861_C, <span class="at">data =</span> geno)</span>
+<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a><span class="fu">summary</span>(model)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>
+Call:
+lm(formula = PHENOTYPE ~ rs1861_C, data = geno)
+
+Residuals:
+    Min      1Q  Median      3Q     Max 
+-3.5439 -0.6850  0.0021  0.6993  3.3268 
+
+Coefficients:
+            Estimate Std. Error t value Pr(&gt;|t|)    
+(Intercept)  0.05238    0.09486   0.552    0.581    
+rs1861_C     0.97382    0.04943  19.703   &lt;2e-16 ***
+---
+Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+
+Residual standard error: 1.003 on 3962 degrees of freedom
+  (36 observations deleted due to missingness)
+Multiple R-squared:  0.08924,   Adjusted R-squared:  0.08901 
+F-statistic: 388.2 on 1 and 3962 DF,  p-value: &lt; 2.2e-16</code></pre>
+</div>
+</div>
+<p>The p-value for the association between SNP rs1861 and the phenotype is &lt;2e-16.</p>
+<ol start="2" type="i">
+<li>How would you interpret the beta coefficient from this regression?</li>
+</ol>
+<p>The regression coefficient for rs1861 is 0.97382 (p &lt;2e-16), indicating that each additional copy of the C allele is associated with an increase of approximately 0.97 units of the phenotype on average. The significant p-value suggests evidence of an association between this SNP and the phenotype in our sample.</p>
+<ol start="3" type="i">
+<li>Plot the scatterplot of phenotype versus the genotype of SNP rs1861. Add the regression line to the plot.</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb22"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Load the genotype data</span></span>
+<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a>geno <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_A1_subset_additive.raw"</span>)</span>
+<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb22-4"><a href="#cb22-4" aria-hidden="true" tabindex="-1"></a><span class="co"># Remove missing values before plotting</span></span>
+<span id="cb22-5"><a href="#cb22-5" aria-hidden="true" tabindex="-1"></a>plot_data_add <span class="ot">&lt;-</span> geno[<span class="sc">!</span><span class="fu">is.na</span>(rs1861_C) <span class="sc">&amp;</span> <span class="sc">!</span><span class="fu">is.na</span>(PHENOTYPE), ]</span>
+<span id="cb22-6"><a href="#cb22-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb22-7"><a href="#cb22-7" aria-hidden="true" tabindex="-1"></a><span class="co"># Create scatterplot with regression line assuming outcome is phenotype and predictor is genotype of SNP rs1861</span></span>
+<span id="cb22-8"><a href="#cb22-8" aria-hidden="true" tabindex="-1"></a>p <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(plot_data_add, <span class="fu">aes</span>(<span class="at">x =</span> <span class="fu">factor</span>(rs1861_C), <span class="at">y =</span> PHENOTYPE)) <span class="sc">+</span></span>
+<span id="cb22-9"><a href="#cb22-9" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_point</span>(<span class="at">color =</span> <span class="st">"blue"</span>, <span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
+<span id="cb22-10"><a href="#cb22-10" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_smooth</span>(<span class="fu">aes</span>(<span class="at">group =</span> <span class="dv">1</span>), <span class="at">method =</span> <span class="st">"lm"</span>, <span class="at">color =</span> <span class="st">"red"</span>) <span class="sc">+</span></span>
+<span id="cb22-11"><a href="#cb22-11" aria-hidden="true" tabindex="-1"></a>  <span class="fu">labs</span>(</span>
+<span id="cb22-12"><a href="#cb22-12" aria-hidden="true" tabindex="-1"></a>    <span class="at">title =</span> <span class="st">"Scatterplot showing the association between SNP rs1861 and phenotype"</span>,</span>
+<span id="cb22-13"><a href="#cb22-13" aria-hidden="true" tabindex="-1"></a>    <span class="at">x =</span> <span class="st">"Genotype (number of C alleles)"</span>,</span>
+<span id="cb22-14"><a href="#cb22-14" aria-hidden="true" tabindex="-1"></a>    <span class="at">y =</span> <span class="st">"Phenotype"</span></span>
+<span id="cb22-15"><a href="#cb22-15" aria-hidden="true" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb22-16"><a href="#cb22-16" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span>
+<span id="cb22-17"><a href="#cb22-17" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb22-18"><a href="#cb22-18" aria-hidden="true" tabindex="-1"></a>p</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output-display">
+<div>
+<figure class="figure">
+<p><img src="assignment_1_files/figure-html/unnamed-chunk-13-1.png" class="img-fluid figure-img" width="672"></p>
+</figure>
+</div>
+</div>
+</div>
+<ol start="4" type="i">
+<li>Convert the genotype coding for rs1861 to recessive coding.</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Load genotype data</span></span>
+<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a>geno <span class="ot">&lt;-</span> <span class="fu">fread</span>(<span class="st">"../data/gwa_A1_subset_additive.raw"</span>)</span>
+<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb23-4"><a href="#cb23-4" aria-hidden="true" tabindex="-1"></a><span class="co"># Convert coding for rs1861 to recessive coding</span></span>
+<span id="cb23-5"><a href="#cb23-5" aria-hidden="true" tabindex="-1"></a>geno_recessive <span class="ot">&lt;-</span> geno</span>
+<span id="cb23-6"><a href="#cb23-6" aria-hidden="true" tabindex="-1"></a>geno_recessive<span class="sc">$</span>rs1861_recessive <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(geno_recessive<span class="sc">$</span>rs1861_C <span class="sc">==</span> <span class="dv">2</span>, <span class="dv">1</span>, <span class="dv">0</span>)</span>
+<span id="cb23-7"><a href="#cb23-7" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb23-8"><a href="#cb23-8" aria-hidden="true" tabindex="-1"></a><span class="co"># Check recoding</span></span>
+<span id="cb23-9"><a href="#cb23-9" aria-hidden="true" tabindex="-1"></a><span class="fu">table</span>(geno_recessive<span class="sc">$</span>rs1861_C, geno_recessive<span class="sc">$</span>rs1861_recessive, <span class="at">useNA =</span> <span class="st">"ifany"</span>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>      
+          0    1 &lt;NA&gt;
+  0      15    0    0
+  1     398    0    0
+  2       0 3551    0
+  &lt;NA&gt;    0    0   36</code></pre>
+</div>
+</div>
+<ol start="22" type="a">
+<li>Conduct a linear regression to test the association between the recessive-coded rs1861 and the phenotype. What is the p-value?</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Linear regression model using recessive coding</span></span>
+<span id="cb25-2"><a href="#cb25-2" aria-hidden="true" tabindex="-1"></a>model_recessive <span class="ot">&lt;-</span> <span class="fu">lm</span>(PHENOTYPE <span class="sc">~</span> rs1861_recessive, <span class="at">data =</span> geno_recessive)</span>
+<span id="cb25-3"><a href="#cb25-3" aria-hidden="true" tabindex="-1"></a><span class="fu">summary</span>(model_recessive)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>
+Call:
+lm(formula = PHENOTYPE ~ rs1861_recessive, data = geno_recessive)
+
+Residuals:
+    Min      1Q  Median      3Q     Max 
+-3.5437 -0.6892  0.0015  0.7016  3.3270 
+
+Coefficients:
+                 Estimate Std. Error t value Pr(&gt;|t|)    
+(Intercept)       0.99231    0.04945   20.07   &lt;2e-16 ***
+rs1861_recessive  1.00754    0.05224   19.29   &lt;2e-16 ***
+---
+Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+
+Residual standard error: 1.005 on 3962 degrees of freedom
+  (36 observations deleted due to missingness)
+Multiple R-squared:  0.08582,   Adjusted R-squared:  0.08559 
+F-statistic:   372 on 1 and 3962 DF,  p-value: &lt; 2.2e-16</code></pre>
+</div>
+</div>
+<p>The p-value for the association between SNP rs1861 and the phenotype using recessive coding is &lt;2e-16.</p>
+<ol start="6" type="i">
+<li>Plot the scatterplot of phenotype versus the recessive-coded genotype of rs1861. Add the regression line to the plot.</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Create scatterplot with regression line assuming outcome is phenotype and predictor is genotype of SNP rs1861 using recessive coding</span></span>
+<span id="cb27-2"><a href="#cb27-2" aria-hidden="true" tabindex="-1"></a>plot_data_rec <span class="ot">&lt;-</span> geno_recessive[<span class="sc">!</span><span class="fu">is.na</span>(rs1861_recessive) <span class="sc">&amp;</span> <span class="sc">!</span><span class="fu">is.na</span>(PHENOTYPE), ]</span>
+<span id="cb27-3"><a href="#cb27-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb27-4"><a href="#cb27-4" aria-hidden="true" tabindex="-1"></a>p_rec <span class="ot">&lt;-</span> <span class="fu">ggplot</span>(plot_data_rec,</span>
+<span id="cb27-5"><a href="#cb27-5" aria-hidden="true" tabindex="-1"></a>                <span class="fu">aes</span>(<span class="at">x =</span> <span class="fu">factor</span>(rs1861_recessive), <span class="at">y =</span> PHENOTYPE)) <span class="sc">+</span></span>
+<span id="cb27-6"><a href="#cb27-6" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_point</span>(<span class="at">color =</span> <span class="st">"blue"</span>, <span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
+<span id="cb27-7"><a href="#cb27-7" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_smooth</span>(<span class="fu">aes</span>(<span class="at">group =</span> <span class="dv">1</span>), <span class="at">method =</span> <span class="st">"lm"</span>, <span class="at">color =</span> <span class="st">"red"</span>) <span class="sc">+</span></span>
+<span id="cb27-8"><a href="#cb27-8" aria-hidden="true" tabindex="-1"></a>  <span class="fu">labs</span>(</span>
+<span id="cb27-9"><a href="#cb27-9" aria-hidden="true" tabindex="-1"></a>    <span class="at">title =</span> <span class="st">"Scatterplot showing the association between recessive-coded rs1861 </span><span class="sc">\n</span><span class="st">and phenotype"</span>,</span>
+<span id="cb27-10"><a href="#cb27-10" aria-hidden="true" tabindex="-1"></a>    <span class="at">x =</span> <span class="st">"Recessive genotype (1 = two copies of allele)"</span>,</span>
+<span id="cb27-11"><a href="#cb27-11" aria-hidden="true" tabindex="-1"></a>    <span class="at">y =</span> <span class="st">"Phenotype"</span></span>
+<span id="cb27-12"><a href="#cb27-12" aria-hidden="true" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb27-13"><a href="#cb27-13" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span>
+<span id="cb27-14"><a href="#cb27-14" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb27-15"><a href="#cb27-15" aria-hidden="true" tabindex="-1"></a>p_rec</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output-display">
+<div>
+<figure class="figure">
+<p><img src="assignment_1_files/figure-html/unnamed-chunk-16-1.png" class="img-fluid figure-img" width="672"></p>
+</figure>
+</div>
+</div>
+</div>
+<ol start="7" type="i">
+<li>Which model fits better? Justify your answer.</li>
+</ol>
+<div class="cell">
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb28"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a><span class="fu">AIC</span>(model)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 11276.91</code></pre>
+</div>
+<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb30"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1" aria-hidden="true" tabindex="-1"></a><span class="fu">AIC</span>(model_recessive)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 11291.74</code></pre>
+</div>
+</div>
+<p>The additive model fits better than the recessive model. In the linear regression analyses, the additive model had a higher R-squared value compared to the recessive model (0.08924 vs.&nbsp;0.08582). Furthermore, the AIC of the additive model (11276.91) is lower than the AIC of the recessive model (11291.74), indicating that the additive coding provides a better overall model fit. This suggests that the effect of rs1861 on the phenotype is better captured by an allele-dose (additive) effect rather than a recessive effect.</p>
+</section>
+<section id="criteria" class="level3">
+<h3 class="anchored" data-anchor-id="criteria">Criteria</h3>
+<table class="caption-top table">
+<colgroup>
+<col style="width: 33%">
+<col style="width: 33%">
+<col style="width: 33%">
+</colgroup>
+<thead>
+<tr class="header">
+<th>Criteria</th>
+<th>Complete</th>
+<th>Incomplete</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td><strong>Data Inspection</strong></td>
+<td>Correct sample/SNP counts and variable type identified.</td>
+<td>Missing or incorrect counts or variable type.</td>
+</tr>
+<tr class="even">
+<td><strong>Allele Frequency Estimation</strong></td>
+<td>Correct allele and minor allele frequencies computed.</td>
+<td>Frequencies missing or wrong.</td>
+</tr>
+<tr class="odd">
+<td><strong>Hardy–Weinberg Equilibrium Test</strong></td>
+<td>Correct PLINK command and p-value extraction in R.</td>
+<td>PLINK command or extraction incorrect/missing.</td>
+</tr>
+<tr class="even">
+<td><strong>Genetic Association Test</strong></td>
+<td>Correct regressions, plots, coding, and interpretation.</td>
+<td>Regression, plots, or interpretation missing/incomplete.</td>
+</tr>
+</tbody>
+</table>
+</section>
+<section id="submission-information" class="level3">
+<h3 class="anchored" data-anchor-id="submission-information">Submission Information</h3>
+<p>📌 Please review our <a href="https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md">Assignment Submission Guide</a> for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.</p>
+<section id="note" class="level4">
+<h4 class="anchored" data-anchor-id="note">Note:</h4>
+<p>If you like, you may collaborate with others in the cohort. If you choose to do so, please indicate with whom you have worked with in your pull request by tagging their GitHub username. Separate submissions are required.</p>
+<hr>
+</section>
+<section id="submission-parameters" class="level4">
+<h4 class="anchored" data-anchor-id="submission-parameters">Submission Parameters</h4>
+<ul>
+<li><p>Submission Due Date: <code>11:59 PM – 16/03/2026</code></p></li>
+<li><p>Branch name for your repo should be: <code>assignment-1</code></p></li>
+<li><p>What to submit for this assignment:</p>
+<ul>
+<li>Populate this Quarto document (<code>assignment_1.qmd</code>).</li>
+<li>Render the document with Quarto: <code>quarto render assignment_1.qmd</code>.</li>
+<li>Submit both <code>assignment_1.qmd</code> and the rendered HTML file <code>assignment_1.html</code> in your pull request.</li>
+</ul></li>
+<li><p>What the pull request link should look like for this assignment: <code>https://github.com/&lt;your_github_username&gt;/gen_data/pull/&lt;pr_id&gt;</code></p>
+<ul>
+<li>Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support team review your submission easily.</li>
+</ul></li>
+</ul>
+<hr>
+<p>Checklist:</p>
+<ul>
+<li>Created a branch with the correct naming convention.</li>
+<li>Ensured that the repository is public.</li>
+<li>Reviewed the PR description guidelines and adhered to them.</li>
+<li>Verified that the link is accessible in a private browser window.</li>
+<li>Confirmed that both <code>assignment_1.qmd</code> and <code>assignment_1.html</code> are included in the pull request.</li>
+</ul>
+<p>If you encounter any difficulties or have questions, please don’t hesitate to reach out to our team via our Slack help channel. Our technical facilitators and learning support team are here to help you navigate any challenges.</p>
+</section>
+</section>
+</section>
+
+</main>
+<!-- /main column -->
+<script id="quarto-html-after-body" type="application/javascript">
+  window.document.addEventListener("DOMContentLoaded", function (event) {
+    const icon = "";
+    const anchorJS = new window.AnchorJS();
+    anchorJS.options = {
+      placement: 'right',
+      icon: icon
+    };
+    anchorJS.add('.anchored');
+    const isCodeAnnotation = (el) => {
+      for (const clz of el.classList) {
+        if (clz.startsWith('code-annotation-')) {                     
+          return true;
+        }
+      }
+      return false;
+    }
+    const onCopySuccess = function(e) {
+      // button target
+      const button = e.trigger;
+      // don't keep focus
+      button.blur();
+      // flash "checked"
+      button.classList.add('code-copy-button-checked');
+      var currentTitle = button.getAttribute("title");
+      button.setAttribute("title", "Copied!");
+      let tooltip;
+      if (window.bootstrap) {
+        button.setAttribute("data-bs-toggle", "tooltip");
+        button.setAttribute("data-bs-placement", "left");
+        button.setAttribute("data-bs-title", "Copied!");
+        tooltip = new bootstrap.Tooltip(button, 
+          { trigger: "manual", 
+            customClass: "code-copy-button-tooltip",
+            offset: [0, -8]});
+        tooltip.show();    
+      }
+      setTimeout(function() {
+        if (tooltip) {
+          tooltip.hide();
+          button.removeAttribute("data-bs-title");
+          button.removeAttribute("data-bs-toggle");
+          button.removeAttribute("data-bs-placement");
+        }
+        button.setAttribute("title", currentTitle);
+        button.classList.remove('code-copy-button-checked');
+      }, 1000);
+      // clear code selection
+      e.clearSelection();
+    }
+    const getTextToCopy = function(trigger) {
+      const outerScaffold = trigger.parentElement.cloneNode(true);
+      const codeEl = outerScaffold.querySelector('code');
+      for (const childEl of codeEl.children) {
+        if (isCodeAnnotation(childEl)) {
+          childEl.remove();
+        }
+      }
+      return codeEl.innerText;
+    }
+    const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
+      text: getTextToCopy
+    });
+    clipboard.on('success', onCopySuccess);
+    if (window.document.getElementById('quarto-embedded-source-code-modal')) {
+      const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
+        text: getTextToCopy,
+        container: window.document.getElementById('quarto-embedded-source-code-modal')
+      });
+      clipboardModal.on('success', onCopySuccess);
+    }
+      var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
+      var mailtoRegex = new RegExp(/^mailto:/);
+        var filterRegex = new RegExp('/' + window.location.host + '/');
+      var isInternal = (href) => {
+          return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
+      }
+      // Inspect non-navigation links and adorn them if external
+     var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
+      for (var i=0; i<links.length; i++) {
+        const link = links[i];
+        if (!isInternal(link.href)) {
+          // undo the damage that might have been done by quarto-nav.js in the case of
+          // links that we want to consider external
+          if (link.dataset.originalHref !== undefined) {
+            link.href = link.dataset.originalHref;
+          }
+        }
+      }
+    function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      const config = {
+        allowHTML: true,
+        maxWidth: 500,
+        delay: 100,
+        arrow: false,
+        appendTo: function(el) {
+            return el.parentElement;
+        },
+        interactive: true,
+        interactiveBorder: 10,
+        theme: 'quarto',
+        placement: 'bottom-start',
+      };
+      if (contentFn) {
+        config.content = contentFn;
+      }
+      if (onTriggerFn) {
+        config.onTrigger = onTriggerFn;
+      }
+      if (onUntriggerFn) {
+        config.onUntrigger = onUntriggerFn;
+      }
+      window.tippy(el, config); 
+    }
+    const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
+    for (var i=0; i<noterefs.length; i++) {
+      const ref = noterefs[i];
+      tippyHover(ref, function() {
+        // use id or data attribute instead here
+        let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
+        try { href = new URL(href).hash; } catch {}
+        const id = href.replace(/^#\/?/, "");
+        const note = window.document.getElementById(id);
+        if (note) {
+          return note.innerHTML;
+        } else {
+          return "";
+        }
+      });
+    }
+    const xrefs = window.document.querySelectorAll('a.quarto-xref');
+    const processXRef = (id, note) => {
+      // Strip column container classes
+      const stripColumnClz = (el) => {
+        el.classList.remove("page-full", "page-columns");
+        if (el.children) {
+          for (const child of el.children) {
+            stripColumnClz(child);
+          }
+        }
+      }
+      stripColumnClz(note)
+      if (id === null || id.startsWith('sec-')) {
+        // Special case sections, only their first couple elements
+        const container = document.createElement("div");
+        if (note.children && note.children.length > 2) {
+          container.appendChild(note.children[0].cloneNode(true));
+          for (let i = 1; i < note.children.length; i++) {
+            const child = note.children[i];
+            if (child.tagName === "P" && child.innerText === "") {
+              continue;
+            } else {
+              container.appendChild(child.cloneNode(true));
+              break;
+            }
+          }
+          if (window.Quarto?.typesetMath) {
+            window.Quarto.typesetMath(container);
+          }
+          return container.innerHTML
+        } else {
+          if (window.Quarto?.typesetMath) {
+            window.Quarto.typesetMath(note);
+          }
+          return note.innerHTML;
+        }
+      } else {
+        // Remove any anchor links if they are present
+        const anchorLink = note.querySelector('a.anchorjs-link');
+        if (anchorLink) {
+          anchorLink.remove();
+        }
+        if (window.Quarto?.typesetMath) {
+          window.Quarto.typesetMath(note);
+        }
+        if (note.classList.contains("callout")) {
+          return note.outerHTML;
+        } else {
+          return note.innerHTML;
+        }
+      }
+    }
+    for (var i=0; i<xrefs.length; i++) {
+      const xref = xrefs[i];
+      tippyHover(xref, undefined, function(instance) {
+        instance.disable();
+        let url = xref.getAttribute('href');
+        let hash = undefined; 
+        if (url.startsWith('#')) {
+          hash = url;
+        } else {
+          try { hash = new URL(url).hash; } catch {}
+        }
+        if (hash) {
+          const id = hash.replace(/^#\/?/, "");
+          const note = window.document.getElementById(id);
+          if (note !== null) {
+            try {
+              const html = processXRef(id, note.cloneNode(true));
+              instance.setContent(html);
+            } finally {
+              instance.enable();
+              instance.show();
+            }
+          } else {
+            // See if we can fetch this
+            fetch(url.split('#')[0])
+            .then(res => res.text())
+            .then(html => {
+              const parser = new DOMParser();
+              const htmlDoc = parser.parseFromString(html, "text/html");
+              const note = htmlDoc.getElementById(id);
+              if (note !== null) {
+                const html = processXRef(id, note);
+                instance.setContent(html);
+              } 
+            }).finally(() => {
+              instance.enable();
+              instance.show();
+            });
+          }
+        } else {
+          // See if we can fetch a full url (with no hash to target)
+          // This is a special case and we should probably do some content thinning / targeting
+          fetch(url)
+          .then(res => res.text())
+          .then(html => {
+            const parser = new DOMParser();
+            const htmlDoc = parser.parseFromString(html, "text/html");
+            const note = htmlDoc.querySelector('main.content');
+            if (note !== null) {
+              // This should only happen for chapter cross references
+              // (since there is no id in the URL)
+              // remove the first header
+              if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
+                note.children[0].remove();
+              }
+              const html = processXRef(null, note);
+              instance.setContent(html);
+            } 
+          }).finally(() => {
+            instance.enable();
+            instance.show();
+          });
+        }
+      }, function(instance) {
+      });
+    }
+        let selectedAnnoteEl;
+        const selectorForAnnotation = ( cell, annotation) => {
+          let cellAttr = 'data-code-cell="' + cell + '"';
+          let lineAttr = 'data-code-annotation="' +  annotation + '"';
+          const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
+          return selector;
+        }
+        const selectCodeLines = (annoteEl) => {
+          const doc = window.document;
+          const targetCell = annoteEl.getAttribute("data-target-cell");
+          const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
+          const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
+          const lines = annoteSpan.getAttribute("data-code-lines").split(",");
+          const lineIds = lines.map((line) => {
+            return targetCell + "-" + line;
+          })
+          let top = null;
+          let height = null;
+          let parent = null;
+          if (lineIds.length > 0) {
+              //compute the position of the single el (top and bottom and make a div)
+              const el = window.document.getElementById(lineIds[0]);
+              top = el.offsetTop;
+              height = el.offsetHeight;
+              parent = el.parentElement.parentElement;
+            if (lineIds.length > 1) {
+              const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
+              const bottom = lastEl.offsetTop + lastEl.offsetHeight;
+              height = bottom - top;
+            }
+            if (top !== null && height !== null && parent !== null) {
+              // cook up a div (if necessary) and position it 
+              let div = window.document.getElementById("code-annotation-line-highlight");
+              if (div === null) {
+                div = window.document.createElement("div");
+                div.setAttribute("id", "code-annotation-line-highlight");
+                div.style.position = 'absolute';
+                parent.appendChild(div);
+              }
+              div.style.top = top - 2 + "px";
+              div.style.height = height + 4 + "px";
+              div.style.left = 0;
+              let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
+              if (gutterDiv === null) {
+                gutterDiv = window.document.createElement("div");
+                gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
+                gutterDiv.style.position = 'absolute';
+                const codeCell = window.document.getElementById(targetCell);
+                const gutter = codeCell.querySelector('.code-annotation-gutter');
+                gutter.appendChild(gutterDiv);
+              }
+              gutterDiv.style.top = top - 2 + "px";
+              gutterDiv.style.height = height + 4 + "px";
+            }
+            selectedAnnoteEl = annoteEl;
+          }
+        };
+        const unselectCodeLines = () => {
+          const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
+          elementsIds.forEach((elId) => {
+            const div = window.document.getElementById(elId);
+            if (div) {
+              div.remove();
+            }
+          });
+          selectedAnnoteEl = undefined;
+        };
+          // Handle positioning of the toggle
+      window.addEventListener(
+        "resize",
+        throttle(() => {
+          elRect = undefined;
+          if (selectedAnnoteEl) {
+            selectCodeLines(selectedAnnoteEl);
+          }
+        }, 10)
+      );
+      function throttle(fn, ms) {
+      let throttle = false;
+      let timer;
+        return (...args) => {
+          if(!throttle) { // first call gets through
+              fn.apply(this, args);
+              throttle = true;
+          } else { // all the others get throttled
+              if(timer) clearTimeout(timer); // cancel #2
+              timer = setTimeout(() => {
+                fn.apply(this, args);
+                timer = throttle = false;
+              }, ms);
+          }
+        };
+      }
+        // Attach click handler to the DT
+        const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
+        for (const annoteDlNode of annoteDls) {
+          annoteDlNode.addEventListener('click', (event) => {
+            const clickedEl = event.target;
+            if (clickedEl !== selectedAnnoteEl) {
+              unselectCodeLines();
+              const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
+              if (activeEl) {
+                activeEl.classList.remove('code-annotation-active');
+              }
+              selectCodeLines(clickedEl);
+              clickedEl.classList.add('code-annotation-active');
+            } else {
+              // Unselect the line
+              unselectCodeLines();
+              clickedEl.classList.remove('code-annotation-active');
+            }
+          });
+        }
+    const findCites = (el) => {
+      const parentEl = el.parentElement;
+      if (parentEl) {
+        const cites = parentEl.dataset.cites;
+        if (cites) {
+          return {
+            el,
+            cites: cites.split(' ')
+          };
+        } else {
+          return findCites(el.parentElement)
+        }
+      } else {
+        return undefined;
+      }
+    };
+    var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
+    for (var i=0; i<bibliorefs.length; i++) {
+      const ref = bibliorefs[i];
+      const citeInfo = findCites(ref);
+      if (citeInfo) {
+        tippyHover(citeInfo.el, function() {
+          var popup = window.document.createElement('div');
+          citeInfo.cites.forEach(function(cite) {
+            var citeDiv = window.document.createElement('div');
+            citeDiv.classList.add('hanging-indent');
+            citeDiv.classList.add('csl-entry');
+            var biblioDiv = window.document.getElementById('ref-' + cite);
+            if (biblioDiv) {
+              citeDiv.innerHTML = biblioDiv.innerHTML;
+            }
+            popup.appendChild(citeDiv);
+          });
+          return popup.innerHTML;
+        });
+      }
+    }
+  });
+  </script>
+</div> <!-- /content -->
+
+
+
+
+</body></html>
\ No newline at end of file
diff --git a/02_activities/assignments/assignment_1.qmd b/02_activities/assignments/assignment_1.qmd
index 550af3d..f0bdd0b 100644
--- a/02_activities/assignments/assignment_1.qmd
+++ b/02_activities/assignments/assignment_1.qmd
@@ -17,96 +17,200 @@ You will need to install PLINK and run the analyses. Please follow the OS-specif
 
 Before fitting any models, it is essential to understand the data. Use R or bash code to answer the following questions about the `gwa.qc.A1.fam`, `gwa.qc.A1.bim`, and `gwa.qc.A1.bed` files, available at the following Google Drive link: <https://drive.google.com/drive/folders/11meVqGCY5yAyI1fh-fAlMEXQt0VmRGuz?usp=drive_link>. Please download all three files from this link and place them in `02_activities/data/`.
 
-(i) Read the .fam file. How many samples does the dataset contain?
+```{r, message=FALSE, warning=FALSE, results='hide'}
+# Load the packages needed for this assignment
+library(data.table)
+library(ggplot2)
+library(seqminer)
+library(HardyWeinberg)
+library(dplyr)
+```
 
-```         
-# Your answer here...
+(i) Read the .fam file. How many samples does the dataset contain?
+     
+```{bash}
+wc -l ../data/gwa.qc.A1.fam
 ```
 
+The fam dataset contains 4000 samples
+
 (ii) What is the 'variable type' of the response variable (i.e.Continuous or binary)?
 
-```         
-# Your answer here...
+```{bash}
+head ../data/gwa.qc.A1.fam
 ```
 
+The variable type or phenotype is continuous
+
 (iii) Read the .bim file. How many SNPs does the dataset contain?
 
-```         
-# Your answer here...
+```{bash}
+wc -l ../data/gwa.qc.A1.bim         
 ```
 
+The bim file has 101083 SNPs
+
 #### Question 2: Allele Frequency Estimation
 
 (i) Load the genotype matrix for SNPs rs1861, rs3813199, rs3128342, and rs11804831 using additive coding. What are the allele frequencies (AFs) for these four SNPs?
 
-```         
-# Your code here...
+```{bash,results='hide',message=FALSE, warning=FALSE}
+
+# Create SNP list
+printf "%s\n" rs1861 rs3813199 rs3128342 rs11804831 > ../data/snplist_A1.txt
+cat ../data/snplist_A1.txt
+
+# Subset the 4 SNPs from the PLINK dataset
+plink2 --bfile ../data/gwa.qc.A1 --extract ../data/snplist_A1.txt --make-bed --out ../data/gwa_A1_subset
+
+# Additive coding on the subsetted SNPs
+plink2 --bfile ../data/gwa_A1_subset --export A --out ../data/gwa_A1_subset_additive
+
+# Calculate allele frequencies for the 4-SNP subset
+plink2 --bfile ../data/gwa_A1_subset --freq --out ../data/gwa_A1_subset_freq
 ```
 
+```{r,message=FALSE, warning=FALSE}
+# Load additive-coded genotype matrix
+geno <- fread("../data/gwa_A1_subset_additive.raw")
+head(geno)
+```
+
+```{r,message=FALSE, warning=FALSE}
+# Read and display allele frequencies of four SNPs
+freq <- fread("../data/gwa_A1_subset_freq.afreq")
+af_table <- freq[, .(SNP = ID, AF = ALT_FREQS)]
+af_table
+```
+
+The allele frequencies (AF) of the four SNPs are as follows: rs1861 = 0.0539859, rs3813199 = 0.0569126, rs3128342 = 0.3051210, rs11804831 = 0.1543410
+
 (ii) What are the minor allele frequencies (MAFs) for these four SNPs?
 
-```         
-# Your code here...
+```{r,message=FALSE, warning=FALSE}
+maf_table <- freq[, .(SNP = ID, AF = ALT_FREQS, MAF = pmin(ALT_FREQS, 1 - ALT_FREQS))]
+maf_table
 ```
 
+Since the estimated allele frequencies of these SNPs are <0.5 from the ALT_FREQS column of the PLINK frequency output, ALT is already the minor allele. Therefore, ALT_FREQS = minor allele frequencies.
+
 #### Question 3: Hardy–Weinberg Equilibrium (HWE) Test
 
 (i) Conduct the Hardy–Weinberg Equilibrium (HWE) test for all SNPs in the .bim file. Then, load the file containing the HWE p-value results and display the first few rows of the resulting data frame.
 
-```         
-# Your code here...
+```{bash,results='hide',message=FALSE, warning=FALSE}
+plink2 --bfile ../data/gwa.qc.A1 --hardy --out ../data/gwa_qc_A1_hwe
+```
+
+```{r, message=FALSE, warning=FALSE}
+hwe <- fread("../data/gwa_qc_A1_hwe.hardy")
+head(hwe)  
 ```
 
 (ii) What are the HWE p-values for SNPs rs1861, rs3813199, rs3128342, and rs11804831?
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}     
+# Create a subset of the four SNPs
+snps_interest <- c("rs1861", "rs3813199", "rs3128342", "rs11804831")
+hwe_subset <- hwe[ID %in% snps_interest, .(SNP = ID, HWE_P = P)]
+hwe_subset
 ```
 
+The HWE p-values for the four SNPs are as follows: rs1861 = 0.274719, rs3813199 = 1.000000, rs3128342 = 0.330273, rs11804831 = 0.113354
+
 #### Question 4: Genetic Association Test
 
 (i) Conduct a linear regression to test the association between SNP rs1861 and the phenotype. What is the p-value?
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}         
+# Linear regression model
+geno <- fread("../data/gwa_A1_subset_additive.raw")
+model <- lm(PHENOTYPE ~ rs1861_C, data = geno)
+summary(model)
 ```
 
+The p-value for the association between SNP rs1861 and the phenotype is <2e-16.
+
 (ii) How would you interpret the beta coefficient from this regression?
 
-```         
-# Your answer here...
-```
+The regression coefficient for rs1861 is 0.97382 (p <2e-16), indicating that each additional copy of the C allele is associated with an increase of approximately 0.97 units of the phenotype on average. The significant p-value suggests evidence of an association between this SNP and the phenotype in our sample.
 
 (iii) Plot the scatterplot of phenotype versus the genotype of SNP rs1861. Add the regression line to the plot.
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}          
+# Load the genotype data
+geno <- fread("../data/gwa_A1_subset_additive.raw")
+
+# Remove missing values before plotting
+plot_data_add <- geno[!is.na(rs1861_C) & !is.na(PHENOTYPE), ]
+
+# Create scatterplot with regression line assuming outcome is phenotype and predictor is genotype of SNP rs1861
+p <- ggplot(plot_data_add, aes(x = factor(rs1861_C), y = PHENOTYPE)) +
+  geom_point(color = "blue", alpha = 0.5) +
+  geom_smooth(aes(group = 1), method = "lm", color = "red") +
+  labs(
+    title = "Scatterplot showing the association between SNP rs1861 and phenotype",
+    x = "Genotype (number of C alleles)",
+    y = "Phenotype"
+  ) +
+  theme_minimal()
+
+p
 ```
 
 (iv) Convert the genotype coding for rs1861 to recessive coding.
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}            
+# Load genotype data
+geno <- fread("../data/gwa_A1_subset_additive.raw")
+
+# Convert coding for rs1861 to recessive coding
+geno_recessive <- geno
+geno_recessive$rs1861_recessive <- ifelse(geno_recessive$rs1861_C == 2, 1, 0)
+
+# Check recoding
+table(geno_recessive$rs1861_C, geno_recessive$rs1861_recessive, useNA = "ifany")
 ```
 
 (v) Conduct a linear regression to test the association between the recessive-coded rs1861 and the phenotype. What is the p-value?
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}         
+# Linear regression model using recessive coding
+model_recessive <- lm(PHENOTYPE ~ rs1861_recessive, data = geno_recessive)
+summary(model_recessive)
 ```
 
+The p-value for the association between SNP rs1861 and the phenotype using recessive coding is <2e-16.
+
 (vi) Plot the scatterplot of phenotype versus the recessive-coded genotype of rs1861. Add the regression line to the plot.
 
-```         
-# Your code here...
+```{r, message=FALSE, warning=FALSE}         
+# Create scatterplot with regression line assuming outcome is phenotype and predictor is genotype of SNP rs1861 using recessive coding
+plot_data_rec <- geno_recessive[!is.na(rs1861_recessive) & !is.na(PHENOTYPE), ]
+
+p_rec <- ggplot(plot_data_rec,
+                aes(x = factor(rs1861_recessive), y = PHENOTYPE)) +
+  geom_point(color = "blue", alpha = 0.5) +
+  geom_smooth(aes(group = 1), method = "lm", color = "red") +
+  labs(
+    title = "Scatterplot showing the association between recessive-coded rs1861 \nand phenotype",
+    x = "Recessive genotype (1 = two copies of allele)",
+    y = "Phenotype"
+  ) +
+  theme_minimal()
+
+p_rec
 ```
 
 (vii) Which model fits better? Justify your answer.
 
-```         
-# Your answer here...
+```{r, message=FALSE, warning=FALSE}
+AIC(model)
+AIC(model_recessive)
 ```
 
+The additive model fits better than the recessive model. In the linear regression analyses, the additive model had a higher R-squared value compared to the recessive model (0.08924 vs. 0.08582). Furthermore, the AIC of the additive model (11276.91) is lower than the AIC of the recessive model (11291.74), indicating that the additive coding provides a better overall model fit. This suggests that the effect of rs1861 on the phenotype is better captured by an allele-dose (additive) effect rather than a recessive effect.
+
 ### Criteria
 
 | Criteria | Complete | Incomplete |

Criteria	Complete	Incomplete
Data Inspection	Correct sample/SNP counts and variable type identified.	Missing or incorrect counts or variable type.
Allele Frequency Estimation	Correct allele and minor allele frequencies computed.	Frequencies missing or wrong.
Hardy–Weinberg Equilibrium Test	Correct PLINK command and p-value extraction in R.	PLINK command or extraction incorrect/missing.
Genetic Association Test	Correct regressions, plots, coding, and interpretation.	Regression, plots, or interpretation missing/incomplete.