|
90 | 90 | {class}`.SweepGenicSelection`
|
91 | 91 | : Selective sweep at a linked locus
|
92 | 92 |
|
| 93 | +{class}`.NeutralFixation` |
| 94 | +: a neutral fixation at a linked locus |
93 | 95 | ---
|
94 | 96 |
|
95 | 97 |
|
@@ -2274,6 +2276,122 @@ multiple models. See the {ref}`sec_logging` section for more
|
2274 | 2276 | details.
|
2275 | 2277 | :::
|
2276 | 2278 |
|
| 2279 | +(sec_ancestry_models_neutral_fixations)= |
| 2280 | + |
| 2281 | +### Neutral Fixations |
| 2282 | + |
| 2283 | +The same sort of structured coalescent that we used above to model |
| 2284 | +selective sweeps can also be used to model the coalescent conditional |
| 2285 | +on the fixation of a selectively neutral mutation, what we call in |
| 2286 | +msprime the {class}`.NeutralFixation` model. |
| 2287 | + |
| 2288 | +As the {class}`.SweepGenicSelection` model, |
| 2289 | +the population is split between two classes of selective backgrounds, |
| 2290 | +those linked to the derived neutral allele, call it {math}`B`, and those not, |
| 2291 | +{math}`b`. As with the {class}`.SweepGenicSelection` model, |
| 2292 | +the user supplies a final allele frequency and a starting |
| 2293 | +allele frequency, between which msprime simulates a stochastic |
| 2294 | +drift trajectory according to a conditional diffusion model that |
| 2295 | +conditions on eventual loss of the allele (looking backwards in time). |
| 2296 | + |
| 2297 | +Beyond the start and end frequencies of the sweep trajectory, the user |
| 2298 | +must specify the position along |
| 2299 | +the chromosome where the focal allele occurs. |
| 2300 | +All other parameters can be set as usual. |
| 2301 | + |
| 2302 | +:::{warning} |
| 2303 | +The implementation of the structured coalescent is quite general, but there are |
| 2304 | +some limitations in the current implementation of the sweeps model (e.g., no |
| 2305 | +change of population size during a sweep). Please let us know if there are |
| 2306 | +related features you would like to see (or if you would be interested in |
| 2307 | +helping to create this functionality!) |
| 2308 | +::: |
| 2309 | +#### An example of a neutral fixation simulation |
| 2310 | + |
| 2311 | +First we set up some basic parameters, and define the sweep model: |
| 2312 | + |
| 2313 | +```{code-cell} |
| 2314 | +Ne = 1e3 |
| 2315 | +L = 1e6 # Length of simulated region |
| 2316 | +num_reps = 100 |
| 2317 | +
|
| 2318 | +# define hard sweep model |
| 2319 | +nf_model = msprime.NeutralFixation( |
| 2320 | + position=L / 2, # middle of chrom |
| 2321 | + start_frequency=1.0 / (2 * Ne), |
| 2322 | + end_frequency=1.0 - (1.0 / (2 * Ne)), |
| 2323 | + dt=1e-6, |
| 2324 | +) |
| 2325 | +``` |
| 2326 | +The start and end frequencies here specify that we will look at |
| 2327 | +the complete sojourn of the neutral fixation, from frequency |
| 2328 | +1/2N to frequency 1 looking forward in time. |
| 2329 | + |
| 2330 | +Next we set up the replicate simulations. As described |
| 2331 | +in the {ref}`sec_ancestry_models_specifying_multiple` section, |
| 2332 | +the ``model`` parameter is a list when we want to run multiple |
| 2333 | +ancestry models. As in the hard sweep example above, |
| 2334 | +we here have a neutral fixation that occurs |
| 2335 | +in the immediate past, and is then followed by the |
| 2336 | +standard coalescent for the rest of time: |
| 2337 | + |
| 2338 | +```{code-cell} |
| 2339 | +reps = msprime.sim_ancestry( |
| 2340 | + 5, |
| 2341 | + model=[nf_model, msprime.StandardCoalescent()], |
| 2342 | + population_size=Ne, |
| 2343 | + recombination_rate=1e-7, |
| 2344 | + sequence_length=L, |
| 2345 | + num_replicates=num_reps, |
| 2346 | +) |
| 2347 | +``` |
| 2348 | + |
| 2349 | +:::{note} |
| 2350 | +Because the {class}`.NeutralFixation` model has a random |
| 2351 | +{ref}`duration<sec_ancestry_models_specifying_duration>` we do |
| 2352 | +not set this value in the model. Please see the |
| 2353 | +{ref}`sec_ancestry_models_specifying_completion` section for |
| 2354 | +more discussion on this important point. |
| 2355 | +::: |
| 2356 | + |
| 2357 | +Once we've set up the replicate simulations we can compute the |
| 2358 | +windows for plotting, run the actual simulations |
| 2359 | +(see the {ref}`sec_randomness_replication` section for more details) |
| 2360 | +and compute the |
| 2361 | +{meth}`pairwise diversity<tskit.TreeSequence.diversity>`. |
| 2362 | + |
| 2363 | +```{code-cell} |
| 2364 | +wins = np.linspace(0, L, 21) |
| 2365 | +mids = (wins[1:] + wins[:-1]) / 2 |
| 2366 | +diversity = np.zeros((num_reps, mids.shape[0])) |
| 2367 | +for j, ts in enumerate(reps): |
| 2368 | + diversity[j] = ts.diversity(windows=wins, mode="branch") |
| 2369 | +``` |
| 2370 | + |
| 2371 | +Finally, we can plot the observed mean diversity across the replicates |
| 2372 | +and compare it to the neutral expectation: |
| 2373 | + |
| 2374 | +```{code-cell} |
| 2375 | +from matplotlib import pyplot as plt |
| 2376 | +
|
| 2377 | +plt.plot(mids, diversity.mean(axis=0), label="Simulations") |
| 2378 | +plt.axhline(4 * Ne, linestyle=":", label=r'Neutral expectation') |
| 2379 | +plt.ylabel(r'Branch $\pi$'); |
| 2380 | +plt.xlabel('Position (bp)') |
| 2381 | +plt.legend(); |
| 2382 | +``` |
| 2383 | + |
| 2384 | +:::{note} |
| 2385 | +We use the "branch" measure of pairwise diversity which is |
| 2386 | +defined in terms of the trees rather than nucleotide sequences. See |
| 2387 | +[Ralph et al. 2020](https://doi.org/10.1534/genetics.120.303253) |
| 2388 | +for more information. |
| 2389 | +::: |
| 2390 | + |
| 2391 | +As we can see, the neutral fixation has reduced variation in the region |
| 2392 | +most very closely linked to the focal allele but not in nearly as |
| 2393 | +dramatic a fashion as the hard sweeps we simulated above. |
| 2394 | + |
2277 | 2395 | (sec_ancestry_missing_data)=
|
2278 | 2396 |
|
2279 | 2397 | ## Missing data
|
|
0 commit comments