This repository has been archived by the owner on Aug 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Scripts used for my Master's Thesis (2015) where I studied the properties of the Milky Way based on the output of the Illustris simulation. I used a k-D Tree search to efficiently searched through and analyze the billion particle data sets. This has been posted with written permission by my previous advisor. Note that coding style has not been u…
cjtaylor1990/illustris_mw_kdtree
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The following is my original README that I wrote for these scripts in 2015. The codes were all written in Python 2.7 and do reference readsnapHDF5.py and readsubfHDF5.py which were codes produced by the Illustris team independently of myself (accessed in 2015). This also requires the use of MPFIT, a script that can be found here (https://code.google.com/archive/p/astrolibpy/downloads). The results of this work were presented in Taylor et al. (2016), MNRAS 461, 3483. (arXiv link: https://arxiv.org/abs/1510.06409). I will not be hosting the data necessary to run these scripts, nor the Illustris team's scripts, but these can be accessed via http://www.illustris-project.org/data/ In this project, I used both the particle data and the FOF halo data from the Illustris and Illustris-Dark simulations to estimate the mass of the Milky Way (~1x10^12 solar masses) using measured properties (e.g. stellar mass, mass within 50 kpc, etc) and creating a statistical weighting scheme based on these. The biggest advance was done in the kdTree codes (kdTree_10-300kpc.py, kdTree_10-300kpc.py), where I am directly calculating the the mass distribution of every FOF halo in the Illustris catalogue using a k-D Tree search algorithm from SciPy. Individuals before myself used a naive search across massive, unordered arrays that proved to be incredibly inefficient. By using the k-D Tree search algorithm and only loading subsets of particles at a time, I greatly decreased runtime and memory usage by a factor of 100-1000, decreasing the run time significantly. I have included a table (in speed_tests.png) that describes the speed tests for Illustris-3 and 2. Finally, as this is a historical record, I am not going to update the coding style. My coding style has changed since I wrote this and I've largely transitioned to Python 3, but I want this to be an indicator or where I was in 2015. Also, one script (cumdist_mass_sigma_new.py) is currently missing, but it is one of the less important pieces of code. ------------ This is a how-to document on reproducing the plots and table values from Taylor et al. 2016. If you have any questions or concerns, feel free to email me at [email protected]. Note that in almost all cases, brackets (i.e. []) represent some variable of input on the command line in this document, and are not actually typed. 1.) KD Tree codes Names kdTree_10-300kpc.py kdTree_10-300kpc_dark.py Needed files: Need a txt file with the paths to all of the particle files, individually listed readsnapHDF5.py readsubfHDF5.py cosmo_const.py Running the script python [script name] [run] [snap_number] [search_r] [pType] [partFileTxt] run = Illustris-1, Illustris-Dark-1, etc snap_number = The number of the snap shot (135 for z = 0) search_r = Maximal radius that you’re searching at (in this case, 300kpc). This is really redundant, but I haven’t taken the time to eliminate the variable pType = Dark Matter (dm), Stars (star), or Gas (gas) partFileTxt = The text file that contains the particle files paths Outputs: Numpy file called: 'kdTree_10-300kpc_'+pType+'_'+run+'.npy' Inside are arrays for each central sub-halo in e11-13 for M200,c The first three indices of arrays are the sub halo numbers, the group numbers, and the sub halo mass (though, this will not be used in analysis. This is a mass determined by sound binding condition rather than the typical M200,c or anything) After that, is M(<r) for r=10-300 kpc 2.) Getting the masses of the halos (M200,c, M200,m, Mvir, Mstar) Scripts: python get_mcrit200.py python get_mcrit200_dark.py Running: python [script-name] Files Needed: readsubfHDF5.py cosmo_const.py kdTree script output for run of interest (see #1) Output: Numpy array with arrays for each halo [groupNum, m_crit200, m_mean200, m_topHat200, m_star] Notes: m_topHat200 = mvir You will have to do this for each Illustris run. You will have to change the output file name (outFile, line #32) and the data being loaded in (line 16 with variable massInsideData). Finally, everything is in units of solar masses. 3.) Figure 1, Part A - Predicting M(<r) Scripts mass_dist_10-300_weighting_new.py mass_dist_10-300_weighting_new_dark.py Files Needed kdTree output spy files (see 1) cosmo_const.py Running python [script] [weight type] [mass type] Weight type = tophat or gauss (we used the latter) Mass type = 0,1,2,3 (0 = Total Mass, 1 = DM, 2 = Gas, 3 = Star) Output Two numpy files (examples) massdist_deason_e11-13_[mass type]_illustris3.npy veldist.. Where the first is the mass distribution and the other are the corresponding velocity curves. Note that it will show the curves on the screen, but you can exit and they’ll still save Also, note, will have to edit outFile1, outFile2 (lines 158 and 160) depending on constraint (D12 or G14) and simulation name Finally, will have to change lines 28-31 for mass constraint at 50 kpc and 48-50 to give the code the kdTree outputs that you are interested in out_arrays = [r, median, low68Err, high68Err] 4.) Figure 1, Part B - Fitting M(<r) with NFW profile Scripts: overplot_massdists.py Files needed: cosmo_const.py mpfit Outputs from #3 for Illustris-1,2,3 for particular constraint (e.g. D12) Running: python overplot_massdists.py Output: Will plot the two panel plot from the paper’s Figure 1 that you can then save Notes: Will have to change the variables file1,file2,file3, which are the mass distributions for Illustris-1,2,3 (lines 19-21) 5.) Figure 2a - Velocity curves from total mass distributions Script: overplot_vcirc.py Files Needed: cosmo_const.py Outputs from #3 for Illustris-1,2,3 for total mass Running: python overplot_vcirc.py Output: Will plot out Figure 2a (left panel) that you can then save. Notes: Need to change file1,file2 variables (lines 11&12) to be the D12 and G14 constraints massdist files (total mass; output from #3) 6.) Figure 2b - Velocity curves from component mass distributions Running: python overplot_comp_dists.py Files needed: cosmo_const.py Outputs from #3 for Illustris-1,2,3. In particular, the “veldist” files for each component types (dm, gas, star) Notes: Will have to change file1,2,3 variables (lines 11-13) which are the file paths to the veldist files Output: Will plot out Figure 2b 7.) Table 1&2 part 1 - weighing each individual halo by M(<50 kpc) Script: halo_weighing_m50.py halo_weighing_m50_dark.py Files needed: kdTree output files for simulation that you are interested in (see #1) Running: python [script] [constraint_mass] [constraint_sigma] [constraint_name] constraint_mass = mean mass of constraint (peak of gaussian) constraint_sigma = standard deviation of constraint constraint_name = name of constraint being used (e.g. D12) Output: saves array with name given by saveFile variable array = transpose([haloNum, groupNum, boolArray, totalMassInside50, totalMassInside80, totalMassInside100, totalMassInside250]) where boolArray is the weights given to each halo Notes: May need to change the inFile variables in order to suit the simulation being weighed 8.) Table 1, 2, 3 - Getting the confidence intervals (and individual posteriors) Scripts: probdist_mass_new_illustris.py Files Needed: cosmo_const.py Files from #2 and #7 (given by variables on line 12 and 13 Running python [script] [constraint_name] [mass_type] [simulation] constraint_name = name of constraint (e.g. D12 or G14) mass_type = mass over which the posterior will be made sim = simulation name Output: Will print out confidence intervals in units of 10^10 solar mass for 68% and 90% ranges Will plot the posterior over [mass_type] space Will save an array named by the variable outFile (line 138) array = [mass, likelihood] Notes: Will have to do this for all of the simulations and all masses that you would like to get the confidence interval of. Mass types are: mcrit200, mmean200, mvir, mstar, m100, and m250 9.) Figure 3 - Posterior distributions for M(<100kpc) Script: overplot_likelihood_new_m100.py Files needed: cosmo_const.py Illustris-1,2,3 posteriors for m100 for season and gibbons (see #8) Running: python [script] [constraint_name] [mass_type] [simulation] Output: Will plot all of the posteriors for M(<100) over-plotted, which then can then be saved. Notes: May have to change illustris#_file# variables (lines 6-12) 10.) Figure 4 part 1 - Calculating M200,c from general M(<50 kpc) constraint Scripts: script_m50_mCrit200_auto halo_weighing_m50_auto.py cumdist_mass_auto.py Running: ./script_m50_mCrit200_auto Output: This will produce an output .txt file with columns: mean_number, M_crit200 mean, M_crit200 high error bar, M_crit200 low error bar. Name = script_m50_mCrit200.txt The mean_number is used by the script to automatically generate evenly spaced mean constraints on M(<50kpc): mwMassMean = 10.**(np.log10(1.5*10.**11.) + (mean_number)*deltaM) Note that, in cumdist_mass_auto.py, mean_number is given the name m50Value due to the cannibalized nature of the code (this could obviously be changed) Finally, the errors are in the form of mean +upper -lower, and assume a relative standard deviation of 10%. Notes: You can edit the script script_m50_mCrit200_auto by changing the name to the same thing but .txt as an extension. 11.) Figure 4 part 2 - Converting .txt to .npy file Scripts: txt_to_npy_auto.py Running: python txt_to_npy_auto.py [input_name] [output_name] input_name = .txt output file from #10 output_name = name of .npy file Output: An .npy file with a name given by ‘output_name’ inputted by the user [m50, mCrit200, highError, lowError] 12.) Figure 4 part 3 - Fitting the scatter Scripts: m50_v_mCrit200_scatter.py Running: python m50_v_mCrit200_scatter.py Output: Will print out the best fit values of a quadratic fit to the scatter of x = log10(m50/4e11) to y = log10(M200,c): y = Ax^2 + Bx + C Example of output [ 0.32509592 1.6193051 12.04541389] (results using curve_fit) [ 0.00630671 0.00424532 0.00108466] (uncertainty from curve_fit) [ 0.32509593 1.61930511 12.04541389] (results using leastsq) Finally, it will plot the fit to the scatter, over plotted over the scatter plot of m50 v. mCrit200 13.) Figure 4 part 4 - Final plotting of Figure 4 + Median Fit Script: m50_v_mCrit200.py Running: python m50_v_mCrit200.py [m50_weighting_npy] m50_weighting_npy = output from #11 Output: Will plot Figure 4, along with printing out the results from fitting the medians of m50 v. mCrit200 (obtained from #10) Example of output: [ 0.37301615 1.60312584 12.01361125] (best fit from mpfit) [ 0.60983036 0.16185792 0.05504208] (error from mpfit) Note: Will have to change the values in calculating fitY (line 31) to be whatever the output best-fit parameters from the scatter fit (see #12) 14.) Figure 5 - NFW fit to Illustris-Dark Scripts: nfw_fit_new.py Running: python nfw_fit_new.py Output: Will plot the M(<r) distribution from Illustris-Dark (Illustris-Dark-1 as currently written) Note: On line 37, there is a line that defines the name of the input data (as given by the output of #3 for Illustris-Dark) that will be loaded in to be fit and plotted 15.) Figure 6 - Illustris v. Illustris-Dark M(<r) and dM(<r)/dr Script: res_hydro_v_dark_new.py Running: python res_hydro_v_dark_new.py Output: Will output a plot of either the left or right panel of Figure 6, depending on if you have line 118 or 119 commented out Note: Will have to change file paths on lines 8-12 for: 8: Illustris-Dark output from KD-Tree code (see #1) 9: Illustris output from KD-Tree code for dark matter particles (see #1) 10: Illustris output from KD-Tree code for star particles (see #1) 11: Illustris output from KD-Tree code for gas particles (see #1) 12: Illustris output from #2 Also, may have change lines 33-34 as the range of masses you are comparing (in this case, within the 68% interval for the Illustris-1 M200,c result) 16.) Figure 7 - M50 v. Mstar v. M200,c color scatter plot Script: probdist_illustris1_colorscat.py Running: python probdist_illustris1_colorscat.py Output: Will plot out Figure 7, which can be saved using the python GUI Notes: May want to change line 10 and line 29 for the file paths/names 10: gives the name of the output of halo weighing (see #7) 29: loads in the data from the output of #2 (the string in np.load is the file path) 17.) Figure 8 Part A - Getting M200,c based on M50, M80, M100 with varying relative uncertainty Scripts: script_sigma_auto Running ./script_sigma_auto Output Will output a .txt file with the relative uncertainty in M50 (or M80, or M100) along with M200,c and upper and lower uncertainty (68%) Note: Will have to do this for M50, M80, and M100. Will also have to change the two lines within the loop depending on if you want M50, M50, or M100. You can do this by adding a .txt extension. M50: python halo_weighing_m50.py [m50_value] $s python cumdist_mass_sigma_new.py mcrit200 halo_weighing_m50_sigma.npy $s >> script_sigma_m50_mcrit200_deason.txt M80: python halo_weighing_m80.py [m80_value] $s python cumdist_mass_sigma_new.py mcrit200 halo_weighing_m80_sigma.npy $s >> script_sigma_m80_mcrit200_deason.txt M100: python halo_weighing_m100.py [m100_value] $s python cumdist_mass_sigma_new.py mcrit200 halo_weighing_m100_sigma.npy $s >> script_sigma_m100_mcrit200_deason.txt 18.) Figure 8 Part B - Converting the .txt files to .npy files Scripts: txt_to_npy_sigma_auto.py Running: python txt_to_npy_sigma_auto.py [txt_sigma_file] [outfile] txt_sigma_file = output from #17 outfile = .npy version of txt_sigma_file Output: Will produce an .npy file that has [massUncertainty, mCrit200, highError, lowError] Note: Will have to do this for M50, M80, and M100 19.) Figure 8 Part C - Plotting M200,c based on M50, M80, M100 with varying relative uncertainty Scripts: plot_sigma_auto.py Running: python plot_sigma_auto.py Output: Will output Figure 8 Note: May want to change lines 6,7,&8 to match names of your outputs from #18 for m50,m80,m100 20.) Table 4 - Joint Constraint From M50 & Mstar Scripts: halo_weighing_mstar+m50.py Running: python halo_weighing_mstar+m50.py [mstar_value] [mstar_sigma] [m50_value] [m50_sigma] [constraint_name] mstar/m50_value = constraint on m50/mstar …sigma = uncertainty on m50/mstar constraint_name = name to label file (usually use constraint name; e.g. D12+LN15) Output: Will produce .npy files similar to halo_weighing_m50.py, but weighed with both M50 and Mstar preferred values Note: You can then produce confidence intervals using probdist_mass_new_illustris.py
About
Scripts used for my Master's Thesis (2015) where I studied the properties of the Milky Way based on the output of the Illustris simulation. I used a k-D Tree search to efficiently searched through and analyze the billion particle data sets. This has been posted with written permission by my previous advisor. Note that coding style has not been u…
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published