diff --git a/README.md b/README.md index e1d8846..acbd54e 100644 --- a/README.md +++ b/README.md @@ -1,73 +1,138 @@ # DART_CASES -DART CASE directories from CESM experiments. - -This is a way for us to curate what happens during the continued cycling of a CESM experiment. -Some (most?) things are expected to stay static, but better to be safe than sorry. - -There is a complication that arises from trying to create a git directory -from an existing CESM CASE directory, and there is also a complication that -arises from trying to create a CESM CASE directory from an existing git directory. - -The following strategy is based on the fact the 'master' branch of the git repository -should be empty and each experiment/case will be a unique branch name that will reflect -the CESM 'case' name. - -The strategy is: - -1. create the CESM CASE in the usual way -2. _clone_ the git repository into a temporary directory ... maybe named **bob** . You should get a branch called _master_ which is (hopefully) empty except for the (hidden) git administration files and this README. -3. make a new git _branch_ in **bob** ... with the same name as the CASE. In the example below, the use of is context-sensitive. We are trying to make a git branch with the same name as the CESM case directory, so sometimes refers to a directory, sometimes it refers to a git branch. -4. copy everything from the CESM CASE to **bob** -5. move the CESM CASE directory _out of the way_ ... maybe call it **backup** -6. rename **bob** to be the original CESM CASE directory name -7. compare the new CESM CASE directory with **backup** -8. add files to the local git repository - this should be on the branch that matches your CASE. You can confirm with _git status_ -9. commit them to the local git repository -10. push the contents of the local git repository back to GitHub. When you cloned the repository in Step 2, -you automatically get a _remote_ called _origin_ but the GitHub repository has no knowledge of your new branch, so there is a special syntax to push the new branch to the GitHub repository. -11. delete **backup** - or at least make it readonly to prevent you from actually using it. +DART CASE directories from CESM+DART experiments. + +This presents ways to archive the experiment setups for publication purposes +and curate what happens during the continued cycling of a CESM experiment. +Some (most?) things are expected to stay static, but better to be safe than confused. + +Creating a CESM case setup under git control is reasonably straightforward. +Putting an existing case there is a little more complicated. + +The following strategies are based on making the 'main' ('master') branch +of the git repository have no case files in it, and each experiment/case +will be a unique branch named the same as the CESM 'CASE' name. +The separation is made clearer by also putting each case in a subdirectory +also named the same as the CASE. +The most useful way to do this is to make a new git clone for each new case. +This allows multiple cases to be active, and prevents a running job +from one case making changes to files belonging to a different branch. +All of these branches will be pushed to the same github repository. +They should not be merged into main, in order to keep the size of the branches small, +which allows more to fit within github's limit on free repositories. + +## Terminology + +| Word | Meaning| + :--------------- |:-----------------------------------------------------------------------------------------| +|CASE | your CESM CASE name, which is the name of the assimilation and the git branch | +|DART\_CASES | your github repository for archiving assimilation setups | +|CASE_git | the local clone of DART\_CASES | +|EXP | the directory where you want the CASE_git clones to be created | +|CASEROOT | the path in which CESM builds CASE and from which assimilation jobs are launched. 'EXP/CASE_git/CASE' below | +|DART | your clone of DART | + +## Create a New CASE Under git(hub) + +### Preliminaries + +0. If you're not an employee in NSF NCAR's Data Assimilation Research Section, + or are, but want to keep your cases separate from other DAReS CASEs, + fork https://github.com/NCAR/DART_CASES.git to your github. +1. Put a copy of list\_of\_files\_to\_commit.csh (found in the top directory + of branch "main") into your shell\_scripts directory. + Edit it ( or rewrite in your favorite language) to specify which files + should be under git control. + See comments in that file about how to choose the files. +2. In DART/models/\/shell\_scripts/\ + 1. Set the CASE name for this assimilation. + If you're **re**using a DART\_CASES-controlled case name, + clean the existing git branch, which might mean just + + ` git rm -r $CASE ; git branch -d $CASE ` + + 2. Specify that this case will be built in EXP/CASE\_git/CASE. + 3. Add lines like this before create\_newcase + + ``` + git clone git@github.com:/DART_CASES.git $EXP/${CASE}_git ` + git branch -a | grep $CASE + if ($status == 0) exit + ``` + 4. add lines near the end to put CASE under git control: + + cd \$CASEROOT + + git checkout -b \$CASE + + \$DART/\/shell\_scripts/list\_of\_files\_to\_commit.csh + + git commit -m "First commit of files in case \$CASE " + + This will create a *branch* in your CASE_git clone which has the name \$CASE. + +### Production + +3. Run the setup script + This will create a *directory* in your CASE_git clone which has the name \$CASE. + Follow the instructions about checking the setup. +4. If needed, in the CASEROOT directory, edit input.nml, \*.xml, etc. + to define your assimilation job correctly. + Add, commit, and push these to a new (-u) experiment git(hub) branch. + + ` git push -u origin ${CASE} ` + +5. Submit the job +6. If git-controlled files need to be changed for future DA cycles; + add, commit, and push them to the (correct) branch. + + ` git push origin ${CASE} ` + +### Put an Existing CASE Under git(hub) Control in its Own Directory + +Example commands follow this list of steps. + +1. Change the CASE directory name (made by CESM) to ${CASE}\_orig. (Not strictly necessary.) +2. Clone the DART\_CASES repository into ${CASE}\_git. +3. In ${CASE}\_git make a new *branch* with the name ${CASE} and check it out. +4. Copy everything from ${CASE}\_orig into a new subdirectory ./${CASE} +5. Compare this copy with ${CASE}\_orig +6. Make ${CASE}\_orig read-only, so that you don't accidentally use it. +7. Update the CASEROOT variable in CESM. +8. `git add` the files that should be archived. You can use a variant of list\_of\_files\_to\_commit.csh +9. Commit them (with a useful message) +10. Push the contents of the CASE branch to DART\_CASES. There is a special syntax to push the new branch to the GitHub repository when the branch does not exist yet in the "origin". +11. Ongoing; add, commit, and push any changes to the CASE which should be archived. + + +#### Tcsh Example ``` -example[1]% cd cases/ +[step 0]% set CASE = -example[-]% cd .. +[step 1]% mv ${CASE} ${CASE}_orig -example[2]% git clone git@github.com:NCAR/DART_CASES.git bob -Cloning into 'bob'... -remote: Enumerating objects: 6, done. -remote: Counting objects: 100% (6/6), done. -remote: Compressing objects: 100% (4/4), done. -remote: Total 6 (delta 1), reused 0 (delta 0), pack-reused 0 -Receiving objects: 100% (6/6), done. -Resolving deltas: 100% (1/1), done. +[step 2]% git clone git@github.com:/DART_CASES.git ${CASE}_git -example[-]% cd bob -example[3]% git checkout -b -example[-]% cd .. +[step 3]% cd ${CASE}_git +[step 3]% git checkout -b ${CASE} -example[4]% rsync -av / bob/ -sending incremental file list -./ -.case.run -.env_mach_specific.csh -.env_mach_specific.sh -... +[step 4]% rsync -av ../${CASE}_orig/ ${CASE} -example[5]% mv backup +[step 5]% diff -r ../${CASE}_orig ${CASE} | less -example[6]% mv bob +[step 6]% chmod 644 ../${CASE}_orig -example[7]% +[step 7]% xmlchange --caseroot -example[-]% cd +[step 8]% git add -example[8]% git add +[step 9]% git commit -m "First commit of files which must be archived" -example[9]% git commit +[step 10]% git push -u origin ${CASE} -example[10]% git push -u origin +``` + +## Wild West Method + + If the methods above aren't available to you, then hopefully you can use parts of them + to set up your case under git control. + There are too many degrees of freedom to describe how to do it + 1 operating system command at a time. -example[-]% cd .. -example[11]% rm -rf backup -``` diff --git a/list_of_files_to_commit.csh b/list_of_files_to_commit.csh new file mode 100755 index 0000000..840b759 --- /dev/null +++ b/list_of_files_to_commit.csh @@ -0,0 +1,181 @@ +#!/bin/tcsh + +# A CESM case + DART has a large number of files because of the ensemble size. +# Many of them change with every job submission, but have uninformative changes. +# Here is a list of files archived for the CAM6+DART Reanalysis +# before each month of assimilation or change in the setup. +# This may not be comprehensive for your case, and probably names files which you don't need, +# especially if you're running a non-F-compset (CAM) assimilation. +# File types that have a file for each instance (member) are represented in this list +# by only the first 2 instances, so that they can be compared to see whether there are +# differences between instances. +# +# This list uses csh wildcards (*, []) to shorten the list. + +foreach ftype ( \ + '*.csv' \ + '.case.run' \ + '.env_mach_specific.csh' \ + '.env_mach_specific.sh' \ + 'Buildconf/camconf/CESM_cppdefs' \ + 'Buildconf/camconf/Filepath' \ + 'Buildconf/camconf/atm_in' \ + 'Buildconf/camconf/chem_mech.doc' \ + 'Buildconf/camconf/chem_mech.in' \ + 'Buildconf/camconf/config_cache.xml' \ + 'Buildconf/camconf/docn_in' \ + 'Buildconf/camconf/drv_flds_in' \ + 'Buildconf/camconf/namelist' \ + 'Buildconf/cice.input_data_list' \ + 'Buildconf/ciceconf/ice_in' \ + 'Buildconf/ciceconf/namelist_infile' \ + 'Buildconf/clmconf/config_cache.xml' \ + 'Buildconf/clmconf/drv_flds_in' \ + 'Buildconf/clmconf/lnd_in' \ + 'Buildconf/clmconf/namelist' \ + 'Buildconf/cpl.input_data_list' \ + 'Buildconf/cplconf/*_modelio.nml_000[12]' \ + 'Buildconf/cplconf/drv_in' \ + 'Buildconf/cplconf/namelist_infile' \ + 'Buildconf/cplconf/seq_maps.rc' \ + 'Buildconf/docn.input_data_list' \ + 'Buildconf/docnconf/docn.streams.txt.prescribed_000[12]' \ + 'Buildconf/docnconf/docn_in' \ + 'Buildconf/docnconf/namelist_infile' \ + 'Buildconf/mosart.input_data_list' \ + 'Buildconf/mosartconf/mosart_in' \ + 'Buildconf/mosartconf/namelist_infile' \ + 'CaseDocs/*_in_000[12]' \ + 'CaseDocs/*_modelio.nml_000[12]' \ + 'CaseDocs/chem_mech.doc' \ + 'CaseDocs/chem_mech.in' \ + 'CaseDocs/docn.streams.txt.prescribed_000[12]' \ + 'CaseDocs/drv_flds_in' \ + 'CaseDocs/drv_in' \ + 'CaseDocs/seq_maps.rc' \ + 'CaseStatus' \ + 'DART_config' \ + 'Depends.intel' \ + 'Macros.make' \ + 'README.case' \ + 'SourceMods/*/*' \ + 'Tools/Makefile' \ + 'Tools/check_lockedfiles' \ + 'Tools/getTiming' \ + 'Tools/mkDepends' \ + 'Tools/mkSrcfiles' \ + 'Tools/save_provenance' \ + 'archive_metadata' \ + 'assimilate.csh' \ + 'case.build' \ + 'case.cmpgen_namelists' \ + 'case.qstatus' \ + 'case.setup' \ + 'case.st_archive' \ + 'case.submit' \ + 'check_case' \ + 'check_input_data' \ + 'compress.csh' \ + 'data_scripts.csh' \ + 'diags_rean.csh' \ + 'env_archive.xml' \ + 'env_batch.xml' \ + 'env_build.xml' \ + 'env_case.xml' \ + 'env_mach_pes.xml' \ + 'env_mach_specific.xml' \ + 'env_run.xml' \ + 'input.nml' \ + 'launch_cf.sh' \ + 'mv_to_campaign.csh' \ + 'no_assimilate.csh' \ + 'pelayout' \ + 'pre_purge_check.csh' \ + 'pre_submit.csh' \ + 'purge.csh' \ + 'repack_project.csh' \ + 'repack_st_arch.csh' \ + 'software_environment.txt' \ + 'stage_cesm_files' \ + 'stage_dart_files' \ + 'submit_compress.csh' \ + 'update_dart_namelists' \ + 'user_nl_*_000[12]' \ + 'user_nl_cpl' \ + ) + git add $ftype +end +exit + +-------------- +Other files archived in the Reanalysis github: + 'assimilate.csh.template' \ + '.gitignore' \ + 'README.md' \ + 'stage_cesm_files.template' \ +Buildconf/cam.input_data_list.sorted +Buildconf/clm.input_data_list.sorted +CaseDocs/nml_in_0001.tar +CESM_instructions.txt +DART_instructions.txt +O2-xHost_def-envir_2022-2-23/filter +O2-xHost_def-envir_2022-2-23/minimal_build.out +O2-xHost_def-envir_2022-2-23/obs_def_mod.f90 +O2-xHost_def-envir_2022-2-23/obs_kind_mod.f90 +O2-xHost_obs_typeOK_2022-3-10/filter +O2-xHost_obs_typeOK_2022-3-10/obs_def_mod.f90 +O2-xHost_obs_typeOK_2022-3-10/obs_kind_mod.f90 +O2-xHost_obs_typeOK_2022-3-10/quickbuild.out +add_user_docn.streams.csh +all_but_submit +assim.csh.added +assim_no_debug4ben.csh +assim_post_filter.csh +backup_manually.casper +backup_manually.cpl +backup_manually.csh +bias_from_obs_seq_output.csh +call_mv_to_cs.csh +case.setup.MPI_COMM_MAX +caseroot_script_list +cesm_exe_220214-143228.tgz +chng_hybrid2branch.csh +compress.csh_proj2camp +compress_hist.csh +compress_joblogs.csh +compress_st-arch.csh +copy_atts.csh +diff.csh +env_archive.xml.original +env_batch.xml.original +env_build.xml.original +env_case.xml.original +env_mach_pes.xml.original +env_mach_specific.xml.original +env_run.xml.original +env_run_branch_first_2012.xml +env_run_branch_second_2012.xml +env_run_pre_branch.xml +finish_june +first_try +fix_yearly_atts.csh +hist_cleanup_2012-10 +logs/run_environment.all_cycles_1_month +matlab_norm.csh +mover.csh +mover_proj2scratch.clm2_2011-3 +mover_proj2scratch.cpl_2011-3 +mover_proj2scratch.csh +mv_2011-05_to_CS_intera.atm.hist +mv_2011-05_to_CS_intera.esp.hist +mv_2011-05_to_CS_intera.logs +mv_2012-09-17-00000.csh +recreate_Reanalysis.notes +repack_hwm.csh +repack_st_arch-thru2013-12.csh +repack_st_arch.csh_proj2camp +repack_st_arch_tidy_mess.csh +sst+spinup.pptx +submit_compress_hist.csh +diags_batch.csh +not_in_DART