Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 122 additions & 57 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,73 +1,138 @@
# DART_CASES
DART CASE directories from CESM experiments.

This is a way for us to curate what happens during the continued cycling of a CESM experiment.
Some (most?) things are expected to stay static, but better to be safe than sorry.

There is a complication that arises from trying to create a git directory
from an existing CESM CASE directory, and there is also a complication that
arises from trying to create a CESM CASE directory from an existing git directory.

The following strategy is based on the fact the 'master' branch of the git repository
should be empty and each experiment/case will be a unique branch name that will reflect
the CESM 'case' name.

The strategy is:

1. create the CESM CASE in the usual way
2. _clone_ the git repository into a temporary directory ... maybe named **bob** . You should get a branch called _master_ which is (hopefully) empty except for the (hidden) git administration files and this README.
3. make a new git _branch_ in **bob** ... with the same name as the CASE. In the example below, the use of <your_casename> is context-sensitive. We are trying to make a git branch with the same name as the CESM case directory, so sometimes <your_casename> refers to a directory, sometimes it refers to a git branch.
4. copy everything from the CESM CASE to **bob**
5. move the CESM CASE directory _out of the way_ ... maybe call it **backup**
6. rename **bob** to be the original CESM CASE directory name
7. compare the new CESM CASE directory with **backup**
8. add files to the local git repository - this should be on the branch that matches your CASE. You can confirm with _git status_
9. commit them to the local git repository
10. push the contents of the local git repository back to GitHub. When you cloned the repository in Step 2,
you automatically get a _remote_ called _origin_ but the GitHub repository has no knowledge of your new branch, so there is a special syntax to push the new branch to the GitHub repository.
11. delete **backup** - or at least make it readonly to prevent you from actually using it.
DART CASE directories from CESM+DART experiments.

This presents ways to archive the experiment setups for publication purposes
and curate what happens during the continued cycling of a CESM experiment.
Some (most?) things are expected to stay static, but better to be safe than confused.

Creating a CESM case setup under git control is reasonably straightforward.
Putting an existing case there is a little more complicated.

The following strategies are based on making the 'main' ('master') branch
of the git repository have no case files in it, and each experiment/case
will be a unique branch named the same as the CESM 'CASE' name.
The separation is made clearer by also putting each case in a subdirectory
also named the same as the CASE.
The most useful way to do this is to make a new git clone for each new case.
This allows multiple cases to be active, and prevents a running job
from one case making changes to files belonging to a different branch.
All of these branches will be pushed to the same github repository.
They should not be merged into main, in order to keep the size of the branches small,
which allows more to fit within github's limit on free repositories.
Comment on lines +20 to +21
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the GitHub limit is everything (all branches) in the repo.


## Terminology

| Word | Meaning|
:--------------- |:-----------------------------------------------------------------------------------------|
|CASE | your CESM CASE name, which is the name of the assimilation and the git branch |
|DART\_CASES | your github repository for archiving assimilation setups |
|CASE_git | the local clone of DART\_CASES |
|EXP | the directory where you want the CASE_git clones to be created |
|CASEROOT | the path in which CESM builds CASE and from which assimilation jobs are launched. 'EXP/CASE_git/CASE' below |
|DART | your clone of DART |

## Create a New CASE Under git(hub)

### Preliminaries

0. If you're not an employee in NSF NCAR's Data Assimilation Research Section,
or are, but want to keep your cases separate from other DAReS CASEs,
fork https://github.com/NCAR/DART_CASES.git to your github.
1. Put a copy of list\_of\_files\_to\_commit.csh (found in the top directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add everything in the case?

of branch "main") into your shell\_scripts directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is shell_scripts in
DART/modesl/cam-fv/shell_scripts ?

Edit it ( or rewrite in your favorite language) to specify which files
should be under git control.
See comments in that file about how to choose the files.
2. In DART/models/\<your\_model\>/shell\_scripts/\<your\_setup\_script\>
1. Set the CASE name for this assimilation.
If you're **re**using a DART\_CASES-controlled case name,
clean the existing git branch, which might mean just

` git rm -r $CASE ; git branch -d $CASE `
Comment on lines +47 to +51
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you delete an existing branch?
What if someone else made the case?


2. Specify that this case will be built in EXP/CASE\_git/CASE.
3. Add lines like this before create\_newcase

```
git clone [email protected]:<your_github>/DART_CASES.git $EXP/${CASE}_git `
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for <your_github> rather than NCAR/DART_CASES

git branch -a | grep $CASE
if ($status == 0) exit
```
4. add lines near the end to put CASE under git control:
+ cd \$CASEROOT
+ git checkout -b \$CASE
+ \$DART/\<your\_model\>/shell\_scripts/list\_of\_files\_to\_commit.csh
+ git commit -m "First commit of files in case \$CASE "

This will create a *branch* in your CASE_git clone which has the name \$CASE.

### Production

3. Run the setup script
This will create a *directory* in your CASE_git clone which has the name \$CASE.
Follow the instructions about checking the setup.
4. If needed, in the CASEROOT directory, edit input.nml, \*.xml, etc.
to define your assimilation job correctly.
Add, commit, and push these to a new (-u) experiment git(hub) branch.

` git push -u origin ${CASE} `

5. Submit the job
6. If git-controlled files need to be changed for future DA cycles;
add, commit, and push them to the (correct) branch.

` git push origin ${CASE} `

### Put an Existing CASE Under git(hub) Control in its Own Directory

Example commands follow this list of steps.

1. Change the CASE directory name (made by CESM) to ${CASE}\_orig. (Not strictly necessary.)
2. Clone the DART\_CASES repository into ${CASE}\_git.
3. In ${CASE}\_git make a new *branch* with the name ${CASE} and check it out.
4. Copy everything from ${CASE}\_orig into a new subdirectory ./${CASE}
5. Compare this copy with ${CASE}\_orig
6. Make ${CASE}\_orig read-only, so that you don't accidentally use it.
7. Update the CASEROOT variable in CESM.
8. `git add` the files that should be archived. You can use a variant of list\_of\_files\_to\_commit.csh
9. Commit them (with a useful message)
10. Push the contents of the CASE branch to DART\_CASES. There is a special syntax to push the new branch to the GitHub repository when the branch does not exist yet in the "origin".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you taking about push -u here?
You can push any branch to the origin.

11. Ongoing; add, commit, and push any changes to the CASE which should be archived.


#### Tcsh Example

```
example[1]% cd cases/<your_casename>
[step 0]% set CASE = <your_casename>

example[-]% cd ..
[step 1]% mv ${CASE} ${CASE}_orig

example[2]% git clone [email protected]:NCAR/DART_CASES.git bob
Cloning into 'bob'...
remote: Enumerating objects: 6, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 1), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (6/6), done.
Resolving deltas: 100% (1/1), done.
[step 2]% git clone [email protected]:<your_github>/DART_CASES.git ${CASE}_git

example[-]% cd bob
example[3]% git checkout -b <your_casename>
example[-]% cd ..
[step 3]% cd ${CASE}_git
[step 3]% git checkout -b ${CASE}

example[4]% rsync -av <your_casename>/ bob/
sending incremental file list
./
.case.run
.env_mach_specific.csh
.env_mach_specific.sh
...
[step 4]% rsync -av ../${CASE}_orig/ ${CASE}

example[5]% mv <your_casename> backup
[step 5]% diff -r ../${CASE}_orig ${CASE} | less

example[6]% mv bob <your_casename>
[step 6]% chmod 644 ../${CASE}_orig

example[7]% <satisfy yourself these directories are 'identical' - caveat the git administration files>
[step 7]% xmlchange --caseroot <full_path_new_CASEROOT>

example[-]% cd <your_casename>
[step 8]% git add <files_you_want_to_archive>

example[8]% git add <whatever_files_you_want>
[step 9]% git commit -m "First commit of files which must be archived"

example[9]% git commit
[step 10]% git push -u origin ${CASE}

example[10]% git push -u origin <your_casename>
```

## Wild West Method

If the methods above aren't available to you, then hopefully you can use parts of them
to set up your case under git control.
There are too many degrees of freedom to describe how to do it
1 operating system command at a time.

example[-]% cd ..

example[11]% rm -rf backup
```
181 changes: 181 additions & 0 deletions list_of_files_to_commit.csh
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
#!/bin/tcsh

# A CESM case + DART has a large number of files because of the ensemble size.
# Many of them change with every job submission, but have uninformative changes.
# Here is a list of files archived for the CAM6+DART Reanalysis
# before each month of assimilation or change in the setup.
# This may not be comprehensive for your case, and probably names files which you don't need,
# especially if you're running a non-F-compset (CAM) assimilation.
# File types that have a file for each instance (member) are represented in this list
# by only the first 2 instances, so that they can be compared to see whether there are
# differences between instances.
#
# This list uses csh wildcards (*, []) to shorten the list.

foreach ftype ( \
'*.csv' \
'.case.run' \
'.env_mach_specific.csh' \
'.env_mach_specific.sh' \
'Buildconf/camconf/CESM_cppdefs' \
'Buildconf/camconf/Filepath' \
'Buildconf/camconf/atm_in' \
'Buildconf/camconf/chem_mech.doc' \
'Buildconf/camconf/chem_mech.in' \
'Buildconf/camconf/config_cache.xml' \
'Buildconf/camconf/docn_in' \
'Buildconf/camconf/drv_flds_in' \
'Buildconf/camconf/namelist' \
'Buildconf/cice.input_data_list' \
'Buildconf/ciceconf/ice_in' \
'Buildconf/ciceconf/namelist_infile' \
'Buildconf/clmconf/config_cache.xml' \
'Buildconf/clmconf/drv_flds_in' \
'Buildconf/clmconf/lnd_in' \
'Buildconf/clmconf/namelist' \
'Buildconf/cpl.input_data_list' \
'Buildconf/cplconf/*_modelio.nml_000[12]' \
'Buildconf/cplconf/drv_in' \
'Buildconf/cplconf/namelist_infile' \
'Buildconf/cplconf/seq_maps.rc' \
'Buildconf/docn.input_data_list' \
'Buildconf/docnconf/docn.streams.txt.prescribed_000[12]' \
'Buildconf/docnconf/docn_in' \
'Buildconf/docnconf/namelist_infile' \
'Buildconf/mosart.input_data_list' \
'Buildconf/mosartconf/mosart_in' \
'Buildconf/mosartconf/namelist_infile' \
'CaseDocs/*_in_000[12]' \
'CaseDocs/*_modelio.nml_000[12]' \
'CaseDocs/chem_mech.doc' \
'CaseDocs/chem_mech.in' \
'CaseDocs/docn.streams.txt.prescribed_000[12]' \
'CaseDocs/drv_flds_in' \
'CaseDocs/drv_in' \
'CaseDocs/seq_maps.rc' \
'CaseStatus' \
'DART_config' \
'Depends.intel' \
'Macros.make' \
'README.case' \
'SourceMods/*/*' \
'Tools/Makefile' \
'Tools/check_lockedfiles' \
'Tools/getTiming' \
'Tools/mkDepends' \
'Tools/mkSrcfiles' \
'Tools/save_provenance' \
'archive_metadata' \
'assimilate.csh' \
'case.build' \
'case.cmpgen_namelists' \
'case.qstatus' \
'case.setup' \
'case.st_archive' \
'case.submit' \
'check_case' \
'check_input_data' \
'compress.csh' \
'data_scripts.csh' \
'diags_rean.csh' \
'env_archive.xml' \
'env_batch.xml' \
'env_build.xml' \
'env_case.xml' \
'env_mach_pes.xml' \
'env_mach_specific.xml' \
'env_run.xml' \
'input.nml' \
'launch_cf.sh' \
'mv_to_campaign.csh' \
'no_assimilate.csh' \
'pelayout' \
'pre_purge_check.csh' \
'pre_submit.csh' \
'purge.csh' \
'repack_project.csh' \
'repack_st_arch.csh' \
'software_environment.txt' \
'stage_cesm_files' \
'stage_dart_files' \
'submit_compress.csh' \
'update_dart_namelists' \
'user_nl_*_000[12]' \
'user_nl_cpl' \
)
git add $ftype
end
exit

--------------
Other files archived in the Reanalysis github:
'assimilate.csh.template' \
'.gitignore' \
'README.md' \
'stage_cesm_files.template' \
Buildconf/cam.input_data_list.sorted
Buildconf/clm.input_data_list.sorted
CaseDocs/nml_in_0001.tar
CESM_instructions.txt
DART_instructions.txt
O2-xHost_def-envir_2022-2-23/filter
O2-xHost_def-envir_2022-2-23/minimal_build.out
O2-xHost_def-envir_2022-2-23/obs_def_mod.f90
O2-xHost_def-envir_2022-2-23/obs_kind_mod.f90
O2-xHost_obs_typeOK_2022-3-10/filter
O2-xHost_obs_typeOK_2022-3-10/obs_def_mod.f90
O2-xHost_obs_typeOK_2022-3-10/obs_kind_mod.f90
O2-xHost_obs_typeOK_2022-3-10/quickbuild.out
add_user_docn.streams.csh
all_but_submit
assim.csh.added
assim_no_debug4ben.csh
assim_post_filter.csh
backup_manually.casper
backup_manually.cpl
backup_manually.csh
bias_from_obs_seq_output.csh
call_mv_to_cs.csh
case.setup.MPI_COMM_MAX
caseroot_script_list
cesm_exe_220214-143228.tgz
chng_hybrid2branch.csh
compress.csh_proj2camp
compress_hist.csh
compress_joblogs.csh
compress_st-arch.csh
copy_atts.csh
diff.csh
env_archive.xml.original
env_batch.xml.original
env_build.xml.original
env_case.xml.original
env_mach_pes.xml.original
env_mach_specific.xml.original
env_run.xml.original
env_run_branch_first_2012.xml
env_run_branch_second_2012.xml
env_run_pre_branch.xml
finish_june
first_try
fix_yearly_atts.csh
hist_cleanup_2012-10
logs/run_environment.all_cycles_1_month
matlab_norm.csh
mover.csh
mover_proj2scratch.clm2_2011-3
mover_proj2scratch.cpl_2011-3
mover_proj2scratch.csh
mv_2011-05_to_CS_intera.atm.hist
mv_2011-05_to_CS_intera.esp.hist
mv_2011-05_to_CS_intera.logs
mv_2012-09-17-00000.csh
recreate_Reanalysis.notes
repack_hwm.csh
repack_st_arch-thru2013-12.csh
repack_st_arch.csh_proj2camp
repack_st_arch_tidy_mess.csh
sst+spinup.pptx
submit_compress_hist.csh
diags_batch.csh
not_in_DART