Performance optiimisation: Add debug flags, and target cascadelake#210
Performance optiimisation: Add debug flags, and target cascadelake#210
Conversation
|
The model version in the
|
|
🚀 Attempted to deploy 🖥️
|
|
🚀 Attempted to deploy 🖥️
|
|
What was the namelist error @manodeep? The WOMBATlite namelist does need to be updated with these versions. See https://github.com/ACCESS-NRI/access-om3-configs/pull/1225/changes#diff-864e627df2bf0220849373d949dfe5de1979c09c25b0eac03f7c8c647e92c89e |
|
Also |
|
🚀 Attempted to deploy 🖥️
|
|
@dougiesquire I am using the 25km IAF config. Both the 24th Feb and 17th March versions had this error on startup: |
|
I think you'll need the changes in ACCESS-NRI/access-om3-configs#1225 (merged into the 25km IAF WOMBATlite config yesterday). I believe @anton-seaice is updating the other configs as we speak |
|
Thanks both! |
|
Can confirm that run with the first deployment is running successfully after cloning from the cherry-picked 1232 pr. |
Do not merge - I am using this to test performance with various compiler flag combos. Test-bed is with a 20-day run of the 25k IAF config.
debug flags
The
-g3 -grecord-gcc-switches -fno-omit-frame-pointerare good to have and do (should?) not have any meaningful impact on performance. We might have to apply the-fno-omit-frame-pointerafter the optimisation flag (-O2here), since-O1/-O2/etcimplies-fomit-frame-pointer(which will presumably cancel out the previous flag). Similarly,-gimplies-fno-omit-frame-pointeroptimisation flags
Based on testing with OM2, the
-xCORE-AVX2flags produced a 5-10% performance improvement compared to-mavx2/-march=haswelletc. Testing whether a similar performance boost occurs by replacing-march=sapphirerapids -mtune=sapphirerapidswith-xcascadelake🚀 The latest prerelease
access-om3/pr210-2at 8d863ee is here: #210 (comment) 🚀