Activities facilitating scientific software development skills at the Flatiron Institute
- Describe vision for Sciware activity
- Motivate and provide an overview of the topics we plan to discuss
- Hear multiple perspectives
- Increase awareness and adoption of software development best practices to improve scientific productivity and quality.
- A positive and inclusive learning environment for all experience levels.
- A blend of lecture, activities, and discussion to facilitate equal involvement across experience levels.
- Topics general enough to be relevant to anyone doing scientific research.
- For example, technology and coding language agnostic
- To use the Flatiron Institute as a sandbox for development of something useful for all of science.
- Blend of presentations and intro followed by hands-on activities
- 1x/month for 2 hours
- Location TBD
- Avoid discussions between a few people on a narrow topic
- Provide time for people who haven't spoken to speak/ask questions
- Find ways to be sure everybody is following and that folks aren't lost
- Make it engaging enough for experts to attend
- Make it accessible enough for novices to attend
- Not how to write code (software development), but
- How to turn code into a product (process)
- "...systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software" (IEEE)
- "...sound engineering principles to economically obtain software that is reliable and works efficiently on real machines" (Fritz Bauer, 1972)
- groups of developers and non-developers working together
- requirements, agile, scrum
- waterfall, release schedules
- QA, support, product management
- coordination between individuals with different skill sets and levels
- Tools to write code faster
- Tools to find bugs sooner
- Shorten the iteration/feedback cycle
- Invest in file structure, function naming
- Comment for your (future) self
- Write tests to prevent later changes from breaking older parts ("regression" tests)
- Productive languages and tools mean less time coding
- Reusable components mean less code to maintain
- A few minutes of optimization can save days of run time
- Good resources (people, documentation) mean less time figuring it out yourself
- People from other fields can ask deceptively simple questions
- Passive way to solicit feedback and code improvements, verify correctness
- Advertise your work and techniques to more people
- Open source is democratizing
- Openness is a fundamental value of science
- Don’t have to reinvent the wheel
- “I have no idea why, but suddenly radio astronomers are interested in fixing bugs in my package?”
- “I just wanted to learn how debuggers work, and then I ended up writing big parts of the Julia compiler and garbage collector…”
- Physics, math, and CS people spontaneously collaborating to create a huge ecosystem of ODE/PDE solving tools
- Force ourselves to ask: what does this code actually do?
- Often writing tests spurs rewrites which make code clearer and more robust
- Easier to catch problems early and know immediately if a change broke something
- Good way to onboard new contributors
- Duplicated code
- Variable names: too short, too long, not meaningful
- Functions: too long, too many arguments, no well-defined purpose
- Comments: redundant, not present, outdated
- Yourself, collaborators, other researchers?
- Lifetime: 1 day, 1 week, 1 month, 1 year, ...?
- A version archiver (use
cp
) - A synchronizer (use
rsync
) - A new, exciting way to make you and your colleagues miserable
- A tool for code collaboration through story-telling
- Change (what?)
- Author (who?)
- Message (why?)
- Why was this line of code changed?
- What was there before?
- When was this bug introduced?
- What changed in this file between v0.1 and v0.2?
- Who has been contributing to this function?
- Benefit doesn't come from tool, but how you use it
Participants all working to actively foster an environment which encourages participation across experience levels, coding language fluency, technology choices, and scientific disciplines.
(These will always be a work in progress and will be updated, clarified, or expanded as needed.)
"Programs must be written for people to read, and only incidentally for machines to execute." (Abelson and Sussman, 1985)
n=>(g=(o,d=N=n+o)=>N%--d?g(o,d):d-1?g(o<0?-o:~o):N)
— Arnauld, Code Golf Stack Exchange
%Coeff = [sPCA_data.Coeff, conj(sPCA_data.Coeff)]; % John April 2016
Coeff = sPCA_data.Coeff;
Freqs = sPCA_data.Freqs;
eigval = sPCA_data.eigval;
clear sPCA_data;
%rad_Freqs = sPCA_data.rad_Freqs;
n_im = size(Coeff, 2);
%n_im = (size(Coeff, 2))/2; % John April 21, 2017
%normalize the coefficients
Coeff(Freqs==0, :) = Coeff(Freqs==0, :)/sqrt(2);
for i=1:n_im %% John April 21, 2017 %% No need to double the coefficients
Coeff(:, i) = Coeff(:, i) / norm(Coeff(:, i));
end
Coeff(Freqs==0, :)=Coeff(Freqs==0, :)*sqrt(2);
%Compute bispectrum
%[ Coeff_b, toc_bispec ] = Bispec_2Drot_large( Coeff, Freqs ); %If the number of images and number of Coefficients are large use Bispec_2Drot_large
%[ Coeff_b, toc_bispec ] = Bispec_2Drot_1( Coeff, Freqs );
%[ Coeff_b, Coeff_b_r, toc_bispec ] = Bispec_2Drot_large_v2( Coeff, Freqs );
[ Coeff_b, Coeff_b_r, toc_bispec ] = Bispec_2Drot_large( Coeff, Freqs, eigval );
How much time do you need to spend to understand what the code does?
What topics we want to cover in next two meetings?