This post describes the prerequisites that I will generally assume you have if you want to work with me. It also contains a list of references where you can learn these prerequisites. Please let me know if you find any additional resources particularly useful so I can add them for the benefit of others. This list is by definition incomplete - you should regard it as a minimum.
Table of Contents
- 1 Prerequisites
- 2 Standards and Expectations
- 3 Administration, Shell, Etc.
- 4 Programming
- 5 Documentation
- 6 Physics
This section describes various coding standards, conventions, and expectations for your workflow. Some og these are just ideas I am developing: feedback is welcome.
Like an experimentalist, please keep a lab notebook where you describe each day what you are working on, and summarize at the end of the day what you have accomplished. This serves several purposes:
- Quickly start working again: you have a summary of where you left off and what still needs to be done.
- History of work: Sometimes you will remember that you solve a similar problem. This is where you can see what and when you solved this. (The "when" might help in identifying corrupt simulation data that was made before a bug was fixed for example.)
- Introspection: by keeping track of what you do, you can see where you are effective and where you waste time. This can help you develop better work habits.
How you do this is your choice: I tend to keep a
Notes.md file in the root of each project where I record project specific information. The only problem with this approach is that I must search each project to get an overall history of what I have done. Suggestions on how to get the benefit of both a global and local lab notebook are welcome. Here is an example:
... 20 June 2018 ============ * Working on gpe/soc.py::State1 for SOC with modified dispersion. Documenting the use in Docs/SOC.ipynb. I am running into some issues with the initial state preparation which seems overly complex. Revisiting to see if we can make this simple again. * One thing that is complicating stuff is the search for time-dependent parameters in the `Experiment.init()` which uses `inspect.getmembers(self, predicate=callable)`. This "gets" `fiducial_V_TF` triggering an early call to `get_fiducial_V_TF()`. This is now preempted. * I am also trying to make bec.py and bec2.py mirror each other better, so I am renaming the key methods so that they match: get_Vs_TF() -> get_V_TF() * I tried implementing Maren's procedure of quenching into a state with SOC to measure the frequency of rotation on the Bloch sphere in `SOC/Catch and Release.ipynb`. This does not match the expectation: in particular, as Omega -> 0 with delta=0 one still has w=4E_R/hbar. * Made code a bit more robust with respect to these issues, but still have not resolved it. 21 June 2018 ============ * Resolved issue with Maren's quenching procedure: the ground state with no phases corresponds to a state with k=-k_r in the rotating phase basis. With this included in the Bloch sphere picture, I get the correct frequency. ...
Please following a coding style. Which guidelines you choose will depend on language, but stick with it, and enforce it in your editor and tests.
Python: Follow most of the recommendation in PEP-8. You can check your code with the following tools:
Flake 8. I ignore the following warnings (this codes in my
setup.cfgfile in my projects:
[flake8] # E221 Spaces before operators (x = 4 + 4) # E225 Requires space arround *all* operators (I like compact x**2 for example.) # E226 Requires whitespace around arithmetic operators # E241 Extra spaces after commas a = (1, 2) is not allowed # W293 Blank line containes whitespace # W503 line break before binary operator ignore = E225,E226,W293,W503 max-complexity = 15 # This is the limit at which I can get 3 full emacs windows open. max-line-length = 85
Make sure your code is well tested using a framework like pytest and run it frequently - ideally before all commits. Use code coverage tools such as coverage which integrates with pytest. Aim for 100% code coverage through automated tests (difficult to achieve though).
- Get code working, write tests, then modify to make sure you don't break anything.
- Separate fast tests from slow tests so you can run the basic things quickly and regularly. Run the slow tests, but don't let them slow you down.
- Continuous integration (CI) is a great idea, but I have not implemented it yet, mostly due to issues with setup of Conda environments on the CI frameworks.
Version control all of your work, and host it on a site like our Heptapod server (mercurial and private work), GitLab (has an educational program but WSU should do this as a whole: can do it on more finely grained divisions if needed), or github (good for public work).
Personally I recommend mercurial as It think it is much easier to use than git. If using our Heptapod server, then you should follow their workflow recommendations:
A couple of important points and limitations.
- Named branches can have only one head. If you want to work on an independent branch, you will either need to merge or rebase that before pushing
Commit messages should start with a single line <~ 80 characters summarizing the commit, followed by details. E.g.:
ENH,BUG,WIP: Almost working 1D code for homogeneous and lattice systems. homogeneous.py: - Added global arguments so you can increase integration precision for testing. - Use ufloat() for integrals (testing needs updates). - Added quadl which computes lattice integrals and twists for benchmarking. Note: This uses np.trapz which may not be exactly the same as the lattice code (needs testing). - Consistent order of arguments (mus, delta) - Add option to Homogeneous1D.get_BCS_v_n_e() to use lattice integration quadl() (L and N) for testing. test_homogeneous.py: - Started updating for ufloat() integrals... incomplete. vortex_2d.py: - Almost working BCS_1D() class (needs testing). - Previous code was missing factor of 1/dV for the density matrix. (Densities need to have this dimension.)
I use the following acronyms to start messages – multiple acronyms can be combined such as
CHK: Checkpoint of my work. Commit often to make sure you do not lose anything, but do not push these to public repos. In mercurial you can do the following:
hg com -m "CHK: Checkpoint" # Commit all files hg bookmark -f CHK # Force this revision to have bookmark CHK hg up -r "p1(tip)" # Update back to the parent. hg revert --all -r CHK # Revert all files back to the checkpoint. hg bookmark -d CHK # Clear bookmark (optional)
I usually just do this manually with explicit revision numbers. Now I can continue working, but have a backup of everything in the
CHKrevision just in case I need to revert something or check. I try to follow the NumPy conventions.
WIP: Work in progress. Consider squashing these, but only if it makes sense.
- STY: Spaces, lines, PEP8, etc. Cleanup that does not affect execution. (I sometimes use SPC but STY is consistent with numpy.
DOC: Update documentation. Please separate documentation updates from code updates. Update the code, then update the corresponding documentation. All notebook commits should be included here. Please strip out unneeded output before committing using
- ENH: Enhancement. New features etc.
- API: The public application programming interface has been changed. These revisions might require users to change their code.
- BUG: Demonstrate, work on, or fix a bug.
- TST: Add or update testing code.
BLD: Update configuration or build scripts such as
setup.py, requirements etc.
You need to be comfortable using a linux shell (I generally use bash). Even if you do not use a linux computer, you will need to loging to other computers such as HPC clusters, where you will be presented with a shell. In particular, you should be able to do the following:
- Run programs from the command line.
- Know why one might need
./programrather than just
- Capture the output and send it to a file and send errors to
- Run a program in the background, bring it to the foreground, kill it etc.
- See what processes are running with
jobs, and then be able to send signals to running programs using
- Use GNU
screento run a program in a virtual terminal and reconnect if you are logged out of a server.
- Inspect and set environment variables.
- Manage your environment with startup files
.profile, etc. Know the difference between these. Have a good process for quickly getting started working on a new computer. (I store all my settings in my configurations project, so I can just clone this and run a command
mmf_initial_setupto get going.)
- Know how to use the
modulecommand (mostly for HPC clusters).
grepetc. to look for information in files, or in the output of commands.
We often need to work on other computers, so you need to have some basic knowledge about how to connect to other computers on the internet. The standard and secure approach is to use SSH:
sshto log into other computers.
ssh-keygento setup passwordless login. (Know about
sshto forward ports from one computer to another. For example, we might like to connect to a
jupyternotebook server running on one computer (a remote server) using a browser on our local machine (laptop). One can use port forwarding to do this securely without having to expose the server to the world (thereby allowing the server to be hacked).
scpto copy files and directories securely from one machine to another.
rsyncin combination with
sshto do the same, but only sending changed files.
Many resoures are available on the world-wide web, so you should understand the following:
- Know what a URL is: i.e.
- Know what an IP address is (note that there is a new standard IPv6 that is starting to be used more frequently).
- Know the difference between
- Know what a proxy is and how or why we might need to use one.
pingto test if a server is up.
- Find out the IP and MAC address of the network devices your machine is using to connect to the internet.
- Efficiently cut and paste text.
- Search for content.
- Perform a search-and-replace with patterns (i.e. regexp).
- Syntax highlighting, auto indent, electric parentheses.
- Expand/hide sections of the file.
- Define useful abbreviations and expand them quickly as needed.
- Change the encoding of a file.
- Run a programs like
pyflakesetc. to check your code.
- Run a spell checker.
Be familiar with Best Practices for Scientific Computing and know how to apply them to your work. (DRY, version control, testing, profiling, debugging, etc.)
- How to compile software. (Even if you don't write in C++ or Fortran, there will be times you need to use a library and you need to know how to build and install it.)
- The difference between static and dynamic libraries, where they go, how to link with them etc.
- How to use Makefiles.
- Know how to use Python. See the following:
- Python Language Tutorial: This is the place to start. Great tutorial written by the language author. Some people find that this this great up to and including section 5, but gets harder beyond this unless you are familiar with concepts like classes from other languages. At this point, Python for Dummies can be helpful.
- Python for Dummies and Python for Data Science for Dummies are useful if you do not have much exposure with data types, classes etc.
- A Student’s Guide to Python for Physical Modeling: Ome found this to be a good introduction to numpy arrays etc. and a useful place to start learning python for physics.
When things go wrong, you need to know how to figure out where the problem lies. You should be able to:
- Print or inspect various quantities at points in your code.
- Instrument your code with debugging symbols (for compiled code).
- Use a debugger to interactively inspect your code.
- Use a debugger to determine where a program crashed based on a core dump.
- Unit test your code.
- Test your tests for code coverage.
- Profile your code using a profiler.
- Optimize the slow spots of your code based on the output of the profiler.
Documenting your code and your work is essential. I recommend you develop a strict and regular strategy for documenting your progress. You should establish and regularly record your progress in the equivalent of an experimentalist's laboratory notebook.
- How to use. (Including good editing environment.)
- Install an up-to-date version of TeXLive.
Markup Languages (for wikis, notes, etc.)¶
Here are some resources that students have found useful for learning various topics.
A primer on quantum fluids: A gentle and practical introduction, but somewhat short on details.
Lagrangian (co-moving) and Eulerian formulation of fluids.
- Chiral Effective Field Theory and Nuclear Forces: Good survey of the current state of affairs (as of 2011) with a very nice appendix with details about the expansion etc. Start here.
- A Primer for Chiral Perturbation Theory: This provides some additional foundational details and is a good supplement to the previous review paper.