Prerequisites¶
This post describes the prerequisites that I will generally assume you have if you want to work with me. It also contains a list of references where you can learn these prerequisites. Please let me know if you find any additional resources particularly useful so I can add them for the benefit of others. This list is by definition incomplete - you should regard it as a minimum.
Standards and Expectations¶
This section describes various coding standards, conventions, and expectations for your workflow. Some og these are just ideas I am developing: feedback is welcome.
Lab Notebook¶
Like an experimentalist, please keep a lab notebook where you describe each day what you are working on, and summarize at the end of the day what you have accomplished. This serves several purposes:
- Quickly start working again: you have a summary of where you left off and what still needs to be done.
- History of work: Sometimes you will remember that you solve a similar problem. This is where you can see what and when you solved this. (The "when" might help in identifying corrupt simulation data that was made before a bug was fixed for example.)
- Introspection: by keeping track of what you do, you can see where you are effective and where you waste time. This can help you develop better work habits.
How you do this is your choice: I tend to keep a Notes.md
file in the root of each project where I record project specific information. The only problem with this approach is that I must search each project to get an overall history of what I have done. Suggestions on how to get the benefit of both a global and local lab notebook are welcome. Here is an example:
...
20 June 2018
============
* Working on gpe/soc.py::State1 for SOC with modified dispersion. Documenting
the use in Docs/SOC.ipynb. I am running into some issues with the initial
state preparation which seems overly complex. Revisiting to see if we can
make this simple again.
* One thing that is complicating stuff is the search for time-dependent
parameters in the `Experiment.init()` which uses
`inspect.getmembers(self, predicate=callable)`. This "gets" `fiducial_V_TF`
triggering an early call to `get_fiducial_V_TF()`. This is now preempted.
* I am also trying to make bec.py and bec2.py mirror each other better, so I
am renaming the key methods so that they match:
get_Vs_TF() -> get_V_TF()
* I tried implementing Maren's procedure of quenching into a state with SOC to
measure the frequency of rotation on the Bloch sphere in
`SOC/Catch and Release.ipynb`. This does not match the expectation: in
particular, as Omega -> 0 with delta=0 one still has w=4E_R/hbar.
* Made code a bit more robust with respect to these issues, but still have
not resolved it.
21 June 2018
============
* Resolved issue with Maren's quenching procedure: the ground state with no
phases corresponds to a state with k=-k_r in the rotating phase basis. With
this included in the Bloch sphere picture, I get the correct frequency.
...
Coding¶
Style¶
Please following a coding style. Which guidelines you choose will depend on language, but stick with it, and enforce it in your editor and tests.
-
Python: Follow most of the recommendation in PEP-8. You can check your code with the following tools:
-
Flake 8. I ignore the following warnings (this codes in my
setup.cfg
file in my projects:[flake8] # E221 Spaces before operators (x = 4 + 4) # E225 Requires space arround *all* operators (I like compact x**2 for example.) # E226 Requires whitespace around arithmetic operators # E241 Extra spaces after commas a = (1, 2) is not allowed # W293 Blank line containes whitespace # W503 line break before binary operator ignore = E225,E226,W293,W503 max-complexity = 15 # This is the limit at which I can get 3 full emacs windows open. max-line-length = 85
-
Testing¶
Make sure your code is well tested using a framework like pytest and run it frequently - ideally before all commits. Use code coverage tools such as coverage which integrates with pytest. Aim for 100% code coverage through automated tests (difficult to achieve though).
- Get code working, write tests, then modify to make sure you don't break anything.
- Separate fast tests from slow tests so you can run the basic things quickly and regularly. Run the slow tests, but don't let them slow you down.
- Continuous integration (CI) is a great idea, but I have not implemented it yet, mostly due to issues with setup of Conda environments on the CI frameworks.
Version Control¶
Version control all of your work, and host it on a site like our Heptapod server (mercurial and private work), GitLab (has an educational program but WSU should do this as a whole: can do it on more finely grained divisions if needed), or github (good for public work).
Personally I recommend mercurial as It think it is much easier to use than git. If using our Heptapod server, then you should follow their workflow recommendations:
-
A couple of important points and limitations.
- Named branches can have only one head. If you want to work on an independent branch, you will either need to merge or rebase that before pushing
Commit messages should start with a single line <~ 80 characters summarizing the commit, followed by details. E.g.:
ENH,BUG,WIP: Almost working 1D code for homogeneous and lattice systems.
homogeneous.py:
- Added global arguments so you can increase integration precision for
testing.
- Use ufloat() for integrals (testing needs updates).
- Added quadl which computes lattice integrals and twists for
benchmarking. Note: This uses np.trapz which may not be exactly the
same as the lattice code (needs testing).
- Consistent order of arguments (mus, delta)
- Add option to Homogeneous1D.get_BCS_v_n_e() to use lattice
integration quadl() (L and N) for testing.
test_homogeneous.py:
- Started updating for ufloat() integrals... incomplete.
vortex_2d.py:
- Almost working BCS_1D() class (needs testing).
- Previous code was missing factor of 1/dV for the density matrix.
(Densities need to have this dimension.)
I use the following acronyms to start messages – multiple acronyms can be combined such as ENH,API,TST
:
-
CHK: Checkpoint of my work. Commit often to make sure you do not lose anything, but do not push these to public repos. In mercurial you can do the following:
hg com -m "CHK: Checkpoint" # Commit all files hg bookmark -f CHK # Force this revision to have bookmark CHK hg up -r "p1(tip)" # Update back to the parent. hg revert --all -r CHK # Revert all files back to the checkpoint. hg bookmark -d CHK # Clear bookmark (optional)
I usually just do this manually with explicit revision numbers. Now I can continue working, but have a backup of everything in the
CHK
revision just in case I need to revert something or check. I try to follow the NumPy conventions. -
WIP: Work in progress. Consider squashing these, but only if it makes sense.
- STY: Spaces, lines, PEP8, etc. Cleanup that does not affect execution. (I sometimes use SPC but STY is consistent with numpy.
-
DOC: Update documentation. Please separate documentation updates from code updates. Update the code, then update the corresponding documentation. All notebook commits should be included here. Please strip out unneeded output before committing using
nbstripout
. - ENH: Enhancement. New features etc.
- API: The public application programming interface has been changed. These revisions might require users to change their code.
- BUG: Demonstrate, work on, or fix a bug.
- TST: Add or update testing code.
-
BLD: Update configuration or build scripts such as
setup.py
, requirements etc.
Administration, Shell, Etc.¶
Working with the Shell¶
You need to be comfortable using a linux shell (I generally use bash). Even if you do not use a linux computer, you will need to loging to other computers such as HPC clusters, where you will be presented with a shell. In particular, you should be able to do the following:
Running Programs
- Run programs from the command line.
- Know why one might need
./program
rather than justprogram
. - Capture the output and send it to a file and send errors to
/dev/null
. - Use
tee
. - Run a program in the background, bring it to the foreground, kill it etc.
- See what processes are running with
ps
andjobs
, and then be able to send signals to running programs usingkill
orkill -KILL
. - Use GNU
screen
to run a program in a virtual terminal and reconnect if you are logged out of a server. - Use
nohup
appropriately.
Environment
- Inspect and set environment variables.
- Manage your environment with startup files
.bashrc
,.profile
, etc. Know the difference between these. Have a good process for quickly getting started working on a new computer. (I store all my settings in my configurations project, so I can just clone this and run a commandmmf_initial_setup
to get going.) - Know how to use the
module
command (mostly for HPC clusters).
Shell Tools
- Use
find
,locate
,grep
etc. to look for information in files, or in the output of commands.
Networking¶
We often need to work on other computers, so you need to have some basic knowledge about how to connect to other computers on the internet. The standard and secure approach is to use SSH:
SSH
- Use
ssh
to log into other computers. - Use
ssh-keygen
to setup passwordless login. (Know about~/.ssh/authorized_keys
.) - Use
ssh
to forward ports from one computer to another. For example, we might like to connect to ajupyter
notebook server running on one computer (a remote server) using a browser on our local machine (laptop). One can use port forwarding to do this securely without having to expose the server to the world (thereby allowing the server to be hacked). - Use
scp
to copy files and directories securely from one machine to another. - Use
rsync
in combination withssh
to do the same, but only sending changed files.
Many resoures are available on the world-wide web, so you should understand the following:
HTTP
- Know what a URL is: i.e.
https://www.google.com:443
. - Know what an IP address is (note that there is a new standard IPv6 that is starting to be used more frequently).
- Know the difference between
http
andhttps
. - Know what a proxy is and how or why we might need to use one.
Networking
- Use
ping
to test if a server is up. - Find out the IP and MAC address of the network devices your machine is using to connect to the internet.
References¶
Editing Files¶
Get to know how to use a powerful text editor. I recommend Emacs or Vi. Whatever editor you choose, make sure you know how to:
- Efficiently cut and paste text.
- Search for content.
- Perform a search-and-replace with patterns (i.e. regexp).
- Syntax highlighting, auto indent, electric parentheses.
- Expand/hide sections of the file.
- Define useful abbreviations and expand them quickly as needed.
- Change the encoding of a file.
- Run a programs like
pylint
,pyflakes
etc. to check your code. - Run a spell checker.
Programming¶
Be familiar with Best Practices for Scientific Computing and know how to apply them to your work. (DRY, version control, testing, profiling, debugging, etc.)
Compiling¶
- How to compile software. (Even if you don't write in C++ or Fortran, there will be times you need to use a library and you need to know how to build and install it.)
- The difference between static and dynamic libraries, where they go, how to link with them etc.
- How to use Makefiles.
Python¶
- Know how to use Python. See the following:
- Python Language Tutorial: This is the place to start. Great tutorial written by the language author. Some people find that this this great up to and including section 5, but gets harder beyond this unless you are familiar with concepts like classes from other languages. At this point, Python for Dummies can be helpful.
- Python for Dummies and Python for Data Science for Dummies are useful if you do not have much exposure with data types, classes etc.
- A Student’s Guide to Python for Physical Modeling: Ome found this to be a good introduction to numpy arrays etc. and a useful place to start learning python for physics.
Debugging¶
When things go wrong, you need to know how to figure out where the problem lies. You should be able to:
- Print or inspect various quantities at points in your code.
- Instrument your code with debugging symbols (for compiled code).
- Use a debugger to interactively inspect your code.
- Use a debugger to determine where a program crashed based on a core dump.
Testing¶
- Unit test your code.
- Test your tests for code coverage.
Profiling and Optimization¶
- Profile your code using a profiler.
- Optimize the slow spots of your code based on the output of the profiler.
Version Control¶
Documentation¶
Documenting your code and your work is essential. I recommend you develop a strict and regular strategy for documenting your progress. You should establish and regularly record your progress in the equivalent of an experimentalist's laboratory notebook.
LaTeX¶
- How to use. (Including good editing environment.)
- Install an up-to-date version of TeXLive.
- http://www.texstackechange.com
Markup Languages (for wikis, notes, etc.)¶
- ReStructuredText
- Markdown
Physics¶
Here are some resources that students have found useful for learning various topics.
Quantum Fluids¶
-
A primer on quantum fluids: A gentle and practical introduction, but somewhat short on details.
-
Lagrangian (co-moving) and Eulerian formulation of fluids.
Chiral Perturbation Theory and Effective Field Theory¶
- Chiral Effective Field Theory and Nuclear Forces: Good survey of the current state of affairs (as of 2011) with a very nice appendix with details about the expansion etc. Start here.
- A Primer for Chiral Perturbation Theory: This provides some additional foundational details and is a good supplement to the previous review paper.