The following is a recommended git workflow for developing notebooks. Its main goals are to encourage regular commits of notebook code and to provide release coordinators with flexibility when preparing to merge content into the master branch. Make sure you obtain contributing permissions for the repository before attempting to make a branch.
There two types of branches:
Individual notebooks or stories
Most developers will do work solely within individual branches, making sure to occasionally fetch and merge changes from the master branch.
In a usual software development project, the second level would more commonly be referred to as feature branches. In this project, individual notebooks take the place of individual features.
Keeping to a single notebook per feature branch gives release coordinators the flexibility to grab notebooks that are ready and leave those that are not when it comes time to prepare for a release.
More than one developer can work on the same notebook branch, and you shouldn't be terribly concerned with checking in polished work at this level, except to the extent that it may disrupt anyone else working with you on the same notebook. A little communication goes a long way here. That said, we expect that it will largely be a single developer working on a particular notebook at a time.
Get comfortable with making regular, small commits to your notebook branch, and push to GitHub fairly often. This makes merging easier down the line, makes it easier for other developers to review progress, and ensures that your work is backed up. Make sure you are only tracking files that you are editing (avoid git add * where possible). Due to the metadata files in Jupyter Notebooks, often times just running a notebook will show up as changes in the code. In this case, use git reset HEAD or git reset --hard to match your branch to what's on the Github repo.
Occasionally developers working on the individual notebook branches will want to do the following:
git fetchgit merge master
to make sure they’re up to date with the upstream changes and are always ready to be merged into master. This will become more important the closer those individual notebook branches get to being integrated into master.
Notebooks will undergo reviews before being added to master. To do this, make a pull request in the repository and add at least one other person as a reviewer. In this process, the Travis continuous integration client will provide automatic code checks. To learn more about using Travis with Pythonm see here.
This represents "releasable" code. In theory, anyone checking out code from the master branch should expect to see the best quality notebook code that we can offer. It won’t necessarily be bug free, but it's free of the bugs we know about (and that are serious enough to block a release).
Capitalize the first letter, no period. Use the imperative "Fix bug" not "Fixes bug" or "Fixed bug". Include the Jira task identifier in the commit message (not in the file title), for example: "CC-33 Add interactivity to the plots". For longer commit message write a short description followed by a blank line and then the longer description. Try to wrap at 72 characters, in Vim this is done with
:set textwidth=72. Use the body to explain what and why but not how.
It’s often helpful to mark a specific commit of the master branch when that commit represents code we know has been "released" and will likely not be continually updated from Github until the next release. This can come in handy when tracking down bug reports that may or may not have been fixed since that release.
Individual developers working on notebook branches do not have to worry about tagging, but in the event that a release coordinator wants to tag a release, they can run the following:
git tag -a v1.0 -m "our first release"
Here are some of the common commands that developers will be using:
git add some_filegit add some_other_files*git statusgit commitgit push
This will create a branch locally and remotely, based off of master:
git checkout master # if not already theregit checkout -b branch_namegit push -u origin branch_name
# Fetch the remote branch and merge it into the local branchgit pullgit checkout branch_name # if not already theregit merge master
git diff staging notebook_a
Compare with upstream changes on the same branch and then merge (more complicated than git pull, but also avoids potential surprise merge conflicts):
git fetchgit checkout branch_namegit diff origin/branch_name# assuming the changes are okay...git merge origin/branch_name
The moves local changes out of the way, making your repository look clean. It can be useful when you realize there are upstream changes but don't want to commit your current changes just to see those upstream changes.
When ready, you can then reapply your local changes via:
git stash pop