Lab code should be version controlled and hosted on an online repository (probably github). If you don't know how to do that, read on!
Git is a distributed version control system (DVCS). Basically, that means that you keep track of your code on whatever system you're working on, and then merge with whatever other systems need to have the code. No individual code location needs to be the "source of truth" (though in practice we can establish one).
The best way to learn git is to use it. Kevin will push you to use it frequently, and if you run into problems, try to solve them on your own, then ask Kevin to help!
In the meantime, this page has some resources to get you started.
These terms will come up frequently when using git and github, and are included here as a reference. A more complete list can be found here
git add $FILE
, then $FILE
is now staged)add
- include the file or file change in the next commitcommit
(verb) - register staged changes to the version historypush
- move commits from your local repo to a remote.
If you're not paying attention, this can lead to merge conflicts
(which is OK! git
is built to handle this).pull
- move commits from a remote to your local repo.
If you're not paying attention, this can lead to merge conflicts.git
social networks (eg github),
an interface to comment on a branch that could be merged into the main
branch.
On gitlab and some other forges, this is called a "merge request" insteadgit
social networks like (eg github),
an interface to discuss problems, potential features, or other information about a repo.If you don't already have a github account, you can create one here. If you already have a github account, it is totally fine to use the same one. You are also welcome to create a work-specific one, though it can be complicated to manage multiple users on your local system.
If you haven't already, be sure to complete the new lab member form, and include your github username so that you can be added to lab repos.
git diff
to look back through your changes,
and if possible break them up into smaller commits (and force yourself to write informative commit messages!)push
after every commit, but try to make sure to sync up with your remote
before taking a break or ending for the day.main
branch.
But the moment you start collaborating, you will need to use branches.
It can be good to practice this habit even when it's not necessary.If you are starting a brand new project or repository,
you can simply create a new directory (mkdir my_project
),
and then inside that directory (cd my_project
),
run git init
to give the directory super powers.
Typically, the first file that you should create is a README.md
where you describe the purpose of the repository.
Commit the readme, and you're off to the races!
As soon as is practical, create a remote on github,
and add it as the origin
remote by copying the URL,
and running git remote add origin $URL
.
Then git push -u origin main
will push your local commits
and set origin
as the default remote, and main
as the default branch.
For future push
es, you won't need to add that information.
More often, you will be starting from an existing repository.
If you are starting a new analysis project,
presentation, or poster,
you can also select the appropriate template repository,
and create a new repo on github from the template.
Otherwise, just find the repo, copy the URL,
and run git clone $URL
.
This will download the project to your local system,
(into a directory the same name as the project)
and set the remote as origin
.
In typical software development, branches are used to develop specific "features", typically one feature per branch. This keeps code that might be affecting other parts of the code base separate from the main branch until it's working and tested.
If you are working on software in the lab, this is also how we use branches, but if your repo is an analysis repo, the analogy breaks down a bit. Even so, you can and should use branches to organize work on a particular analysis, or when you start working on a manuscript or poster.
If nothing else, using a branch makes it easy to start a pull request (see the next section) to facilitate discussion of the thing you're working on.
Pull requests (PRs) track discussion and code changes
before they are merged to the main
branch.
They are not a feature of the version control system,
they are a feature of the "social network" layer -
eg. github or gitlab.
Still, PRs are powerful tools for visualizing and organizing your work, and allow improved discussion and code review of specific topics.
TODO
TODO
Each student, postdoc, or other staff member should have a personal repository that will be used to plan and keep track conflict progress, host meeting agendas/notes, host presentations, etc.
Each software or analysis project should have its own repo or repos