::init() renv
Reproducible manuscripts with Git
Motivation
We have already talked about reproducible manuscripts with Quarto. Now, one big plus of writing in a markdown language is that it is very easy to use with a version control systems like Git, enabling us to leverage the many advantages of version control.
Git
Git is a version control system. Some advantages:
- History: You can see the history of your project, who did what and when. Changes in the project can be easily tracked.
- Collaboration: You can easily work together with others on the same project.
- Backup: Your project lies online, so you don’t have to worry about backing it up.
Mainly it is used for working on code. However, markdown files are also text files, and can therefore be easily version controlled with Git. For this text I assume you are already kind of proficient in working with GitHub. If not, you can take a look at this Getting Started Guide.
Quarto + Git
Generally, there isn’t much new stuff here, if you already work with GitHub. You set up your repo and track your R and Quarto files with Git. In light of reproducability, this is as transparent as we can get. If we use GitHub throughout the whole project, and make the project public, everyone can track what we have done, which decisions we made and why.
We can use Issues to discuss certain points with coauthors and can use Pull Requests and Reviews to discuss changes in the manuscript or analysis.
GitHub Actions
It is considered bad practice to commit rendered documents like PDF or HTML to GitHub. Instead, build them with GitHub Actions. This way, it is always clear what the current version is, and how your code relates to the built output document. GitHub Actions are a way to automate your workflow. You can set up a workflow that runs every time you push to your repository. This can be used to check your code, run tests, or even build your manuscript. The setup is a bit more complex, the complete documentation can be found here.
In this section, I’ll present one possible workflow.
Even if your repository is private, publishing a document like shown in this workflow will make it public, so in theory everyone can see it.
1. renv
First, you have to setup renv
.
This will create a .LOCK
file containing the package versions and a folder in which your project specific packages are saved.
2. render
Now you can render your quarto project once, using the Terminal (not the Console):
quarto render
Commit and push your changes!
Don’t commit your output file, like html. You can exclude it from appearing in your git-interface by adding *.html
to your .gitignore
file.
3. gh-pages branch
After that, you have to set up a gh-pages
branch (make sure you have commited all changes before building the branch), again in the Terminal:
--orphan gh-pages
git checkout --hard # make sure all changes are committed before running this!
git reset --allow-empty -m "Initialising gh-pages branch"
git commit -pages git push origin gh
Your published content will be build from this branch. You don’t have to touch it after setting it up, the Actions we’ll build will take care of that.
4. publish
Finally, you can publish your quarto document:
-pages documentname.qmd quarto publish gh
5. Action
To trigger this publishing everytime you push to your main branch on GitHub, build a new directory in your project called .github/workflows
. Into this directory, you put a file publish.yml
and fill it with the following code:
:
on:
workflow_dispatch:
push: main
branches
: Quarto Publish
name
:
jobs-deploy:
build-on: ubuntu-latest
runs:
permissions: write
contents:
steps- name: Check out repository
: actions/checkout@v4
uses
- name: Set up Quarto
: quarto-dev/quarto-actions/setup@v2
uses
- name: Install R
: r-lib/actions/setup-r@v2
uses:
with-version: '4.4.1'
r
- name: Install R Dependencies
: r-lib/actions/setup-renv@v2
uses:
with-version: 1
cache
- name: Render and Publish
: quarto-dev/quarto-actions/publish@v2
uses:
with: gh-pages
target:
env: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN
You need to check the Read and write permissions box under Workflow permissions in the Actions section of your repository Settings.
You can find the link to the published site under Settings - Pages
. Copy it to put in the About
field of your repo.
There are other workflows available as well.
Caveats
Some words of warning: Everything is online. So you should be carefull to upload sensitive data. Also, the fact that the whole process would be visible to everyone might feel weird. Still, even if you leave the Repo on private, it still is a great thing!
Exercises
- Set up a GitHub repository for the quarto project you worked on in the last sessions. If you don’t upload your stuff to a cloud.
- Make up some small Issue that you can write into the Issue section on GitHub.
- Fix this Issue on a new branch. Commmit the changes, using
closes #Issuenumber
in the commit message, push everything and open a pull request. - Assign someone from the group as reviewer.
- Review a pull-request assigned to you.
- Setup a actions workflow that automatically renders your document.
References
Footnotes
Image by Towfiqu barbhuiya on Unsplash.↩︎