Cleaner Scripts

R
R-SIG
best practice
R-SIG 02.12.2023
Published

October 9, 2023

1

Code conventions/Style Guides

For general code styling, multiple style guides exist:

Some general things we can look at when refactoring code

Note

Refactoring is the process of making many small improvements to code without altering the code’s output/result.

Markdown vs. Quarto vs. R-Skripte

Markdown has more dependencies, so I would now use .R files if I don’t need the markdown features. In general however, I would highly recommend to work with Quarto, some use cases can be found in our Quarto-Tutorial.
Quarto is just the newer R-Markdown, with more features and bringing the R-Markdown magic to new programming languages.

Load data

It should always be clear which data is loaded and why. Paths should work, so keep that in mind when dependent folders are moved.

Write packages on top of the script.

At the very minimum, write down version number (use sessionInfo()).
This way, it is kind of reproducable which packages you used for the script.
Much much better and definitily recommended: using renv or repro for your R-project.

Duplications

Duplication should be avoided.
They make the code less readable and more error-prone. The most important tools to avoid duplication in R are:

Functions

Every time we do something more than once/twice, we should write that into a function This has several advantages:

  1. We can give the function a name that conveys to the user what happens in the function. This makes the code more readable.
  2. We can easily make changes to the function once and don’t have to update it every time the action is performed.

When the cursor is in the function name, we can press F2 to quickly jump to the function definition.
We can put our functions into another file

Loops

Loops and apply-functions help us to repeat actions multiple times.

Compartmentalization

It makes sense to split big scripts into multiple smaller ones.
This also increases readability and makes it easier to get an overview of what happens in a project. For example, we could put our self-defined functions into a functions.R script and load it into our main script with source("functions.R")

Version Control

Use Version Control for your code. A great option is Git and in addition GitHub.

Footnotes

  1. Image by Markus Spiske on Unsplash.↩︎