Using Checkmate

R
checkmate
tutorial
Author

Edna Grewers

Published

December 1, 2023

1

Intro

With the checkmate package you can test, check and assert all kinds of arguments regarding type, length and much more. You can also write your own assert-functions.

In this Vignette you can get a broad understand how the package works, how the output looks like and some example functions. For more details and a complete overview go to CRAN, specifically the checkmate.pdf or use the following line of code:

lsf.str("package:checkmate")

Setup

Install the package from CRAN with the following code. Then load it in your library.

install.packages("checkmate")
library(checkmate)

Different Outputs

There are three main functions that we use: test, check and assert, which do similar things but produce different outputs. All functions have two ways to write them: test_numeric() with an underscore and testNumeric() with a capital Letter, but they both work the same. For simplicity we’ll just use one option.

To show the differences in outputs we check arguments for numeric input, as an example. See below to see checks for different types or attributes.

Test

Test-functions test whether an argument has certain attributes and gives you TRUE or FALSE output.

test_numeric(c(6:1, 4))
[1] TRUE
test_numeric("hallo") 
[1] FALSE

Check

Check-functions check whether an argument has certain attributes and gives you TRUE or a string containing an error message as an output. In the string you can see what kind of argument you should have given, and what you did wrong.

check_numeric(c(6:1, 4))
[1] TRUE
check_numeric("hallo") 
[1] "Must be of type 'numeric', not 'character'"

Assert

Assert-functions assert whether an argument has certain attributes and throw an error message if you it doesn’t. When you did everything correct, it doesn’t create an output. The error message contains the string from the check-functions.

assert_numeric(c(6:1, 4))
assert_numeric("hallo")
Error in eval(expr, envir, enclos): Assertion on '"hallo"' failed: Must be of type 'numeric', not 'character'.

If you save an assert_numeric() object into a variable x, it will contain the original object that you asserted.

x <- assert_numeric(c(6:1, 4))
x
[1] 6 5 4 3 2 1 4

Checking for Type

With lsf.str() you can see all functions of the package.

lsf.str("package:checkmate")

You can check for specific types of arguments e.g. numeric, number, integer, double, character, string, logical, flag, missing, or data structure e.g. list, data_frame, array and so on. You can also look for attributes e.g. true, subset, named or atomic, and much more.

Depending on what kind of output you want, you choose your function.

check_list(list())
[1] TRUE
check_list(1:9)
[1] "Must be of type 'list', not 'integer'"

If you are looking for a specific function you can use:

ls("package:checkmate", pattern = "atomic") 
[1] "assert_atomic"        "assert_atomic_vector" "check_atomic"        
[4] "check_atomic_vector"  "expect_atomic"        "expect_atomic_vector"
[7] "test_atomic"          "test_atomic_vector"  

Checking for Type: Data Frames

If you want to check the types of elements of a more complex data structure like a list or a data frame, you have to look at the arguments of the functions. Both have the argument types.

First we create an example data frame with rows and columns named. It has numeric and character elements.

df <- data.frame(klein = 1:3, mittel = 4:6, groß = c("7", "8", "9"), row.names = c("A", "B", "C"))
df
  klein mittel groß
A     1      4    7
B     2      5    8
C     3      6    9

Now we can check for the types of the elements using the types argument. You can look at the whole data frame with df, at single columns using df[1] or df[klein] or at single rows using df[1,]. The error message will tell you the first element that has a type you didn’t check for.

# checking the whole data frame
check_data_frame(df, types = c("numeric", "character"))
[1] TRUE
check_data_frame(df, types = "numeric")
[1] "May only contain the following types: {numeric}, but element 3 has type 'character'"
# checking individual columns
check_data_frame(df[1], types = "numeric")
[1] TRUE
check_data_frame(df["klein"], types = "numeric")
[1] TRUE
check_data_frame(df[3], types = "numeric")
[1] "May only contain the following types: {numeric}, but element 1 has type 'character'"
# checking for individual rows
df[1,]
  klein mittel groß
A     1      4    7
check_data_frame(df[1,], types = "numeric")
[1] "May only contain the following types: {numeric}, but element 3 has type 'character'"

You can also check the type of individual elements with df[1,1] or just the content of the columns by loosing the attributes of the data frame with df$klein or df[,1]. In this example they all have type integer. Now check_data_frame() doesn’t work anymore, because the data is no longer a data frame. You can use the normal checks from above.

check_data_frame(df$klein, types = "numeric")
[1] "Must be of type 'data.frame', not 'integer'"
check_integer(df$klein)
[1] TRUE
check_numeric(df[,1])
[1] TRUE
check_double(df[1,1])
[1] "Must be of type 'double', not 'integer'"

Checking for Type: Lists

Lists work the same way. We create an example list a, with named elements of different types.

a <- list(zahlen = 1:9, mon = month.abb, creator = "IQB")
assert_list(a, types = c("numeric", "character"))
assert_character(a$mon)

Checking for Length

You can check if an argument is scalar, but you cannot check for arbitrary lengths this way.

check_scalar(1)
[1] TRUE
check_scalar(1:5)
[1] "Must have length 1"
ls("package:checkmate", pattern = "length")
character(0)

To check for arbitrary length of an argument, you have to use the len argument from the test, check or assert functions.

check_character(month.abb, len = 12)
[1] TRUE
check_character(month.abb, len = 11)
[1] "Must have length 11, but has length 12"

There are some other attributes you can check like this, e.g. min.len, max.len, unique or the length of elements in character vectors using n.chars, min.chars or max.chars.

check_character(month.abb, n.chars = 3, min.chars = 2)
[1] TRUE
check_character(month.abb, n.chars = 2)
[1] "All elements must have exactly 2 characters, but element 1 has 3 chararacters"
check_character(month.abb, n.chars = 3, max.chars = 2)
[1] "All elements must have at most 2 characters, but element 1 has 3 characters"

For more info what arguments you can check, type ?check_character in the console.

?check_character

You can also check the lengths of lists or the length of columns/rows of a data frame in a similar way by using our example objects from above.

assert_list(a, len = 3, min.len = 2, max.len = 3)
assert_data_frame(df, min.cols = 1, max.cols = 3, ncols = 3)

Checking Names via Subset

You cannot check the names of complex objects directly. With the list or data_frame functions you can only check if the objects is named at all, or if the names are unique via the arguments names for lists and col.names or `row.names for data frames. Again using our example objects.

# a <- list(zahlen = 1:9, mon = month.abb, creator = "IQB")
# df <- data.frame(klein = 1:3, mittel = 4:6, groß = c("7", "8", "9"), row.names = c("A", "B", "C"))

assert_list(a, names = "named")
assert_data_frame(df, col.names = "unique")

You can check for specific names by using the subset functions. You can check whether the object give the argument choices contains x. If x is not a subset of choices you’ll get an error message.

# lists
check_subset(x = "mon", choices = names(a))
[1] TRUE
# data frames
check_subset(c("klein", "mittel", "groß", "größer"), choices = colnames(df))
[1] "Must be a subset of {'klein','mittel','groß'}, but has additional elements {'größer'}"

You can also check for unique values or missings in a similar way. For more info see the help functions.

?assert_list
?assert_data_frame

Footnotes

  1. Image by Felix mittermeier on Unsplash↩︎