Making Your First R Package 📦 (Instructor Notes)

SORTEE 2023

Author
Affiliation

Eric R. Scott

Communications & Cyber Technologies, University of Arizona

Introduction

Important

Go through introductions in SORTEE facilitator checklist

Introduce your background as an ecologist!

Slides

Tip

Remember, even if you don’t ever make an R package that goes on CRAN (or any R package at all), the skills covered in this workshop will be helpful!

Package Setup

Load needed packages

library(devtools)
Loading required package: usethis

Create new package with usethis

usethis::create_package("path/to/packagename")
Note

This will work best if learners start NOT in a project, so that tab auto-complete for the file path will start at ~/.

Files created

  • .gitignore - list files you don’t want to be on GitHub

  • .Rbuildignore - files that you want on GitHub, but that can’t be part of your R package

  • DESCRIPTION - contains package metadata

  • <package name>.Rproj - holds settings for the RStudio Project

  • NAMESPACE - read-only file that will be generated once we add some functions

  • R/ - this is where you’ll put R code for your package’s functions

Add a README

  • run use_readme_md() and edit the README.

  • You could use use_readme_rmd() if you want a README that executes some code, like running an example.

Functions

Add a function

  1. New .R file, save as greet.R

  2. Start with super simple:

    greet.R
    greet <- function() {
      cat("Hello!")
    }
    greet()
    Hello!
  3. load_all() and run greet()

  4. Add argument:

    greet.R
    greet <- function(name) {
      cat("Hello ", name, "!", sep = "")
    }
  5. load_all() and greet("yourname")

    greet("Eric")
    Hello Eric!
    greet()
    Error in greet(): argument "name" is missing, with no default
  6. greet() errors, so let’s add a default

    greet.R
    greet <- function(name = "User") {
      cat("Hello ", name, "!", sep = "")
    }
  7. load_all() and greet()

    greet()
    Hello User!

check()

Run check() and address stuff:

  • Edit author field in DESCRIPTION

  • run use_mit_license()

Add documentation

  1. With cursor in function definition, add roxygen skeleton and fill it out
  2. document() then ?greet
  3. Explore and commit changes
  4. Discuss roxygen2 fields and where to get more help on this
  5. What does @export mean? When would you not want a function to be exported?

check() again

should pass

Install

Should be able to install the package with install() now, load it with library(), etc.

Git/GitHub setup

If we want other people to be able to install our package, we need to put it on GitHub

Tip

This is all a one-time per computer setup

Configure Git

Get a checklist of what to do to set things up:

git_sitrep()
── Git global (user) 
• Name: 'Eric Scott'
• Email: 'scottericr@gmail.com'
• Global (user-level) gitignore file: '~/.gitignore'
• Vaccinated: TRUE
ℹ Defaulting to 'https' Git protocol
• Default Git protocol: 'https'
• Default initial branch name: 'main'


── GitHub user 

• Default GitHub host: 'https://github.com'
• Personal access token for 'https://github.com': '<discovered>'
• GitHub user: 'Aariq'
• Token scopes: 'gist, repo, user, workflow'
• Email(s): 'scottericr@gmail.com', 'ericrscott@arizona.edu (primary)'


── Active usethis project: '/Users/ericscott/Documents/GitHub/sortee-rpkg' ──



── Git local (project) 

• Name: 'Eric Scott'
• Email: 'scottericr@gmail.com'
• Default branch: 'main'
• Current local branch -> remote tracking branch:
  'main' -> 'origin/main'


── GitHub project 

• Type = 'ours'
• Host = 'https://github.com'
• Config supports a pull request = TRUE
• origin = 'cct-datascience/sortee-rpkg' (can push)
• upstream = <not configured>
• Desc = 'origin' is both the source and primary repo.
  
  Read more about the GitHub remote configurations that usethis supports at:
  'https://happygitwithr.com/common-remote-setups.html'

Add a global .gitignore to prevent you from accidentally sharing sensitive files on GitHub:

git_vaccinate()

Introduce yourself to git:

use_git_config(
  user.name = "your name",
  user.email = "youremail@arizona.com", #email you used to sign up for GitHub account
)

Set a default branch name:

#sets your default branch name to "main"
git_default_branch_configure()

Git and GitHub

There are some other things to do in git_sitrep(), but first, let’s make our first commit. Make a commit using the git pane in RStudio. Commit message can be something like “initial commit”

#check how we're doing
git_sitrep()

Set up a personal access token (PAT). This is necessary to securely send things between your computer and GitHub.

create_github_token() #takes you to external website. Add description, but keep defaults

Store that token on your computer:

gitcreds::gitcreds_set()
#paste in PAT from GitHub

Double-check that it worked:

git_sitrep()

Now everything should be set (except maybe upstream) with no ❌s

Sync with GitHub

use_git() #might not be necessary
use_github()
  • Inspect files on GitHub

  • Notice that files in .gitignore aren’t there

  • Notice how README.md is rendered

  • Now your package can be installed by anyone with remotes::install_github("username/packagename") or pak::pkg_install("username/packagename")

Add A dependency

Let’s make the output colorful with the cli package!

greet.R
greet <- function(name = "User") {
  cat("Hello ", cli::col_cyan(name), "!\n", sep = "")
}
greet("Eric")
Hello Eric!
  1. Notice the syntax—pkgname::function()—this is essential so that your package knows which package functions are from!

  2. Run check() and notice the warning:

    '::' or ':::' import not declared from: ‘cli’
  3. Solution: use_package("cli")

  4. Inspect DESCRIPTION and run check() again

Practice commit + push

Inspect on GitHub

Data

Add a data set

  • Read in dataset from URL (paste code into chat I guess?)

  • use_data() to add dataset to your package

#read in data as an R object
baked_goods <- read.csv("https://raw.githubusercontent.com/cct-datascience/sortee-rpkg/main/baked_goods_long.csv")
#use_data() on the R object to add to your package
use_data(baked_goods)
  • Inspect output and see what is changed

    • lazy data means the dataset is available when you load the package without having to run data(baked_goods). Just load your package and baked_goods is available.

    • .rda files are for saving R objects

  • Document the data set. https://r-pkgs.org/data.html#sec-documenting-data

    • Create a R/data.R, paste in the example and edit

    • Never @export a dataset, they’re available through lazy loading and this isn’t needed.

  • Run document() and check ?baked_goods

  • Run check(), commit and push

Testing

Writing tests

  • “Unit tests” check that your code gives desired output, errors when it’s supposed to, etc.

  • Set up with use_testthat()

  • Inspect changes

    • tests/

    • DESCRIPTION

  • Add a test for our function with use_test("greet")

  • Run test with button and with test() and with check()

  • Edit test to make sense for greet

test-greet.R
test_that("greet works with names", {
  expect_equal(greet("Eric"), "Hello Eric!")
})

test_that("greet works with default", {
  expect_equal(greet(), "Hello User!")
})

Continuous Integration

So you’ve got check() passing on your computer, but will your package work on a different OS? With a different version of R?

How can you find out and be alerted if changes you make will break the package for others?

Continuous Integration (CI) is a practice of automatically building your package (or running an analysis pipeline) on a variety of systems every time you make changes to your code (and push to GitHub)

We’ll use GitHub Actions and usethis to set this up.

Sharing your Package

Installing from GitHub

  • Add instructions to README for using remotes::install_github("username/packagename") or pak::pkg_install("username/packagename") to install from GitHub
  • You could also set up r-universe.

Package Website

  • Set up a website for your package with use_pkgdown() and use_pkgdown_github_pages()

Polishing your Package

  • Add a vignette to your package with use_vignette()

  • Add a code of conduct to your repo with use_code_of_conduct()

  • Make a checklist of what you need to do to submit your package to CRAN with use_release_issue()

Appendix

Set up build tools

This may not be strictly necessary until you want to build packages containing C or C++ code. Especially if you are using RStudio, you can set this aside for now. The IDE will alert you and provide support once you try to do something that requires you to setup your development environment

Important

If it isn’t going to work for someone, they can use Posit Cloud as a fallback. Build tools come pre-installed there.

macOS

Check if you already have xcode command line tools installed:

xcode-select -p

If you get an error instead of a path like /Library/Developer/CommandLineTools, then you need to install xcode command line tools with:

xcode-select --install

Windows

Install Rtools from https://cran.r-project.org/bin/windows/Rtools/

  • Do not select the box for “Edit the system PATH”. devtools and RStudio should put Rtools on the PATH automatically when it is needed.

  • Do select the box for “Save version information to registry”. It should be selected by default.