Harnessing the Power of HPC From the Comfort of

My Background

Ecologist → “Scientific Programmer & Educator”
= my comfort zone ❤️
Attempted (unsuccessfully) to use HPC as PhD student
Successfully used HPC as postdoc

Barriers to HPC use

Requirement to use shell commands
Uncomfortable way of editing and running R code
Not seeing HPC resources as “for me”

Technologies to bridge the gap

Key skills that empowered me to use HPC while minimizing time outside of my comfort zone:

GitHub
renv 📦 for managing R package dependencies

Open OnDemand
targets 📦

RStudio IDE¹ in a web browser
Code run on HPC cores

The form to configure an Open OnDemand instance of RStudio with fields for Cluster, R version, Queue, Run Time, and Core Count

I can avoid the command line entirely:

RStudio file pane for upload/download of files
RStudio git pane for interacting with git/GitHub
Run parallel R code on HPC cores without SLURM
Cons: can’t load additional modules (?)

`targets`

Make-like workflow management package for R
Skips computationally-intensive steps that are already up to date
Orchestrates parallel computing

`targets`

_targets.R

library(targets)
tar_source()
tar_option_set(packages = c("dplyr", "ggplot2"))
list(
  tar_target(file, "data.csv", format = "file"),
  tar_target(data, read.csv(file)),
  tar_target(model, fit_model(data)),
  tar_target(plot, plot_model(model, data))
)

1: Sources all R scripts in R/ with custom functions fit_model() and plot_model()
2: Define packages needed for pipeline and other options
3: Define pipeline

`targets`

Visualize pipeline with tar_visnetwork()

`targets`

Run pipeline with tar_make()

targets::tar_make()

✔ skipped target file
✔ skipped target data
▶ dispatched target model
● completed target model [3.008 seconds, 2.879 kilobytes]
▶ dispatched target plot
● completed target plot [0.101 seconds, 1.081 kilobytes]
▶ ended pipeline [3.779 seconds]

Parallel execution with `crew`

We can set up a crew controller to run targets in parallel.

_targets.R

library(targets)
tar_source()                                              
tar_option_set(
  packages = c("dplyr", "ggplot2"),
  controller = crew::crew_controller_local(workers = 3)
) 
list(                                                     
  tar_target(file, "data.csv", format = "file"),          
  tar_target(data, read.csv(file)),                       
  tar_target(model1, fit_model1(data)),
  tar_target(model2, fit_model2(data)),
  tar_target(model3, fit_model3(data))
)

1: This will set up three R sessions that can run tasks in parallel
2: These three targets can all be run in parallel

This “local” controller also works on Open OnDemand!

On the HPC with `crew.cluster`

Use SLURM (or PBS, SGE, etc.) without writing a bash script!

crew.cluster::crew_controller_slurm(
  workers = 5,
  slurm_partition = "standard",
  slurm_time_minutes = 1200,
  slurm_log_output = "logs/crew_log_%A.out",
  slurm_log_error = "logs/crew_log_%A.err",
  slurm_memory_gigabytes_per_cpu = 5,
  slurm_cpus_per_task = 2,
  script_lines = c(
    "#SBATCH --account kristinariemer",
    "module load R"
  ),
  seconds_idle = 600
)

1: Launches 5 R sessions as SLURM jobs
2: R code that gets translated into SBATCH script
3: Creates semi-transient workers

Template repository

cct-datascience/targets-uahpc

Links to relevant tutorials for prerequisite skills
Example targets pipeline
Uses renv for package management
Example crew controllers with all required fields set
Includes run.sh to launch targets::tar_make() as a SLURM job

How we can help bridge the gap

Collaborative workshops led by HPC RSEs & Domain RSEs
Offer workshops on using HPC without the command line
HPC workshops tailored to R/RStudio users
Create a template repo for using targets on your HPC

Questions?

ericrscott@arizona.edu

@LeafyEricScott@fosstodon.org

Template Repo

Slides

`crew` technical details

nanonext, R bindings for NNG (Nanomsg Next Gen), which powers…

mirai, a “minimalist async evaluation framework for R”, which powers …

crew, a unifying interface for creating distributed worker launchers

Optimizing `crew.cluster`

Use semi-transient workers by setting seconds_idle
Different controllers for different sized tasks
Avoid serialization overhead by allowing workers to access data store
Use S3 storage for object store