2024-10-17
Ecologist → “Scientific Programmer & Educator”
Attempted (unsuccessfully) to use HPC as PhD student
Successfully used HPC as postdoc
Requirement to use shell commands
Uncomfortable way of editing and running R code
Not seeing HPC resources as “for me”
Key skills that empowered me to use HPC while minimizing time outside of my comfort zone:
GitHub
renv 📦 for managing R package dependencies
Open OnDemand
targets 📦

I can avoid the command line entirely:
RStudio file pane for upload/download of files
RStudio git pane for interacting with git/GitHub
Run parallel R code on HPC cores without SLURM
Cons: can’t load additional modules (?)
targetstargets_targets.R
R/ with custom functions fit_model() and plot_model()
targetsVisualize pipeline with tar_visnetwork()
targetsRun pipeline with tar_make()
✔ skipped target file
✔ skipped target data
▶ dispatched target model
● completed target model [3.008 seconds, 2.879 kilobytes]
▶ dispatched target plot
● completed target plot [0.101 seconds, 1.081 kilobytes]
▶ ended pipeline [3.779 seconds]
crewWe can set up a crew controller to run targets in parallel.
_targets.R
library(targets)
tar_source()
tar_option_set(
packages = c("dplyr", "ggplot2"),
controller = crew::crew_controller_local(workers = 3)
)
list(
tar_target(file, "data.csv", format = "file"),
tar_target(data, read.csv(file)),
tar_target(model1, fit_model1(data)),
tar_target(model2, fit_model2(data)),
tar_target(model3, fit_model3(data))
) This “local” controller also works on Open OnDemand!
crew.clusterUse SLURM (or PBS, SGE, etc.) without writing a bash script!
crew.cluster::crew_controller_slurm(
workers = 5,
slurm_partition = "standard",
slurm_time_minutes = 1200,
slurm_log_output = "logs/crew_log_%A.out",
slurm_log_error = "logs/crew_log_%A.err",
slurm_memory_gigabytes_per_cpu = 5,
slurm_cpus_per_task = 2,
script_lines = c(
"#SBATCH --account kristinariemer",
"module load R"
),
seconds_idle = 600
)Links to relevant tutorials for prerequisite skills
Example targets pipeline
Uses renv for package management
Example crew controllers with all required fields set
Includes run.sh to launch targets::tar_make() as a SLURM job
Collaborative workshops led by HPC RSEs & Domain RSEs
Offer workshops on using HPC without the command line
HPC workshops tailored to R/RStudio users
Create a template repo for using targets on your HPC
crew technical details
nanonext, R bindings for NNG (Nanomsg Next Gen), which powers…
crew.clusterseconds_idle