Setup & Prerequisites

Getting Started with the Workshop Materials

Quick checklist for software setup and package installation.

Welcome, and thanks for joining the short course, Prediction-Based Inference: Methods & Applications!

This page is a quick checklist to help you get set up before the session. If you’d like to follow along interactively during the workshop, these steps will make sure everything runs smoothly.

Quick Checklist (10-15 minutes)

Complete these before the short course:

  • Confirm you have a stable internet connection and a laptop.
  • Choose your environment:
    • Recommended: Docker + browser-based RStudio
    • Alternative: local R/RStudio install
  • Verify ipd and core R packages install successfully (Option B only).

Prerequisites

You should be comfortable with:

  • Base R and tidyverse syntax (dplyr, ggplot2, basic pipes).
  • Basic regression modeling (lm, glm).
  • Basic predictive modeling concepts (train/test split, predictions, model error).

Helpful but optional (for the supplemental modules):

  • Bioconductor familiarity (ExpressionSet, AnnotationDbi, MLInterfaces).

Software Requirements

Option B: Local R + RStudio

You need R 4.4.1 or newer.

Install:

R Packages to Install Ahead of Time

If you use Option A (Docker), all required packages are already included in the workshop image and you can skip installation.

Core packages (required for Option B)

install.packages(c(
  "ipd", "MASS", "broom", "tidyverse", "patchwork", "scales",
  "future", "furrr", "GGally", "randomForest", "DALEX",
  "neuralnet", "partykit"
))

Additional packages for the supplemental BCR-ABL module

During the short course, we will cover Getting Started, Measuring Adiposity, Proteomics with AlphaFold, and The Rashomon Quartet. You only need the following packages if you want to explore the supplemental BCR-ABL module.

install.packages(c("pROC"))

if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install(c(
  "ALL", "golubEsets", "AnnotationDbi", "hgu95av2.db",
  "hu6800.db", "MLInterfaces"
))

60-Second Setup Test

Run this in R/RStudio (for Option B, this confirms local setup is complete):

library(ipd)
library(MASS)
library(broom)
library(tidyverse)
library(patchwork)
library(scales)
library(future)
library(furrr)
library(GGally)
library(randomForest)
library(DALEX)
library(neuralnet)
library(partykit)
sessionInfo()

Optional check (if one fails, install that package and rerun this chunk):

required <- c(
  "ipd", "MASS", "broom", "tidyverse", "patchwork", "scales",
  "future", "furrr", "GGally", "randomForest", "DALEX",
  "neuralnet", "partykit"
)

missing <- required[!vapply(required, requireNamespace, logical(1), quietly = TRUE)]

if (length(missing) == 0L) {
  message("All required packages are available.")
} else {
  stop(sprintf("Missing required packages: %s", paste(missing, collapse = ", ")))
}

If this runs without errors, you are ready.

Data

We will be providing datasets for the modules that use real data. For Option B, please download this data folder into your local working directory (for Option A, these data will already be available in the docker image).

Link to Data Folder: https://github.com/thmccormick/ipd-short-course/tree/main/content/data

Support

If you hit setup issues before the session, contact Tyler H. McCormick (thmccormick@gmail.com)