---
title: "A first containerization workflow with containr"
output:
rmarkdown::html_vignette:
css: styles.css
vignette: >
%\VignetteIndexEntry{A first containerization workflow with containr}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
date: "Created 2026-04-30 | Last updated `r Sys.Date()`"
---
## Before you start
This vignette walks through the complete `containr` workflow: generating a
`Dockerfile`, building a container image, inspecting local images, and pushing
the image to a registry. It assumes you are comfortable with R and have some
familiarity with the idea of containers. If you are new to containers and want
to understand why they complement `renv` rather than replace it, start with the
companion vignette *From renv to containers: why recording your R packages may
not be enough*.
Before running any of the code below, confirm that:
- your project uses `renv` and `renv.lock` exists in the project root;
- Podman or Docker is installed and running;
- you have access to a container registry if you plan to push the image.
At UW-Madison, `registry.doit.wisc.edu` is the default registry for the
CHTC-oriented workflow. The authentication guide, including how to create a
Personal Access Token with the right scopes, is at
.
If you are working through this vignette for the first time, `dry_run = TRUE`
is available on `build_image()` and `push_image()`. It prints the command that
would be run without executing it. Use it freely until you are confident in each
step.
---
## Step 1: Generate a Dockerfile
`generate_dockerfile()` reads your `renv.lock`, infers the system library
requirements of your R packages, and writes a `Dockerfile` in the project root.
It is the entry point into the `containr` workflow and the step that does the
most work on your behalf.
```{r, eval=FALSE}
generate_dockerfile(
r_version = "4.4.0",
output = ".",
comments = TRUE
)
```
The `r_version` argument should match the R version recorded in your
`renv.lock`. Using a consistent R version between your lockfile and your base
image reduces the chance of package installation failures inside the container.
The `comments = TRUE` argument annotates each instruction in the generated
`Dockerfile` with an explanation of what it does. This is useful when you are
learning containerization or reviewing the file with collaborators. Here is what
the generated `Dockerfile` looks like with comments enabled:
```dockerfile
# Base image: rocker/r-ver provides a minimal R installation on Ubuntu.
# Pinning the R version ensures the container matches your renv.lock.
FROM rocker/r-ver:4.4.0
# Suppress interactive prompts during apt-get package installation.
ENV DEBIAN_FRONTEND=noninteractive
# Install system libraries required by your R packages.
# These are inferred from the packages recorded in renv.lock.
RUN apt-get update && apt-get install -y \
curl \
git \
libcurl4-openssl-dev \
libssl-dev \
libxml2-dev \
libgit2-dev \
cmake \
make \
libfreetype6-dev \
libjpeg-dev \
libpng-dev \
libtiff-dev \
libfontconfig1-dev \
libfribidi-dev \
libharfbuzz-dev \
pandoc \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set the working directory inside the container.
WORKDIR /home
# Copy renv.lock into the container so renv can restore the R package
# environment at build time.
COPY renv.lock /home/renv.lock
# Install renv from CRAN, then restore the R package environment from
# renv.lock. This step reproduces your exact package versions inside
# the container.
RUN R -e "install.packages('renv', repos='https://packagemanager.posit.co/cran/latest')"
RUN R -e "renv::restore()"
```
The `Dockerfile` is a plain text file. You can inspect it, edit it, and
regenerate it as many times as needed. If your project has unusual system
library requirements that `generate_dockerfile()` did not catch, add them to
the `install_syslibs` argument:
```{r, eval=FALSE}
generate_dockerfile(
r_version = "4.4.0",
output = ".",
install_syslibs = c("libuv1-dev", "libwebp-dev")
)
```
If your analysis depends on data files or scripts that should be available
inside the container, pass them via `data_file` and `code_file`. The generated
`COPY` instructions preserve your local directory structure under `/home/` --
a file at `data-raw/sample.csv` locally becomes `/home/data-raw/sample.csv`
in the container. All files must be inside the current working directory (the
build context).
```{r, eval=FALSE}
generate_dockerfile(
r_version = "4.4.0",
data_file = "data-raw/sample.csv",
code_file = "analysis.R",
output = ".",
comments = TRUE
)
```
If your project uses RStudio Server rather than plain R, pass `r_mode =
"rstudio"` to use the `rocker/rstudio` base image instead:
```{r, eval=FALSE}
generate_dockerfile(
r_version = "4.4.0",
r_mode = "rstudio",
output = "."
)
```
Take a few minutes to read the generated `Dockerfile` before moving on.
Understanding what it does makes the subsequent steps easier to reason about
and debug if something goes wrong.
---
## Step 2: Build the image
`build_image()` passes your `Dockerfile` to Podman or Docker and builds the
image locally. The first build pulls the base image and installs every R package
in your `renv.lock` from scratch, so it can take several minutes depending on
the size of your package library and your network connection.
```{r, eval=FALSE}
build_image(verbose = TRUE)
```
The `platform` argument defaults to `"linux/amd64"`, which is the architecture
used by CHTC and most HPC clusters. On Apple Silicon Macs, this means the image
targets a different architecture than the host. When Docker is the resolved
tool, `build_image()` automatically uses `docker buildx build` with `--load`
for cross-platform builds. For Podman, `--platform` is passed directly. If the
target platform differs from the host, a warning is emitted about potential
emulation issues.
Docker Desktop handles cross-platform builds more reliably than Podman's QEMU
emulation layer. If builds fail with segfaults under Podman, try
`tool = "docker"` or build on a native x86_64 machine.
```{r, eval=FALSE}
# Build for the host architecture (e.g. local use on Apple Silicon)
build_image(platform = NULL, verbose = TRUE)
# Build for ARM64 explicitly
build_image(platform = "linux/arm64", verbose = TRUE)
```
With `verbose = TRUE`, the build output streams to the console so you can watch
the installation progress. The output looks something like this:
```bash
i Resolving tool: using "docker"
i Target platform: "linux/amd64"
i Building image (no tag applied)
i Build context: /home/user/my-analysis, Dockerfile: Dockerfile
STEP 1/7: FROM rocker/r-ver:4.4.0
STEP 2/7: ENV DEBIAN_FRONTEND=noninteractive
STEP 3/7: RUN apt-get update && apt-get install -y ...
...
STEP 6/7: RUN R -e "install.packages('renv', ...)"
STEP 7/7: RUN R -e "renv::restore()"
v Image built successfully.
```
If you want to preview the build command without running it:
```{r, eval=FALSE}
build_image(dry_run = TRUE)
#> docker buildx build --platform linux/amd64 --load -f Dockerfile .
```
Subsequent builds are usually faster because Podman and Docker cache layers.
If only your `renv.lock` changed, the system library installation step is
reused from cache and only the R package installation step reruns.
A common first-build failure is a missing system library. If `renv::restore()`
fails inside the container with a message about a missing header file or a
failed compilation, add the relevant library to `install_syslibs` in
`generate_dockerfile()`, regenerate the `Dockerfile`, and rebuild.
---
---
## Step 3: Inspect local images
`list_images()` returns a data frame of images in the local image store. It is
the R equivalent of `podman image ls` or `docker image ls`.
```{r, eval=FALSE}
imgs <- list_images()
imgs
```
```
repository tag image_id created size
1 registry.doit.wisc.edu/your.netid/my-analysis 1.0.0 974123909a36 2 hours ago 1.59 GB
2 3b8f20dc1a47 3 hours ago 1.21 GB
```
The `image_id` column contains the hash you pass to `push_image()`. Untagged
images — those built without a name — appear with `` in the `repository`
and `tag` columns. These accumulate during development as you rebuild with
different settings and can be pruned periodically.
Use `imgs$image_id[1]` to pass the most recently built image to the next step,
or select a specific row if you have multiple images and want to push a
particular one.
---
## Step 4: Push the image to the registry
`push_image()` tags a local image with a registry path and pushes it to a
container registry. Before pushing, authenticate with the registry once in a
terminal. `containr` checks whether you are logged in before attempting the
push and errors with clear instructions if not.
```{r, eval=FALSE}
push_image(
image_id = imgs$image_id[1],
netid = "your.netid",
project = "my-analysis",
tag = "1.0.0"
)
```
The push output looks something like this:
```
ℹ Tagging image 974123909a36 as
registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
ℹ Pushing to registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
...
✔ Image pushed successfully.
ℹ Image URI: docker://registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
```
To preview the tag and push commands without running them:
```{r, eval=FALSE}
push_image(
image_id = imgs$image_id[1],
netid = "your.netid",
project = "my-analysis",
tag = "1.0.0",
dry_run = TRUE
)
#> podman tag 974123909a36 registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
#> podman push registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
```
A note on tagging: use explicit version tags like `"1.0.0"` rather than
`"latest"`. The `"latest"` tag is overwritten on every push, which makes it
difficult to reconstruct which image was used for a specific result. An explicit
version tag ties the image to a specific state of the analysis and can be
referenced unambiguously in submit files, documentation, and data management
plans.
---
## Putting it together
The complete workflow from a project with a `renv.lock` to a pushed container
image takes four function calls:
```{r, eval=FALSE}
library(containr)
# 1. Generate the Dockerfile
generate_dockerfile(
r_version = "4.4.0",
output = ".",
comments = TRUE
)
# 2. Build the image
build_image(verbose = TRUE)
# 3. Inspect local images
imgs <- list_images()
# 4. Push to the registry
push_image(
image_id = imgs$image_id[1],
netid = "your.netid",
project = "my-analysis",
tag = "1.0.0"
)
```
The image URI returned by `push_image()` is the reference you pass to any
downstream workflow that needs to run the containerized analysis. For HTCondor
submissions, it goes in the submit file:
```
container_image = docker://registry.doit.wisc.edu/your.netid/my-analysis:1.0.0
```
The `submitr` package handles the next step: generating the submit file,
uploading files to the submit node, dispatching the job, and retrieving
results. You can stop here if your goal is a portable, shareable R environment,
and return to `submitr` when the project is ready to run on CHTC.
---
## Troubleshooting
**The build fails with a missing system library.** Add the library to
`install_syslibs` in `generate_dockerfile()`, regenerate, and rebuild. The
error message from the failed compilation usually names the missing library or
header file directly.
**`renv::restore()` fails inside the container.** Check whether the package
requires a system library that `generate_dockerfile()` did not infer
automatically. Packages with compiled C or C++ code are the most common source
of this problem.
**`push_image()` errors with an authentication message.** Run `podman login
registry.doit.wisc.edu` in a terminal and authenticate before retrying. The
authentication guide is at
.
**The image is very large.** The size is driven primarily by the number of R
packages in your `renv.lock` and their system library dependencies. This is
expected. A project with many packages will produce a large image. Size can be
reduced by trimming unused packages from the lockfile before building.
**The build fails with a QEMU segfault on Apple Silicon.** Building
`linux/amd64` images on ARM hosts requires emulation, which can crash during
R package installation. Switch to Docker Desktop (`tool = "docker"`), which
uses `buildx` and handles cross-platform builds more reliably. Alternatively,
build on a native x86_64 machine or via GitHub Actions.