ocaisa · ocaisa · Nov 7, 2025 · May 15, 2025 · Aug 21, 2025 · Nov 5, 2025
diff --git a/config.yaml b/config.yaml
@@ -65,7 +65,7 @@ contact: 'team@carpentries.org'
 
 # Order of episodes in your lesson
 episodes:
-  - 10-hpc-intro.md
+  - 10-hpc-intro.Rmd
   - 11-connecting.Rmd
   - 12-cluster.Rmd
   - 13-scheduler.Rmd
@@ -78,12 +78,15 @@ episodes:
 
 # Information for Learners
 learners:
+  - setup.md
 
 # Information for Instructors
 instructors:
+  - instructor-notes.Rmd
 
 # Learner Profiles
 profiles:
+  - learner-profiles.md
 
 # Customisation ---------------------------------------------
 #

diff --git a/episodes/10-hpc-intro.md → episodes/10-hpc-intro.Rmd b/episodes/10-hpc-intro.md → episodes/10-hpc-intro.Rmd
@@ -4,6 +4,11 @@ teaching: 15
 exercises: 5
 ---
 
+```{r, echo=FALSE}
+# Source the external configuration script
+source("load_config.R")
+```
+
 ::::::::::::::::::::::::::::::::::::::: objectives
 
 - Describe what an HPC system is
@@ -22,15 +27,15 @@ Frequently, research problems that use computing can outgrow the capabilities
 of the desktop or laptop computer where they started:
 
 - A statistics student wants to cross-validate a model. This involves running
-  the model 1000 times -- but each run takes an hour. Running the model on
+  the model 1000 times — but each run takes an hour. Running the model on
   a laptop will take over a month! In this research problem, final results are
   calculated after all 1000 models have run, but typically only one model is
   run at a time (in **serial**) on the laptop. Since each of the 1000 runs is
   independent of all others, and given enough computers, it's theoretically
   possible to run them all at once (in **parallel**).
 - A genomics researcher has been using small datasets of sequence data, but
   soon will be receiving a new type of sequencing data that is 10 times as
-  large. It's already challenging to open the datasets on a computer --
+  large. It's already challenging to open the datasets on a computer —
   analyzing these larger datasets will probably crash it. In this research
   problem, the calculations required might be impossible to parallelize, but a
   computer with **more memory** would be required to analyze the much larger
@@ -54,7 +59,7 @@ problems in parallel**.
 
 ## Jargon Busting Presentation
 
-Open the [HPC Jargon Buster](../files/jargon#p1)
+Open the [HPC Jargon Buster](files/jargon.html#p1)
 in a new tab. To present the content, press `C` to open a **c**lone in a
 separate window, then press `P` to toggle **p**resentation mode.
 
@@ -71,48 +76,44 @@ results.
 ## Some Ideas
 
 - Checking email: your computer (possibly in your pocket) contacts a remote
-  machine, authenticates, and downloads a list of new messages; it also
-  uploads changes to message status, such as whether you read, marked as
-  junk, or deleted the message. Since yours is not the only account, the
-  mail server is probably one of many in a data center.
-- Searching for a phrase online involves comparing your search term against
-  a massive database of all known sites, looking for matches. This "query"
+  machine, authenticates, and downloads a list of new messages; it also uploads
+  changes to message status, such as whether you read, marked as junk, or
+  deleted the message. Since yours is not the only account, the mail server is
+  probably one of many in a data center.
+- Searching for a phrase online involves comparing your search term against a
+  massive database of all known sites, looking for matches. This "query"
   operation can be straightforward, but building that database is a
   [monumental task][mapreduce]! Servers are involved at every step.
-- Searching for directions on a mapping website involves connecting your
-  (A) starting and (B) end points by [traversing a graph][dijkstra] in
-  search of the "shortest" path by distance, time, expense, or another
-  metric. Converting a map into the right form is relatively simple, but
-  calculating all the possible routes between A and B is expensive.
+- Searching for directions on a mapping website involves connecting your (A)
+  starting and (B) end points by [traversing a graph][dijkstra] in search of
+  the "shortest" path by distance, time, expense, or another metric. Converting
+  a map into the right form is relatively simple, but calculating all the
+  possible routes between A and B is expensive.
 
 Checking email could be serial: your machine connects to one server and
 exchanges data. Searching by querying the database for your search term (or
-endpoints) could also be serial, in that one machine receives your query
-and returns the result. However, assembling and storing the full database
-is far beyond the capability of any one machine. Therefore, these functions
-are served in parallel by a large, ["hyperscale"][hyperscale] collection of
-servers working together.
-
-
+endpoints) could also be serial, in that one machine receives your query and
+returns the result. However, assembling and storing the full database is far
+beyond the capability of any one machine. Therefore, these functions are served
+in parallel by a large, ["hyperscale"][hyperscale] collection of servers
+working together.
 
 :::::::::::::::::::::::::
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
-
-
 [mapreduce]: https://en.wikipedia.org/wiki/MapReduce
 [dijkstra]: https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
 [hyperscale]: https://en.wikipedia.org/wiki/Hyperscale_computing
 
-
 :::::::::::::::::::::::::::::::::::::::: keypoints
 
-- High Performance Computing (HPC) typically involves connecting to very large computing systems elsewhere in the world.
-- These other systems can be used to do work that would either be impossible or much slower on smaller systems.
+- High Performance Computing (HPC) typically involves connecting to very large
+  computing systems elsewhere in the world.
+- These other systems can be used to do work that would either be impossible or
+  much slower on smaller systems.
 - HPC resources are shared by multiple users.
-- The standard method of interacting with such systems is via a command line interface.
+- The standard method of interacting with such systems is via a command line
+  interface.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
-
-
diff --git a/episodes/11-connecting.Rmd b/episodes/11-connecting.Rmd
@@ -4,6 +4,11 @@ teaching: 25
 exercises: 10
 ---
 
+```{r, echo=FALSE}
+# Source the external configuration script
+source("load_config.R")
+```
+
 ::::::::::::::::::::::::::::::::::::::: objectives
 
 - Configure secure access to a remote HPC system.
@@ -17,11 +22,6 @@ exercises: 10
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
 
-```{r, echo=FALSE}
-# Source the external configuration script
-source("load_config.R")
-```
-
 ## Secure Connections
 
 The first step in using a cluster is to establish a connection from our laptop
@@ -38,15 +38,17 @@ results.
 If you have ever opened the Windows Command Prompt or macOS Terminal, you have
 seen a CLI. If you have already taken The Carpentries' courses on the UNIX
 Shell or Version Control, you have used the CLI on your *local machine*
-extensively. The only leap to be made here is to open a CLI on a *remote machine*,
-while taking some precautions so that other folks on the network can't see (or
-change) the commands you're running or the results the remote machine sends
-back. We will use the Secure SHell protocol (or SSH) to open an encrypted
-network connection between two machines, allowing you to send \& receive text
-and data without having to worry about prying eyes.
-
-![](/fig/connect-to-remote.svg){max-width="50%" alt="Connect to cluster"}
-
+extensively. The only leap to be made here is to open a CLI on a *remote
+machine*, while taking some precautions so that other folks on the network
+can't see (or change) the commands you're running or the results the remote
+machine sends back. We will use the Secure SHell protocol (or SSH) to open an
+encrypted network connection between two machines, allowing you to send \&
+receive text and data without having to worry about prying eyes.
+
+![connect-to-remote.svg](fig/connect-to-remote.svg){
+  max-width="50%"
+  alt="Connect to cluster. "
+}
 
 SSH clients are usually command-line tools, where you provide the remote
 machine address as the only required argument. If your username on the remote

diff --git a/episodes/13-hpcc-scheduler/hpcc/section2.rmd b/episodes/13-hpcc-scheduler/hpcc/section2.rmd
diff --git a/episodes/13-scheduler.Rmd b/episodes/13-scheduler.Rmd
@@ -57,7 +57,7 @@ In this case, the job we want to run is a shell script -- essentially a
 text file containing a list of UNIX commands to be executed in a sequential
 manner. Our shell script will have three parts:
 
-- On the very first line, add ``r config$remote$bash_shebang``. The `#!`
+- On the very first line, add ``r config$remote$shebang``. The `#!`
   (pronounced "hash-bang" or "shebang") tells the computer what program is
   meant to process the contents of this file. In this case, we are telling it
   that the commands that follow are written for the command-line shell (what
@@ -75,7 +75,7 @@ manner. Our shell script will have three parts:
 ```
 
 ```bash
-`r config$remote$bash_shebang`
+`r config$remote$shebang`
 
 echo -n "This script is running on "
 hostname
@@ -163,7 +163,7 @@ resources we must customize our job script.
 Comments in UNIX shell scripts (denoted by `#`) are typically ignored, but
 there are exceptions. For instance the special `#!` comment at the beginning of
 scripts specifies what program should be used to run it (you'll typically see
-``r config$local$bash_shebang``). Schedulers like `r config$sched$name` also
+``r config$local$shebang``). Schedulers like `r config$sched$name` also
 have a special comment used to denote special scheduler-specific options.
 Though these comments differ from scheduler to scheduler,
 `r config$sched$name`'s special comment is ``r config$sched$comment``. Anything
@@ -179,7 +179,7 @@ name of a job. Add an option to the script:
 ```
 
 ```bash
-`r config$remote$bash_shebang`
+`r config$remote$shebang`
 `r config$sched$comment` `r config$sched$flag$name` hello-world
 
 echo -n "This script is running on "
@@ -253,7 +253,7 @@ for it on the cluster.
 ```
 
 ```bash
-`r config$remote$bash_shebang`
+`r config$remote$shebang`
 `r config$sched$comment` `r config$sched$flag$time` 00:01 # timeout in HH:MM
 
 echo -n "This script is running on "
@@ -282,7 +282,7 @@ wall time, and attempt to run a job for two minutes.
 ```
 
 ```bash
-`r config$remote$bash_shebang`
+`r config$remote$shebang`
 `r config$sched$comment` `r config$sched$flag$name` long_job
 `r config$sched$comment` `r config$sched$flag$time` 00:01 # timeout in HH:MM
 

diff --git a/episodes/14-environment-variables.Rmd b/episodes/14-environment-variables.Rmd
@@ -212,7 +212,7 @@ job was submitted.
 ```
 
 ```output
-`r config$remote$bash_shebang`
+`r config$remote$shebang`
 `r config$sched$comment` `r config$sched$flag$time` 00:00:30
 
 echo -n "This script is running on "
@@ -279,7 +279,7 @@ unless we type in the full path to the program,
 since the directory `/users/vlad` isn't in `PATH`.
 
 This means that I can have executables in lots of different places as long as
-I remember that I need to to update my `PATH` so that my shell can find them.
+I remember that I need to update my `PATH` so that my shell can find them.
 
 What if I want to run two different versions of the same program?
 Since they share the same name, if I add them both to my `PATH` the first one