Content from Using RMarkdown


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • How do you write a lesson using R Markdown and sandpaper?

Objectives

  • Explain how to use markdown with the new lesson template
  • Demonstrate how to include pieces of code, figures, and nested challenge blocks

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from What is OMOP?


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What is OMOP?
  • What information would you expect to find in the person table?
  • What information would you expect to find in the condition_occurrence table?
  • How can you join these tables to aggregate information?

Objectives

  • Examine the diagram of the OMOP tables and the data specification
  • Interrogate the data in the tables
  • Join these tables to find the concept names

Setting up R


Getting started

Since we want to import the files called *.csv into our R environment, we need to be able to tell our computer where the file is. To do this, we will create a “Project” with RStudio that contains the data we want to work with. The “Projects” interface in RStudio not only creates a working directory for you, but also remembers its location (allowing you to quickly navigate to it). The interface also (optionally) preserves custom settings and open files to make it easier to resume work after a break.

Install the required packages

You will need the dplyr, remotes and readr packages from CRAN (the official package repository). You will also need a package we have developed omopcept. This can be installed with:

remotes::install_github(“SAFEHR-data/omopcept”)

Create a new project

Introduction


OMOP is a format for recording Electronic Healthcare Records. It allows you to follow a patient journey through a hospital by linking every aspect to a standard vocabulary thus enabling easy sharing of data between hospitals, trusts and even countries.

OMOP CDM Diagram

A diagram showing the tables that occur in the OMOP-CDM , how they relate to each other and standard vocabularies.
The OMOP Common Data Model

OMOP CDM stands for the Observational Medical Outcomes Partnership Common Data Model. You don’t really need to remember what OMOP stands for. Remembering that CDM stands for Common Data Model can help you remember that it is a data standard that can be applied to different data sources to create data in a Common (same) format. The table diagram will look confusing to start with but you can use data in the OMOP CDM without needing to understand (or populate) all 37 tables.

Challenge

Test yourself

Look at the OMOP-CDM figure and answer the following questions:

  1. Which table is the key to all the other tables?
  2. Which table allows you to distinguish between different stays in hospital?
  1. The Person table

  2. The Visit_occurrence table

There are a handful of core tables and columns that contain key information about a patient’s journey in the hospital. These are 7 tables to get you started :

  • person uniquely identifies each person or patient, and some demographic information. This is the central table that all other tables relate to.
  • condition_occurrence records relating to a Person suggesting the presence of a medical condition.
  • drug_exposure records about exposure of a patient to a drug.
  • procedure_occurrence activities carried out by a healthcare provider on the patient with a diagnostic or therapeutic purpose.
  • measurement numerical or categorical values obtained through standardized examination of a Person or Person’s sample.
  • observation clinical facts about a Person obtained in the context of examination, questioning or a procedure.
  • visit_occurrence records of times where Persons engage with the healthcare system.

Why use OMOP?


A diagram showing that different sources of data, transformed to OMOP, can then be used by multiple analysis tools.
Why use the OMOP-CDM

Once a database has been converted to the OMOP CDM, evidence can be generated using standardized analytics tools. This means that different tools can also be shared and reused. So using OMOP can help make your research FAIR.

Some simple tables


Loading Data

Now that we are set up with an Rstudio project, we are sure that the data and scripts we are using are all in our working directory. The data files should be located in the directory data, inside the working directory. Now we can load the data into R, there are three data files person.csv, condition_occurrence.csv and drug_exposure.csv. Read each of these into a table with the same name.

Challenge

Read the Data

There are three data files person.csv, condition_occurrence.csv and drug_exposure.csv. Read each of these into a table with the same name.

NOTE: The data does have headers.

R

person <- read.csv(file = "data/person.csv")
condition_occurrence <- read.csv(file = "data/condition_occurrence.csv")
drug_exposure <- read.csv(file = "data/drug_exposure.csv")

When you have read in the data, take some time to explore it.

Adding concept names


You will have noticed that content of the tables are not terribly easy to understand. This is because everything in OMOP is viewed as a concept that allows it to be related to one or more standard vocabularies such as SNOMED, ICD-10, etc.

We have developed a package that makes it very easy to add concept names to the tables.

You will need the function omopcept::omop_join_name_all(). This will look up the concept_id in the main table of concepts and add a column for the name of the concept associated with that id.

Challenge

Who’s who?

By creating tables that also have the name of the concepts answer the following questions

  1. How old is the black gentleman?
  2. In which month was an unspecified fever prevalent in the hospital?
  3. What was the ethnicity of the patient not affected by this fever?
  4. Give a description of the patient who received Amoxicillin because they were wheezing?

R

library(omopcept)
person_named <- person |> omop_join_name_all()

OUTPUT

Warning: downloading a subset of omop vocab files, pre-processed.
If you want to make sure you have the vocabs you need, download from Athena, save locally & call `omop_vocabs_preprocess()`

OUTPUT

downloading concept file, may take a few minutes, this only needs to be repeated if the package is re-installed

R

condition_occurrence_named <- condition_occurrence |> omop_join_name_all()
drug_exposure_named <- drug_exposure |> omop_join_name_all()
  1. 25 (or 24 if he hasn’t had his birthday this year)
  2. July
  3. Don’t know - it hasn’t been specified
  4. A 53/54 white female

Joining and interrogating the tables


Using join

We established when looking at the diagram that the person table was the key to accessing all the other tables. In fact it is the person_id column that is the actual key that will allow us to join with other tables.

So we can join two of the tables together to get information about the different conditions suffered by each person.

I am going to use a left join because I want a record of every person and the conditions they may have.

R

library(dplyr)
person_condition <- 
  person_named |> 
  left_join(condition_occurrence_named, by = join_by(person_id) )

This produces a new table with all the column names from both tables and six rows.

Using count

Challenge

Challenge

Count the number of people with each condition

R

person_condition |> count(gender_concept_name, condition_concept_name)

OUTPUT

# A tibble: 5 × 3
  gender_concept_name condition_concept_name     n
  <chr>               <chr>                  <int>
1 FEMALE              Fever, unspecified         2
2 FEMALE              Nausea and vomiting        1
3 FEMALE              Wheezing                   1
4 MALE                Fever, unspecified         1
5 MALE                Nausea and vomiting        1

This produces a table:

A table showing the different conditions listed with the n.umber of males and females suffering from them
A table of the condition counts
Challenge

For the Confident

Using the “GiBleed” database work out the number of male and female patients with each condition_concept_id

R

cdm$person |> 
  left_join( cdm$condition_occurrence, by = join_by(person_id) ) |> 
  group_by(condition_concept_id, gender_concept_id) |>
  summarise(num_persons = n_distinct(person_id)) |>
  collect() |>
  ungroup() |> 
  omopcept::omop_join_name_all() |> 
  #remove some columns to make display clearer
  select(-condition_concept_id, -gender_concept_id) |> 
  arrange(condition_concept_name)  

OUTPUT

# A tibble: 158 × 3
   condition_concept_name    gender_concept_name num_persons
   <chr>                     <chr>                     <dbl>
 1 Acute allergic reaction   MALE                         57
 2 Acute allergic reaction   FEMALE                       59
 3 Acute bacterial sinusitis FEMALE                      418
 4 Acute bacterial sinusitis MALE                        368
 5 Acute bronchitis          FEMALE                     1300
 6 Acute bronchitis          MALE                       1243
 7 Acute cholecystitis       FEMALE                       29
 8 Acute cholecystitis       MALE                          6
 9 Acute viral pharyngitis   MALE                       1284
10 Acute viral pharyngitis   FEMALE                     1322
# ℹ 148 more rows

Solutions for working out Who’s who programmatically.

Challenge

How old is the black gentleman?

R

library(dplyr)

year_of_birth <- person_named %>%
  filter(grepl("black", race_concept_name, ignore.case = TRUE)) %>%
  select(year_of_birth)

person_age <- year_of_birth$year_of_birth[1]  
age = 2025 - person_age

print(age)

OUTPUT

[1] 25
Challenge

In which month was an unspecified fever prevalent in the hospital?

The lubridate package has a function

R

library(lubridate)

OUTPUT


Attaching package: 'lubridate'

OUTPUT

The following objects are masked from 'package:base':

    date, intersect, setdiff, union

R

 month = month(ymd("2025-01-30"), label = TRUE)
 print(month)

OUTPUT

[1] Jan
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

which returns month = Jan

The dyplr package has a function that will allow you to add columns to a table

R

 new_person_table = mutate(person, age = 2025-year_of_birth)
 print(new_person_table)

OUTPUT

  person_id year_of_birth gender_concept_id race_concept_id age
1         1          1980              8532        46285833  45
2         2          1971              8532        46286810  54
3         3          2000              8507        46285836  25
4         4          2010              8507        37394011  15

which adds a column called age to the person table

R

library(dplyr)
library(lubridate)

fever_months <- condition_occurrence_named %>%
  # Filter for rows where the condition name contains "fever"
  filter(grepl("fever", condition_concept_name, ignore.case = TRUE)) %>%
  # Extract month from the condition_start_date
  mutate(month = month(condition_start_date, label = TRUE)) %>%
  # Select just the month column for the results
  select(month) %>%
  # Count occurrences by month
  group_by(month) %>%
  summarise(count = n()) %>%
  # Sort by count (descending)
  arrange(desc(count))

# This gives us a table with the months where people had a fever and how many people had the fever each month. The question tells us that there is only one month so we select that.
fever <- fever_months$month[1]

# View the results
print(fever)

OUTPUT

[1] Jul
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
Challenge

What was the ethnicity of the patient not affected by this fever?

R

library(dplyr)

people_without_condition <- person_named %>%
  # we want to join with the people who do not meet the condition
  anti_join(condition_occurrence_named %>%
    filter(grepl("fever", condition_concept_name, ignore.case = TRUE)),           
    by = "person_id") 
  
  ethnicity <- people_without_condition$race_concept_name 

# View results
print(ethnicity)

OUTPUT

[1] "Ethnicity not stated"
Challenge

Give a description of the patient who received Amoxicillin because they were wheezing?

R

library(dplyr)

# Query to find persons prescribed amoxicillin for wheezing
amoxicillin_for_wheezing <- person_named %>%
  # Join with condition_occurrence table 
  inner_join(condition_occurrence_named, by = "person_id") %>%
  # Filter for wheezing condition
  filter(grepl("wheez", condition_concept_name, ignore.case = TRUE)) %>%
  # Join with drug_exposure table to find medications
  inner_join(drug_exposure_named, by = "person_id") %>%
  # Filter for amoxicillin prescriptions
  filter(grepl("amoxicillin", drug_concept_name, ignore.case = TRUE)) 
  
  # again the question implies there is only one person
  age <- 2025-amoxicillin_for_wheezing$year_of_birth[1]
  gender <- amoxicillin_for_wheezing$gender_concept_name[1]
  ethnicity <- amoxicillin_for_wheezing$race_concept_name[1]

# View results
print(age)

OUTPUT

[1] 54

R

print(gender)

OUTPUT

[1] "FEMALE"

R

print(ethnicity)

OUTPUT

[1] "White: English or Welsh or Scottish or Northern Irish or British - England and Wales ethnic category 2011 census"
Key Points
  • Using a standard makes it much easier to share data
  • OMOP uses concepts to link dated to standard vocabularies
  • R can be used to join and interrogate data

Content from Why OMOP?


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • Why use OMOP?
  • Why not use spreadsheets?
  • What are the advantages of OMOP?
  • What are the disadvantages of OMOP?

Objectives

  • Examine the diagram of the OMOP tables and the data specification
  • Familiarise with the vocab schema
  • Join two or more tables together
  • Attempt to join data from spreadsheets with different structures
  • Describe the pros and cons on using OMOP vs raw data, why this is the way forward
  • Use Athena and other OHDSI tools for reference
  • Describe the full landscape of OMOP tools and the community

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Compare data from two separate OMOP data sets

Let’s read in an OMOP extract called extract_1 from the local files.

R

omop_dataset_file_location_1 <- here::here("extracts/uclh1")

extract_1 <- read_omop_dataset(omop_dataset_file_location_1)

A colleague at the hospital is familiar with the events that occurred in hospital to one of the patients in the dataset. This patient has been identified by the data team as the anonymised patient with person id 7.

R

extract_1$person |>
  filter(person_id==7) |>
  select(person_id, race_concept_id, gender_concept_id, year_of_birth) |>
  omopcept::omop_join_name_all() |>
  collect()
Checklist

Verify that the patient details match your colleague’s description.

Let’s take a sample of patients in this dataset, selecting those same columns.

R

extract_1_pt_sample <- extract_1$person |>
  slice_sample(n = 10) |>
  select(person_id, race_concept_id, gender_concept_id, year_of_birth) |>
  collect()

We’ve received another OMOP dataset from another site.

R

omop_dataset_file_location_2 <- here::here("extracts/other_site_1")

extract_2 <- read_omop_dataset(omop_dataset_file_location_2)

Let’s take a sample of patients from the second extract and bind them together.

Callout

Note that, because the structure of the data (table names, columns and data types) are set as standard by the OMOP specification, we are guaranteed to be able to bind these two datasets together without error. We can also re-apply the same code, only changing the reference to the new extract.

R

extract_2_pt_sample <- extract_2$person |>
  slice_sample(n = 10) |>
  select(person_id, race_concept_id, gender_concept_id, year_of_birth) |>
  collect()
  
 bind_rows(extract_1_pt_sample, extract_2_pt_sample)

R

dplyr::tibble(person_id = c(101,102,201,202), year_of_birth = c(1992,1993,1994,1995))
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from Chapter 3 OMOP measurement and observation tables


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What do the measurement and observation tables contain ?
  • How do we access values ?

Objectives

  • Understand that these tables contain concept_ids and values obtained from them
  • Know that values can be numeric with units
  • and/or categorical by specifying a standard concept_id that can be looked up in the concept table

Introduction


The OMOP measurement and observation tables contain information collected about a person. The difference between them is that measurement contains numerical or categorical values collected by a standardised process, whereas observation contains less standardised clinical facts. Measurements are predominately lab tests with a few exceptions, like blood pressure or function tests.

A person_id column means that there can be multiple records per person.

Columns are similar between measurement and observation.

Concepts and values


Data are stored as questions and answers. A question (e.g. Pulse rate) is defined by a concept_id and the answer is stored in a value column.

measurement_concept_id or observation_concept_id columns define what has been recorded here are some examples :

Example Measurement concepts Example Observation concepts
Respiratory rate Respiratory function
Pulse rate Wound dressing observable
Hemoglobin saturation with oxygen Mandatory breath rate
Body temperature Body position for blood pressure measurement
Diastolic blood pressure Alcohol intake - finding
Arterial oxygen saturation Tobacco smoking behavior - finding
Body weight Vomit appearance
Leukocytes [#/volume] in Blood State of consciousness and awareness

These value columns store values :

column name data type example concept_name
value_as_number numeric value 1.2 -
unit_concept_id units of the numeric value 9529 kilogram
value_as_concept_id categorical value 4328749 High
operator_concept_id optional operators 4172704 >

Where values are a concept_id, the name of that concept can be looked up in the concept table that is part of the OMOP vocabularies and included in most CDM instances. We show this below.

Looking at numeric measurement values


R


install.packages("MeasurementDiagnostics")
library(dplyr)

cdm <- MeasurementDiagnostics::mockMeasurementDiagnostics()

# first we can see that some concepts have a value_as_number
freq_numeric <- cdm$measurement |>
  filter(!is.na(value_as_number)) |> 
  count(measurement_concept_id, unit_concept_id) |> 
  # join concept names
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |> 
  rename(measurement_concept_name=concept_name) |> 
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(unit_concept_id==concept_id)) |> 
  rename(unit_concept_name=concept_name) |> 
  collect()
  
freq_numeric |> head(3)
  

Output of numeric values


TODO these data from MeasurementDiagnostics aren’t great but they are a start

  • Good that it is a complete CDM with the concept table for joining names.
  • Bad few values and unrealistic units

OUTPUT


  measurement_concept_id unit_concept_id     n measurement_concept_name                         unit_concept_name
                   <int>           <dbl> <dbl> <chr>                                            <chr>
1                3002069            9529    50 Alkaline phosphatase.bone/Alkaline phosphatase.… kilogram
2                3012056            9529    50 Uroporphyrin 3 isomer [Moles/volume] in Urine    kilogram
3                3026074            9529    50 Uroporphyrin 3 isomer [Moles/volume] in Stool    kilogram
Challenge

Challenge 1: Can you adapt the code above to output the frequency of categorical values ?

R

# other concepts have a value_as_concept_id
freq_categorical <- cdm$measurement |>
  filter(!is.na(value_as_concept_id)) |> 
  count(measurement_concept_id, value_as_concept_id) |> 
  # join concept names
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |>  
  rename(measurement_concept_name=concept_name) |> 
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(value_as_concept_id==concept_id)) |> 
  rename(value_as_concept_name=concept_name) |> 
  collect()
  
freq_categorical |> head(3)

OUTPUT

  measurement_concept_id value_as_concept_id     n measurement_concept_name                 value_as_concept_name
                   <int>               <dbl> <dbl> <chr>                                    <chr>
1                3001467             4328749    33 Alkaline phosphatase.bone [Enzymatic ac… High
2                3002069             4267416    33 Alkaline phosphatase.bone/Alkaline phos… Low
3                3011539             4267416    33 Uroporphyrin 3 isomer [Moles/volume] in… Low

When a measurement or observation was done


There are date and time columns indicating when a measurement or observation was performed.

A visit_occurrence_id specifies the visit that it occurred in.

TODO do we want to add more here with example data

Source data columns


There are columns that can store data from the source database that hasn’t been standardised to the same level. Usually it is not a good idea to use these in analyses because any code is unlikely to work on different data. The source columns can be used to check how the standardised data were arrived at.

Source columns in the measurement table :

measurement_source_value
measurement_source_concept_id
unit_source_value
value_source_value

Advanced : Linking measurement and observation tables to other tables e.g. specimen


There are measurement_event_id and observation_event_id columns that can link a record to the primary key in another table e.g. specimen_id. If these are used then meas_event_field_concept_id or obs_event_field_concept_id need to contain the concept_id corresponding to the linked table (in this case specimen).

For microbiology results observation_concept_id can store the genus & species of an organism & measurement_concept_id can store its growth.

TODO question in the above do they need to be linked by fact_relationship to deal with multiple organisms from the same specimen ? TODO do we want to add more here with example data

Key Points
  • the measurement table contains numeric or categorical results of a standardised process
  • the observation table contains less standardised clinical facts

Content from Placeholder Chapter 4


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What

Objectives

  • 1

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from Placeholder Chapter 5


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What

Objectives

  • 1

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from Placeholder Chapter 6


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What

Objectives

  • 1

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from Placeholder Chapter 7


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What

Objectives

  • 1

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally

Content from Placeholder Chapter 8


Last updated on 2025-09-10 | Edit this page

Overview

Questions

  • What

Objectives

  • 1

Introduction


This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.

What you need to know is that there are three sections required for a valid Carpentries lesson template:

  1. questions are displayed at the beginning of the episode to prime the learner for the content.
  2. objectives are the learning objectives for an episode displayed with the questions.
  3. keypoints are displayed at the end of the episode to reinforce the objectives.
Challenge

Challenge 1: Can you do it?

What is the output of this command?

R

paste("This", "new", "lesson", "looks", "good")

OUTPUT

[1] "This new lesson looks good"
Challenge

Challenge 2: how do you nest solutions within challenge blocks?

You can add a line with at least three colons and a solution tag.

Figures


You can also include figures generated from R Markdown:

R

pie(
  c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5), 
  init.angle = 315, 
  col = c("deepskyblue", "yellow", "yellow3"), 
  border = FALSE
)
pie chart illusion of a pyramid
Sun arise each and every morning

Or you can use standard markdown for static figures with the following syntax:

![optional caption that appears below the figure](figure url){alt='alt text for accessibility purposes'}

Blue Carpentries hex person logo with no text.
You belong in The Carpentries!
Callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.

Math


One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:

$\alpha = \dfrac{1}{(1 - \beta)^2}$ becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)

Cool, right?

Key Points
  • Use .md files for episodes when you want static content
  • Use .Rmd files for episodes when you need to generate output
  • Run sandpaper::check_lesson() to identify any issues with your lesson
  • Run sandpaper::build_lesson() to preview your lesson locally