Measurements and Observations

Last updated on 2026-02-08 | Edit this page

Overview

Questions

How to access measurements and observations ?

Objectives

Know that measurements are mainly lab results and other records like pulse rate
Know observations are other facts obtained through questioning or direct observation
Understand concept ids identify the measure or observation, values are stored in value_as_number or value_as_concept_id
Be able to join to the concept table to find a particular measurement or observation concept by name

Introduction

This episode covers the OMOP measurement and observation tables.

Callout

For this episode we will be using a sample OMOP CDM database that is pre-loaded with data. This database is a simplified version of a real-world OMOP CDM database and is intended for educational purposes only.

(UCLH only) This will come in the same form as you would get data if you asked for a data extract via the SAFEHR platform (i.e. a set of parquet files).

As part of the setup prior to this course you were asked to download and install the sample database. If you have not done this yet, please refer to the setup instructions provided earlier in the course. For now, we will assume that you have the sample OMOP CDM database available on your local machine at the following path: workshop/data/public/ and the functions in a folder workshop/code.

You will then need to load the database as shown in the previous episode.

R

open_omop_dataset <- function(dir) {
  open_omop_schema <- function(path) {
    # iterate table level folders
    list.dirs(path, recursive = FALSE) |>
      # exclude folder name from path
      # and use it as index for named list
      purrr::set_names(~ basename(.)) |>
      # "lazy-open" list of parquet files
      # from specified folder
      purrr::map(arrow::open_dataset)
  }
  # iterate top-level folders
  list.dirs(dir, recursive = FALSE) |>
    # exclude folder name from path
    # and use it as index for named list
    purrr::set_names(~ basename(.)) |>
    purrr::map(open_omop_schema)
}

R

omop <- open_omop_dataset("./data/")

and the useful functions we created in the previous episode to look up concept names/ids.

R

library(arrow)
library(dplyr)
get_concept_name <- function(id) {
  omop$public$concept |>
    filter(concept_id == !!id) |>
    select(concept_name) |>
    collect()
}

R

get_concept_id <- function(name) {
  omop$public$concept |>
    filter(concept_name == !!name) |>
    select(concept_id) |>
    collect()
}

The OMOP measurement and observation tables contain information collected about a person.

The difference between them is that measurement contains numerical or categorical values collected by a standardised process, whereas observation contains less standardised clinical facts. Measurements are often lab results, vital signs or other clinical measurements such as height, weight, blood pressure, pulse rate, respiratory rate, oxygen saturations etc. Observations are other facts obtained through questioning or direct observation, for example smoking status, alcohol intake, family history, symptoms reported by the patient etc.

A person_id column means that there can be multiple records per person.

Columns are similar between measurement and observation.

Concepts and values

Data are stored as questions and answers. A question (e.g. Pulse rate) is defined by a concept_id and the answer is stored in a value column.

The measurement_concept_id or observation_concept_id columns define what has been recorded. Here are some examples :

Example Measurement concepts	Example Observation concepts
Respiratory rate	Respiratory function
Pulse rate	Wound dressing observable
Hemoglobin saturation with oxygen	Mandatory breath rate
Body temperature	Body position for blood pressure measurement
Diastolic blood pressure	Alcohol intake - finding
Arterial oxygen saturation	Tobacco smoking behavior - finding
Body weight	Vomit appearance
Leukocytes [#/volume] in Blood	State of consciousness and awareness

Challenge

Looking at their measurement and observation tables identify the various columns that might store a value and associated information (e.g. units).

Show me the solution

The various value columns store values :

column name	data type	example	concept_name
value_as_number	numeric value	1.2	-
unit_concept_id	units of the numeric value	9529	kilogram
value_as_concept_id	categorical value	4328749	High
operator_concept_id	optional operators	4172704	>

Note where values are a concept_id, the name of that concept can be looked up in the concept table that is part of the OMOP vocabularies and included in most CDM instances.

Look at the column values we have got in the tables associated with our database.

R

omop$public$measurement |> colnames() |> print()

OUTPUT

 [1] "measurement_id"         "person_id"              "measurement_concept_id"
 [4] "measurement_date"       "measurement_datetime"   "operator_concept_id"
 [7] "value_as_number"        "value_as_concept_id"    "unit_concept_id"
[10] "range_low"              "range_high"             "visit_occurrence_id"

R

omop$public$observation |> colnames() |> print()

OUTPUT

[1] "observation_id"         "person_id"              "observation_concept_id"
[4] "observation_date"       "observation_datetime"   "value_as_number"
[7] "value_as_string"        "value_as_concept_id"    "visit_occurrence_id"

You can see that for observations the main value is a string or a concept, whereas for a measurement the main value is a number accompanied by the concept id of a unit.

Looking at observation values

Let’s focus on observations.

Now we could go through each table and use our get_concept_name function to work out what all these measurements and observations are, but that could get a bit tedious!

Let’s try and join to the concept table and produce a table that gives us the humanly readable names to start with.

Challenge

By joining to the concept table produce a version of the observation table with concept names. Only include columns that are relevant to the value.

Show me the solution

R

library(dplyr)
# Pre-load concept names and ids
concepts <- select(omop$public$concept |> collect(), concept_id, concept_name)

# Create a mini observation table with only the columns relevant to value
mini_observation <- omop$public$observation |>
  select(observation_id, person_id, observation_concept_id, value_as_concept_id, value_as_number) |>
  collect()

# Join to get names of the observation concept id
# Rename the new column to observation_concept_name
# Relocate the new column to be after observation_concept_id
mini_observation <- mini_observation |>
  left_join(concepts, by=join_by(observation_concept_id == concept_id)) |>
  rename(observation_concept_name = concept_name) |>
  relocate(observation_concept_name, .after = observation_concept_id)

# Repeat the join to get names of the value concept id
mini_observation <- mini_observation |>
  left_join(concepts, by = join_by(value_as_concept_id == concept_id)) |>
  rename(value_as_concept_name = concept_name) |>
  relocate(value_as_concept_name, .after = value_as_concept_id)

Now we can look at this named table.

R

View(mini_observation)

ERROR

Error in .External2(C_dataviewer, x, title): unable to start data viewer

It is interesting to note that some observations relate to social indexes such as deprivation indices. As noted in the title these are observations made in England only.

Challenge

Create a mini version of the concepts table that contains only the concepts relating to social indices.

Show me the solution

R

social_concepts <- omop$public$concept |>
  filter(concept_id %in% c(35812888, 35812884, 35812883, 35812882, 35812883, 35812885)) |>
  collect()

social_concepts

OUTPUT

# A tibble: 5 × 6
  concept_id concept_name               domain_id vocabulary_id standard_concept
       <int> <chr>                      <chr>     <chr>         <chr>
1   35812882 Index of Multiple Depriva… Observat… UK Biobank    ""
2   35812883 Income score (England)     Observat… UK Biobank    ""
3   35812884 Employment score (England) Observat… UK Biobank    ""
4   35812885 Health score (England)     Observat… UK Biobank    ""
5   35812888 Crime score (England)      Observat… UK Biobank    ""
# ℹ 1 more variable: concept_class_id <chr>

This is an instance of a nonstandard concept being used within OMOP.

Looking at measurement values

Let’s now look at measurements. As we said before, measurements are often numerical values with associated units. This can arise from lab results or vital signs.

Challenge

Consider the concept with the name Heart rate. Use the measurement and concept tables to answer the following question:

What are the units associated with this measurement concept?
What is the average value recorded for this measurement across all persons?
What class of concept is this measurement concept?

Show me the solution

What are the units associated with Heart rate?

R

# Get the concept id for Heart rate  
heart_rate_id <- get_concept_id("Heart rate")$concept_id
heart_rate_id

OUTPUT

[1] 3027018

R

# Filter measurement table for this concept id
heart_rate_measurements <- omop$public$measurement |>
  filter(measurement_concept_id == heart_rate_id) |>
  collect()
# Get the unique unit concept ids
unique_units <- unique(heart_rate_measurements$unit_concept_id)
get_concept_name(unique_units)

OUTPUT

# A tibble: 1 × 1
  concept_name
  <chr>
1 per minute

What is the average value recorded for Heart rate across all persons?

R

average_heart_rate <- mean(heart_rate_measurements$value_as_number, na.rm = TRUE)
average_heart_rate

OUTPUT

[1] 95

Get the class of concept for Heart rate

R

heart_rate_class <- omop$public$concept |>
  filter(concept_id == heart_rate_id) |>
  select(concept_class_id) |>
  collect() 
heart_rate_class

OUTPUT

# A tibble: 1 × 1
  concept_class_id
  <chr>
1 Clinical Observation

exercise on operator concepts

exercise on value_as_concept_id

Key Points

know that measurements are mainly lab results and other records like pulse rate
know observations are other facts obtained through questioning or direct observation
understand concept ids identify the measure or observation, values are stored in value_as_number or value_as_concept_id
be able to join to the concept table to find a particular measurement or observation concept by name
understand that different clinical questions can be answered by querying by patient and/or visit, or summing across all records

Measurements and Observations

Overview

Questions

Objectives

Introduction

R

R

R

R

Concepts and values

Challenge

Show me the solution

R

OUTPUT

R

OUTPUT

Looking at observation values

Challenge

Show me the solution

R

R

ERROR

Social indexes

Challenge

Show me the solution

R

OUTPUT

Looking at measurement values

Challenge

Show me the solution

R

OUTPUT

R

OUTPUT

R

OUTPUT

R

OUTPUT