Measurements and Observations

Last updated on 2026-02-08 | Edit this page

Estimated time: 0 minutes

Overview

Questions

  • How to access measurements and observations ?

Objectives

  • Know that measurements are mainly lab results and other records like pulse rate

  • Know observations are other facts obtained through questioning or direct observation

  • Understand concept ids identify the measure or observation, values are stored in value_as_number or value_as_concept_id

  • Be able to join to the concept table to find a particular measurement or observation concept by name

Introduction


This episode covers the OMOP measurement and observation tables.

Callout

For this episode we will be using a sample OMOP CDM database that is pre-loaded with data. This database is a simplified version of a real-world OMOP CDM database and is intended for educational purposes only.

(UCLH only) This will come in the same form as you would get data if you asked for a data extract via the SAFEHR platform (i.e. a set of parquet files).

As part of the setup prior to this course you were asked to download and install the sample database. If you have not done this yet, please refer to the setup instructions provided earlier in the course. For now, we will assume that you have the sample OMOP CDM database available on your local machine at the following path: workshop/data/public/ and the functions in a folder workshop/code.

You will then need to load the database as shown in the previous episode.

R

open_omop_dataset <- function(dir) {
  open_omop_schema <- function(path) {
    # iterate table level folders
    list.dirs(path, recursive = FALSE) |>
      # exclude folder name from path
      # and use it as index for named list
      purrr::set_names(~ basename(.)) |>
      # "lazy-open" list of parquet files
      # from specified folder
      purrr::map(arrow::open_dataset)
  }
  # iterate top-level folders
  list.dirs(dir, recursive = FALSE) |>
    # exclude folder name from path
    # and use it as index for named list
    purrr::set_names(~ basename(.)) |>
    purrr::map(open_omop_schema)
}

R

omop <- open_omop_dataset("./data/")

and the useful functions we created in the previous episode to look up concept names/ids.

R

library(arrow)
library(dplyr)
get_concept_name <- function(id) {
  omop$public$concept |>
    filter(concept_id == !!id) |>
    select(concept_name) |>
    collect()
}

R

get_concept_id <- function(name) {
  omop$public$concept |>
    filter(concept_name == !!name) |>
    select(concept_id) |>
    collect()
}

The OMOP measurement and observation tables contain information collected about a person.

The difference between them is that measurement contains numerical or categorical values collected by a standardised process, whereas observation contains less standardised clinical facts. Measurements are often lab results, vital signs or other clinical measurements such as height, weight, blood pressure, pulse rate, respiratory rate, oxygen saturations etc. Observations are other facts obtained through questioning or direct observation, for example smoking status, alcohol intake, family history, symptoms reported by the patient etc.

A person_id column means that there can be multiple records per person.

Columns are similar between measurement and observation.

Concepts and values


Data are stored as questions and answers. A question (e.g. Pulse rate) is defined by a concept_id and the answer is stored in a value column.

The measurement_concept_id or observation_concept_id columns define what has been recorded. Here are some examples :

Example Measurement concepts Example Observation concepts
Respiratory rate Respiratory function
Pulse rate Wound dressing observable
Hemoglobin saturation with oxygen Mandatory breath rate
Body temperature Body position for blood pressure measurement
Diastolic blood pressure Alcohol intake - finding
Arterial oxygen saturation Tobacco smoking behavior - finding
Body weight Vomit appearance
Leukocytes [#/volume] in Blood State of consciousness and awareness
Challenge

Challenge

Looking at their measurement and observation tables identify the various columns that might store a value and associated information (e.g. units).

The various value columns store values :

column name data type example concept_name
value_as_number numeric value 1.2 -
unit_concept_id units of the numeric value 9529 kilogram
value_as_concept_id categorical value 4328749 High
operator_concept_id optional operators 4172704 >

Note where values are a concept_id, the name of that concept can be looked up in the concept table that is part of the OMOP vocabularies and included in most CDM instances.

Look at the column values we have got in the tables associated with our database.

R

omop$public$measurement |> colnames() |> print()

OUTPUT

 [1] "measurement_id"         "person_id"              "measurement_concept_id"
 [4] "measurement_date"       "measurement_datetime"   "operator_concept_id"
 [7] "value_as_number"        "value_as_concept_id"    "unit_concept_id"
[10] "range_low"              "range_high"             "visit_occurrence_id"   

R

omop$public$observation |> colnames() |> print()

OUTPUT

[1] "observation_id"         "person_id"              "observation_concept_id"
[4] "observation_date"       "observation_datetime"   "value_as_number"
[7] "value_as_string"        "value_as_concept_id"    "visit_occurrence_id"   

You can see that for observations the main value is a string or a concept, whereas for a measurement the main value is a number accompanied by the concept id of a unit.

Looking at observation values


Let’s focus on observations.

Now we could go through each table and use our get_concept_name function to work out what all these measurements and observations are, but that could get a bit tedious!

Let’s try and join to the concept table and produce a table that gives us the humanly readable names to start with.

Challenge

Challenge

By joining to the concept table produce a version of the observation table with concept names. Only include columns that are relevant to the value.

R

library(dplyr)
# Pre-load concept names and ids
concepts <- select(omop$public$concept |> collect(), concept_id, concept_name)

# Create a mini observation table with only the columns relevant to value
mini_observation <- omop$public$observation |>
  select(observation_id, person_id, observation_concept_id, value_as_concept_id, value_as_number) |>
  collect()

# Join to get names of the observation concept id
# Rename the new column to observation_concept_name
# Relocate the new column to be after observation_concept_id
mini_observation <- mini_observation |>
  left_join(concepts, by=join_by(observation_concept_id == concept_id)) |>
  rename(observation_concept_name = concept_name) |>
  relocate(observation_concept_name, .after = observation_concept_id)

# Repeat the join to get names of the value concept id
mini_observation <- mini_observation |>
  left_join(concepts, by = join_by(value_as_concept_id == concept_id)) |>
  rename(value_as_concept_name = concept_name) |>
  relocate(value_as_concept_name, .after = value_as_concept_id)

Now we can look at this named table.

R

View(mini_observation)

ERROR

Error in .External2(C_dataviewer, x, title): unable to start data viewer

Social indexes

Could be skipped if short of time

It is interesting to note that some observations relate to social indexes such as deprivation indices. As noted in the title these are observations made in England only.

Challenge

Challenge

Create a mini version of the concepts table that contains only the concepts relating to social indices.

R

social_concepts <- omop$public$concept |>
  filter(concept_id %in% c(35812888, 35812884, 35812883, 35812882, 35812883, 35812885)) |>
  collect()

social_concepts

OUTPUT

# A tibble: 5 × 6
  concept_id concept_name               domain_id vocabulary_id standard_concept
       <int> <chr>                      <chr>     <chr>         <chr>
1   35812882 Index of Multiple Depriva… Observat… UK Biobank    ""
2   35812883 Income score (England)     Observat… UK Biobank    ""
3   35812884 Employment score (England) Observat… UK Biobank    ""
4   35812885 Health score (England)     Observat… UK Biobank    ""
5   35812888 Crime score (England)      Observat… UK Biobank    ""
# ℹ 1 more variable: concept_class_id <chr>

This is an instance of a nonstandard concept being used within OMOP.

Comment on the fact that these concepts are nonstandard and that this is an example of how local data can be mapped to OMOP concepts even if they are not part of the standard vocabulary.

Looking at measurement values


Let’s now look at measurements. As we said before, measurements are often numerical values with associated units. This can arise from lab results or vital signs.

Challenge

Challenge

Consider the concept with the name Heart rate. Use the measurement and concept tables to answer the following question:

  1. What are the units associated with this measurement concept?

  2. What is the average value recorded for this measurement across all persons?

  3. What class of concept is this measurement concept?

  1. What are the units associated with Heart rate?

R

# Get the concept id for Heart rate  
heart_rate_id <- get_concept_id("Heart rate")$concept_id
heart_rate_id

OUTPUT

[1] 3027018

R

# Filter measurement table for this concept id
heart_rate_measurements <- omop$public$measurement |>
  filter(measurement_concept_id == heart_rate_id) |>
  collect()
# Get the unique unit concept ids
unique_units <- unique(heart_rate_measurements$unit_concept_id)
get_concept_name(unique_units)

OUTPUT

# A tibble: 1 × 1
  concept_name
  <chr>
1 per minute  
  1. What is the average value recorded for Heart rate across all persons?

R

average_heart_rate <- mean(heart_rate_measurements$value_as_number, na.rm = TRUE)
average_heart_rate

OUTPUT

[1] 95
  1. Get the class of concept for Heart rate

R

heart_rate_class <- omop$public$concept |>
  filter(concept_id == heart_rate_id) |>
  select(concept_class_id) |>
  collect() 
heart_rate_class

OUTPUT

# A tibble: 1 × 1
  concept_class_id
  <chr>
1 Clinical Observation

exercise on operator concepts

exercise on value_as_concept_id

Key Points
  • know that measurements are mainly lab results and other records like pulse rate
  • know observations are other facts obtained through questioning or direct observation
  • understand concept ids identify the measure or observation, values are stored in value_as_number or value_as_concept_id
  • be able to join to the concept table to find a particular measurement or observation concept by name
  • understand that different clinical questions can be answered by querying by patient and/or visit, or summing across all records