Chapter 3 OMOP measurement and observation tables

Last updated on 2025-09-10 | Edit this page

Overview

Questions

What do the measurement and observation tables contain ?
How do we access values ?

Objectives

Understand that these tables contain concept_ids and values obtained from them
Know that values can be numeric with units
and/or categorical by specifying a standard concept_id that can be looked up in the concept table

Introduction

The OMOP measurement and observation tables contain information collected about a person. The difference between them is that measurement contains numerical or categorical values collected by a standardised process, whereas observation contains less standardised clinical facts. Measurements are predominately lab tests with a few exceptions, like blood pressure or function tests.

A person_id column means that there can be multiple records per person.

Columns are similar between measurement and observation.

Concepts and values

Data are stored as questions and answers. A question (e.g. Pulse rate) is defined by a concept_id and the answer is stored in a value column.

measurement_concept_id or observation_concept_id columns define what has been recorded here are some examples :

Example Measurement concepts	Example Observation concepts
Respiratory rate	Respiratory function
Pulse rate	Wound dressing observable
Hemoglobin saturation with oxygen	Mandatory breath rate
Body temperature	Body position for blood pressure measurement
Diastolic blood pressure	Alcohol intake - finding
Arterial oxygen saturation	Tobacco smoking behavior - finding
Body weight	Vomit appearance
Leukocytes [#/volume] in Blood	State of consciousness and awareness

These value columns store values :

column name	data type	example	concept_name
value_as_number	numeric value	1.2	-
unit_concept_id	units of the numeric value	9529	kilogram
value_as_concept_id	categorical value	4328749	High
operator_concept_id	optional operators	4172704	>

Where values are a concept_id, the name of that concept can be looked up in the concept table that is part of the OMOP vocabularies and included in most CDM instances. We show this below.

Looking at numeric measurement values

R


install.packages("MeasurementDiagnostics")
library(dplyr)

cdm <- MeasurementDiagnostics::mockMeasurementDiagnostics()

# first we can see that some concepts have a value_as_number
freq_numeric <- cdm$measurement |>
  filter(!is.na(value_as_number)) |> 
  count(measurement_concept_id, unit_concept_id) |> 
  # join concept names
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |> 
  rename(measurement_concept_name=concept_name) |> 
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(unit_concept_id==concept_id)) |> 
  rename(unit_concept_name=concept_name) |> 
  collect()
  
freq_numeric |> head(3)

Output of numeric values

TODO these data from MeasurementDiagnostics aren’t great but they are a start

Good that it is a complete CDM with the concept table for joining names.
Bad few values and unrealistic units

OUTPUT


  measurement_concept_id unit_concept_id     n measurement_concept_name                         unit_concept_name
                   <int>           <dbl> <dbl> <chr>                                            <chr>
1                3002069            9529    50 Alkaline phosphatase.bone/Alkaline phosphatase.… kilogram
2                3012056            9529    50 Uroporphyrin 3 isomer [Moles/volume] in Urine    kilogram
3                3026074            9529    50 Uroporphyrin 3 isomer [Moles/volume] in Stool    kilogram

Challenge

Challenge 1: Can you adapt the code above to output the frequency of categorical values ?

Code

R

# other concepts have a value_as_concept_id
freq_categorical <- cdm$measurement |>
  filter(!is.na(value_as_concept_id)) |> 
  count(measurement_concept_id, value_as_concept_id) |> 
  # join concept names
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |>  
  rename(measurement_concept_name=concept_name) |> 
  left_join(select(cdm$concept, concept_id, concept_name), by=join_by(value_as_concept_id==concept_id)) |> 
  rename(value_as_concept_name=concept_name) |> 
  collect()
  
freq_categorical |> head(3)

OUTPUT

  measurement_concept_id value_as_concept_id     n measurement_concept_name                 value_as_concept_name
                   <int>               <dbl> <dbl> <chr>                                    <chr>
1                3001467             4328749    33 Alkaline phosphatase.bone [Enzymatic ac… High
2                3002069             4267416    33 Alkaline phosphatase.bone/Alkaline phos… Low
3                3011539             4267416    33 Uroporphyrin 3 isomer [Moles/volume] in… Low

When a measurement or observation was done

There are date and time columns indicating when a measurement or observation was performed.

A visit_occurrence_id specifies the visit that it occurred in.

TODO do we want to add more here with example data

Source data columns

There are columns that can store data from the source database that hasn’t been standardised to the same level. Usually it is not a good idea to use these in analyses because any code is unlikely to work on different data. The source columns can be used to check how the standardised data were arrived at.

Source columns in the measurement table :

measurement_source_value
measurement_source_concept_id
unit_source_value
value_source_value

Advanced : Linking measurement and observation tables to other tables e.g. specimen

There are measurement_event_id and observation_event_id columns that can link a record to the primary key in another table e.g. specimen_id. If these are used then meas_event_field_concept_id or obs_event_field_concept_id need to contain the concept_id corresponding to the linked table (in this case specimen).

For microbiology results observation_concept_id can store the genus & species of an organism & measurement_concept_id can store its growth.

TODO question in the above do they need to be linked by fact_relationship to deal with multiple organisms from the same specimen ? TODO do we want to add more here with example data

Key Points

the measurement table contains numeric or categorical results of a standardised process
the observation table contains less standardised clinical facts