Chapter 3 OMOP measurement and observation tables
Last updated on 2025-09-10 | Edit this page
Overview
Questions
- What do the measurement and observation tables contain ?
- How do we access values ?
Objectives
- Understand that these tables contain concept_ids and values obtained from them
- Know that values can be numeric with units
- and/or categorical by specifying a standard concept_id that can be looked up in the concept table
Introduction
The OMOP measurement and observation tables contain information collected about a person. The difference between them is that measurement contains numerical or categorical values collected by a standardised process, whereas observation contains less standardised clinical facts. Measurements are predominately lab tests with a few exceptions, like blood pressure or function tests.
A person_id
column means that there can be multiple
records per person.
Columns are similar between measurement and observation.
Concepts and values
Data are stored as questions and answers. A question
(e.g. Pulse rate
) is defined by a concept_id and the answer
is stored in a value column.
measurement_concept_id or observation_concept_id columns define what has been recorded here are some examples :
Example Measurement concepts | Example Observation concepts |
---|---|
Respiratory rate | Respiratory function |
Pulse rate | Wound dressing observable |
Hemoglobin saturation with oxygen | Mandatory breath rate |
Body temperature | Body position for blood pressure measurement |
Diastolic blood pressure | Alcohol intake - finding |
Arterial oxygen saturation | Tobacco smoking behavior - finding |
Body weight | Vomit appearance |
Leukocytes [#/volume] in Blood | State of consciousness and awareness |
These value columns store values :
column name | data type | example | concept_name |
---|---|---|---|
value_as_number | numeric value | 1.2 | - |
unit_concept_id | units of the numeric value | 9529 | kilogram |
value_as_concept_id | categorical value | 4328749 | High |
operator_concept_id | optional operators | 4172704 | > |
Where values are a concept_id, the name of that concept can be looked up in the concept table that is part of the OMOP vocabularies and included in most CDM instances. We show this below.
Looking at numeric measurement values
R
install.packages("MeasurementDiagnostics")
library(dplyr)
cdm <- MeasurementDiagnostics::mockMeasurementDiagnostics()
# first we can see that some concepts have a value_as_number
freq_numeric <- cdm$measurement |>
filter(!is.na(value_as_number)) |>
count(measurement_concept_id, unit_concept_id) |>
# join concept names
left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |>
rename(measurement_concept_name=concept_name) |>
left_join(select(cdm$concept, concept_id, concept_name), by=join_by(unit_concept_id==concept_id)) |>
rename(unit_concept_name=concept_name) |>
collect()
freq_numeric |> head(3)
Output of numeric values
TODO these data from MeasurementDiagnostics aren’t great but they are a start
- Good that it is a complete CDM with the concept table for joining names.
- Bad few values and unrealistic units
OUTPUT
measurement_concept_id unit_concept_id n measurement_concept_name unit_concept_name
<int> <dbl> <dbl> <chr> <chr>
1 3002069 9529 50 Alkaline phosphatase.bone/Alkaline phosphatase.… kilogram
2 3012056 9529 50 Uroporphyrin 3 isomer [Moles/volume] in Urine kilogram
3 3026074 9529 50 Uroporphyrin 3 isomer [Moles/volume] in Stool kilogram
Challenge 1: Can you adapt the code above to output the frequency of categorical values ?
R
# other concepts have a value_as_concept_id
freq_categorical <- cdm$measurement |>
filter(!is.na(value_as_concept_id)) |>
count(measurement_concept_id, value_as_concept_id) |>
# join concept names
left_join(select(cdm$concept, concept_id, concept_name), by=join_by(measurement_concept_id==concept_id)) |>
rename(measurement_concept_name=concept_name) |>
left_join(select(cdm$concept, concept_id, concept_name), by=join_by(value_as_concept_id==concept_id)) |>
rename(value_as_concept_name=concept_name) |>
collect()
freq_categorical |> head(3)
OUTPUT
measurement_concept_id value_as_concept_id n measurement_concept_name value_as_concept_name
<int> <dbl> <dbl> <chr> <chr>
1 3001467 4328749 33 Alkaline phosphatase.bone [Enzymatic ac… High
2 3002069 4267416 33 Alkaline phosphatase.bone/Alkaline phos… Low
3 3011539 4267416 33 Uroporphyrin 3 isomer [Moles/volume] in… Low
When a measurement or observation was done
There are date and time columns indicating when a measurement or observation was performed.
A visit_occurrence_id specifies the visit that it occurred in.
TODO do we want to add more here with example data
Source data columns
There are columns that can store data from the source database that hasn’t been standardised to the same level. Usually it is not a good idea to use these in analyses because any code is unlikely to work on different data. The source columns can be used to check how the standardised data were arrived at.
Source columns in the measurement table :
measurement_source_value
measurement_source_concept_id
unit_source_value
value_source_value
Advanced : Linking measurement and observation tables to other tables e.g. specimen
There are measurement_event_id and observation_event_id columns that can link a record to the primary key in another table e.g. specimen_id. If these are used then meas_event_field_concept_id or obs_event_field_concept_id need to contain the concept_id corresponding to the linked table (in this case specimen).
For microbiology results observation_concept_id can store the genus & species of an organism & measurement_concept_id can store its growth.
TODO question in the above do they need to be linked by fact_relationship to deal with multiple organisms from the same specimen ? TODO do we want to add more here with example data
- the measurement table contains numeric or categorical results of a standardised process
- the observation table contains less standardised clinical facts