Skip to main content
Version: 9.4.0

FHIRPath

The library provides functions for converting FHIRPath expressions into Spark Columns. These columns can be used for filtering resources with boolean expressions or extracting values from complex FHIR data structures.

FHIRPath is a path-based navigation and extraction language designed specifically for FHIR data. It provides more expressive power than search parameters, enabling complex queries and data extraction operations.

Boolean filtering

FHIRPath expressions that evaluate to boolean values can be used with the filter operation to select resources that match specific criteria.

In this example, we filter patients by gender using a FHIRPath boolean expression.

from pathling import PathlingContext

pc = PathlingContext.create()
data_source = pc.read.ndjson("data/ndjson")
patients = data_source.read("Patient")

# Filter patients using a boolean FHIRPath expression.
gender_filter = pc.fhirpath_to_column("Patient", "gender = 'male'")
patients.filter(gender_filter).select("id", "gender").show()

Results in:

idgender
8ee183e2-b3c0-4151-be94-b945d6aa8c6dmale
93ee0b14-4f22-4c1a-93e2-b4e5c0d7f0d6male

Value extraction

FHIRPath expressions can extract values from FHIR resources. These expressions can be used with select or withColumn operations to add derived columns to your dataset.

The expression should evaluate to a single value per resource to avoid collection-valued results.

from pathling import PathlingContext

pc = PathlingContext.create()
data_source = pc.read.ndjson("data/ndjson")
patients = data_source.read("Patient")

# Extract the first given name using FHIRPath.
first_name = pc.fhirpath_to_column("Patient", "name.given.first()")
patients.withColumn("given_name", first_name).select("id", "given_name").show()

Results in:

idgiven_name
8ee183e2-b3c0-4151-be94-b945d6aa8c6dCollin
7b4d8c2f-9a3e-4d5b-8c1f-2e3d4c5b6a7dPatrica
93ee0b14-4f22-4c1a-93e2-b4e5c0d7f0d6John

Complex expressions

FHIRPath supports complex expressions including path traversal, filtering, and function calls. This enables sophisticated queries that would be difficult to express with search parameters alone.

from pathling import PathlingContext

pc = PathlingContext.create()
data_source = pc.read.ndjson("data/ndjson")
patients = data_source.read("Patient")

# Extract the family name from official names only.
official_family = pc.fhirpath_to_column(
"Patient",
"name.where(use = 'official').family.first()"
)
patients.withColumn("official_family", official_family).select(
"id", "official_family"
).show()

Results in:

idofficial_family
8ee183e2-b3c0-4151-be94-b945d6aa8c6dRunte378
7b4d8c2f-9a3e-4d5b-8c1f-2e3d4c5b6a7dDonnelly735

Combining with other operations

FHIRPath columns can be integrated into larger transformation pipelines, combining filtering and value extraction with other DataFrame operations.

from pathling import PathlingContext

pc = PathlingContext.create()
data_source = pc.read.ndjson("data/ndjson")
patients = data_source.read("Patient")

# Build a pipeline that filters and extracts data.
gender_filter = pc.fhirpath_to_column("Patient", "gender = 'male'")
birth_year = pc.fhirpath_to_column("Patient", "birthDate.toString().substring(0, 4)")

result = (
patients
.filter(gender_filter)
.withColumn("birth_year", birth_year)
.select("id", "gender", "birth_year")
.orderBy("birth_year")
)
result.show()

Results in:

idgenderbirth_year
8ee183e2-b3c0-4151-be94-b945d6aa8c6dmale1967
93ee0b14-4f22-4c1a-93e2-b4e5c0d7f0d6male1995

Single resource evaluation

FHIRPath expressions can also be evaluated against a single FHIR resource provided as a JSON string, without requiring the resource to be part of a DataFrame. This is useful for ad-hoc evaluation, testing expressions, or working with individual resources.

The method returns a list of typed results along with the inferred return type of the expression.

from pathling import PathlingContext

pc = PathlingContext.create()

patient_json = '{"resourceType": "Patient", "id": "example", "gender": "male", "name": [{"family": "Smith", "given": ["John"]}]}'

# Evaluate a FHIRPath expression against a single resource.
result = pc.evaluate_fhirpath("Patient", patient_json, "name.family")
for value in result["results"]:
print(f"{value['type']}: {value['value']}")
print(f"Return type: {result['expectedReturnType']}")

Results in:

string: Smith
Return type: string