Skip to main content
Version: 9.4.0

Export

The export operation allows FHIR data to be exported from Pathling in bulk. This implementation follows the FHIR Bulk Data Access specification.

Export levels

Pathling supports export at multiple levels:

LevelEndpointDescription
SystemGET [base]/$exportExport all resources in the server
Patient typeGET [base]/Patient/$exportExport all resources in the Patient compartment
Patient instanceGET [base]/Patient/[id]/$exportExport resources for a specific patient
GroupGET [base]/Group/[id]/$exportExport resources for patients in a specific group

Parameters

NameCardinalityTypeDescription
_outputFormat0..1stringThe format for exported files. Accepts application/fhir+ndjson, application/ndjson, ndjson, application/vnd.apache.parquet, or parquet. Defaults to application/fhir+ndjson. See Output formats.
_since0..1instantOnly include resources where meta.lastUpdated is after this time.
_until0..1instantOnly include resources where meta.lastUpdated is before this time.
_type0..*stringComma-delimited list of resource types to export. If omitted, all supported types are exported. Invalid types cause an error unless the Prefer: handling=lenient header is included.
_typeFilter0..*stringFHIR search queries to filter exported resources by type. Each value has the format [ResourceType]?[search-params] (e.g., Patient?gender=male). Multiple filters for the same type are combined with OR logic. See Type filters.
_elements0..*stringComma-delimited list of elements to include. Specify as [type].[element] (e.g., Patient.name) or [element] for all types. Only top-level elements are supported. Mandatory elements are always included.

Type filters

The _typeFilter parameter allows you to apply FHIR search queries to filter resources during export. Each filter specifies a resource type and a search query string, separated by ?.

For example, to export only active medication requests and male patients:

GET [base]/$export?_type=MedicationRequest,Patient&_typeFilter=MedicationRequest?status=active&_typeFilter=Patient?gender=male HTTP/1.1
Accept: application/fhir+json
Prefer: respond-async

Multiple filters per type

When multiple _typeFilter values target the same resource type, they are combined using OR logic. For example, the following request exports patients who are either male or born after 2000:

GET [base]/$export?_type=Patient&_typeFilter=Patient?gender=male&_typeFilter=Patient?birthdate=gt2000-01-01 HTTP/1.1

Implicit type inclusion

If _typeFilter is provided without _type, the resource types referenced in the filters are automatically included in the export. For example, this request exports only Patient resources:

GET [base]/$export?_typeFilter=Patient?gender=male HTTP/1.1

Validation

  • Each _typeFilter value must contain a ? separating the resource type from the search query.
  • The resource type must be a valid FHIR resource type supported by the server.
  • If _type is also specified, the resource types in _typeFilter must be a subset of _type. In strict mode, a mismatch causes an error. With Prefer: handling=lenient, mismatched filters are silently ignored.
  • Search parameters must be valid for the specified resource type.

Output formats

Pathling supports two output formats for bulk export:

NDJSON

The default format is NDJSON (Newline Delimited JSON), which outputs one FHIR resource per line as JSON. This format is specified by the FHIR Bulk Data Access specification and is widely supported by FHIR tools.

Use application/fhir+ndjson, application/ndjson, or ndjson as the _outputFormat value.

Parquet

Pathling also supports export to Apache Parquet format, which provides efficient columnar storage suitable for analytics workloads. Parquet files are significantly smaller than NDJSON and can be queried directly using tools like Apache Spark, DuckDB, or pandas.

Use application/vnd.apache.parquet or parquet as the _outputFormat value.

The Parquet output follows the Pathling Parquet specification, which defines how FHIR resources are represented in the Parquet schema.

Asynchronous processing

The export operation uses the FHIR Asynchronous Request Pattern. Include a Prefer: respond-async header to initiate asynchronous processing.

Kick-off request

Parameters can be passed either as query parameters in a GET request or as a FHIR Parameters resource in the body of a POST request.

Using GET with query parameters:

GET [base]/$export?_type=Patient,Observation&_outputFormat=parquet HTTP/1.1
Accept: application/fhir+json
Prefer: respond-async

Using POST with a Parameters resource:

POST [base]/$export HTTP/1.1
Accept: application/fhir+json
Content-Type: application/fhir+json
Prefer: respond-async

{
"resourceType": "Parameters",
"parameter": [
{
"name": "_type",
"valueString": "Patient,Observation"
},
{
"name": "_outputFormat",
"valueString": "parquet"
}
]
}

Kick-off response

HTTP/1.1 202 Accepted
Content-Location: [base]/$exportstatus/[job-id]

Polling

Poll the URL from Content-Location until you receive a 200 OK response with the export manifest.

  • 202 Accepted — Export still in progress. Check X-Progress header for status.
  • 200 OK — Export complete. Response body contains the manifest.

Response manifest

When the export completes, the response contains a JSON manifest:

{
"transactionTime": "2025-01-15T10:30:00.000Z",
"request": "https://pathling.example.com/fhir/$export",
"requiresAccessToken": true,
"output": [
{
"type": "Patient",
"url": "https://pathling.example.com/fhir/$result?job=abc123&file=Patient.ndjson"
},
{
"type": "Observation",
"url": "https://pathling.example.com/fhir/$result?job=abc123&file=Observation.ndjson"
}
],
"deleted": [],
"error": []
}

Manifest fields

FieldDescription
transactionTimeThe time the export was initiated
requestThe original kick-off request URL
requiresAccessTokenWhether authentication is required to download result files
outputArray of exported files with resource type and download URL
deletedArray of deleted resources (not currently supported)
errorArray of OperationOutcome files for any errors

Retrieving results

Download exported files using the URLs in the manifest. Each URL points to the $result endpoint with job and file query parameters.

For NDJSON exports:

GET [base]/$result?job=abc123&file=Patient.ndjson HTTP/1.1
Accept: application/fhir+ndjson

For Parquet exports:

GET [base]/$result?job=abc123&file=Patient.parquet HTTP/1.1
Accept: application/vnd.apache.parquet

The file extension in the manifest URLs reflects the requested output format.

Python example

The following Python script demonstrates the complete export workflow.

Run the script using uv:

uv run export_client.py

Export client

#!/usr/bin/env python3
# /// script
# requires-python = ">=3.10"
# dependencies = ["requests"]
# ///
"""Demonstrates the $export operation with async polling."""

import time
import requests

BASE_URL = "https://pathling.example.com/fhir"
OUTPUT_DIR = "./export"


def kick_off_export(resource_types=None):
"""Initiate a bulk export."""
url = f"{BASE_URL}/$export"
params = {}
if resource_types:
params["_type"] = ",".join(resource_types)

headers = {
"Accept": "application/fhir+json",
"Prefer": "respond-async"
}

response = requests.get(url, params=params, headers=headers)

if response.status_code == 202:
status_url = response.headers.get("Content-Location")
print(f"Export started, polling: {status_url}")
return status_url
else:
response.raise_for_status()


def poll_status(status_url, timeout=3600):
"""Poll the status endpoint until export completes."""
start = time.time()
interval = 2.0

while time.time() - start < timeout:
response = requests.get(
status_url,
headers={"Accept": "application/fhir+json"}
)

if response.status_code == 200:
print("Export complete")
return response.json()
elif response.status_code == 202:
progress = response.headers.get("X-Progress", "unknown")
print(f"In progress: {progress}")
time.sleep(interval)
interval = min(interval * 1.5, 30.0)
else:
response.raise_for_status()

raise TimeoutError(f"Export timed out after {timeout} seconds")


def download_files(manifest, output_dir):
"""Download all files from the export manifest."""
import os
os.makedirs(output_dir, exist_ok=True)

for item in manifest.get("output", []):
resource_type = item["type"]
url = item["url"]
filename = f"{resource_type}.ndjson"
filepath = os.path.join(output_dir, filename)

print(f"Downloading {resource_type}...")
response = requests.get(
url,
headers={"Accept": "application/fhir+ndjson"}
)
response.raise_for_status()

with open(filepath, "wb") as f:
f.write(response.content)
print(f" Saved to {filepath}")


def main():
"""Execute the complete export workflow."""
# Export specific resource types (or None for all)
resource_types = ["Patient", "Observation", "Condition"]

print(f"Starting export for: {', '.join(resource_types)}")
status_url = kick_off_export(resource_types)

manifest = poll_status(status_url)
print(f"Manifest: {manifest}")

download_files(manifest, OUTPUT_DIR)
print("Export complete!")


if __name__ == "__main__":
main()