Skip to main content
Version: 9.7.1

Import (ping and pull)

The $import-pnp operation allows bulk FHIR data to be imported using a "Ping and Pull" workflow. Instead of the client pushing data, Pathling fetches data directly from a remote FHIR server's bulk export endpoint.

This implementation follows the SMART Bulk Data Import - Ping and Pull Approach proposal.

How it works

  1. The client provides the URL of a bulk export endpoint on a remote FHIR server
  2. Pathling initiates a bulk export on that server
  3. Pathling polls the export status until complete
  4. Pathling downloads and imports the exported files

Request

POST [base]/$import-pnp
Content-Type: application/fhir+json
Prefer: respond-async

The operation requires the asynchronous request pattern. Include the Prefer: respond-async header.

Example request

This example imports Patient and Observation resources modified after a specific timestamp, using merge mode for incremental updates:

{
"resourceType": "Parameters",
"parameter": [
{
"name": "exportUrl",
"valueUrl": "https://source-server.example.com/fhir/$export"
},
{
"name": "mode",
"valueCoding": {
"code": "merge"
}
},
{
"name": "_type",
"valueString": "Patient"
},
{
"name": "_type",
"valueString": "Observation"
},
{
"name": "_since",
"valueInstant": "2025-01-01T00:00:00Z"
}
]
}

Parameters

NameCardinalityTypeDescription
exportUrl1..1urlThe URL of the bulk export endpoint to import from. Can include query parameters (e.g., $export?_type=Patient).
exportType0..1CodingThe type of export: dynamic (default) includes data as of export completion, static includes data as of export initiation.
mode0..1CodingControls how data is merged with existing resources. See Save modes. Defaults to merge.
inputFormat0..1CodingThe format of source files. Defaults to application/fhir+ndjson. See Supported formats.

Bulk export parameters

These parameters are passed through to the remote bulk export endpoint, allowing you to filter the exported data without embedding query parameters in the export URL.

NameCardinalityTypeDescription
_type0..*stringResource types to include. Repeat for multiple types.
_since0..1instantExport resources modified after this timestamp.
_until0..1instantExport resources modified before this timestamp.
_outputFormat0..1stringOutput format for the export. Defaults to application/fhir+ndjson.
_elements0..*stringElements to include in output. Experimental; support varies by server.
_typeFilter0..*stringFHIR search queries to filter resources (e.g., Patient?active=true).
includeAssociatedData0..*codeAssociated data sets to include (e.g., LatestProvenanceResources). Experimental.

Save modes

ModeDescription
mergeMatch resources by ID; update existing resources and add new ones (default)
overwriteDelete and replace all existing resources of each type
appendAdd new resources without modifying existing ones
ignoreSkip resources that already exist
errorFail if any resources already exist

Security

Export URL whitelist

To prevent credential leakage and server-side request forgery (SSRF), the $import-pnp operation validates the exportUrl against a configurable whitelist.

  • pathling.import.pnp.allowableExportUrls - A list of URL prefixes which are allowable for use as export URLs. Any exportUrl that does not start with one of these prefixes is rejected with a 400 Bad Request.

This list is mandatory. If it is empty the operation rejects every request, regardless of whether PNP credentials have been configured. This prevents $import-pnp from being used as an SSRF or warehouse-poisoning vector.

Authentication interlock

When PNP credentials (clientId/clientSecret or privateKeyJwk) are configured, Pathling authentication (pathling.auth.enabled) must also be enabled. If authentication is disabled but PNP credentials are present, the server logs a warning at startup and rejects all $import-pnp requests.

Manifest URL validation

After receiving the export manifest, Pathling validates that every download URL in the manifest uses the same origin (scheme + host + port) as the original exportUrl. If a manifest references a different origin, the job fails with an error and no bearer token is sent to the off-host URL.

Asynchronous processing

The operation uses the FHIR Asynchronous Request Pattern.

Kick-off response

HTTP/1.1 202 Accepted
Content-Location: [base]/$importstatus/[job-id]

Polling

Poll the URL from Content-Location until you receive a 200 OK response.

  • 202 Accepted — Import still in progress
  • 200 OK — Import complete

Response

On completion, the operation returns a FHIR Parameters resource:

{
"resourceType": "Parameters",
"parameter": [
{
"name": "transactionTime",
"valueInstant": "2025-01-15T10:30:00.000Z"
},
{
"name": "request",
"valueUrl": "https://pathling.example.com/fhir/$import-pnp"
},
{
"name": "output",
"part": [
{
"name": "inputUrl",
"valueUrl": "https://source-server.example.com/fhir/export/Patient.ndjson"
}
]
}
]
}

Python example

The following Python script demonstrates invoking the $import-pnp operation.

Run the script using uv:

uv run import_pnp_client.py

Import PnP client

#!/usr/bin/env python3
# /// script
# requires-python = ">=3.10"
# dependencies = ["requests"]
# ///
"""Demonstrates the $import-pnp operation."""

import time
import requests

PATHLING_URL = "https://pathling.example.com/fhir"
SOURCE_EXPORT_URL = "https://source-server.example.com/fhir/$export"


def kick_off_import(export_url, mode="merge", types=None, since=None):
"""Initiate a ping and pull import.

Args:
export_url: The bulk export endpoint URL.
mode: Save mode (overwrite, merge, append, ignore, error).
types: Optional list of resource types to include.
since: Optional ISO timestamp for incremental export.
"""
parameters = [
{"name": "exportUrl", "valueUrl": export_url},
{"name": "mode", "valueCoding": {"code": mode}}
]

# Add resource type filters.
if types:
for resource_type in types:
parameters.append({"name": "_type", "valueString": resource_type})

# Add timestamp filter for incremental sync.
if since:
parameters.append({"name": "_since", "valueInstant": since})

params = {"resourceType": "Parameters", "parameter": parameters}

headers = {
"Content-Type": "application/fhir+json",
"Accept": "application/fhir+json",
"Prefer": "respond-async"
}

response = requests.post(
f"{PATHLING_URL}/$import-pnp",
json=params,
headers=headers
)

if response.status_code == 202:
status_url = response.headers.get("Content-Location")
print(f"Import started, polling: {status_url}")
return status_url
else:
response.raise_for_status()


def poll_status(status_url, timeout=7200):
"""Poll the status endpoint until import completes."""
start = time.time()
interval = 5.0

while time.time() - start < timeout:
response = requests.get(
status_url,
headers={"Accept": "application/fhir+json"}
)

if response.status_code == 200:
print("Import complete")
return response.json()
elif response.status_code == 202:
progress = response.headers.get("X-Progress", "unknown")
print(f"In progress: {progress}")
time.sleep(interval)
interval = min(interval * 1.5, 60.0)
else:
response.raise_for_status()

raise TimeoutError(f"Import timed out after {timeout} seconds")


def main():
"""Execute the ping and pull import."""
print(f"Starting import from: {SOURCE_EXPORT_URL}")

# Import only Patient and Observation resources modified since 2025-01-01.
status_url = kick_off_import(
SOURCE_EXPORT_URL,
mode="merge",
types=["Patient", "Observation"],
since="2025-01-01T00:00:00Z"
)
result = poll_status(status_url)

print("Result:")
print(result)


if __name__ == "__main__":
main()