Bulk submit
The bulk submit operation enables external systems (Data Providers) to push large FHIR datasets to Pathling using a staged submission workflow. This implementation follows the Argonaut $bulk-submit specification.
Operations
The bulk submit functionality consists of two operations:
| Operation | Endpoint | Purpose |
|---|---|---|
$bulk-submit | POST [fhir base]/$bulk-submit | Submit data manifests for processing |
$bulk-submit-status | POST [fhir base]/$bulk-submit-status | Check submission status and retrieve results |
Submission workflow
The bulk submit operation uses a staged workflow:
- Submit manifests — The Data Provider sends one or more requests with
submissionStatus: in-progressand provides themanifestUrlpointing to bulk data manifests. Pathling downloads the files in the background. - Mark complete — Once all manifests have been submitted and downloads are
finished, the Data Provider sends a request with
submissionStatus: completeto trigger the import into the data warehouse. - Poll status — The Data Provider uses
$bulk-submit-statusto check the processing status and retrieve results.
$bulk-submit operation
POST [fhir base]/$bulk-submit
Parameters
| Name | Cardinality | Type | Description |
|---|---|---|---|
submitter | 1..1 | Identifier | Identifier for the submitting system. Must match an entry in the server's allowed submitters list. |
submissionId | 1..1 | string | Unique identifier for this submission, generated by the submitter. |
submissionStatus | 1..1 | Coding | Status of the submission: in-progress, complete, or aborted. |
manifestUrl | 0..1 | string | URL of the bulk export manifest. Provided with in-progress to submit manifests for downloading. |
fhirBaseUrl | 0..1 | string | Base URL of the FHIR server that produced the manifest. Required when manifestUrl is provided. |
oauthMetadataUrl | 0..1 | string | Explicit URL to OAuth 2.0 metadata. If not provided, SMART discovery is used on fhirBaseUrl. |
replacesManifestUrl | 0..1 | string | URL of a previous manifest to abort and replace with the new manifestUrl. |
fileRequestHeader | 0..* | - | Custom HTTP headers to include when downloading files from the manifest. |
metadata | 0..1 | - | Optional metadata including label and description for the submission. |
Submission status values
| Status | Description |
|---|---|
in-progress | Submit manifests for downloading. Multiple in-progress requests can be sent to add manifests. |
complete | Signal that all manifests have been submitted. Triggers the import once all downloads are complete. |
aborted | Cancel the submission. All in-progress downloads are stopped and the submission is marked as aborted. |
Example request (submit manifest)
{
"resourceType": "Parameters",
"parameter": [
{
"name": "submitter",
"valueIdentifier": {
"system": "https://example.com/systems",
"value": "hospital-ehr"
}
},
{
"name": "submissionId",
"valueString": "submission-2025-001"
},
{
"name": "submissionStatus",
"valueCoding": {
"code": "in-progress"
}
},
{
"name": "manifestUrl",
"valueString": "https://source-server.example.com/export/manifest.json"
},
{
"name": "fhirBaseUrl",
"valueString": "https://source-server.example.com/fhir"
}
]
}
Example request (mark complete)
{
"resourceType": "Parameters",
"parameter": [
{
"name": "submitter",
"valueIdentifier": {
"system": "https://example.com/systems",
"value": "hospital-ehr"
}
},
{
"name": "submissionId",
"valueString": "submission-2025-001"
},
{
"name": "submissionStatus",
"valueCoding": {
"code": "complete"
}
}
]
}
Example request (abort submission)
{
"resourceType": "Parameters",
"parameter": [
{
"name": "submitter",
"valueIdentifier": {
"system": "https://example.com/systems",
"value": "hospital-ehr"
}
},
{
"name": "submissionId",
"valueString": "submission-2025-001"
},
{
"name": "submissionStatus",
"valueCoding": {
"code": "aborted"
}
}
]
}
Example request (with explicit OAuth metadata URL)
When the source server's OAuth metadata is not at the standard SMART discovery
location, you can provide an explicit oauthMetadataUrl:
{
"resourceType": "Parameters",
"parameter": [
{
"name": "submitter",
"valueIdentifier": {
"system": "https://example.com/systems",
"value": "hospital-ehr"
}
},
{
"name": "submissionId",
"valueString": "submission-2025-001"
},
{
"name": "submissionStatus",
"valueCoding": {
"code": "in-progress"
}
},
{
"name": "manifestUrl",
"valueString": "https://source-server.example.com/export/manifest.json"
},
{
"name": "fhirBaseUrl",
"valueString": "https://source-server.example.com/fhir"
},
{
"name": "oauthMetadataUrl",
"valueString": "https://auth.example.com/.well-known/oauth-authorization-server"
}
]
}
$bulk-submit-status operation
POST [fhir base]/$bulk-submit-status
This operation retrieves the processing status and results for a submission.
Parameters
| Name | Cardinality | Type | Description |
|---|---|---|---|
submitter | 1..1 | Identifier | The identifier of the submitting system. |
submissionId | 1..1 | string | The unique identifier of the submission to check. |
Required headers
| Header | Value | Description |
|---|---|---|
Accept | application/fhir+json | Specifies the response format. |
Prefer | respond-async | Indicates asynchronous processing is expected. |
Responses
| Status | Description |
|---|---|
| 202 | Processing in progress. Poll again later. Content-Location and X-Progress headers provided. |
| 200 | Processing complete. Response body contains the status manifest. |
| 4XX | Error. Response body contains an OperationOutcome. |
Response headers
| Header | Description |
|---|---|
Content-Location | URL to poll for status updates (returned with 202). |
X-Progress | Progress percentage (e.g., 50%) indicating how much processing is done. |
Example request
{
"resourceType": "Parameters",
"parameter": [
{
"name": "submitter",
"valueIdentifier": {
"system": "https://example.com/systems",
"value": "hospital-ehr"
}
},
{
"name": "submissionId",
"valueString": "submission-2025-001"
}
]
}
Configuration
The bulk submit operation requires server configuration to specify allowed submitters and source URLs. Submitters can optionally be configured with OAuth2 credentials for authenticated file downloads.
Configuration options
| Property | Type | Description |
|---|---|---|
pathling.bulkSubmit.allowedSubmitters | List | List of allowed submitter identifiers (system and value) |
pathling.bulkSubmit.allowableSources | List | URL prefixes allowed for manifest and file URLs |
Submitter configuration
Each submitter in the allowedSubmitters list supports the following properties:
| Property | Type | Required | Description |
|---|---|---|---|
system | string | Yes | The identifier system for the submitter. |
value | string | Yes | The identifier value for the submitter. |
clientId | string | No | OAuth2 client ID for authenticated file downloads. |
clientSecret | string | No | OAuth2 client secret for symmetric authentication. |
privateKeyJwk | string | No | Private key in JWK format for asymmetric (JWT) authentication. |
scope | string | No | OAuth2 scope to request (e.g., system/*.read). |
tokenExpiryTolerance | number | No | Seconds before token expiry to refresh (default: 120). |
useFormForBasicAuth | boolean | No | Send credentials in form body instead of Authorization header (default: true). |
When OAuth credentials are configured, Pathling will:
- Discover the token endpoint via SMART configuration (from
fhirBaseUrl) or use the explicitoauthMetadataUrlif provided in the request. - Acquire an access token using the configured credentials.
- Include the access token in the
Authorizationheader when fetching manifests and files (if the manifest specifiesrequiresAccessToken: true).
Example configuration
pathling:
bulkSubmit:
allowedSubmitters:
# Submitter without OAuth - files must be publicly accessible.
- system: "https://example.com/systems"
value: "public-submitter"
# Submitter with symmetric (client_secret) OAuth authentication.
- system: "https://example.com/systems"
value: "hospital-ehr"
clientId: "pathling-client"
clientSecret: "secret-value"
scope: "system/*.read"
# Submitter with asymmetric (JWT) OAuth authentication.
- system: "https://example.com/systems"
value: "clinic-system"
clientId: "pathling-jwt-client"
privateKeyJwk: '{"kty":"EC","crv":"P-384","d":"...","x":"...","y":"..."}'
scope: "system/*.read"
allowableSources:
- "https://source-server.example.com/"
- "s3://my-bucket/"
Python example
The following Python script demonstrates the complete bulk submit workflow, including manifest submission, marking complete, and status polling.
Run the script using uv:
uv run bulk_submit_client.py
Bulk submit client
#!/usr/bin/env python3
# /// script
# requires-python = ">=3.10"
# dependencies = ["requests"]
# ///
"""Demonstrates the $bulk-submit workflow with status polling."""
import time
import uuid
import requests
# Configuration
PATHLING_URL = "https://pathling.example.com/fhir"
SUBMITTER_SYSTEM = "https://example.com/systems"
SUBMITTER_VALUE = "hospital-ehr"
MANIFEST_URL = "https://source-server.example.com/export/manifest.json"
FHIR_BASE_URL = "https://source-server.example.com/fhir"
def build_submitter():
"""Build the submitter identifier."""
return {
"system": SUBMITTER_SYSTEM,
"value": SUBMITTER_VALUE
}
def submit_manifest(submission_id, manifest_url, fhir_base_url):
"""Submit a manifest for downloading."""
params = {
"resourceType": "Parameters",
"parameter": [
{"name": "submitter", "valueIdentifier": build_submitter()},
{"name": "submissionId", "valueString": submission_id},
{
"name": "submissionStatus",
"valueCoding": {"code": "in-progress"}
},
{"name": "manifestUrl", "valueString": manifest_url},
{"name": "fhirBaseUrl", "valueString": fhir_base_url}
]
}
response = requests.post(
f"{PATHLING_URL}/$bulk-submit",
json=params,
headers={"Content-Type": "application/fhir+json"}
)
response.raise_for_status()
print(f"Submitted manifest: {manifest_url}")
def mark_complete(submission_id):
"""Mark the submission as complete to trigger the import."""
params = {
"resourceType": "Parameters",
"parameter": [
{"name": "submitter", "valueIdentifier": build_submitter()},
{"name": "submissionId", "valueString": submission_id},
{
"name": "submissionStatus",
"valueCoding": {"code": "complete"}
}
]
}
response = requests.post(
f"{PATHLING_URL}/$bulk-submit",
json=params,
headers={"Content-Type": "application/fhir+json"}
)
response.raise_for_status()
print(f"Marked submission as complete: {submission_id}")
def poll_status(submission_id, timeout=3600):
"""Poll the status endpoint until processing is complete."""
params = {
"resourceType": "Parameters",
"parameter": [
{"name": "submitter", "valueIdentifier": build_submitter()},
{"name": "submissionId", "valueString": submission_id}
]
}
headers = {
"Content-Type": "application/fhir+json",
"Accept": "application/fhir+json",
"Prefer": "respond-async"
}
# Initial kick-off request
response = requests.post(
f"{PATHLING_URL}/$bulk-submit-status",
json=params,
headers=headers
)
if response.status_code == 200:
print("Processing already complete")
return response.json()
elif response.status_code != 202:
response.raise_for_status()
status_url = response.headers.get("Content-Location")
if not status_url:
raise ValueError("No Content-Location header in 202 response")
print(f"Polling status at: {status_url}")
start = time.time()
interval = 2.0
while time.time() - start < timeout:
response = requests.get(
status_url,
headers={"Accept": "application/fhir+json"}
)
if response.status_code == 200:
print("Processing complete")
return response.json()
elif response.status_code == 202:
progress = response.headers.get("X-Progress", "unknown")
print(f"In progress: {progress}")
time.sleep(interval)
interval = min(interval * 1.5, 30.0)
else:
response.raise_for_status()
raise TimeoutError(f"Status polling timed out after {timeout} seconds")
def main():
"""Execute the complete bulk submit workflow."""
submission_id = f"submission-{uuid.uuid4()}"
print(f"Starting bulk submit workflow")
print(f"Submission ID: {submission_id}")
print()
# Step 1: Submit the manifest for downloading.
submit_manifest(submission_id, MANIFEST_URL, FHIR_BASE_URL)
# Step 2: Mark the submission as complete to trigger the import.
mark_complete(submission_id)
# Step 3: Poll for status.
result = poll_status(submission_id)
print()
print("Result:")
print(result)
if __name__ == "__main__":
main()