Skip to main content


Pathling is a set of tools that make it easier to use FHIR® within data analytics. It is built on Apache Spark, and includes both language libraries and a server implementation.

It is primarily aimed at the following use cases:

  1. Exploratory data analysis – Exploration of hypotheses, assessment of assumptions, and selection of appropriate statistical tools and techniques.
  2. Patient cohort selection – Selection and retrieval of patient records based upon complex inclusion and exclusion criteria.
  3. Data preparation – Processing and re-shaping data in preparation for use with statistical and machine learning tools.


There are three main components that are provided as part of Pathling:

  1. Encoders - a library that can turn FHIR data into Spark data sets, ready for SQL query or use within Spark applications;
  2. Language libraries - libraries that help you use FHIR data within data analytics workflows and applications;
  3. Server - a FHIR server implementation that can provide query services for analytics applications.

Components Components


Pathling implements a language called FHIRPath as a way of referring to FHIR data within your queries. It helps to reduce the complexity of navigating FHIR data structures, as compared to more general query languages such as SQL.

You can get further information about supported syntax and functions within FHIRPath here.

Licensing and attribution

Pathling is a product of the Australian e-Health Research Centre, CSIRO, licensed under the CSIRO Open Source Software Licence Agreement . This means that you are free to use, modify and redistribute the software as you wish, even for commercial purposes.

If you use Pathling within your research, please consider citing it using the instructions on our GitHub page.

Pathling is experimental software, use it at your own risk! You can get a full description of the current set of known issues over on our GitHub page.