SQL on FHIR is not currently available to new customers. This documentation is maintained for existing legacy users only. If you have questions about your current implementation, please contact your account team.
You can query your FHIR data on the 1up Platform with the power of SQL and the business intelligence (BI) tools that you're most familiar with. Though the 1upHealth FHIR API makes it easy for you to operate on individual or bulk FHIR resources, it doesn't include built-in analytics. You can use SQL on FHIR to run your analytics, discover insights, and get value from your FHIR data.
SQL on FHIR allows you to view your FHIR data in relational format, and access it over standard SQL connections. Before you use SQL on FHIR, we recommend that you review some basic concepts that will help understand how the product works, such as resources, resource types, and data mapping.
The FHIR specification defines over a hundred (and counting) types of data that form the basis of interoperability for different healthcare use cases. Commonly used FHIR resources include Patient, Encounter, ExplanationOfBenefit and Provider.
Every type of resource in SQL on FHIR is represented by a table. A row in one of those tables represents a single resource of that type.
FHIR resources can refer to each other, such as when an Encounter resource refers to a particular Patient resource. In the 1upHealth FHIR server, these references use local resource identifiers, in the form of RESOURCE_TYPE/ID. For example, if an Encounter references a Patient, the reference is in this format: Patient/abc123.
Every FHIR resource is stored as its own table in SQL on FHIR. These tables follow the fhir_r4 schema are of the format fhir_r4.resourcetype. For example, Patient resources can be found in the fhir_r4.patient table and Encounter resources are in the fhir_r4.encounter table.
The columns in each table directly relate to the FHIR specification for each resource. For example, the FHIR attribute meta is represented by the SQL on FHIR column meta.
Commonly referenced data that is of a simple data type have been unnested into their own columns. To use complex data types, such as arrays and structs, you must unnest them in your SQL queries.
Structured Query Language (SQL) is used to query relational database systems. It is built into most common BI tools including Tableau, Looker, and Power BI. SQL on FHIR only provides read-only access to data. You cannot modify data using SQL.
SQL on FHIR’s query engine is built on Amazon Redshift, and has a documented list of data types and supported features. SQL queries from other databases might not work without modification.
For information about the supported SQL dialect, see the Amazon Redshift SQL Reference.
The 1upHealth SQL on FHIR platform is HIPAA-compliant. Each environment is secured with multi-layer security, and an authentication and authorization architecture that is based on industry best practices. The strategies we use to protect your data are below:
| Strategy | Description |
|---|---|
| Network security | Access to your SQL endpoint is restricted to users on your VPN using source IP policies. |
| Data protection | Data is secured by server-side encryption at rest and in-flight. |
| Authentication | Authentication requires individual user accounts with access keys and secrets, which are securely distributed. |
| Authorization | Individuals are assigned to groups with the correct roles for their SQL access. |
| Query limits & logging | Data scanning limit per query is 500 GB and the data usage alert is set to 500 GB per hour. |
You can use any SQL client that supports Amazon Redshift as a data source. The following information may be needed to connect your BI tool to Amazon Redshift:
- JDBC Endpoint URL: Provided by 1upHealth.
- Example:
jdbc:redshift://endpoint:port/database
- Example:
- Username: Your personal access key, provided by 1up.
- Password: Your personal secret key, provided by 1up.
Reference the documentation for your tool on configuration instructions. A few popular tools are linked below:
SQL on FHIR defines what type of data can be stored in each column. This impacts the functions available for each column and is valuable to understand data returned from a query.
The following data types are used to describe singular values in a column. They can be further aggregated using super for more complex data types.
varchar— Variable length character stringbigint— 64-bit signed integerdouble— 64-bit signed double-precision floating point numberboolean— Values are true and falsetimestampz— Date and time, UTC timezonedate— Date is in ISO format (YYYY-MM-DD)
Complex data types contain collections of simple data types or nested data. Because of the nested nature of FHIR, several columns are represented by the super data type. You must use different notation to access the individual values inside this complex data type. See the Amazon Redshift documentation for more detailed type information.
A few guidelines are listed below:
- Use the
superdata type to store semi-structured data or documents as values. - A super data type can either be an Amazon Redshift scalar value (such as a null, a boolean, a number, or string value) or a complex value (such as an array of values or a structure).
- To unnest this data type, Amazon Redshift uses the PartiQL syntax to iterate over SUPER arrays in the FROM clause of a query.
SQL on FHIR does not currently support time zone information, but timestamps are stored and processed as UTC. Where there is a timestamp given, such as in the meta.lastUpdated field of any resource, you can assume it refers to UTC. This applies to the simple data types of timestampz and date, either in their own columns or as part of a super.
SQL on FHIR includes enrichment tables, FHIR data tables, views that you can use to generate analytics and insights, and visibility into your FHIR data. You can use these definitions to write SQL queries and to help you understand the context of the data in the SQL tables.
You can use the following enrichment tables to help you analyze and get insights into your FHIR data. Some of the tables can also be combined with the views to increase your visibility into your data.
enr_rxnconso: Map NCD codes and RX codes. Use with thevw_mapping_rxnorm_to_ndcview.enr_rxnrel: Map NCD codes and RX codes. Use with thevw_mapping_rxnorm_to_ndcview.enr_rxnsat: Map NCD codes and RX codes. Use with thevw_mapping_rxnorm_to_ndcview.enr_valueset: Limited value sets from eCQM (Electronic Quality Measure).enr_zip_lat_long: Latitude and longitude values for US ZIP codes. Use to create geographic visualizations.
You can use these FHIR data tables to get more detailed information about your FHIR resource data. Each table contains the information for one type of FHIR resource, which is denoted in the final part of the table name. Within each table, the columns directly map to the attributes defined by the FHIR specification for that resource.
| FHIR data table | FHIR response type |
|---|---|
| fhir_r4.condition | Condition |
| fhir_r4.coverage | Coverage |
| fhir_r4.encounter | Encounter |
| fhir_r4.explanationofbenefit | Explanation of Benefit |
| fhir_r4.observation | Observation |
| fhir_r4.organization | Organization |
| fhir_r4.patient | Patient |
| fhir_r4.practitioner | Practitioner |
You can use the following views to help you evaluate data quality and referential integrity. You can also use them to get view profile information for key FHIR tables, such as Patient, Practitioner, and Explanation of Benefits.
| View | Description |
|---|---|
vw_loaded_resources | Count of resources loaded by day. |
vw_ref_int_with_path_and_privclientid_aggr | Aggregate information on referential integrity among resources. If a FHIR resource contains a reference, this is the count of resources that exist (match) or don’t exist (mismatch). |
vw_ref_int_with_path_and_privclientid_details | Detailed information on referential integrity. Lists all references and whether they exist (match) or don’t exist (mismatch). |
When you use 1up SQL on FHIR, we recommend that you use the following best practices for writing your SQL queries to use SQL on FHIR. Whether you are familiar with SQL syntax, or new to writing SQL queries, this information is valuable because it describes the specific recommended methods of querying for 1up SQL on FHIR.
SQL on FHIR is a columnar store. We recommend that you only query the fields that are required for efficient and optimized performance. For more information about columnar storage, see Amazon Redshift's documentation.
Amazon Redshift has a list of reserved words. Some of the table fields that you use with SQL on FHIR might coincide with the reserved words. To successfully query the fields without encountering issues, make sure to include double quotes (“ “) around the field name.
select p."language"
from fhir_r4.patient p
limit 10;Some fields in FHIR tables include mixed-case values. To make sure you can successfully query the fields without encountering issues, you can enable the case-sensitive identifier.
SHOW enable_case_sensitive_identifier;SET enable_case_sensitive_identifier to TRUE;One example of a common mixed-case query is the valueCodeableConcept field in the fhir_r4.observation table.
select o."valueCodeableConcept"
from fhir_r4.observation o
limit 10;You can use the SUPER data type to store semistructured data as values. FHIR data is highly nested. You can use the SUPER data type to more easily store and interact with these complex data types.
A common task in SQL on FHIR is to query array data. SQL on FHIR supports the unnest function, which is used to expand an array into a relation that can be queried.
For example, you can query a FHIR table for Practitioner names. The name column in the fhir_r4.practitioner table is an array of structs, which contains a nest array for the given name.
name: array | struct ->
| id:varchar(256)
| use:varchar(256)
| family:varchar(256)
| given: array | varchar(256)
| prefix: array | varchar(256)
| suffix: array | varchar(256)When you write a query for this example, you must consider that fhir_r4.practitioner.name is an array of varying length that contains 0 or more elements. If you query only the first element from the array, you lose all other values that might be present. To get every element in the array, you must use the unnest function.
SELECT
id,
unnested_name.family as "Last Name",
given as "First Name"
FROM fhir_r4.practitioner as p,
p."name" as unnested_name,
unnested_name.given as given
LIMIT 10;You can use the following JSON Functions to convert SQL data from one type to another:
JSON_SERIALIZE: Convert the SUPER data type to the VARCHAR data type.JSON_PARSE: Convert the VARCHAR data type to the SUPER data type.
Joins are used to combine records from two or more related using a common identifier. You can use a join to query for a patient and all of that patient’s observations. SQL on FHIR uses local resource identifiers, in the RESOURCE_TYPE/ID form.
For example, an Observation resource contains a reference to a patient with the Patient/123xyz resource identifier. The Patient resource uses only the ID portion of that reference, which means that before you can use join you must first separate the local resource identifier resource type from the ID.
You can separate the local resource identifier with the split_part function in SQL on FHIR. In the Observation FHIR resource, the Patient identifier is found in the reference nested attribute of subject, which you can call using dot notation.
split_part(json_serialize(fhir_r4.observation.subject.reference), '/', 2) AS member_idThis function separates the string in fhir_r4.observation.subject.reference on the / slash character, and returns the second part, which is the ID.
Another function used in this example is json_serialize. The split_part function uses varchar as the first argument. Because fhir_f4.observation.subject.reference is a SUPER data type, you can use the json_serialize function to convert it to a VARCHAR data type.
SELECT
split_part(json_serialize(o.subject.reference), '/', 2) AS member_id
FROM
fhir_r4.observation o
INNER JOIN fhir_r4.patient p ON split_part(json_serialize(o.subject.reference), '/', 2) = p.id
LIMIT 10;You can use the following example SQL queries to gather common insights about your data. Each query demonstrates the techniques that 1up recommends you use when querying SQL on FHIR, such as unnesting, joining, and correctly pulling data.
These queries are written to pull data from common customer use cases. In some situations you might find that null values are returned or you might not see any results. The specific data available in your instance of SQL on FHIR varies depending on what data has been ingested into 1upHealth‘s FHIR Server.
You can use the examples in this section to learn how to query for common patient information and limit a query to a certain cohort of patients.
You can use aggregation to count Patient resources for patients born after a certain date. You can use the to_date function to specify the date for date fields, such as birthdate.
SELECT DISTINCT count(*)
FROM fhir_r4.patient
WHERE birthdate < to_date('1951-12-26', 'yyyy-mm-dd');You can use the date_add function to find resources relative to a date.
SELECT DISTINCT
p.*
FROM fhir_r4.patient p
WHERE
p.birthdate > dateadd(MONTH, -6, getdate())
LIMIT 10;SELECT
p.*
FROM
fhir_r4.patient p
WHERE
p.deceasedboolean = true OR deceaseddatetime is not null
LIMIT 10;This query joins a patient resource to their observations and returns the observation’s code and display text.
WITH observation AS
(SELECT
split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
code_coding_unnested.code AS observation_code,
code_coding_unnested.display AS observation_display
FROM
fhir_r4.observation AS o,
o.code.coding AS code_coding_unnested
)
SELECT
member_id,
observation_code,
observation_display
FROM observation AS o
INNER JOIN fhir_r4.patient ON o.member_id = fhir_r4.patient.id
LIMIT 10;Vital signs are stored in patient observations, and can be joined to the patient resource to return the patient’s member ID.
WITH observation AS (
SELECT
split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
category_coding_unnested.version AS category_version,
category_coding_unnested.code AS category_code,
category_coding_unnested.display AS category_display,
code_coding_unnested.code AS observation_code,
code_coding_unnested.display AS observation_display
FROM
fhir_r4.observation AS o,
o.category AS category_unnested,
o.code.coding AS code_coding_unnested,
category_unnested.coding AS category_coding_unnested
)
SELECT o.*
FROM observation AS o
INNER JOIN fhir_r4.patient p ON o.member_id = p.id
WHERE
o.category_display = 'Vital Signs'
LIMIT 10;Smoking status is one type of observation stored in a survey observation category. You can retrieve all surveys from observations and use the smoking_status_display column to get the smoking status.
with observation as (
SELECT
distinct split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
category_coding_unnested.display AS category_display,
valueCodeableConcept_coding_unnested.code AS smoking_status_code,
valueCodeableConcept_coding_unnested.display AS smoking_status_display
FROM
fhir_r4.observation o,
o.category category_unnested,
category_unnested.coding category_coding_unnested,
o."valueCodeableConcept".coding valueCodeableConcept_coding_unnested
)
select o.*
from observation as o
INNER JOIN fhir_r4.patient p ON o.member_id = p.id
WHERE
category_display = 'survey'
LIMIT 10;Coverage resources include data that can help you understand payments.
Show members that have coverage started or ended within the last 3 months.
The word end is a reserved word and must be double-quoted.
SELECT
*
FROM fhir_r4.coverage
WHERE
(
json_serialize(period.start) > dateadd(month, -3, getdate())
OR json_serialize(period."end") > dateadd(month, -3, getdate())
)
LIMIT 10;You can join coverage data to a patient to find any data that overlaps.
SELECT
count(id) as n,
split_part(json_serialize(c.beneficiary.reference), '/', 2) as patient_id,
period.start as period_start,
period."end" as period_end
FROM
fhir_r4.coverage c
GROUP BY
2, 3, 4
HAVING
n > 1
ORDER BY
1