# SQL on FHIR

Deprecated
SQL on FHIR is not currently available to new customers. This documentation is maintained for existing legacy users only. If you have questions about your current implementation, please contact your account team.

You can query your FHIR data on the 1up Platform with the power of SQL and the business intelligence (BI) tools that you're most familiar with. Though the 1upHealth FHIR API makes it easy for you to operate on individual or bulk FHIR resources, it doesn't include built-in analytics. You can use SQL on FHIR to run your analytics, discover insights, and get value from your FHIR data.

SQL on FHIR allows you to view your FHIR data in relational format, and access it over standard SQL connections. Before you use SQL on FHIR, we recommend that you review some basic concepts that will help understand how the product works, such as resources, resource types, and data mapping.

## FHIR Resources

The FHIR specification [defines over a hundred](http://hl7.org/fhir/R4/resourcelist.html) (and counting) types of data that form the basis of interoperability for different healthcare use cases. Commonly used FHIR resources include Patient, Encounter, ExplanationOfBenefit and Provider.

Every type of resource in SQL on FHIR is represented by a table. A row in one of those tables represents a single resource of that type.

## FHIR Resource Identifiers

FHIR resources can refer to each other, such as when an Encounter resource refers to a particular Patient resource. In the 1upHealth FHIR server, these references use local resource identifiers, in the form of `RESOURCE_TYPE/ID`. For example, if an Encounter references a Patient, the reference is in this format: `Patient/abc123`.

## How FHIR Maps to Relational Data

Every FHIR resource is stored as its own table in SQL on FHIR. These tables follow the fhir_r4 schema are of the format `fhir_r4.resourcetype`. For example, Patient resources can be found in the `fhir_r4.patient` table and Encounter resources are in the `fhir_r4.encounter` table.

The columns in each table directly relate to the FHIR specification for each resource. For example, the FHIR attribute `meta` is represented by the SQL on FHIR column `meta`.

Commonly referenced data that is of a simple data type have been unnested into their own columns. To use complex data types, such as arrays and structs, you must unnest them in your SQL queries.

## SQL

Structured Query Language (SQL) is used to query relational database systems. It is built into most common BI tools including Tableau, Looker, and Power BI. SQL on FHIR only provides read-only access to data. You cannot modify data using SQL.

SQL on FHIR’s query engine is built on Amazon Redshift, and has a documented list of data types and supported features. SQL queries from other databases might not work without modification.

For information about the supported SQL dialect, see the [Amazon Redshift SQL Reference](https://docs.aws.amazon.com/redshift/latest/dg/cm_chap_SQLCommandRef.html).

## Security, Authentication, Authorization & Limits

The 1upHealth SQL on FHIR platform is HIPAA-compliant. Each environment is secured with multi-layer security, and an authentication and authorization architecture that is based on industry best practices. The strategies we use to protect your data are below:

| Strategy | Description |
|  --- | --- |
| Network security | Access to your SQL endpoint is restricted to users on your VPN using source IP policies. |
| Data protection | Data is secured by server-side encryption at rest and in-flight. |
| Authentication | Authentication requires individual user accounts with access keys and secrets, which are securely distributed. |
| Authorization | Individuals are assigned to groups with the correct roles for their SQL access. |
| Query limits & logging | Data scanning limit per query is 500 GB and the data usage alert is set to 500 GB per hour. |


## Setup your SQL client

You can use any SQL client that supports Amazon Redshift as a data source. The following information may be needed to connect your BI tool to Amazon Redshift:

* JDBC Endpoint URL: Provided by 1upHealth.
  * Example: `jdbc:redshift://endpoint:port/database`
* Username: Your personal access key, provided by 1up.
* Password: Your personal secret key, provided by 1up.


Reference the documentation for your tool on configuration instructions. A few popular tools are linked below:

* [DBeaver](https://dbeaver.com/docs/dbeaver/Database-driver-Amazon-Redshift/)
* [Tableau](https://help.tableau.com/current/pro/desktop/en-us/examples_amazonredshift.htm)
* [Power BI](https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-data-sources)


## Data types

SQL on FHIR defines what type of data can be stored in each column. This impacts the functions available for each column and is valuable to understand data returned from a query.

### Simple data types

The following data types are used to describe singular values in a column. They can be further aggregated using `super` for more complex data types.

* `varchar` — Variable length character string
* `bigint` — 64-bit signed integer
* `double` — 64-bit signed double-precision floating point number
* `boolean` — Values are true and false
* `timestampz` — Date and time, UTC timezone
* `date` — Date is in ISO format (YYYY-MM-DD)


### Complex data types

Complex data types contain collections of simple data types or nested data. Because of the nested nature of FHIR, several columns are represented by the `super` data type. You must use different notation to access the individual values inside this complex data type. See the [Amazon Redshift documentation](https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html) for more detailed type information.

A few guidelines are listed below:

* Use the `super` data type to store semi-structured data or documents as values.
* A super data type can either be an Amazon Redshift scalar value (such as a null, a boolean, a number, or string value) or a complex value (such as an array of values or a structure).
* To [unnest this data type](https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#unnest), Amazon Redshift uses the PartiQL syntax to iterate over SUPER arrays in the FROM clause of a query.


## Time and time zones

SQL on FHIR does not currently support time zone information, but timestamps are stored and processed as UTC. Where there is a timestamp given, such as in the `meta.lastUpdated` field of any resource, you can assume it refers to UTC. This applies to the simple data types of `timestampz` and `date`, either in their own columns or as part of a `super`.

## Table and view definitions

SQL on FHIR includes enrichment tables, FHIR data tables, views that you can use to generate analytics and insights, and visibility into your FHIR data. You can use these definitions to write SQL queries and to help you understand the context of the data in the SQL tables.

### Enrichment tables

You can use the following enrichment tables to help you analyze and get insights into your FHIR data. Some of the tables can also be combined with the views to increase your visibility into your data.

* `enr_rxnconso`: Map NCD codes and RX codes. Use with the `vw_mapping_rxnorm_to_ndc` view.
* `enr_rxnrel`: Map NCD codes and RX codes. Use with the `vw_mapping_rxnorm_to_ndc` view.
* `enr_rxnsat`: Map NCD codes and RX codes. Use with the `vw_mapping_rxnorm_to_ndc` view.
* `enr_valueset`: Limited value sets from eCQM (Electronic Quality Measure).
* `enr_zip_lat_long`: Latitude and longitude values for US ZIP codes. Use to create geographic visualizations.


### FHIR data tables

You can use these FHIR data tables to get more detailed information about your FHIR resource data. Each table contains the information for one type of FHIR resource, which is denoted in the final part of the table name. Within each table, the columns directly map to the attributes defined by the FHIR specification for that resource.

| FHIR data table | FHIR response type |
|  --- | --- |
| fhir_r4.condition | [Condition](https://www.hl7.org/fhir/R4/condition.html#resource) |
| fhir_r4.coverage | [Coverage](https://www.hl7.org/fhir/R4/coverage.html#resource) |
| fhir_r4.encounter | [Encounter](https://www.hl7.org/fhir/R4/encounter.html#resource) |
| fhir_r4.explanationofbenefit | [Explanation of Benefit](https://www.hl7.org/fhir/R4/explanationofbenefit.html#resource) |
| fhir_r4.observation | [Observation](https://www.hl7.org/fhir/R4/observation.html#resource) |
| fhir_r4.organization | [Organization](https://www.hl7.org/fhir/R4/organization.html#resource) |
| fhir_r4.patient | [Patient](https://www.hl7.org/fhir/R4/patient.html#resource) |
| fhir_r4.practitioner | [Practitioner](https://www.hl7.org/fhir/R4/practitioner.html#resource) |


### Views

You can use the following views to help you evaluate data quality and referential integrity. You can also use them to get view profile information for key FHIR tables, such as Patient, Practitioner, and Explanation of Benefits.

| View | Description |
|  --- | --- |
| `vw_loaded_resources` | Count of resources loaded by day. |
| `vw_ref_int_with_path_and_privclientid_aggr` | Aggregate information on referential integrity among resources. If a FHIR resource contains a reference, this is the count of resources that exist (match) or don’t exist (mismatch). |
| `vw_ref_int_with_path_and_privclientid_details` | Detailed information on referential integrity. Lists all references and whether they exist (match) or don’t exist (mismatch). |


## Best practices

When you use 1up SQL on FHIR, we recommend that you use the following best practices for writing your SQL queries to use SQL on FHIR. Whether you are familiar with SQL syntax, or new to writing SQL queries, this information is valuable because it describes the specific recommended methods of querying for 1up SQL on FHIR.

SQL on FHIR is a columnar store. We recommend that you only query the fields that are required for efficient and optimized performance. For more information about columnar storage, see [Amazon Redshift's documentation](https://docs.aws.amazon.com/redshift/latest/dg/c_columnar_storage_disk_mem_mgmnt.html).

### Reserved words

Amazon Redshift has a list of [reserved words](https://docs.aws.amazon.com/redshift/latest/dg/r_pg_keywords.html). Some of the table fields that you use with SQL on FHIR might coincide with the reserved words. To successfully query the fields without encountering issues, make sure to include double quotes (“ “) around the field name.

#### Reserved word example


```sql
select p."language"
from fhir_r4.patient p 
limit 10;
```

### Case sensitivity

Some fields in FHIR tables include mixed-case values. To make sure you can successfully query the fields without encountering issues, you can enable the case-sensitive identifier.

#### Verify case sensitivity status


```sql
SHOW enable_case_sensitive_identifier;
```

#### Enable case sensitivity


```sql
SET enable_case_sensitive_identifier to TRUE;
```

#### Query mixed-case fields

One example of a common mixed-case query is the `valueCodeableConcept` field in the `fhir_r4.observation` table.


```sql
select o."valueCodeableConcept"
from fhir_r4.observation o
limit 10;
```

### Use SUPER

You can use the SUPER data type to store semistructured data as values. FHIR data is highly nested. You can use the SUPER data type to more easily store and interact with these complex data types.

### Unnest

A common task in SQL on FHIR is to query array data. SQL on FHIR supports the `unnest` function, which is used to expand an array into a relation that can be queried.

For example, you can query a FHIR table for Practitioner names. The `name` column in the `fhir_r4.practitioner` table is an array of `structs`, which contains a nest array for the `given` name.

#### Example nested array of `structs`


```sql
name: array | struct ->
  | id:varchar(256)
  | use:varchar(256)
  | family:varchar(256)
  | given: array | varchar(256)
  | prefix: array | varchar(256)
  | suffix: array | varchar(256)
```

When you write a query for this example, you must consider that `fhir_r4.practitioner.name` is an array of varying length that contains 0 or more elements. If you query only the first element from the array, you lose all other values that might be present. To get every element in the array, you must use the `unnest` function.

#### Example query to get all elements in the array with `unnest`


```sql
SELECT 
  id,
  unnested_name.family as "Last Name",
  given as "First Name"
FROM fhir_r4.practitioner as p,
  p."name" as unnested_name,
  unnested_name.given as given
LIMIT 10;
```

### Use JSON functions

You can use the following JSON Functions to convert SQL data from one type to another:

* `JSON_SERIALIZE`: Convert the SUPER data type to the VARCHAR data type.
* `JSON_PARSE`: Convert the VARCHAR data type to the SUPER data type.


### Join patient resources

Joins are used to combine records from two or more related using a common identifier. You can use a `join` to query for a patient and all of that patient’s observations. SQL on FHIR uses local resource identifiers, in the `RESOURCE_TYPE/ID` form.

For example, an Observation resource contains a reference to a patient with the `Patient/123xyz` resource identifier. The Patient resource uses only the `ID` portion of that reference, which means that before you can use `join` you must first separate the local resource identifier resource type from the ID.

You can separate the local resource identifier with the `split_part` function in SQL on FHIR. In the Observation FHIR resource, the Patient identifier is found in the `reference` nested attribute of `subject`, which you can call using dot notation.

#### Example `split_part` function


```js
split_part(json_serialize(fhir_r4.observation.subject.reference), '/', 2) AS member_id
```

This function separates the string in `fhir_r4.observation.subject.reference` on the `/` slash character, and returns the second part, which is the ID.

Another function used in this example is `json_serialize`. The `split_part` function uses `varchar` as the first argument. Because `fhir_f4.observation.subject.reference` is a SUPER data type, you can use the `json_serialize` function to convert it to a VARCHAR data type.

#### Example query to get all Observation and Patient data for observations related to a patient


```sql
SELECT
  split_part(json_serialize(o.subject.reference), '/', 2) AS member_id
FROM
  fhir_r4.observation o
  INNER JOIN fhir_r4.patient p ON split_part(json_serialize(o.subject.reference), '/', 2) = p.id
LIMIT 10;
```

## Example queries

You can use the following example SQL queries to gather common insights about your data. Each query demonstrates the techniques that 1up recommends you use when querying SQL on FHIR, such as unnesting, joining, and correctly pulling data.

These queries are written to pull data from common customer use cases. In some situations you might find that null values are returned or you might not see any results. The specific data available in your instance of SQL on FHIR varies depending on what data has been ingested into 1upHealth‘s FHIR Server.

### Querying Patient resources

You can use the examples in this section to learn how to query for common patient information and limit a query to a certain cohort of patients.

#### Count of patients born after a date

You can use aggregation to count Patient resources for patients born after a certain date. You can use the `to_date` function to specify the date for date fields, such as `birthdate`.


```sql
SELECT DISTINCT count(*)
FROM fhir_r4.patient
WHERE birthdate < to_date('1951-12-26', 'yyyy-mm-dd');
```

#### Birth & death queries

You can use the date_add function to find resources relative to a date.

##### Query for patients born in the last 6 months


```sql
SELECT DISTINCT
    p.*
FROM fhir_r4.patient p 
WHERE 
    p.birthdate > dateadd(MONTH, -6, getdate())
LIMIT 10;
```

##### Query for deceased patients


```sql
SELECT
  p.*
FROM 
  fhir_r4.patient p
WHERE 
  p.deceasedboolean = true OR deceaseddatetime is not null
LIMIT 10;
```

#### Find Patient observations

This query joins a patient resource to their observations and returns the observation’s code and display text.


```sql
WITH observation AS 
(SELECT
  split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
  code_coding_unnested.code AS observation_code,
  code_coding_unnested.display AS observation_display
FROM
  fhir_r4.observation AS o, 
  o.code.coding AS code_coding_unnested
 )
 SELECT 
     member_id, 
     observation_code, 
     observation_display 
 FROM observation AS o
 INNER JOIN fhir_r4.patient ON o.member_id = fhir_r4.patient.id
LIMIT 10;
```

#### Find vital signs

Vital signs are stored in patient observations, and can be joined to the patient resource to return the patient’s member ID.


```sql
WITH observation AS (
    SELECT 
      split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
      category_coding_unnested.version AS category_version,
      category_coding_unnested.code AS category_code,
      category_coding_unnested.display AS category_display,
      code_coding_unnested.code AS observation_code,
      code_coding_unnested.display AS observation_display
    FROM 
      fhir_r4.observation AS o,
      o.category AS category_unnested,
      o.code.coding AS code_coding_unnested,
      category_unnested.coding AS category_coding_unnested
)
SELECT o.*
FROM observation AS o
INNER JOIN fhir_r4.patient p ON o.member_id = p.id
WHERE 
  o.category_display = 'Vital Signs'
LIMIT 10;
```

#### Find smoking status

Smoking status is one type of observation stored in a survey observation category. You can retrieve all surveys from observations and use the smoking_status_display column to get the smoking status.


```sql
with observation as (
    SELECT
      distinct split_part(json_serialize(o.subject.reference), '/', 2) AS member_id,
      category_coding_unnested.display AS category_display,
      valueCodeableConcept_coding_unnested.code AS smoking_status_code,
      valueCodeableConcept_coding_unnested.display AS smoking_status_display
    FROM
      fhir_r4.observation o,
      o.category category_unnested,
      category_unnested.coding category_coding_unnested,
      o."valueCodeableConcept".coding valueCodeableConcept_coding_unnested 
 )
select o.* 
from observation as o
  INNER JOIN fhir_r4.patient p ON o.member_id = p.id
WHERE
  category_display = 'survey'
LIMIT 10;
```

### Query Coverage resources

Coverage resources include data that can help you understand payments.

#### Find Coverage data in a date range

Show members that have coverage started or ended within the last 3 months.

Reserved word
The word `end` is a reserved word and must be double-quoted.


```sql
SELECT
  *
FROM fhir_r4.coverage
WHERE 
  (
    json_serialize(period.start) > dateadd(month, -3, getdate())
    OR json_serialize(period."end") > dateadd(month, -3, getdate())
  )
LIMIT 10;
```

#### Find duplicates in Coverage data

You can join coverage data to a patient to find any data that overlaps.


```sql
SELECT 
  count(id) as n, 
  split_part(json_serialize(c.beneficiary.reference), '/', 2) as patient_id,
  period.start as period_start,
  period."end" as period_end
FROM 
  fhir_r4.coverage c
GROUP BY
2, 3, 4
HAVING 
  n > 1
ORDER BY 
  1
```