Querying PerformanceBridge

The core usefullness of PerformanceBridge for developers is the data available. All applications have read access to the PerformanceBridge data. Each application has a chance to extend that data model with application specific information (e.g., a peer review application stores reviews that link back to radiology exams and employees). Whether you are querying your app's data or the PerformanceBridge data you will use the SELECT endpoint.

The SELECT endpoint is found at https://pbserver.example.com/pb/api/query/select. Before jumping into the code or other examples let's briefly explain design goals of this endpoint.

Wait - This page includes the boiler plate for using the SELECT endpoint as well as the language features. If you are using a client library, you can skip the "SELECT boiler plate" section.

PerformanceBridge was designed to enable healthcare applications of many varieties. We tend to group this applications into different classes which reflect a set of platform features that are needed to build them. For those applications dealing with individual patient workflows an API based on a standard, such as FHIR, might be sufficient. However, for applications that deal with aggregation of data (like business intelligence tools), a traditional REST model is a poor fit. Trying to do even basic aggregation, such as counting the number of inpatient exams that took place on a given day, could involve hundreds of separate http requests. Such an anti-pattern would lead to those needing aggregated data copying large portions of the data model to a different database and querying that instead. For this, and other reasons, we have created an endpoint that speaks a query language. The query language we chose is Structured Query Language (SQL). The PerformanceBridge SQL (PBSQL) implementation is based on the PostgreSQL syntax of SQL, but they are not a pure 1:1 mapping. We have also added some language features that help traversing normalized tables easier for developers. We'll demonstrate and discuss this in conjunction with describing the interaction with the endpoint on this page.

Warning - Your PBSQL queries are not run directly on the database. As such things like SQL injection attacks are prevented at the API layer. However, It's still best practice to ensure data is sanatized before sending it to the API. The API does not protect from storing javascript in a text field for instance.

SELECT boiler plate

The api's authentication scheme includes passing a Content-Hash header. Your SELECT statement is part of that header. Here is an example of how to create a call to the select endpoint.

As you can see in the above examples. The Content-Hash is created by running a SHA512 hash on the body of the request encoded in base64. The body consists of a JSON object containing parameters and select information. The parameters are a list of JSON encoded values that correspond to $# references in your select string. The select string is your PBSQL SELECT statement. If no parameters are needed you should pass an empty list as the value for parameters.

Here is an example HTTP request:

POST /pb/api/query/select HTTP/1.1
Host: pbapi.example.com
Content-Type: text/json
Content-Hash: UYShY0WAaD/+x+ldTSXUeSTgworyYfkNW18pYRp61fQRWIVwRTUbosrAW4tSGgRqXEoIWg+OBCX7A1Ag0o3hKg==
Date: 2021-07-22T09:36:56-04:00
Authorization: PB tutorial:vbrCXddMr/GMNTEMUZuMZDHIA9Gt4ls+7JQvYl1TTOxRv1vaLVPqfSqc2BrcvbDg2CLL0nufaE2BlD+wpCdwcw==

{"select":"select * from rad_exams limit 1","parameters":[]}

The response body will be a JSON string containing an object with a count and results key. The count is the number of records returned and the results contains a list of objects that represent the query results. Here is an example respone body from the above request (formatting added):

{
    "count": 1,
    "results": [{
        "accession": "817",
        "current_report_id": null,
        "current_status_id": 13,
        "exam_priority_id": 5,
        "external_system_id": 21,
        "first_final_report_id": null,
        "first_prelim_report_id": null,
        "first_report_id": null,
        "id": 22847,
        "last_final_report_id": null,
        "last_prelim_report_id": null,
        "order_id": 24415,
        "patient_mrn_id": 24428,
        "procedure_id": 853,
        "rad_exam_department_id": 6,
        "resource_id": 113,
        "site_class_id": 865,
        "site_id": 6,
        "site_sublocation_id": 14,
        "updated_at": "2012-3-28T17:55:0Z",
        "visit_id": 202
    }]
}

Error codes

When communication with the API fails HTTP error codes are used to explain the kind of failure and, depending on the failure, a response body will contain more details. The errors are:

  • 200 - No error, returned as expected.
  • 400 - Bad request - The body of the response should describe the reason for the bad request. With the SELECT endpoint this is most often a syntax error in the SQL statement.
  • 401 - Unauthorized - The request did not pass that authentication. Be sure your application is registered, you are using the correct hmac secret key, and your headers follow the scheme described in by the authentication layer.
  • 500 - Something went wrong with the API. Please contact the PerformanceBridge team to determine what went wrong.

PBSQL

As stated above PBSQL is based on PostgreSQL. For the most part you can follow the PostgreSQL syntax and it will work. The exceptions are:

  • Joins need to be more explicit. They must specify a direction (LEFT or RIGHT) at least. Example: SELECT * FROM rad_exams re LEFT JOIN procedures p ON p.id = re.procedure_id
  • Aliasing in the SELECT clause must use the AS keyboard. Example: SELECT count(id) as total FROM rad_exams
  • The CUBE syntax is not yet supported.

Dot Notation

The PerformanceBridge data model is highly normalized. As such it can take several table joins to capture all the data you may need for your users. To make this easier on developers we introduced dot notation syntax to PBSQL. This allows traversal of tables without using joins in your query. The RDBMS backing PerformanceBridge will still perform the joins, but you don't have to state them. Let's walk through some examples of how this can be used to make simpler queries.

Note - Dot notation is not required. It servers no purpose beyond developer happiness. If you don't like it or you're struggling to use it then fall back on traditional joins.

Getting exam information without dot notation:

SELECT re.accession, p.name, pmrn.mrn, o.order_number, proc.description, pt.patient_type
FROM rad_exams re
  LEFT JOIN procedures proc ON proc.id = re.procedure_id
  LEFT JOIN orders o ON o.id = re.order_id
  LEFT JOIN patient_mrns pmrn ON pmrn.id = re.patient_mrn_id
  LEFT JOIN patients p ON p.id = pmrn.patient_id
  LEFT JOIN site_classes sc ON sc.id = re.site_class_id
  LEFT JOIN patient_types pt ON pt.id = sc.patient_type_id
WHERE patient_type = $1
LIMIT 10

Getting order information with dot notation:

SELECT .accession, .patient_mrns.patient.name, .patient_mrns.mrn, .orders.order_number, .procedures.description, .site_classes.patient_types.patient_type
FROM rad_exams
WHERE .site_classes.patient_types.patient_type = $1
LIMIT 10

All dot notation syntax starts with a . and is followed by a structure consisting of either field or a series of fields/tables ending in a field name or a * starting from the first table in the FROM clause. In this case rad_exams. There is no inflection for plurality based on cardinality as there is in some object relationtional mapping libraries. Just use the name of the table. Naturally, there needs to be some link between these tables. If for instance I was to write SELECT .employees.name FROM rad_exams this would return an error as there is no foriegn key associated with the employees table in rad_exams.

What happens when I have multiple foriegn keys in one table that all point to the same table? In this case you would use the field name of the foriegn key explicitly in order to traverse to the other table. Here's an example of getting the employee for each of the roles associated with rad_exam_personnel:

Without dot notation

SELECT e.name
FROM rad_exam_personnel rep
  LEFT JOIN employees e ON rep.technologist_id = e.id
LIMIT 10

With dot notation

SELECT .technologist_id.name FROM rad_exam_personnel LIMIT 10

You may be thinking if I can just use the field name to get at the data in a table it references do I ever need to use the table names in my dot notation? The answer is no. Here is an example of the query we first wrote using only the field names as a way to traverse the tables.

SELECT .accession, .patient_mrn_id.patient_id.name, .patient_mrn_id.mrn, .order_id.order_number, .procedure_id.description, .site_class_id.patient_type_id.patient_type
FROM rad_exams
WHERE .site_class_id.patient_type_id.patient_type = $1
LIMIT 10

Troubleshooting - Should you ever try to reference a table in dot notation that has multiple foreign keys referring to the table where you are in the dot notation (e.g. .rad_reports.employees.name) an error will be thrown by the API stating that there is an ambiguous reference being used.

Dot notation can be used in any core clause (SELECT, WHERE, ORDER BY, GROUP BY) and within functions; including complex ones such as windowing functions. However, it cannot yet be used inside of sub selects.

Finally, what if you have a table that is referenced by other tables that have multiple foriegn keys to it, but you want to start your query from the parent table? For this we'll use a special syntax to specify which foreign field you want to use for the join. The syntax is tablename:fk_field.field. tablename is the foreign table, fk_field is the foreign field on which the join will be done. Finally, field is the foreign field you want to select. It's also possible to specify * as the field. The most common case for this is when thinking about data related to employees. Here's our query for rad_reports for a given employee.

Without dot notation

SELECT e.id, e.name, count(rr.id) as total_reports
FROM employees e
  LEFT JOIN rad_reports rr ON rr.rad1_id = e.id
GROUP BY e.id, e.name
ORDER BY total_reports DESC
LIMIT 10

With dot notation

SELECT .id, .name, count(.rad_reports:rad1_id.id) as total_reports
FROM employees
GROUP BY .id, .name
ORDER BY total_reports DESC
LIMIT 10

Here we use .rad_reports:rad1_id to refer to rad_reports.rad1_id as what we want to match against employees.id. It's possible to continue the dot notation as well. .rad_reports:rad1_id.rad_exams.accession is valid.

Note - You may have noticed we are making an assumption in every example of dot notation. If you did, well done. We are always assuming that there is a primary key for any given table and that the name of the primary key is id. This is a convention that is enforced by our best practices and is consistent throughout the data model.

Helpful Functions and Patterns

As this endpoint provides the full power of SQL there are many ways to build an application around it. There are some helpful tools that we want to call out here that we hope will benefit you.

JSON functions

With the popularity of front end frameworks it's common to use the backend of an application to simply deliver json. SQL typically comes back as tables. This works well when dealing with 1:1 cardinality between models, but it's possible to pull down complex relationships in a single query using functions like array_agg, json_agg, and row_to_json. Here is an example of using the json_agg function being used to retrieve radiology exams and a list of associated reports.

The query:

SELECT
  id,
  accession,
  current_status_id,
  (SELECT json_agg(rs) FROM (SELECT * FROM rad_reports rr WHERE rr.rad_exam_id = re.id) as rs) as reports
FROM rad_exams re
LIMIT 1

The response body:

{
   "count":1,
   "results":[
      {
         "accession":"1007",
         "current_status_id":1,
         "id":1,
         "reports":[
            {
               "body_fts":"'able':97 'accepted':142 'always':147 'annoyances':141 'avoid':171 'avoided':113 'beguiled':16 'belongs':47 'best':103 'blame':46 'blinded':28 'bound':41 'business':129 'cannot':33 'cases':72 'certain':116 'charms':21 'choice':88 'circumstances':117 'claims':122 'demoralized':18 'denounce':6 'desire':30 'dislike':11 'distinguish':79 'duty':54,124 'easy':77 'else':166 'endures':168 'ensue':43 'equal':45 'every':104,111 'fail':51 'foresee':34 'free':82 'frequently':132 'greater':163 'hand':4 'holds':148 'hour':83 'indignation':9 'like':102 'man':145 'matters':151 'men':12 'moment':26 'nothing':93 'obligations':127 'occur':133 'owing':119 'pain':36,70,112 'pains':169,173 'perfectly':74 'pleasure':23,105 'pleasures':135,159,164 'power':86 'prevents':94 'principle':154 'rejects':158 'repudiated':139 'righteous':8 'saying':64 'secure':161 'selection':156 'shrinking':66 'simple':75 'therefore':146 'toil':68 'trouble':38 'untrammelled':90 'weakness':56 'welcomed':109 'wise':144 'worse':172",
               "id":1,
               "impression_fts":"",
               "rad1_id":621,
               "rad2_id":622,
               "rad3_id":null,
               "rad4_id":null,
               "rad_exam_id":1,
               "report_body":"On the other hand, we denounce with righteous indignation and dislike men who are so beguiled and demoralized by the charms of pleasure of the moment, so blinded by desire, that they cannot foresee the pain and trouble that are bound to ensue; and equal blame belongs to those who fail in their duty through weakness of will, which is the same as saying through shrinking from toil and pain. These cases are perfectly simple and easy to distinguish. In a free hour, when our power of choice is untrammelled and when nothing prevents our being able to do what we like best, every pleasure is to be welcomed and every pain avoided. But in certain circumstances and owing to the claims of duty or the obligations of business it will frequently occur that pleasures have to be repudiated and annoyances accepted. The wise man therefore always holds in these matters to this principle of selection: he rejects pleasures to secure other greater pleasures, or else he endures pains to avoid worse pains.",
               "report_event":"2021-04-07T11:10:30-04:00",
               "report_impression":null,
               "report_status_id":1,
               "transcriptionist_id":null,
               "updated_at":"2021-07-06T11:05:17-04:00"
            }
         ]
      }
   ]
}

Creating a query builder

Each language has facilities for string concatenation and interpolation. As your queries have more user controlled filters/selections you will likely have a need for creating queries programmatically. While we will not provide client libraries for all languages, here are some rules we recommend for effective query builders and an example implementation in Ruby.

  1. If you are using a language with Objects, ensure your query builder instances are immutable and always return themselves. This allows for chaining functions without the dangers of underlying state change should you need to set your query to a variable to be used/manipulated later.
  2. Build simple functions for the core clauses of a SELECT statement.
  3. Provide room for prepared statements rather than building your own sanatizing methods.

This is a ruby implementation of a query builder. It refers to an API.select method that does the actual API communication described above.

class BridgeClient::Query

  # Class method missing
  def self.method_missing(method,*args,&block)
    # Provides a mechanism for any valid instance method
    # to instantiate a new query object
    if self.valid_part?(method)
      self.new.send(method,*args,&block)
    else
      super(method,*args,&block)
    end
  end

  # A list of valid instance methods to be used by
  # the above method missing check
  def self.valid_part?(part)
    [:select,:from,:where,:joins,:group,:order,:limit,:offset].include?(part)
  end

  # Instance method missing
  def method_missing(method,*args,&block)
    # Dispatch to the appropriate setter function based
    # on the name of the method
    if [:select,:joins,:where,:group,:order].include?(method)
      self.add(method,args)
    elsif [:from,:limit,:offset].include?(method) and args.size == 1
      self.set(method,args[0])
    elsif [:paginate].include?(method)
      limit = args[0][:per_page]
      offset = limit * (args[0][:page] - 1)
      self.set(:limit,limit).set(:offset,offset)
    else
      super(method,*args,&block)
    end
  end


  def initialize(params={},klass=nil)
    @klass = klass
    if params.class == String
      params = {:from => params}
    end
    # These parameters also provide the function names through the method missing clauses above
    @params = {
      :select => [],
      :joins => [],
      :where => [],
      :group => [],
      :order => [],
      :parameter_count => 0,
      :parameters => []
    }.merge(params)
  end

  def to_s(part=nil)
    case part
    when :select
      "SELECT " + (@params[:select] == [] ? "*" : @params[:select].join(", "))
    when :from
      "FROM " + @params[:from]
    when :joins
      @params[:joins] == [] ? "" : @params[:joins].join(" ")
    when :where
      @params[:where] == [] ? "" : "WHERE " + @params[:where].map {|clause| clause =~ /^\(/ ? "(" + clause + ")" : clause }.join(" AND ")
    when :group
      @params[:group] == [] ? "" : "GROUP BY " + @params[:group].join(",")
    when :order
      @params[:order] == [] ? "" : "ORDER BY " + @params[:order].join(",")
    when :limit
      @params[:limit] ? "LIMIT #{@params[:limit]}" : ""
    when :offset
      @params[:offset] ? "OFFSET #{@params[:offset]}" : ""
    when nil
      [to_s(:select),to_s(:from),to_s(:joins),to_s(:where),to_s(:group),to_s(:order),to_s(:limit),to_s(:offset)].join(" ").strip
    end
  end

  # Handle Arrays as parameters
  def add(part,args)
    # If one argument value and it's a list then assume it's the expected args list
    args = args[0] if args.size == 1 and args[0].class == Array
    local_pcount = @params[:parameter_count]
    added_params = args.slice(1,args.size)
    string = args[0]
    string.gsub!("?") do |match|
      # Test if the given parameter that matches the "?" is an array
      # if it is then flatten the array and build an extra $index reference
      # each item in the array, joined by commas
      if added_params[local_pcount - @params[:parameter_count]].class == Array
        (1..added_params[local_pcount - @params[:parameter_count]].size).map {|i| local_pcount += 1; "$#{local_pcount}"}.join(", ")
      else
        local_pcount += 1 # increment current param count
        "$#{local_pcount}"
      end
    end
    # Flatten any array's that are arguments to the parameters
    flat_params = added_params.inject([]) {|list,param| param.class == Array ? list + param : list << param }
    parameters = @params[:parameters] + flat_params
    string = args[0]
    new_params = @params.merge({part => @params[part] + [string], :parameters => parameters, :parameter_count => local_pcount})
    self.class.new(new_params,@klass)
  end

  def set(part,value)
    self.class.new(@params.merge({part => value}),@klass)
  end

  def query!
    # Basic query validity check (need a from clause)
    if @params[:from]
      BridgeClient::API.select(self.to_s, @params[:parameters])
    else
      throw("FROM clause never set and is required for a query")
    end
  end

  def query
    # basic memoization to prevented repeated api calls should the exact query be called multiple times
    @response ||= query!
  end

  def list
    # basic memoization to prevent repeated list processing should the exact query be called multiple times
    @list ||= ((@klass and @params[:select] == []) ? @klass.records(query["results"]) : query["results"])
  end

  def list!
    query!; list
  end

  def first
    @first ||= self.limit(1).list().first
  end

  def first!
    self.limit(1).list().first
  end

  def clone
    self.class.new(@params,@klass)
  end

end