17
MPOG PHI Scrubbing Introduction In order to run the PHI Scrubber, you will need to be aware of a few items beforehand. The application will remove as many identifiers as it detects, there will always be a minimal (non-zero) risk of some free text notes with identifiers. Sites need to run their own PHI Scrubbing Application to ensure accuracy. The PHI Scrubbing Application uses several dictionaries to identify which strings to remove and which strings should be kept: MPOG Dictionaries (preloaded into application) US Census Bureau of common names (first and last names) Snomed – the most comprehensive list of healthcare terminology Common perioperative terms and acronyms (based upon MPOG data) Local Dictionary Local institution-specific provider name and identifier dictionary must be loaded by your IT team Initially, please only run one week of data, to determine if the program is scrubbing correctly. If there is a problem, it is easier to retrieve one week of data versus your entire database. The PHI scrubbing is done on a local level before being uploaded to the MPOG main repository. When you are ready to run the PHI Scrubbing Application, run it on your local MPOG server (as opposed to from your workstation) to optimize performance. Every time you upload new data into MPOG, please rerun your local directory to include any new employees who work at your institution. For example, if you upload data from May 1, 2012 through July 31, 2012, your institution may have several new residents or faculty who are not

Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

  • Upload
    dodung

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

MPOG PHI ScrubbingIntroduction

In order to run the PHI Scrubber, you will need to be aware of a few items beforehand.

The application will remove as many identifiers as it detects, there will always be a minimal (non-zero) risk of some free text notes with identifiers.

Sites need to run their own PHI Scrubbing Application to ensure accuracy.

The PHI Scrubbing Application uses several dictionaries to identify which strings to remove and which strings should be kept:

MPOG Dictionaries (preloaded into application)

US Census Bureau of common names (first and last names)

Snomed – the most comprehensive list of healthcare terminology

Common perioperative terms and acronyms (based upon MPOG data)

Local Dictionary

Local institution-specific provider name and identifier dictionary must be loaded by your IT team

Initially, please only run one week of data, to determine if the program is scrubbing correctly. If there is a problem, it is easier to retrieve one week of data versus your entire database.

The PHI scrubbing is done on a local level before being uploaded to the MPOG main repository. When you are ready to run the PHI Scrubbing Application, run it on your local MPOG server (as opposed to from your workstation) to optimize performance.

Every time you upload new data into MPOG, please rerun your local directory to include any new employees who work at your institution. For example, if you upload data from May 1, 2012 through July 31, 2012, your institution may have several new residents or faculty who are not in in your local directory. You will need to have your programmer rerun the scripts and verify the new employees.

Your institution IT person can perform the initial review of the PHI scrubbing, removing obvious names such as, CRNA, attending, anes, provider, etc. However, the PI is responsible for the final review of the dictionary. The PI must verify that clinical terms, such as Miller and Macintosh remain in the dictionary and is responsible for all data transferred to MPOG.

If at any time you need assistance, please contact Tory Lacca either via e-mail or phone 734-936-8081

Page 2: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

Directions for PHI Scrubbing:

o Launch the MPOG Application Suiteo Choose the PHI Scrubbing Application

PHI Scrubbing App

Page 3: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

1. PHI Scrubbing Process (‘De-ID’ tab)

Case Selection Options

“Case Set”o “De-ID all cases”: Allows users to choose to scrub all cases (within the specified date range, if one

is specified) o “Cases Waiting for De-ID” Allows users to only scrub the cases which are marked as needing

scrubbing (cases where MPOG_Deid_Status_CD = 0)o Can also choose to scrub just a single case

Will De-ID regardless of whether the MPOG_Deid_Status_CD is 0 or 1 Note: If user also has a date range specified, and the specified case does not fall inside that

range, than the case will not get Scrubbed when the PHI scrubbing process is run, but rather the user will be notified with an error.

“Date Range”o Optional, combines with the “Case Set” criteriao User can specify a date range of cases to identify

Inclusive (i.e. 2/18/2011 – 2/19/2011 consists of 2 days’ worth of cases)

Page 4: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

Running the PHI Scrubbing Process

Starto Once the case selection has been set, user simply hits the “Start De-ID Process” button to begin.o Upon starting, the program sets the “MPOG_Deid_Status_CD” field of all selected cases to 0. This is

done so that in the case that the scrubbing process times out, or must be paused/stopped, the PHI scrubbing process does not have to re-scrub every case, but rather only the ones where MPOG_Deid_Status_CD = 0.

Pauseo Once the De-ID process has been started by the user, the Pause button is enabled.

When the user presses this button, what actually happens is that the De-ID process is stopped (like hitting the “Stop” button), and then when the program is un-paused, it restarts itself (like hitting the “Start” button) but only De-IDs the remaining cases that need De-ID (i.e. the program acts as if the “Cases Waiting for De-ID” option is checked)

Stopo Hit this button at any time after the scrubbing process has been started to stop the process without

the possibility of restarting (unlike Pause) Any cases that have been scrubbed before the process was killed will remain Scrubbed. All cases that were selected but did not get scrubbed before the process was killed will still have

their “MPOG_Deid_Status_CD” set to 0.

Progress Displayo Application has a progress bar which shows the current progress, to a granularity of 1%. The progress

bar will update itself every 1%.o The total number of cases displayed in text above the progress bar

The amount of cases completed is simply calculated by taking the percent complete and multiplying by the total number of cases (so this number is rounded to the nearest 1% of the total cases to be scrubbed)

o Estimated Time Remaining The process displays an estimated time remaining

o The number is calculated by taking how long the process has been running, and what percent has been completed, and then calculating how long it has taken to complete 1% of the work, and then multiplying that number by the percent of work remaining.

o The estimated time remaining is recalculated and updated whenever the progress bar is updated (that is, every 1%)

Page 5: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

2. AIMS Dictionary Configuration (‘Configuration’ tab)

The first step in the process is to make sure your programmer loads the institution-specific provider PHI dictionary. The scrubbing specifications are located at the bottom of the document. If your programmers need any assistance, please have them contact Mark Dehring ([email protected]).

The Scrubbing Application searches through several dictionaries to establish what strings should be considered PHI and removed and which strings should be kept:

MPOG Dictionary (preloaded with program) US Census: Includes all common first and last Snomed: Most comprehensive list of medical terms (strings to keep) Common Perioperative Terms (strings to keep)

Local Institution specific provider PHI Dictionary, consists of all the local provider names and identifiers

Patient specific: for each case, it looks up that patient’s identifiers (SSN, name, date of birth, etc) in the local MPOG tables. These identifiers supersede all other dictionaries and are always removed from any text being scrubbed by the De-ID application

The purpose of the institution specific provider PHI dictionary is to allow the PHI scrubbing process to remove provider names or identifiers that you do not want included in your MPOG contribution. Although the scrubbing algorithm can use MPOG dictionaries to remove nationally known common names (e.g. Kevin), local dictionaries are needed for uncommon names (e.g. Sachin)

Local dictionary

The ‘AIMS PHI Dictionary Search’ is your local dictionary

Page 6: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

o The various search string selection options correspond to the following MPOG_Dictionarytype_concept_IDs:

“All Search Strings” (19016-19020) “Site Common Words” (19016) “First Names” (19017) “Last Names” (19018) “Identifiers” (19020) “Initials” (19019) “Search String”

If the custom search string checkbox is not selected, the program will display all the words found in the dictionary for the selected categories.

If “Search String:” checkbox is selected, than the program will search the selected categories for the specified search string indicated.

Because the local dictionary is based upon your AIMS provider list, it may contain some names or identifiers that useful clinical terms

o Some common names such as “Miller” would be removed from your MPOG contribution if it is left in your PHI dictionary. However, the research value of “Miller” is high enough that it likely that you would prefer to REMOVE this provider’s name from your local PHI dictionary

o In addition, many centers use test or “dummy” users for testing and training that are named clinical terms: “Attending test” or “Resident Training”. Each of these would be considered PHI unless you remove it from your PHI dictionary

First, check ‘All Search Strings’ and then click ‘Search’ – this will provide you with the list of all the strings in your local database. o Scroll through the results to review all strings.

Each institution must determine what provider information they are comfortable leaving in their database (i.e. provider numbers, initials, clinical terms, etc.)

To remove a string, highlight it and click ‘Remove from AIMS Dictionary’o Remove all numbers three digits or lesso Remove all items that may be associated with a clinical outcomes or

terms*o Names that may be acronyms (three characters or less leave in unless it is

a patient’s name) MPOG scrubs on a case-by-case basis.

o For example, if a patient’s name is Miller, MPOG will scrub it from the specific case. If a patient’s name is Miller and Dr. Miller provided anesthesia, then Miller will be removed across the entire case.

o MPOG sets the following priorities to the strings:1. Case-by-case/patient information (trumps all)2. Clinical terminology3. Provider names

*PLEASE NOTE: Clinical terms will need to be removed from the local dictionary. All instances of clinical terms such as, Miller and Macintosh will not be scrubbed unless it is the patient’s name.

Page 7: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

Compare Provider Strings to Common Words

Once you have removed all terms in your local dictionary, click on ‘Compare provider strings to common words’o This will compares the provider names in your local dictionary to any clinical terms which are listed in

Snomed (i.e. CRNA, attending, Miller, anes, etc.) Go through the list and determine what terms, which are common words that you would

want to keep (e.g. You have a provider name of Pain, you must determine the research value vs. the privacy risk).

Adding/Removing words from AIMS_PhiDictionary (i.e. Configuration)o If text has been entered into the search string box, users can select a concept type in the “Select

Type” box (to the right of the search box), and then click “Add to Dictionary” or “Remove from Dictionary” in order to remove/add the specified word/concept_id pair to the AIMS dictionary.

o If the word/concept_id pair already exists in the dictionary (for Adding), or the word/concept_id pair does not exist in the dictionary (for Removing), the user will be told so, and the Add/Remove command will not be executed.

o The only restriction on what words are allowed to be added is that words are not allowed to begin or end in whitespace (the user will notified if a word they try to add begins or ends in whitespace)

Page 8: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

3. De-ID Sample Testing Tab

This application runs scrubbing logic for a given sample string and explains what was done with each word in the string.

Users can create any string they want or they can copy an actual piece of data from the database and use this sample tester to see what will happen to the string if the PHI Scrubbing is run (without actually having to De-ID the string)

Run a test string o Example:

Type is a text stream, “my patient has the name Kevin.” Resulting string: my patient has the name [PHI]

Result Grid: Source word = original before PHI removal Rule Triggered = the reason why the word was removed or nor remove (usually

this is the highest priority reason). Resulting word = the words after PHI removal (it will be either the original word

or [PHI].

Page 9: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

Associate with an MPOG CASE

You can also run a text stream against a specific case to ensure that the PHI is being scrubbed from cases. This allows users to enter a valid MPOG case ID, which will run the results of that De-ID process.

Type is a text stream Enter the Case ID from a specific case and click test (obtain the MPOG case record number by going to the

MPOG case viewer application). o Without associating with an MPOG case, no patient specific data scrubbing can be tested. Only

name strings in the US Census Bureau and MPOG staff identifier list would be removed. So if there is patient PHI in the test string that is not a common name or institution specific staff name, ie “Kheterpal”, association with an MPOG case will show the actual result that will happen if De-ID is run.

“Show whitespace and delimiters”o Displays the whitespace and the delimiter characters that were present in the test string as

words in the Results data grid.

Page 10: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

De-identification Specifications

PHI Scrubbing Algorithm Summary

Word Length <=2

Patient Identifier

Keep

Common Periop Term

Local Staff Identifier

UMLS / SNOMED Term

Scoring System based off common names / words

Remove

Keep

Keep

Remove

Page 11: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

Targeted Columns

AIMS_IntraopCaseInfoo AIMS_Preoperative_Diagnosis_Texto AIMS_Scheduled_Procedure_Texto AIMS_Actual_Procedure_Text

AIMS_IntraopInputOutputso AIMS_IO_Comment

AIMS_IntraopMedicationso AIMS_Med_Comment

AIMS_IntraopNoteComponentso AIMS_Component_Value_Text

AIMS_IntraopNoteso AIMS_Value_Texto AIMS_User_Comment

AIMS_IntraopPhysiologico AIMS_Value_Texto AIMS_User_Comment

AIMS_LabValueso AIMS_Comment

AIMS_Outcomeso AIMS_Value_Texto AIMS_User_Comment

AIMS_Preopo AIMS_Value_texto AIMS_User_Comment

AIMS_PreopDetailso AIMS_Value_text

NSQIP_Case_Formo Commentso Created_byo Updated_byo All custom text fields

NSQIP_Concurrent_Procedureso Created_byo Updated_by

NSQIP_Intraop_Occurrenceso Created_byo Updated_byo Comments_intra

NSQIP_Other_Procedureso Created_byo Updated_by

NSQIP_Postop_Occurrences

Page 12: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

o Comments_posto Created_byo Updated_by

NSQIP_Return_CPT_Codeo Created_byo Updated_by

NSQIP_Return_ICD9_Codeo Created_byo Updated_by

Excluded Tables (Minimal Risk of PHI or Staff Info)

AIMS_IntraopInputOutputTotals AIMS_Patients

o No PHI columns will be transferred AIMS_Sites

Algorithm

All columns listed above are processed through the scrubbing algorithm The text of the column is divided into component text segments using common delimiters

o Spaceo . – ( )/ \- “ ” ’ ‘ + < > *o The list of delimiters is stored in the database and configurable over timeo The list of delimiters is synchronized across siteso Each text segment is then compared against the de-ID text strings (listed below)

Patient identifier scrubbingo Uses all identifiers in the AIMS_Patients table for that particular patient that have character

length of 3 or greater Looks for exact match for following columns

AIMS_first_name AIMS_last_name AIMS_Middle_name AIMS_reg_num AIMS_ssn (numbers only) AIMS_dob (numbers only) AIMS_Address_Street_1 AIMS_Address_street_2 AIMS_Address_city AIMS_Address_State_Province AIMS_Address_postal_Code AIMS_Phone_number

Looks for permutations of the following AIMS_ssn with “-“ in between segments (999-99-9999)

Page 13: Introduction - MPOG Web viewThis application runs scrubbing logic for a given sample string and explains what was done with each word in the string. Users can create any string they

AIMS_ssn without “-“ in between “999999999” AIMS_reg_num without leading zeroes AIMS_reg_num padded with leading zeroes up to 15 characters long AIMS_dob in all numeric formats with delimiters

o 99/99/9999o 9/9/9999o 9/9/99

o Common names removal Use a synchronized MPOG table for US census common names

2 categories: ambiguous and unambiguous For now, treat ambiguous and unambiguous as the same thing

Checking logic If text segments = exact match to an identifier in this database, checks to see if

it is also in the medical terms database If text segment also in the DEFINITE medical terms database, does not remove If text segment is NOT in medical terms database, it does remove

Provider identifier removalo Institution specific provider identifiers are removed by comparison of text segments against a

database table of institution-specific providerso A MPOG Config Database table contains list of all target staff identifiers

First names Last names Pager IDs Doctor Numbers User IDs Any other identifiers / codes the institution wishes to remove

o Each institution loads their identifiers Miscellaneous identifier removal

o Find string “UNOS” (lower or upper case) and remove next 8 characters to remove UNOS # from operative procedures

Text segment replacement is with “[PHI]”