Upload
sb-bhattacharyya
View
165
Download
0
Embed Size (px)
Citation preview
SNOMED CT CSETS (CONSTRAINED SETS)– ITS PLACE & ITS USE
Dr SB Bhattacharyya
MBBS, MBA, FCGP
Member, National EHR Standardisation Committee, MoH&FW, GoI
Member, Health Informatics Sectional Committee, MHD 17, BIS
Hony. State Secretary (2015), IMA Haryana
President (2010 – 2011), IAMI
Constrained Set?
■ The word constrain means “to control or limit something” (Cambridge Online Dictionary)
■ SNOMED CT makes extensive use of refsets (reference sets), for a wide-range of purposes, each of which have specific purposes
■ Refsets need to conform to certain specific rules and guidelines regarding their preparation, distribution and maintenance
■ Takes a long time to design one and the designing entity needs to have a namespace assigned to it
■ This makes the rapid and effective use of SNOMED CT in individual systems cumbersome at best and impractical at worst
Dr SB Bhattacharyya© 2
Solving this conundrum
■ Since within a system it is pretty much lassiez faire or anything goes, it is wise to use a constrained set of the SNOMED CT that suits the purpose
■ For example, for gender or laterality, a small list specially created SNOMED CT code set for that purpose should work excellently (actual list follows in the next slide)
■ Thus, wherever there is a requirement for a system to have a list presented to the user for their selection, this small list serves the purpose
■ This limited list is termed a “constrained set” or CSET (a portmanteaux of the two words that it refers to) by the author
Dr SB Bhattacharyya© 3
Csets for Gender & Laterality
Gender
248152002 | female |
32570681000036106 | indeterminate sex |
32570691000036108 | intersex |
248153007 | male |
407374003 | transsexual |
Laterality
7771000 | left |
24028007 | right |
51440002 | bilateral |
Dr SB Bhattacharyya© 4
Created using Cliniclue®
Explication
■ This constrained list works very well and suits the purpose of helping users to fill in gender or anatomical side
■ The format of expressions as per the IHTSDO construction rules states that either of the following is acceptable (only pre-coordinated types are shown here)
– ConceptId
– ConceptId | Term |
■ Thus, let’s say, for “bilateral” laterality, either of the following works
– 51440002
– 51440002 | bilateral |
Dr SB Bhattacharyya© 5
Explication
■ It is important to debate the merits and demerits of such an approach
■ Not only must the pros and cons be considered but also the end-result should justify it
■ For starters, let us briefly study the refset approach
■ It should be noted that refsets are meant to be exchanged with external entities in their entirety and need to be updated after every release –international or national
■ It should also be noted that by the term “system” it is meant any system that uses SNOMED CT
Dr SB Bhattacharyya© 6
Refsets
■ Namespace required if refsets are shared with external entities/systems
■ Needs regeneration after every release
■ Can be automated using pre-set scripts (e.g., SQL, Perl, etc.) that needs to be designed in-house
Dr SB Bhattacharyya© 7
RefSet field structure
NRC, India© 8
RefSet type A RefSet type B RefSet type C
ID ID ID
Effective time Effective time Effective time
Active Active Active
Module ID Module ID Module ID
RefSetID RefSetID RefSetID
ReferencedComponentID ReferencedComponentID ReferencedComponentID
Extra field 1 Extra field 1
Extra field 2
Extra field 3
Versioning
component
Identifier
Additional
fields
Fie
lds
alw
ays
pre
sent
The reference
set
The ID of the
component being
referred to
Data Management
■ When data is managed, it is the expressions that are stored and exchanged
■ The expressions have a machine-processable part (ConceptId) and a human-readable part (Term) of expressions or just the machine-processable part (ConceptId), it is largely a system designing issue, which is an internal matter
■ Thus, system designers only need to consider that which is necessary to capture, store, retrieve, display, exchange, processing and querying
■ Anything else is not related to the system functionalities
■ Refsets have largely a governance connotation
Dr SB Bhattacharyya© 9
Comparing Refsets with Csets
Refsets
■ Formal, Exchangeable
■ Not easily reproducible – needs namespace
■ Needs to be adapted for system use – cannot be used as-is for data capture, storage, retrieval, query and exchange
Csets
■ Informal, Non-exchangeable
■ Easily reproducible – does not need any namespace
■ Ready-to-use for data capture, storage, retrieval, query and exchange
Dr SB Bhattacharyya© 10
Cset
Pros
■ Quick to develop and ready-to-use
■ Can design, create, deploy and re-use as per specific system requirements
Cons
■ Needs a team with properly trained and experienced professionals to design and create
■ Needs updating with every release – international and national
Dr SB Bhattacharyya© 11
Logic
■ Since most of the data is required to be captured in pre-coordinated expression forms (the form as available from international or national releases) that is either ConceptId only or ConceptId | Term | formats, the system designers need to have access to these for storing in their databases and used as-is
■ For queries, transitive closure tables are required for data aggregation, else, either only the ConceptId or only the Term need be used to return the proper records
■ The csets are easy-to-create being mostly built on-the-fly and hardly taking more than an hour to create moderately complex ones, provided the right domain experts are available to guide the designers
Dr SB Bhattacharyya© 12
Method
■ A good SNOMED CT tool like ClinClue® or Snow OWL® is required
■ A terminologist would be ideal but it may be tough for system vendors to hire
■ The next best person to do this type of work is a health informatics professional who familiar with SNOMED CT
■ Alternatively, the following may be considered as a team since this type of work cannot be done by one person, it will be too error-prone and consequently risky
■ Someone familiar with the tool being used is usually acceptable
Dr SB Bhattacharyya© 13
Method [contd.]
■ Someone well-conversant with SNOMED CT as a whole is required
■ A good DBA who can design the database in such a manner that duplicates are removed – the way SNOMED CT is modelled, the same term may be present in different hierarchies
■ A domain expert – specialist, doctor, nurse, dentist, paramedic, etc. – is required to identify all Terms (preferred as well as synonyms) required for that domain (clinical finding, procedure, disorder, allergy, etc.) to ensure that all the necessary terms (both preferred and synonyms) have been incorporated
■ During system use, only the Terms are displayed while the ConceptIds are stored and/or exchanged with or without the Terms, with the Term to ConceptId to Term mapping is done at the API level
Dr SB Bhattacharyya© 14
Method [contd.]
■ The best way is to identify the Term that best describes the domain concept (marital status, laparoscopic procedure, lipid profile, etc.) and construct the SQL statement that will extract all the necessary subtype children and descendants that will form the required constrained list of values
■ For maintenance purposes, rebuilding the csets for every subsequent official release of SNOMED CT, which happens every six months, can be automated by running these scripts to build a new cset
■ The need to manually check the cset does not go away though to ensure that all the required concepts and their corresponding preferred terms as well as synonyms are incorporated
Dr SB Bhattacharyya© 15
Cset field structure (proposed)
Dr SB Bhattacharyya© 16
CSet type A CSet type B
ConceptId ConceptId
Term Term
Effective time
Active
CsetId
Versioning
component
Concept Identifier
Fie
lds
alw
ays
pre
sent
Term
Cset Identifier
The CsetId uniquely identifies the Cset that the
ConceptId belongs to. A single ConceptId may be
assigned to multiple Csets as necessary.
Cset Types
Type A
■ Separate tables for each domain item like Gender, Employment Status, Drug & Medicament, Absence findings, etc.
■ No CsetId needed
■ Easy to build and maintain
■ Requires regeneration of separate tables with every release –several run cycles
Type B
■ One table where every domain is uniquely identified through CsetId that is self-determined and self-generated
■ Complicated to build and maintain
■ Requires running several scripts in series that populates and updates a single table with every release – automated single run cycle
Dr SB Bhattacharyya© 17
References
■ Matter based on ideas formulated by Dr SB Bhattacharyya
■ Some matter sourced from presentations prepared by Dr Karanvir Singh in consultation with IHTSDO on behalf of National Release Centre, India
Dr SB Bhattacharyya© 18
Dr SB Bhattacharyya© 19
Dr SB Bhattacharyya© 20
Thanks!