37
to Advance Knowledge for Humanity S. Bhalla Model Driven Querying 1 Domain Specific Multi-stage Query Language for Medical Document Repositories 9/23/2013 VLDB Phd Workshop 2013

Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

S. Bhalla

Model Driven Querying

1

Domain Specific Multi-stage Query Language

for Medical Document Repositories

9/23/2013 VLDB Phd Workshop 2013

Page 2: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Introduction

Specialized Domains

Biomedical, agriculture, medical/healthcare

Require Effective search and query mechanisms

Insufficient Search-engine like Key-words based searches

Medical domain

Complex

Example,

Medical professionals Specific technical articles (particular topic )

General public General information (disease or medicine).

How to retrieve medical query related information

9/23/2013 2 VLDB Phd Workshop 2013

Query Information about ”general AIDS information�” medical search tool, such as

PubMed

Result 1000s of documents ⊆ Different aspects of AIDS are displayed Such as , treatment, drug therapy, transmission, diagnosis, and history

Page 3: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 3 VLDB Phd Workshop 2013

Introduction (1)

Medical Information

Knowledge Evolved over 10s of years

Contains Well defined terms and processes

Available on the Web

Patient Specific Information

Knowledge-based Information

Medical Literature

Web Documents

Patient-encounter

Recordings

Page 4: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Complexity Knowledge-based Resources

Heterogeneous End-user groups

Patients, researchers, doctors and other experts

Variation Information Requirements

Patient-treatment, self -diagnosis, general health information

Structure Medical Documents

Scientific papers, encyclopedias and other literature

Unique, well-defined

9/23/2013 VLDB Phd Workshop 2013 4

Introduction (2)

Page 5: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Introduction (3): Specialized Documents

Case of medical encyclopedias

Comprehensive medical guide Patients and clinicians

Authoritative source NLM (National Library of Medicine)

Paper based resources Electronic format

Frequently referred Medical domain users

Example, MedlinePlus, WebMD, ADAMS,

Merriam-Webster Medical Dictionary

9/23/2013 VLDB Phd Workshop 2013 5

Page 6: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Introduction (4): Why Query

External knowledge base Clinicians

Evidence based medicine

During different stages of point-of-care

Assessment plan of treatment

Patient diagnosis

Improve Quality of Care

Authoritative information required

Self Diagnosis

During Early appearance of symptoms

Personal Knowledge Patients and their relatives

9/23/2013 VLDB Phd Workshop 2013 6

Page 7: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

The Underlying Structure

9/23/2013 VLDB Phd Workshop 2013 7

The Hierarchical Structure

Topic of the Document

Subtopics

Miscellaneous/Related

Content

Subtopic 1 Subtopic 2 Subtopic n Content topic 1 Content topic 2 Content topic n

Content Content Content Content Content Content

Flow of Contents Organized as stages of point-of-care

Page 8: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 8 VLDB Phd Workshop 2013

Introduction (5): The End Users Variable

a. Demographical Characteristics

b. Tasks/Purpose

c. Computer/Domain Expertise

Practitioners and Researchers

Well-versed

Domain knowledge and terminologies

Require

Precise, complete, accurate and timely results

Patients and their relatives

NOT Well-versed

Domain knowledge and terminologies

Require

General information

Healthcare Workers

Specialized

Researchers

Patients, their relatives

Page 9: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Evidence-based Queries

Intent: Diagnostic

Raised by: Clinicians/Experts

Target resources: Online Medical Repositories (e.g. medical encyclopedia)

Example: “Cases where helicobacter bacteria causes peptic ulcer”

Hypothesis-directed Queries

Intent: Non-diagnostic

Raised by: Novice users/patients

Target resources: Online Medical Repositories (e.g. medical encyclopedia)

Example: “Treatment in case of high fever and dizziness”

9/23/2013 VLDB Phd Workshop 2013 9

The Medical Queries

Page 10: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

The Query Flows

9/23/2013 VLDB Phd Workshop 2013 10

Occurrence Evidence-based and hypothesis-directed queries

Represent Stages of information seeking

Comprise Varying levels of query complexity

1. 2. 3. 4.

Page 11: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 VLDB Phd Workshop 2013 11

Query: Find chances of "Cancer Risk" in patients showing symptom "Sleep Deprivation"

and have been exposed to "Radiation" (but not "Environmental Toxins" and does not have

"Genetic Disorder") .

Help Needed

The Research Gap

Results Large in number, irrelevant

Failure Keyword search, domain-specific search tools

Require Precise and easy-to-use database style query methods

Key steps:

1. Schema understandable by users

2. Identify Resources to query

3. Identify Granularity of results

Healthcare Expert

Paper-based resources

Page 12: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 VLDB Phd Workshop 2013 12

Aim: Effective Online Medical Information

Transform Document Repository User-Level Schema

Enable High-level Query Language

Target Audience Skilled and semi-skilled users

Utilize Query capabilities of a database query language

Assist Domain Experts Using Query language

Facilitate

In-depth Queries

Granular Results

Bridging the Gap

Page 13: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Querying the New Way

9/23/2013 VLDB Phd Workshop 2013 13

User-level

Schema

High-level Query

language

Traditional

Method

Proposed

Method

Resource

Resource

Search

Method

Query

Method

Medical

Expert

Medical

Expert

Results returned

- Lack specificity

- Long list of full documents

- Trustworthiness of resources unknown

Results returned

- Specific, granular

- Segments of documents query criteria

- Trustworthy/Authoritative sources only

Page 14: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 14 VLDB Phd Workshop 2013

Proposed Approach

Page 15: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 VLDB Phd Workshop 2013 15

Key Features

User-Level Schema

Universal , concept-level schema

Attributes

Understandable Domain experts and novice users

Query-able

Granular results

Multi-stage Query Language

Map multi-stage diagnostic process Step-by-step Query Flow

Interactive Querying View Results Add concept

Continuous query refinement

Supported Queries Simple, Medium, Complex, Recursive

Page 16: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Outline

Two-step framework

Offline Process Create User-level schema

Online Process Enable Multi-stage Query Language

Offline Process

Use Web segmentation algorithm, Domain concepts

Result Automatic creation of a User-level schema

Online Process

Enable Multi-stage Query Language

Use User-level schema

Results Granular, segment-level, context-based

9/23/2013 VLDB Phd Workshop 2013 16

Page 17: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 VLDB Phd Workshop 2013 17

The Method

Two-step Framework

Page 18: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

18 9/23/2013

VLDB Phd Workshop 2013

Data Model

Page 19: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model

Tree Structured Repository

9/23/2013 VLDB Phd Workshop 2013 19

H1

f1

f2

Example: MedlinePlus Medical Encyclopedia

Page 20: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model (1)

9/23/2013 VLDB Phd Workshop 2013 20

Page 21: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model (2):The Schema

9/23/2013 VLDB Phd Workshop 2013 21

Attributes Diagnostic concepts/terms

Understandable by expert and novice users

Do not change frequently

Page 22: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model (3): A XML Document

9/23/2013 VLDB Phd Workshop 2013 22

Example: MedlinePlus Medical Encyclopedia

Title Causes Symptoms

Treatment

Document corresponding to “Aarskog Syndrome”

Page 23: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model (4): Query Effort

Query: Find if "Oxygen therapy" work for the treatment of "Chronic Respiratory

Failure" and symptoms are "Lethargy" OR "Shortness of breath”.

9/23/2013 VLDB Phd Workshop 2013 23

Advanced search: MedlinePlus Proposed Method

SELECT attribute = “Disease_name”

WHERE

Attribute “Disease_name” = “Chronic

Respiratory Failure”

AND

Attribute “Treatment” = “Oxygen therapy”

AND

Attribute “Symptoms” =“Lethargy”

OR

Attribute “Symptoms” = “Shortness of breath”

Easy-to-Use Not Possible

Result segment

Context of user-query

Page 24: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Data Model (5): Granular Results

9/23/2013 VLDB Phd Workshop 2013 24

Queried

Attributes/Segments

Query Results

Context Granular

Each result is a segment, combination of

Concept/context in query

Item of concern (content enclosed in a

segment)

Page 25: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

An Example

Query: Find other symptoms where “chronic kidney failure” is

caused by “anemia”

Queried segment Symptoms

Segments in Query Causes = “anemia” and Disease_name

= “Chronic kidney failure”

Result Segment Symptoms

Context disease_name = “chronic kidney failure” & causes

= “anemia” (SYMPTOM - segment)

9/23/2013 VLDB Phd Workshop 2013 25

Page 26: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

9/23/2013 VLDB Phd Workshop 2013 26

Next Step Multi-stage Query Language

Page 27: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Proposed Query Language (1)

XQBE Medical document repositories

Create queries Drag and drop interface

Query : “Find cases where a person is inflicted with “peptic ulcer” due to

“helicobacter pylon bacteria”

9/23/2013 VLDB Phd Workshop 2013 27

Attributes understandable by end

users

1. Case = disease_name

Value = ??

2. Due to = Causes

Value = “helicobacter pylon bacteria”

3. Inflicted with = Symptoms

Value = “peptic ulcer”

Query Effort Minimal learning curve

Computer-expertise not required

Page 28: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Multi-stage Query-by-Concept

Concept Query-able attribute

Topic, sub-topic, medical concept

Query Effort

Dynamic selection of attributes

No computer expertise

Query Process

9/23/2013 VLDB Phd Workshop 2013 28

Proposed Query Language (2)

An Example: Cases where fever is caused due

to infliction of Pneumonia and Tuberculosis

Page 29: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Another Example

Query: Find cases where 3 clinical concepts (“cough”, “no

sore-throat”, and “had no sterol injection”) occur in context of

symptoms occur along with a concept having sub-key (i.e. “non

sterol injection at the left side”)

Possible XQBE on Specialized Medical Repositories

Possible Multi-stage Query-by-concept Query Language

9/23/2013 VLDB Phd Workshop 2013 29

Page 30: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Evaluation Plan

9/23/2013 VLDB Phd Workshop 2013 30

Data Sets

MedlinePlus document repository

Health topics (900+) , encyclopedia (4000+), drugs (12000+)

Set of Queries

50 test queries (multi-staged)

Using diseases and medication etc.

Quantitative Studies

Evaluation Metrics

Accuracy of segment extraction (schema creation) Precision and Recall

Reduction in search space

Qualitative Studies

Usability Studies

Actual End-users

Query Performance

Page 31: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Initial Achievements

HTML XML as per proposed model

XQuery on XML

Integration with XQBE

Query by concept Enumeration using paper and

pencil

9/23/2013 VLDB Phd Workshop 2013 31

Page 32: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

32 9/23/2013 VLDB Phd Workshop 2013

Challenges

Scalability Similarly structured repositories

List Query operations needed

Implementation Above query operations

Query Language User Interface

Page 33: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

33 9/23/2013 VLDB Phd Workshop 2013

Related Work

Domain-specific Information Retrieval

Similarity and popularity based models Insufficient for domain experts

“Information granulation” needs to be considered in huge document repositories

Form-based Query Interfaces

Easy-to-use

Limited access to the database

Complex queries large number of forms

Varying medical concepts large number of fields in forms

Beyond single page web search results

Provide granular results for user’s search

Return segments from multiple or related web documents as results

High-level Graphical Query Languages

Easy-to-use and understand

Little or no programming effort required by the user

Common languages QBE, XQBE

Page 34: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Summary and Conclusions

34 9/23/2013 VLDB Phd Workshop 2013

Proposed Multi-stage Query Language

1. Aim Effective online medical resources

2. Key feature User-Level Schema

3. Facilitates Granular/Context-based Results

4. Support Healthcare Experts

5. Minimize Learning curve for novice users

6. Reduce Dependency on general-purpose search engines

Provide Web user level activity no or little programming effort

Healthcare experts

Page 35: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

References (1)

35 9/23/2013 VLDB Phd Workshop 2013

[1] D. Braga, A. Campi, and S. Ceri. Xqbe (xquery by example): A visual interface to the standard xml query language. ACM

Trans. Database Syst., 30(2):398–443, June 2005.

[2] D. Cai, S. Yu, J.-R. Wen, and W.-Y. Ma. Extracting content structure for web pages based on visual representation. In

Proceedings of the 5th APWeb, pages 406–417. Springer-Verlag, 2003.

[3] M.-A. Cartright, R. W. White, and E. Horvitz. Intentions and attention in exploratory health search. In Proceedings of the 34th

Intl. ACM SIGIR conference, pages 65–74, New York, NY, USA, 2011.ACM.

[4] S. Cohen, Y. Kanza, Y. Kogan, W. Nutt, Y. Sagiv, and A. Serebrenik. Equix-a search and query language for xml. Journal of

the American Society for Information Science and Technology, 53:2002, 2000.

[5] S. M. Freire, E. Sundvall, D. Karlsson, and P. Lambrix. Performance of XML Databases for Epidemiological Queries in

Archetype-Based EHRs. In Proceedings Scandinavian Conference on Health Informatics 2012, volume 70 of Linkping Electronic

Conference Proceedings, pages 51–57. Linkping University Electronic Press, 2012.

[6] M. Gschwandtner, M. Kritz, and C. Boyer. Requirements of the health professional research. In Technical Report D8.1.2.

Khresmoi Project, 2011.

[7] A. Hanbury. Medical information retrieval, an instance of domain. In SIGIR'12. ACM, August 2012.

[8] S. Hunt, J. J. Cimino, and D. E. Koziol. A comparison of clinicians’s access to online knowledge resources using two types of

information retrieval applications in an academic hospital setting. J Med Libr Assoc, 101(1):26–31, 2013.

[9] http://www.who.int/classifications/icd/en/, 2011.

[10] M. Jayapandian and H. V. Jagadish. Automating the design and construction of query forms. ICDE, page 125, 2006.

[11] F. Li and H. V. Jagadish. Usability, databases, and hci. IEEE Data Eng. Bull., 35(3):37–45, 2012. [12] http://loinc.org/, 2011.

[13] A. Marian and W. Wang. Flexible querying of personal information. IEEE Data Eng. Bull., 32(2):20–27, 2009.

[14] http://www.nlm.nih.gov/bsd/pmresources.html, 2011.

[15] http://www.nlm.nih.gov/medlineplus/, 2009.

[16] http://www.linkedin.com/groups/ Choice-OpenEHR-persistence-layer-144276.S.208531138?qid=208adbca-fc26-4ada-bf02-

7efe5a9e5661&trk=group_most_recent_rich-0-b-ttl&goback=%2Egmr_144276, 2013.

Page 36: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

References (2) [17] http://www.ncbi.nlm.nih.gov/pubmed, 2011.

[18] S. A. Rahman, S. Bhalla, and T. Hashimoto. Query-by-object interface for information requirement elicitation in m-commerce. Int.

J. Hum. Comput. Interaction, 20(2):135–160, 2006.

[19] X. Y. Raymond, Y. Lau, D. Song, X. Li, and J. Ma. Toward a semantic granularity model for domain-specific information retrieval.

ACM Trans. On Information Systems., 29(3), July 2011.

[20] S. Sachdeva and S. Bhalla. Implementing high-level query language interfaces for archetype-based electronic health records

database. In COMAD, 2009.

[21] http://www.ihtsdo.org/snomed-ct/, 2011.

[22] R. Varadarajan, V. Hristidis, and T. Li. Beyond single-page web search results. IEEE Transactions on Knowledge and Data

Engineering, 20(3):411–424, 2008.

[23] A. Yasir, M. Kumara Swamy, P. Krishna Reddy, and S. Bhalla. Enhanced query-by-object approach for information requirement

elicitation in large databases. In Big Data Analytics, volume 7678 of Lecture Notes in Computer Science, pages 26–41. Springer, 2012.

9/23/2013 VLDB Phd Workshop 2013 36

Page 37: Domain Specific Multi-stage Query Language for Medical ...bhalla/ModelDrivenVer2.pdf · Use Web segmentation algorithm, Domain concepts Result Automatic creation of a User-level schema

to Advance Knowledge for Humanity

Questions

37 9/23/2013 VLDB Phd Workshop 2013