Upload
caldwell-mullins
View
23
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Tiran Software. -TURKUAZ Project- RadeX Tahir Bilal Onur Deniz Soner Kara M. Mert Karadağlı. Assistant: Umut Eroğul Instructor: Meltem T. Yöndem. Outline. Problem Definition Important Aspects Our Approach General Structure Analyzer Component Searcher Component - PowerPoint PPT Presentation
Citation preview
-TURKUAZ Project-
RadeX
Tahir BilalOnur DenizSoner Kara
M. Mert Karadağlı Assistant: Umut Eroğul Instructor: Meltem T. Yöndem
Problem DefinitionImportant AspectsOur ApproachGeneral Structure
Analyzer ComponentSearcher Component
Current StatusPrototypeTool and ResourcesQ/A
Problem Definition
Billions of radiology reportsUnfortunately, they are stored in free-text formatHard to search and retrieveNeed for searchable information
Important AspectsText Mining
NLPInformation ExtractionMorphological AnalysisNamed Entity Recognition
Machine LearningNeural Networks, Decision Trees ...
Our Approach
RadeX, Radiology Data Extractor will enable..
Modular machine learning component
Support for internal/external dictionary connection
Template-based approach for finalizing
Analyzer ComponentPreprocess free textLook-up internal and external lexiconsGives semantic to wordsExtracts searchable data
Searcher ComponentSend query strings to databaseRetrieve corresponding information
General Structure (cont.)
Preprocessing.
Connecting and using external sources.
Database implementation.
Applying SVM to unrelated but tagged corpus.
Current Status
Mapping Turkish terms to English translations.
Finding stem of unknown words.
Constructing lexicons.
Features of verbs, adjectives, nouns...
Current Status (cont.)
..decompose reports into sub-parts, sentences and words,
.. analyze words using Zemberek and a stemmer.
.. give semantics to words via internal/external lexicons
.. extract simple information using pre-defined templates
In Prototype we will be able to...