15
-TURKUAZ Project- RadeX Tahir Bilal Onur Deniz Soner Kara M. Mert Karadağlı Assistant: Umu Instructor: Me

Tiran Software

Embed Size (px)

DESCRIPTION

Tiran Software. -TURKUAZ Project- RadeX Tahir Bilal Onur Deniz Soner Kara M. Mert Karadağlı. Assistant: Umut Eroğul Instructor: Meltem T. Yöndem. Outline. Problem Definition Important Aspects Our Approach General Structure Analyzer Component Searcher Component - PowerPoint PPT Presentation

Citation preview

-TURKUAZ Project-

RadeX

Tahir BilalOnur DenizSoner Kara

M. Mert Karadağlı Assistant: Umut Eroğul Instructor: Meltem T. Yöndem

Problem DefinitionImportant AspectsOur ApproachGeneral Structure

Analyzer ComponentSearcher Component

Current StatusPrototypeTool and ResourcesQ/A

Problem Definition

Billions of radiology reportsUnfortunately, they are stored in free-text formatHard to search and retrieveNeed for searchable information

Important AspectsText Mining

NLPInformation ExtractionMorphological AnalysisNamed Entity Recognition

Machine LearningNeural Networks, Decision Trees ...

Our Approach

RadeX, Radiology Data Extractor will enable..

Modular machine learning component

Support for internal/external dictionary connection

Template-based approach for finalizing

General Structure

Analyzer ComponentPreprocess free textLook-up internal and external lexiconsGives semantic to wordsExtracts searchable data

Searcher ComponentSend query strings to databaseRetrieve corresponding information

General Structure (cont.)

Preprocessing.

Connecting and using external sources.

Database implementation.

Applying SVM to unrelated but tagged corpus.

Current Status

Mapping Turkish terms to English translations.

Finding stem of unknown words.

Constructing lexicons.

Features of verbs, adjectives, nouns...

Current Status (cont.)

..decompose reports into sub-parts, sentences and words,

.. analyze words using Zemberek and a stemmer.

.. give semantics to words via internal/external lexicons

.. extract simple information using pre-defined templates

In Prototype we will be able to...

SVM-LightWordNetJWNLTDK / ZarganZemberek,PostgreSQL

Tools & Resources

Any Questions?