Information Extraction and Knowledge management in Newspaper

POWERED BY : GROUP 1

NIA/DESI/POPON/JIMMY/RASYID

- Information Extraction (IE) as an important thing for archieving data from text paper document to be easily maintain and reprocess data

- The manual process of gathering the information consuming too much time and energy

-To create a system that can gathering and also provide an information to user with a simple way

-To classify documents in database

-To provide the good algorithm in information extraction

-To provide application that make SubDirectorate Public Opinion, BPS to archieving data from newspaper

- Information extraction (IE) is the task of automaticallyextracting structured information from unstructeredand/or semi-structured documents

- OCR is a special system is used to identify printed textpaper typed and printed using a printer which is thenfurther processed by using a particular algorithm into acharacter that can be recognized and processed intoinformation

- Document classification on the last step

1. Input (Newspaper/epaper)

2. Cropping + Image Processing

3. OCR

4. Summarizing

5. Classification

Newspaper,etc

OCR

Plain text

InputProses

(get information) Output

e-paper

Gerbawani, R. A. Somadi. 2013. “Peringkasan Dokumen Bahasa Indonesia Menggunakan Logika Fuzzy”. Bogor : Fakultas Matematika dan Ilmu Pengetahuan Alam IPB.

Trisedya, Bayu Distiawan & Jais, Hardinal. 2009. “Klasifikasi Dokumen Menggunakan Algoritma Naive Bayes dengan Penambahan Parameter Probabilitas Parent Category”. Jakarta: Fakultas Ilmu Komputer Universitas Indonesia.

Pramesti, Titis H.W. 2014. “Pengenalan Karakter Teks Menggunakan MetodeNeural Network Backpropagation”. Malang: Jurusan Teknik Elektro, FakultasTeknik Universitas Brawijaya.

THANK YOU

FOR YOUR ATTENTION

Technology

Information Extraction and Knowledge management in Newspaper