Open source e_discovery

Preview:

DESCRIPTION

Presentation for Women in eDiscovery, Houston, TX

Citation preview

Open Source eDiscovery

Presentation for "Women in eDiscovery" Houston, TX

12/15/2011

Open source eDiscovery

•Pre-history•Present capabilities•Foreseeable future•Vision

Qualifications

• MS Math• MS Computer Science• Mensa, Languages (10)

 • Oil: patents, books, awards, software• Projects...

 • JD - eDiscovery• eDiscovery 1• eDiscovery 2• Free Discovery

Following the People with Luck

Watch the people who made it

My first project: writing eDiscovery for 1 computerEnding with 30

My second project: writing eDiscovery for an unlimited clusterEnding with BigData

Big Data! Enter Hadoop

 

Hadoop = Big Data

 

Big Data History

• 2004 - Google reveals their big data technology• 2005 - It becomes open source with Hadoop• 2008 - eDiscovery on the cluster• 09-11 - Big Data explosion

Writing a book

Hadoop in Practice for Manning

Getting invited

• YouTube• Microsoft Bing• Facebook• Google • Yahoo

So what is FreeEed

• Applied knowledge gained from eDiscovery applications and competitor analysis

• Big Data• Open source

Built for Big Data

Write the code once, make it work either on 1 or on 1000s of computers 

• One machine• Many private computers (cluster)• Many rented Amazon computers

What is a cluster

Many computers organized together

What is a Hadoop cluster?

• A group of computers ready to work together• Hadoop allows them to share the workload• Fault-tolerant

What is open source?

Many programmers working together

Open source for eDiscovery

• Low cost for the user• Ideal for in-house implementation• Better code quality• Open collaboration• Fast development using existing

open source tools and applications

FreeEed present capabilities

 

• Text extraction• Culling• Flexible search syntax• Scalability• PDF Imaging • Runs on Windows, Mac, Linux, Hadoop cluster

FreeEed processing stages

  

• Staging, maintaining the integrity of the data• Processing - text/native/exceptions/pdf• Review - Concordance/Future review platform

FreeEed screens

Project, Settings, History

FreeEed immediate future - 3 months

• Amazon cloud processing• Multiple enhancements (imaging, deduping, OCR, etc.)

Next organizational steps

• Support• Development• In-house EDD

Exciting future steps

• Enhanced capabilities based on cloud power• iPad/Chrome tablet eDiscovery• Big Data technology for review• Text Understanding: predictive coding,

automated privilege review, clustering, email chains

 Advanced FreeEed technology will be a powerful weapon in future legal battles

Recommended