Upload
markkerzner
View
977
Download
0
Embed Size (px)
DESCRIPTION
Presentation for Women in eDiscovery, Houston, TX
Citation preview
Open Source eDiscovery
Presentation for "Women in eDiscovery" Houston, TX
12/15/2011
Open source eDiscovery
•Pre-history•Present capabilities•Foreseeable future•Vision
Qualifications
• MS Math• MS Computer Science• Mensa, Languages (10)
• Oil: patents, books, awards, software• Projects...
• JD - eDiscovery• eDiscovery 1• eDiscovery 2• Free Discovery
Following the People with Luck
Watch the people who made it
My first project: writing eDiscovery for 1 computerEnding with 30
My second project: writing eDiscovery for an unlimited clusterEnding with BigData
Big Data! Enter Hadoop
Hadoop = Big Data
Big Data History
• 2004 - Google reveals their big data technology• 2005 - It becomes open source with Hadoop• 2008 - eDiscovery on the cluster• 09-11 - Big Data explosion
Writing a book
Hadoop in Practice for Manning
Getting invited
• YouTube• Microsoft Bing• Facebook• Google • Yahoo
So what is FreeEed
• Applied knowledge gained from eDiscovery applications and competitor analysis
• Big Data• Open source
Built for Big Data
Write the code once, make it work either on 1 or on 1000s of computers
• One machine• Many private computers (cluster)• Many rented Amazon computers
What is a cluster
Many computers organized together
What is a Hadoop cluster?
• A group of computers ready to work together• Hadoop allows them to share the workload• Fault-tolerant
What is open source?
Many programmers working together
Open source for eDiscovery
• Low cost for the user• Ideal for in-house implementation• Better code quality• Open collaboration• Fast development using existing
open source tools and applications
FreeEed present capabilities
• Text extraction• Culling• Flexible search syntax• Scalability• PDF Imaging • Runs on Windows, Mac, Linux, Hadoop cluster
FreeEed processing stages
• Staging, maintaining the integrity of the data• Processing - text/native/exceptions/pdf• Review - Concordance/Future review platform
FreeEed screens
Project, Settings, History
FreeEed immediate future - 3 months
• Amazon cloud processing• Multiple enhancements (imaging, deduping, OCR, etc.)
Next organizational steps
• Support• Development• In-house EDD
Exciting future steps
• Enhanced capabilities based on cloud power• iPad/Chrome tablet eDiscovery• Big Data technology for review• Text Understanding: predictive coding,
automated privilege review, clustering, email chains
Advanced FreeEed technology will be a powerful weapon in future legal battles