View
235
Download
3
Category
Tags:
Preview:
DESCRIPTION
As presented at All Things Open on 22 October 2014 in Raleigh, NC http://allthingsopen.org/talks/lumify/ ========================= Most people interested in analyzing information aren’t data scientists, statisticians, or software engineers. They’re far more likely to have degrees in journalism, political science, or business administration rather than computer science or statistics. If the last math class you took was Algebra 1 during sophomore year of high school, and you’ve deliberately avoided doing any advanced quantitative work in your career, you probably won’t be interested in hearing about algorithms, eigenvectors, or distributional semantics. This type of analyst thinks in terms of real-world objects, relationships, events, etc. They require tools built with their needs in mind, ideally supporting a variety of use cases centered around the analysis of objects and activities. Lumify is an open source project to create a big data fusion, analysis, and visualization platform designed for anyone to use. Its intuitive web-based interface helps users discover connections and explore relationships in their data via a suite of analytic options and BI tools, including 2D and 3D graph visualizations, full-text faceted search, dynamic histograms, interactive geographic maps, and collaborative workspaces shared in real-time with cell-level security. In this presentation, we’ll describe the development of Lumify, showcasing many of the open source tools enabling its processing and analytic capabilities (e.g., Hadoop, Storm, Accumulo, Elasticsearch, Cytoscape, OpenNLP, OpenCV), and our goal of fostering an open source community around Lumify. We’ll conclude with a live demo showing how Lumify can help dismantle one of the world’s deadliest drug cartels. Lumify is intended for all kinds of users, including: - business and financial analysts - law enforcement and legal staff - intelligence analysts - investigative journalists - students and researchers - genealogists and family historians More info: http://lumify.io
Citation preview
photo credit: “Journalists Protest against rising violence during march in Mexico” by Knight Founda>on (h@ps://flic.kr/p/9u8uRW) CC-‐BY-‐SA 2.0
Battling Drug Cartels with Big Data using Lumify
Charlie Greenbacker @greenbacker
PreviouslyWorked For
Father / Son
Individuals Supporting the Narcotics Trafficking Activities of the SINALOA CARTEL and/or ESPARRAGOZA MORENO
Colombian Businesses
Individuals Acting on Behalf of Hugo CUELLAR HURTADO and/or John Fredy CUELLAR SILVA
Mexican Businesses
Colombian and Mexican Family Members
Located at Cl 57 No. 24-72, Bogota, Colombia
Located at Av. Prolongacion Vallarta No. 600, Zona Centro, Tlajomulco de Zuniga, Jalisco C.P. 45640, Mexico
SINALOA CARTEL Leaders (Previously-Designated)
Hugo CUELLAR HURTADODOB 18 May 1947; POB Florencia, Caqueta, Colombia
Cedula No. 17622278 (Colombia)C.U.R.P. CUHH470518HNELRG00 (Mexico)
John Fredy CUELLAR SILVADOB 17 May 1976; POB Florencia, Caqueta, Colombia
Cedula No. 79904164 (Colombia)R.F.C. CUSJ760517HNE (Mexico)
Ofelia Margarita MIRAMONTES GUTIERREZ DOB 24 Apr 1968
POB Guadalajara, Jalisco, MexicoC.U.R.P. MIGO680424MJCRTF03 (Mexico)
Jenny Johanna CUELLAR SILVADOB 11 Jul 1980
POB Florencia, Caqueta, ColombiaCedula No. 52708729 (Colombia)
Victor Hugo CUELLAR SILVADOB 18 Oct 1985
POB Bogota, ColombiaCedula No. 1032359750 (Colombia)
Gabriela AMARILLAS LOPEZDOB 21 Sep 1979
POB Culiacan, Sinaloa, MexicoC.U.R.P. AALG790921MSLMPB09 (Mexico)
Lucy Amparo VARGAS NUNEZDOB 26 Mar 1958
POB San Pedro, Valle, ColombiaCedula No. 38858512 (Colombia)
CASA COMERCIAL UNIQUINCE COMPRAVENTA
Av. 15 No. 124-09 LC 102, Bogota, ColombiaMatricula Mercantil No 00666561 (Colombia)
CASA COMERCIALORO RAPIDO
Cra. 11 # 13-28, Girardot, Cundinamarca, ColombiaMatricula Mercantil No 19022 (Colombia)
HOTEL PARAISO RESORTEN ARRENDAMIENTO
Calle 3 No. 1-33/17, Rivera, Huila, ColombiaMatricula Mercantil No 0000104026 (Colombia)
AGRO Y COMERCIO DE SANTABARBARA LAGROMER S. EN C.
NIT # 800016670-7 (Colombia)
INVERSIONESHUNEL LTDA.
NIT # 800223039-6 (Colombia)
COMPANIA AGROCOMERCIAL CUETA S. EN C.NIT # 800007394-0 (Colombia)
The CUELLARHURTADO Networkoperates in Colombia
and Mexico. It isprimarily invested in
agricultural companiesand pawn shops.
CASA DE EMPENO GUADALAJARA, S.A. DE C.V. (a.k.a. EMPENOS PRESTAFACIL)R.F.C. CEG-000629-9H7 (Mexico)
Av. Lopez Cotilla No. 100,Guadalajara, Jalisco, Mexico
Av. De La Mancha No. 738Zapopan, Jalisco, Mexico
PRENDA TODO, S.A. DE C.V. (a.k.a. CASA DE EMPENO PRENDA TODO)Andador Medrano 2845, Guadalajara Centro, Guadalajara, Jalisco, Mexico;
Zacarias Rubio No. 1609, San Miguel de Huentitan El Alto, Guadalajara, Jalisco, MexicoR.F.C. PTO000504DM5 (Mexico)
COOPERATIVA AVESTRUZCUEMIR, S.C. DE R.L. DE C.V.
Folio Mercantil No. 42877-1 (Mexico)
AGRICOLA Y GANADERACUEMIR, S.P.R. DE R.I.
(a.k.a. RANCHO LA HERRADURA CUEMIR)Folio Mercantil No. 17919-1 (Mexico)
Juan Jose ESPARRAGOZA MORENO
(a.k.a. "El Azul")
Dual Colombian/Mexican Nationals
Ismael ZAMBADA GARCIA
(a.k.a. "El Mayo")
JoaquinGUZMAN LOERA
(a.k.a. "El Chapo")
U.S. Department of the TreasuryOffice of Foreign Assets Control
Foreign Narcotics KingpinDesignation Act
February 2014
SINALOA CARTEL
Leonidas VARGAS VARGAS(Major Medellin Cartel
Drug Trafficker Murdered in Madrid, Spain - 2009)
(Not Designated)
Mexican authorities arrestedGuzman Loera on February 22, 2014
in Mazatlan, Sinaloa.
1.usa.gov/1qORIWu
Information Overload
photo credit: “Paperwork 2” by Issac Bowen (h@ps://flic.kr/p/5ccdaS) CC-‐BY-‐SA 2.0
Collaboration is Hard
photo credit: "Jigsaw puzzling at OCP" by Artaxerxes (h@ps://flic.kr/p/5ccdaS) CC-‐BY-‐SA 3.0
Budgets Don’t Scale
photo credit: “Usa na>onal debt 20 April 2012” by Valugi (h@p://bit.ly/1jrvVBC) CC-‐BY-‐SA 3.0
open source so^ware
Law Enforcement
photo credit: “CID agent at crime scene” by US Army (h@p://bit.ly/1wH13mY) Public Domain
Business Analysts
photo credit: “Numbers And Finance” by Ken Teegardin (h@ps://flic.kr/p/9rn9Yh) CC-‐BY-‐SA 2.0
Intelligence Analysts
photo credit: “FBI Analyst” by FBI (h@p://bit.ly/1Cl3I7Q) Public Domain
Research Staff
photo credit: “A researcher at The Na>onal Archives in Kew” by the UK Na>onal Archives (h@p://bit.ly/1n9dhR8) CC-‐BY 3.0
Built on scalable open source tech
Hadoop CDH 4
Accumulo Elas>cSearch
tesseract CLAVIN CMU Sphinx OpenNLP OpenCV ffmpeg
Storm
Secure Graph (securegraph.org)
Key Concepts in Lumify
structure for organizing information (i.e., your data model) Ontology
any “thing” you want to represent (e.g., person, place, event) Entities
a link between two entities (e.g., leader of, works for, sibling of) Relationships
data about an entity (e.g., first name, last name, date of birth) Properties
collection of entities and the relationships between them Graph
Cell-‐level Security in Lumify
Joaquin Guzman Loera
DOB: 1957-04-04
POB: Badiraguarto
Nationality: Mexican
Founded: 2010-01-11
Location: Mexico City
Employees: 121
Zarka de Mexico
Text Extrac>on in Lumify
video
text docs structured data
images OCR tesseract
audio CMU Sphinx
CMU Sphinx
OCR tesseract
extractor
Text Enrichment in Lumify
• Apache OpenNLP • Named Entity Recognition
• CLAVIN • Geospatial Entity Resolution
Geospa>al Data in Lumify
geotags & coordinates in database records, metadata, etc. Structured Data
location fields & addresses in spreadsheets, etc. Semi-structured Data
place names mentioned in text documents Unstructured Data
try.lumify.io
live demo
Recommended