Upload
a-finance-company
View
41
Download
0
Tags:
Embed Size (px)
Citation preview
What is Lantea
• Open source big data platform
• Rich ETL (Extract-Transform-Load) features
• A platform that can help Data Scientist to collect and deal with data easily
• Import data from different source is extremely easy
Highlighted features of Lantea
• A lot of different data sources on different media
• Query aggregation data via SQL
• Very easy to collect data from websites, local file systems, emails and databases
• Export data via a lot of formats and APIs
Target User of Lantea
• Data Scientists
• Marketing Analyzer
• Managers who needs BI
• Researchers
• Big data/BI Developers
• Deep Machine Learning Developers
Non-Commercial
Commercial
ResearchersData
Scientists
Big data/BI Developers
Marketing Analyzer
Open source developers
Managers who needs BI
Essential Elements of Big Data Platform
• Data/File Extraction
• Data Cleaning and Filtering
• Different ways of Analyzing data
• Real-time Processing
• Data Collection from Different Source
• Connect to Different Database Types
• Analysis Result Rendering
• Advanced Parameter Adjustment
Big Data
Extraction
Cleaning
Analysis
Data Processing
Data Collection
Parameter Adjustment
Third-party Projects Included
• Toxy – Data Extraction framework
• Spidey – Web Spider framework
• EQueue – Queue Implementation
• CacheAdapter – Cache Provider
• Irony – Compiler Implementation
• ServiceStack.Redis– Redis Client
• ScrapySharp – Html Parser and Selector
• Autofac – IOC Container
• Log4net – Configurable Logging System
• Datatables.js – Web Spreadsheet
• Thinkecture Identity Server - Social account integration
• Nepy – Parsers for Natural Language Processing
Architecture Design v1
Key Features
• Web Crawling Service
• Data Extraction Service
• Queue Service
• CQLR
(Common Query Language Runtime)
• Rich Formats Outputs and APIs
• Restful and ODATA support