Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
| Optionale Zusatzinformationen1
Importance of Data Integration for Successful Transformation
Using digitalization in crop protection as an example
Udo Kämpf
SAP Chemicals Conference 2018, Prague
Digital Farming is impacting all areas of agricultural production
Sowing | Soil cultivation | Fertilizing | Logistics | Harvesting |Handling systems | Bio-energy | Inventory | Livestock | Accounting
Smart home | Administration finance | Crop protection | Decision-making support
18.10.2018 Importance of Data Integration for Successful Transformation2
Agricultural practices have always been shapedby disruptive technologies
Agriculture 2.0 GreenRevolution
Agriculture 3.0Precision Agriculture
Agriculture 4.0 Digital Agriculture
1880 1940 1990 2018 +2030
Agriculture 5.0 AutonomousSystems
Agriculture 1.0
18.10.2018 Importance of Data Integration for Successful Transformation3
In crop protection, digitalization will make real-time product recommendations possible
Saving time & money, while increasing yields
Data volume
Time Dimension
SpatialDimension
Always
Daily
Monthly
Yearly
Country Field
2018
PlantRegion Farm
18.10.2018 Importance of Data Integration for Successful Transformation4
Data lakeBusiness needs for digitalization in crop protection
18.10.2018 Importance of Data Integration for Successful Transformation5Integrated data are the driver for digitalization in agriculture
No compromise on data-privacy & -security Data Lake is single source for integrated data Sourcing from and to respective leading
systems (e.g. CRM, master data) Updates in near real-time or batch Persistence of data Provide integrated data for business
applications through APIs Company-wide governance and continuous
development Reduce complexity & control cost Enable data science
Weather forecast Historic weather Commodity prices Crop protection Product master data Crop protection Product efficacy Crop protection Product label data Crop protection Product registration data Crop protection Product recommendation data Crops & Crop growing models & Pests Disease forecasts / alerts CRM & Sales data EHS data
Get access to integrated data Typical data types
ApplicationApplication
pre Data Lake simplified landscapeApplication-specific data integration
18.10.2018 AP Data Lake 2018 - Project Overview and Charter6
Multiple Data Sources, often inconsistent
Multiple InterfacesMultiple data exchange formatsMany point to point connectionsApplication specific data integration
Registration
Active Ingredient
Product Efficiency
Products
Crops
Pest
Advice Area Advisor
Channel synchronization
Maglis
Web CMS
Apps & other 3rd
party services
SAP Hybris
Input Data – Crop,Pest, Product, AI, Registration, CRM, Efficacy
Application.
Customer
AP DataLakeDevOps Teams
AP Data Strategy Team
Datalake Delivery: DevOps SCRUM Process
High level demand qualification:• First Prioritization (Demand
Reassessment)• Identify required data domain &
sources• Alignment with Data Owners
(authorization)• Early consulting: Y/N
• ~ estimate & solution
• Best practice template
• Offer verification• Confirmation &
Prioritization• Concept Approval⇒Complete Data
Catalogue
Product Backlog
Categorize Analyze Prioritize Monitor Revise / Approve
Sprint Backlog
Sprint PlanningWorking
incrementof the
Software
Dev.Cycle
• Information Demand Definition • High Level
conceptEarly consulting:• Identify dependencies and
preconditions• Define techn .data
Requirements, data catalogue• If required then POC
Requestor
Design Analysis Continuous Development & Operations
• Verification and approval of concept changes
18.10.2018 Importance of Data Integration for Successful Transformation7
Data Lake simplified LandscapeData Integration for multiple consumers from a single source of reference
Single interfaceSingle well-defined formatData source for all digital applications
Multiple Data Sources harmonized into Data Lake
Registration
Competitor Information
Product Efficacy
Corporate Products
Crops
Pest
Advice Area Advisor
Data LakeLand
Load
Online
Channel synchronization
Input Data – Crop,Pest, Product, AI, Registration, CRM, Efficiency
Maglis
Web CMS
Apps & other 3rd
party services
Single Source of ReferenceProduct Advice
Product Advice
Customer
18.10.2018 Importance of Data Integration for Successful Transformation8
Technology PurposeHDFS Primary data store
SAP Hana Real time data store - Online API & Reporting
Hbase (NoSQL) Real time data store - Online API
Spark (in-memory) Real time data processing engine
HIVE Batch data processing engine
Kafka Live streaming platform
Oozie Application workflow
API gateway, WebHDFS, HIVE connector, SFTP, Sqoop
Data integration service
Knox, Ranger & API Gateway Security component
Tableau Reporting component
R Data analysis Component
ElasticSearch Search engine
Talend Master data management tool
Data LakeCurrent high-level architecture
Land Zone
Load ZoneOnline Store
Real-Time Data Processing
Streaming Platform
API Gateway
HIVE-JDBC Connector
WebHDFS SQOOP
API Gateway
HIVE-JDBC Connector
Reporting SQOOP
Batch Data Processing
workflow
MDM
18.10.2018 Importance of Data Integration for Successful Transformation9
Data provisioningsystems Registration
Product Master DataCrops Pests
CRM Product Label DataProduct Efficacy
Data consumingsystems App…
Digital Farming AppApp … App5
App... App…App…
Data Lake Challenges
Deliver while managing dependencies Many different systems have been / are involved, some of these under development while building
the Data Lake Prioritization of data demands All running projects queued up immediately to get considered
Not easy to move code between environments (dev uatprod) for certain technologies (e.g. hana) Testing & Go-Live Prepare all interfacing systems with test data & test strategy before testing, align Go-Live dates Test approach for developers, business analysts, automated, and on demand
Managing a data catalogue in Excel is cumbersome
18.10.2018 Importance of Data Integration for Successful Transformation10
Conclusions
Data integration is key for digitalization in crop protection
The data lake approach can help to build new business applications in a legacy systems world Creating well documented provisioning and consuming APIs
Governance key to avoid creating a data mess Data stewards are key to achieve data quality in systems to be integrated into data lake Structured process for integrating new data into the data lake Guided by the data strategy team and data stewards
A data catalogue / inventory system is essential to maintain an overview about data sources, data consumers, and respective provisioning and consuming APIs
18.10.2018 Importance of Data Integration for Successful Transformation11