• High Rice and Wheat Production
• Extremely High Maternal Mortality, Infant Mortality Rates
• High Malnutrition
Irrigationavailability
High yielding disease resistant seeds
Crop rotation literacy
Accessibility to markets
Pulses need more water
Pulses are more susceptible to diseases
Training & educating farmers
Showcase the market to farmers
What can fix it?
Number of land
holdings
Size of land
holdings
Dispersal of
operated area
Tenancy Status
Sources of
irrigation
Irrigation status
Cropping Intensity
Crops
Data Acquisition Data lying in government
servers
Agriculture CensusData and challenges
Data CurationSingle file for each
geography and crop
Input SurveyData and challenges
Data Acquisition Data lying in difficult government servers
Data CurationSingle file for each
geography and crop
Challenges and Solutions
Data CurationAvailable in PDF files or
poorly formatted multiple Excel files
• India’s population to cross China’s by 2022
• Population to grow up to 1.7b by 2050
• PDF parsing using image recognition
• Data cleaning using pattern recognition
• Scalable scripts for automating thousands of filesIntegrated Village wise Reports of disease
outbreak in IndiaBi-weekly
Challenges and Solutions
• India’s population to cross China’s by 2022
• Population to grow up to 1.7b by 2050
Data Stitching
• Phonetic Matching
• Fuzzy Logic Matches
15% The match we got the first time we matched Census Village Names to DISE Village Names