32
Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and Former Series Editors: Phoebe Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu, Dominik Ślęzak, and Xiaokang Yang Editorial Board Simone Diniz Junqueira Barbosa Pontical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro, Brazil Joaquim Filipe Polytechnic Institute of Setúbal, Setúbal, Portugal Ashish Ghosh Indian Statistical Institute, Kolkata, India Igor Kotenko St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia Krishna M. Sivalingam Indian Institute of Technology Madras, Chennai, India Takashi Washio Osaka University, Osaka, Japan Junsong Yuan University at Buffalo, The State University of New York, Buffalo, USA Lizhu Zhou Tsinghua University, Beijing, China

Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Communicationsin Computer and Information Science 941

Commenced Publication in 2007Founding and Former Series Editors:Phoebe Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu,Dominik Ślęzak, and Xiaokang Yang

Editorial Board

Simone Diniz Junqueira BarbosaPontifical Catholic University of Rio de Janeiro (PUC-Rio),Rio de Janeiro, Brazil

Joaquim FilipePolytechnic Institute of Setúbal, Setúbal, Portugal

Ashish GhoshIndian Statistical Institute, Kolkata, India

Igor KotenkoSt. Petersburg Institute for Informatics and Automation of the RussianAcademy of Sciences, St. Petersburg, Russia

Krishna M. SivalingamIndian Institute of Technology Madras, Chennai, India

Takashi WashioOsaka University, Osaka, Japan

Junsong YuanUniversity at Buffalo, The State University of New York, Buffalo, USA

Lizhu ZhouTsinghua University, Beijing, China

Page 2: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

More information about this series at http://www.springer.com/series/7899

Page 3: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Leman Akoglu • Emilio FerraraMallayya Deivamani • Ricardo Baeza-YatesPalanisamy Yogesh (Eds.)

Advances inData ScienceThird International Conferenceon Intelligent Information Technologies, ICIIT 2018Chennai, India, December 11–14, 2018Proceedings

123

Page 4: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

EditorsLeman AkogluCarnegie Mellon UniversityPittsburgh, PA, USA

Emilio FerraraUniversity of Southern CaliforniaMarina Del Rey, CA, USA

Mallayya DeivamaniCEGAnna UniversityChennai, India

Ricardo Baeza-YatesNortheastern University at Silicon ValleySan Jose, CA, USA

Palanisamy YogeshAnna UniversityChennai, India

ISSN 1865-0929 ISSN 1865-0937 (electronic)Communications in Computer and Information ScienceISBN 978-981-13-3581-5 ISBN 978-981-13-3582-2 (eBook)https://doi.org/10.1007/978-981-13-3582-2

Library of Congress Control Number: 2018962935

© Springer Nature Singapore Pte Ltd. 2019This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors, and the editors are safe to assume that the advice and information in this book arebelieved to be true and accurate at the date of publication. Neither the publisher nor the authors or the editorsgive a warranty, express or implied, with respect to the material contained herein or for any errors oromissions that may have been made. The publisher remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,Singapore

Page 5: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

General Chairs’ Preface

On behalf of the Organizing Committee, we are pleased to welcome you to the pro-ceedings of the Third International Conference on Intelligent Information Technologies(ICIIT 2018) organized by the Department of Information Science and Technology,College of Engineering Guindy (CEG), Anna University Chennai, India. ICIIT suc-cessfully brought together researchers and developers, with the purpose of identifyingchallenging problems in recent technologies.

We were delighted to present four outstanding keynote speakers: Dr. C. Mohan IBMAlmaden Research Center, USA; Prof. Jure Leskovec from Stanford University, USA;Prof. Raj Reddy from Carnegie Mellon University, USA; and Dr. Rajeev Rastogi fromAmazon, India.

ICIIT 2018 captivated with a signature event – “Industry Day” – to share practicesamong academia and industry. The industry keynote speakers were: Dr. Lipika Deyfrom Tata Consultancy Services, India; Amruta Joshi from Google, India; HariVasudev from Walmart Labs, India; and Ravi Vijayaraghavan from Flipkart, India.

We are grateful to the many authors who submitted their work to the ICIIT technicalprogram. The Program Committee was led by Leman Akoglu and Emilio Ferrara.A report on the paper selection process appears in the PC Chairs’ Preface.

We also thank the other chairs in the organization team: Prof. Saswati Mukherjee foracting as a convener of the conference; Dr. Mallayya Deivamani for publicizing theevent to attract submissions and for managing the website, handling the proceedingsprocess, and the local arrangements, thus ensuring the conference ran smoothly; Prof.Swamynathan Sankara Narayanan for acting as finance chair; Dr. Muthusamy Chelliah,Flipkart, India for acting as panel chair; and Dr. Shalini Urs, MYRA School ofBusiness, Mysuru, India for acting as tutorial chair.

We are grateful to the sponsors of the conference, Indian Space Research Organi-zation (ISRO), and Flipkart India, for their generous sponsorship and support. Wewould also like to express our gratitude to the College of Engineering (CEG), AnnaUniversity Chennai for hosting and organizing this conference. Last but not least, oursincere thanks go to all the local team members and volunteer helpers for their hardwork to make the event possible. We hope you enjoy the proceedings of ICIIT 2018.

Ricardo Baeza-YatesPalanisamy Yogesh

Page 6: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

PC Chairs’ Preface

On behalf of the Program Committee, it is our pleasure to present to you the pro-ceedings of the International Conference on Intelligent Information Technologies(ICIIT 2018) held during December 11–13, 2018, at the College of EngineeringGuindy, Anna University Chennai, India. ICIIT 2018 acted as a forum for theresearchers, scientists, academics and industrialists to present their latest researchresults and research perspectives on the conference theme, “Data Science andAnalytics.”

The conference received 74 submissions from all over the world. After a rigorouspeer-review process involving 235 reviews in total, 15 full-length articles wereaccepted for oral presentation and for inclusion in the CCIS proceedings. This corre-sponds to an acceptance rate of 20% and is intended for maintaining the high standardsof the conference proceedings. The papers included in this CCIS volume cover a widerange of topics in data science foundations, data management and processing tech-nologies, and data analytics and its applications.

The pre-conference tutorials conducted on December 10, 2018, covered the thrustareas of data science and analytics. The technical program started on December 11,2018, and continued for two days. Non-overlapping oral and poster sessions ensuredthat all attendees had the opportunity to interact personally with presenters. The con-ference featured distinguished keynote speakers, Dr. Mohan Chandrasekaran of IBM,USA, Prof. Jure Leskovec of Stanford University, USA, Prof. Raj Reddy of CarnegieMellon University, USA, and Dr. Rajeev Rastogi of Amazon, India.

We take this opportunity to thank the authors of all submitted papers for their hardwork, adherence to the deadlines, and patience with the review process. The quality ofa refereed volume depends mainly on the expertise and dedication of the reviewers. Weare thankful to the reviewers for their timely effort and help to make this conferencesuccessful. We, thank Prof. Ricardo Bazea-Yates of NTENT and NortheasternUniversity at SV, USA, and Prof. Palanisamy Yogesh of Anna University, Chennai, forproviding valuable guidelines and inspiration to overcome various difficulties in theprocess of organizing this conference as general co-chairs. We would like to thank thetrack chairs – Dr. Suren Byna of Lawrence Berkeley National Laboratory, USA;Dr. Amruta Joshi of Google, India; Dr. Eleanor Loh of Deliveroo, UK; Dr. MallayyaDeivamani of College of Engineering Guindy (CEG), India; Dr. Moumita Sinha ofAdobe, USA; and Dr. Ravi Vijayaraghavan of Flipkart, India – for their effort towardthe review process of ICIIT 2018. We thank Prof. Saswati Mukherjee and Prof.Swamynathan S. for their endless effort in all aspects as conference convener andfinance chair, respectively. For the publishing process at Springer, we would like tothank, Leonie Kunz, Yeshmeena Bisht, Suvira Srivastav, and Nidhi Chandhoke fortheir constant help and cooperation.

Our sincere and heartfelt thanks to Prof. M. K. Surappa, vice-chancellor of AnnaUniversity, Chennai, Prof. J. Kumar, Registrar of Anna University, Chennai, and Prof.

Page 7: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Geetha T V, Dean, College of Engineering (CEG), Anna University, for their supporttoward ICIIT 2018 and providing the infrastructure at CEG to organize the conference.We are indebted to the faculty, staff, and students of the Department of InformationScience and Technology for their tireless efforts that made ICIIT 2018 at CEG possible.We would also like to thank the sponsors Indian Space Research Organization (ISRO)and Flipkart for their support. We would also like to thank the participants of thisconference, who have considered the conference above all hardships. In addition, wewould like to express our appreciation and thanks to all the people whose efforts madethis conference a grand success.

December 2018 Leman AkogluEmilio Ferrara

VIII PC Chairs’ Preface

Page 8: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Organization

ICIIT 2018 was organized by the Department of Information Science and Technology,College of Engineering Guindy, Anna University, Chennai, India.

Chief Patron

Surappa M. K.(Vice-chancellor)

Anna University, India

Patrons

Kumar J. (Registrar) Anna University, IndiaGeetha T. V. (Dean) CEG, Anna University, India

General Co-chairs

Ricardo Baeza-Yates NTENT and Northeastern University, USAPalanisamy Yogesh CEG, Anna University, India

Program Co-chairs

Leman Akoglu Carnegie Mellon University, USAEmilio Ferrara University of Southern California, USA

Track Chairs

Suren Byna Lawrence Berkeley National Laboratory, USAAmruta Joshi Google, IndiaEleanor Loh Deliveroo, UKMallayya Deivamani CEG, Anna University, IndiaMoumita Sinha Adobe, USARavi Vijayaraghavan Flipkart, India

Convener

Saswati Mukherjee CEG, Anna University, India

Local Arrangements Chair and Proceedings Chair

Mallayya Deivamani CEG, Anna University, India

Page 9: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Finance Chair

Swamynathan SankaraNarayanan

CEG, Anna University, India

Panel Chair

Muthusamy Chelliah Flipkart, India

Tutorial Chair

Shalini Urs MYRA School of Business, Mysuru, India

Organizing Committee

Ranjani ParthsarathiUma G. V.Indhumathi J.Sridhar S.Vani K.Geetha Ramani R.Mala T.Sendhil Kumar S.Kulothungan K.Vijayalakshmi M.Thangaraj N.Indra Gandhi K.Uma E.Abirami S.Geetha P.Vidhya K.Selvi RavindranMuthuraj R.

Bama SrinivasanSairamesh L.Pandiyaraju V.Vijaykumar T. J.Narashiman D.Prabhavathy P.Shunmuga Perumal P.Ezhilarasi V.Tina Esther TruemanKanimozhi S.Sindhu T.Senthilnayaki B.Jasmine R. L.Riasudheen H.Yuvaraj B. R.Mohana Bhindu K.Mahalakshmi G.

Technical Review Board

Abdullah Tansel The City University of New York, USAAkhilesh Bajaj The University of Tulsa, OklahomaAlberto Cano Virginia Commonwealth University, USAAlfredo Cuzzocrea University of Trieste, ItalyAndrea Clematis IMATI-CNR, ItalyAntonis Sidiropoulos Aristotle University of Thessaloniki, GreeceAzad Naik Microsoft Research, USABharath Balasubramanian AT & T Labs, USA

X Organization

Page 10: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Bolong Zheng Aalborg University, DenmarkChitra Babu SSN College of Engineering, IndiaDavid Lillis University College Dublin, IrelandDharavath Ramesh Indian Institute of Technology – Dhanbad, IndiaDhiraj Sangwan CSIR-CEERI, Pilani, IndiaEhsan Ullah Qatar Computing Research Institute, QatarEleni Mangina University College Dublin, IrelandFelix Gessert University of Hamburg, GermanyFeng Yan University of Nevada, Reno, USAG. C. Nandi IIIT – Allahabad, IndiaGrigori Sidorov National Polytechnic Institute, IPN, MexicoGuilherme Desouza University of Missouri, USAGunter Saake University of Magdeburg, GermanyHan Fang Facebook, USAHasan Kurban Indiana University, USAHiba Arnaout Saarland Informatics Campus, GermanyHoujun Tang Lawrence Berkeley National Lab (LBNL), USAHu Chun Google Inc., USAJang Hyun Kim Sungkyunkwan University, South KoreaJay Lofstead Sandia National Laboratories, California, USAJesús Camacho-Rodríguez Hortonworks Inc., USAJian Wu The Pennsylvania State University, USAJiawen Yao University of Texas at Arlington, USAJingchao Ni The Pennsylvania State University, USAKa-Chun Wong City University of Hong Kong, SAR ChinaKalidas Yeturu Indian Institute of Technology – Tirupati, IndiaKanchana R. SSN College of Engineering, IndiaKrishnaprasad Thirunarayan Kno.e.sis Center, Wright State University, USALaurent Anne University of Montpellier, LIRMM, CNRS, FranceLaurent D’Orazio University of Rennes, FranceLi-Shiang Tsay North Carolina A & T State University, USALiting Hu Florida International University, USAManar Mohammed Miami University, USAManas Gaur Kno.e.sis Center, Wright State University, USAManoj Thuasidas Singapore Management University, SingaporeMarijn Ten Thij VU University Amsterdam, The NetherlandsMehmet Dalkilic Indiana University, USAMichele Melchiori University of Brescia, ItalyMike Jackson Birmingham City University, USAMingjie Tang Hortonworks, USAMirco Schoenfeld HFP/TU Munich, GermanyMohammad Haque The University of Newcastle, AustraliaMurat Ünalır Ege University, TurkeyPradeep Kumar IIM – Lucknow, India

Organization XI

Page 11: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Pramod Kumar Singh ABV-IIITM Gwalior, IndiaPrateek Jain Nuance Research Lab, Sunnyvale, USAPrem Jayaraman Swinburne University of Technology, AustraliaQing Liu Hong Kong Baptist University, SAR ChinaRouzbeh Shirvani Howard University, USARukshan Athauda The University of Newcastle, AustraliaRuppa Thulasiram University of Manitoba, CanadaSabina Petride Oracle Labs, USASamiulla Z. Shaikh IBM Research Bangalore, IndiaSandjai Bhulai VU University Amsterdam, The NetherlandsSanjay Singh CSIR-CEERI, Pilani, IndiaSantosh Singh Rathore NIT – Jalandhar, IndiaScott Klasky Oak Ridge National Laboratory, USASergio Greco University of Calabria, ItalySeung - Hwa Chung Bennett University, IndiaShelly Sachdeva National Institute Technology – Delhi, IndiaSicong Zhang Facebook, USASofian Maabout University of Bordeaux, FranceSomyava Das Teradata-Aster, USASourav Sen Gupta Nanyang Technological University, SingaporeSraban Kumar Mohanty IIIT – Jabalpur, IndiaSubhash Bhalla Indian Institute of Technology – Delhi, IndiaSujala Shetty BITS Pilani, Dubai Campus, UAESukhamay Kundu Louisiana State University, USASuren Byna Lawrence Berkeley National Laboratory, USASwapna Gottipati Singapore Management University, SingaporeT. S. Narayanan IIITDM Kancheepuram, IndiaT. T. Mirnalinee SSN College of Engineering, IndiaTanmoy Chakraborty IIIT - Delhi, IndiaVijaya Saradhi V. Indian Institute of Technology – Guwahati, IndiaVenkatesh-Prasad

RanganathKansas State University, USA

Vijil Chenthamarakshan IBM AI Research, USAWellington Cabrera Teradata, USAWilfred W. Godfrey ABV-IIITM Gwalior, IndiaXiaomo Liu Thomson Reuters Research, USAXin Cao University of New South Wales, AustraliaXiong Haoyi Missouri University of Science and TechnologyYi-Shin Chen National Tsing Hua University, TaiwanZhiming Zhao University of Amsterdam, The Netherlands

XII Organization

Page 12: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Additional Reviewers

Bing Xie Oak Ridge National Lab, USADongkuan Xu The Pennsylvania State University, USAFelix Enigo SSN College of Engineering, IndiaKumar Vinayak IIT Hyderabad, IndiaÖzgü Can Ege University, TurkeyUdit Arora IIITD, IndiaWeiqing Yang Hortonworks, USAYanbo Liang Hortonworks, USAYongyang Yu Purdue University, USAYu Hao UNSW, Australia

Organization XIII

Page 13: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Keynotes

Page 14: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Blockchains Untangled: Public, Private,Smart Contracts, Applications, Issues

C. Mohan

IBM Almaden Research Center, USA

Abstract. The concept of a distributed ledger was invented as the underlyingtechnology of the public or permission less Bitcoin cryptocurrency network. Butthe adoption and further adaptation of it for use in the private or permissionedenvironments is what I consider to be of practical consequence and hence onlysuch private blockchain systems will be the focus of this talk.

Computer companies like IBM, Intel, Oracle, Baidu and Microsoft, andmany key players in different vertical industry segments have recognized theapplicability of blockchains in environments other than cryptocurrencies. IBMdid some pioneering work by architecting and implementing Fabric, and thenopen sourcing it. Now Fabric is being enhanced via the Hyperledger Consortiumas part of The Linux Foundation. There is a great deal of momentum behindHyperledger Fabric throughout the world. Other private blockchain effortsinclude Enterprise Ethereum, Hyperledger Sawtooth and R3 Corda.

While currently there is no standard in the private blockchain space, all theongoing efforts involve some combination of persistence, transaction, encryp-tion, virtualization, consensus and other distributed systems technologies. Someof the application areas in which blockchain systems have been leveraged are:global trade digitization, derivatives processing, e-governance, Know YourCustomer (KYC), healthcare, food safety, supply chain management andprovenance management.

In this talk, I will describe some use-case scenarios, especially those inproduction deployment. I will also survey the landscape of private blockchainsystems with respect to their architectures in general and their approaches tosome specific technical areas. I will also discuss some of the opportunities thatexist and the challenges that need to be addressed. Since most of the blockchainefforts are still in a nascent state, the time is right for mainstream database anddistributed systems researchers and practitioners to get more deeply involved tofocus on the numerous open problems. Extensive blockchain related collateralcan be found at http://bit.ly/CMbcDB.

Page 15: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Graph Representation Learning

Jure Leskovec

Stanford University, USA

Abstract. Machine learning on graphs is an important and ubiquitous task withapplications ranging from drug design to friendship recommendation in socialnetworks. The primary challenge in this domain is finding a way to represent, orencode, graph structure so that it can be easily exploited by machine learningmodels. However, traditionally machine learning approaches relied onuser-defined heuristics to extract features encoding structural information abouta graph. In this talk I will discuss methods that automatically learn to encodegraph structure into low-dimensional embeddings, using techniques based ondeep learning and nonlinear dimensionality reduction. I will provide a con-ceptual review of key advancements in this area of representation learning ongraphs, including random-walk based algorithms, and graph convolutionalnetworks.

Page 16: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

AI: Background, Historyand Future Opportunities

Raj Reddy

Carnegie Mellon University, USA

Abstract. This talk will provide Background and History of AI in an attempt toclarify the sources of misinformation about AI in the media recently. Manyof these predictions are based on flawed reasoning and incorrect extrapolationsand will not happen. Robots will not take over the world. In this talk, we willreview tools, techniques and advances in AI over the past half century andexplore what might be next. We will discuss how these Intelligent Agents willlead to Knowledge as a Service Industry (KaaS) and create a Market Place forApps that provide KaaS.

Page 17: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Machine Learning @ Amazon

Rajeev Rastogi

Amazon, India

Abstract. In this talk, I will first provide an overview of key problem areaswhere we are applying Machine Learning (ML) techniques within Amazon suchas product demand forecasting, product search, and information extraction fromreviews, and associated technical challenges. I will then talk about three specificapplications where we use a variety of methods to learn semantically rich rep-resentations of data: question answering where we use deep learning techniques,product size recommendations where we use probabilistic models, and fakereviews detection where we use tensor factorization algorithms.

Page 18: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Industrial Invited Talks

Page 19: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Designing Automated Decision-MakingSystems – Today, Tomorrow and the Day After

Lipika Dey

Tata Consultancy Services, India

Abstract. Recent advances in AI technologies backed by the success of Siri,Facebook, Google and Alexa have raised the expectations of Enterprises toautomate several routine tasks that involved dealing with unstructured infor-mation from Video, Speech and Text data and heretofore were exclusively dealtwith by human agents. While there is indeed enough scope to employ AItechnologies in conjunction with Data Analytics, designing such systems needto address many more issues other than the accuracies and performances of theunderlying analytics algorithms. Other than addressing increasing concernsabout privacy and security for such systems, ensuring learnability, trustabilityand explainability are some of the crucial questions that are increasingly comingup. This talk will discuss about these concerns and some attempts that are beingtaken to ensure order the madness.

Page 20: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Data Sciences in the Cloud

Amruta Joshi

Google, India

Abstract. With growing awareness of the importance of Data Sciences, the fieldis getting more and more sophisticated. We now regularly process largeramounts of data and use more complex algorithms. This new landscape comeswith its own set of challenges: it requires large-scale infrastructure, robustmulti-level security, and more sophisticated multi-feature algorithms to processthe data. Cloud provides powerful tools to tackle all of these challenges. In thistalk we will see how Data Science and Cloud work together with each other tohelp you manage your big data and convert it into meaningful insights at largescale.

Page 21: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Using Big Data to Change the Waythe World Shops

Hari Vasudev

Walmart Labs, India

Abstract. With the growing convergence of Big Data, ML, AI and Cloud, thereis a huge transformation underway in Retail today that is impacting Customers,Merchants, global Supply Chains and Employees. Walmart’s vision for BigData is to deliver data-driven experiences that help customers save money (andtime) so they can live better. In this talk, we’ll explore how Big Data algorithmsand sciences is being used at the world’s largest retailer to fundamentally changethe way the world shops and deliver unmatched omni-channel shoppingexperiences.

Page 22: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Analytics and Decision Sciencesfor Ecommerce - An Overview and Use Cases

on Pricing and Selection

Ravi Vijayaraghavan

Flipkart, India

Abstract. The Analytics and Decision Sciences organisation at Flipkart has thecharter of leveraging science to enable robust data-driven decision-making. Keyfocus areas of this organisation are business growth and continuously improvingcustomer experience. My talk will present the overall landscape of the Analyticsorganisation and the areas we cover.

Two key expectations of any consumer from an ecommerce platform are –

1. The availability of a really wide assortment/selection of products and2. A price that is true value for money

Following the overview, I will present use cases related to applications ofstatistics, machine learning and optimisation in addressing these two expecta-tions of our consumers.

Page 23: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Adoption of Analytics in Engineering

Srinath Jangam

L & T Construction, Inc., India

Abstract. Artificial intelligence is poised to unleash the next wave of digitaldisruption, and companies should prepare for it now. There are real-life benefitsfor a few early adopting firms, making it more Important than ever all sectors toaccelerate and adoption to their digital/IOT transformations. Five AI technologysystems to focous is on robotics and autonomous vehicles, computer vision,language, virtual agents, and machine learning, which includes deep learningand underpins many recent advances in the other AI technologies. Early evi-dence suggests that AI can deliver real value to serious adopters and can be apowerful force for disruption. Early AI adopters that combine strong digital/IOTcapability with proactive strategies have higher profit margins and expect theperformance gap with other firms to widen in the future. AI promises benefits,but also poses urgent challenges that cut across firms, developers, government,and workers. The workforce needs to be reskilled to exploit AI rather thancompete with it.

Page 24: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Tutorials

Page 25: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Big Data or Right Data?Opportunities and Challenges

Ricardo Baeza-Yates

NTENT and Northeastern University at SV, USA

Abstract. Big data nowadays is a fashionable topic, independently of whatpeople mean when they use this term. But being big is just a matter of volume,although there is no clear agreement in the size threshold. On the other hand, itis easy to capture large amounts of data using a brute force approach. So, the realgoal should not be big data but to ask ourselves, for a given problem, what is theright data and how much of it is needed. For some problems, this would implybig data, but for most of the problems much less data will and is needed. Hence,in this presentation, we cover the opportunities and the challenges behind bigdata. Regarding the challenges, we explore the trade-offs involved with the mainproblems that arise with big data: scalability, redundancy, bias, the bubble filterand privacy.

Page 26: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Deep Learning Modelsfor Image Processing Tasks

C. ChandraSekhar

Indian Institute of Technology Madras, India

Abstract. The shallow learning models based on conventional machine learningtechniques for pattern classification such as Gaussian mixture models, multilayerfeedforward neural networks and support vector machines use the hand-pickedfeatures as input to the models. Recently, several deep learning models havebeen explored for learning a suitable representation from the image data andthen using the learnt representation for performing the image pattern analysistasks such as image classification, annotation and captioning. In this talk, wepresent the deep learning models such as Stacked autoencoder, Deep convolu-tional neural network and Stacked restricted Boltzmann machine for learning asuitable representation from the image data. Then, we present the deep learningmodels-based approaches to image classification, image annotation and imagecaptioning.

Page 27: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Signal Processing guided Machine Learning

Hema A. Murthy

Indian Institute of Technology Madras, India

Abstract. Machine learning has become ubiquitous today. Big data analyticshas become the buzzword. Build, train, and deploy/transfer is the paradigm thathas become the “mantra” today. The more the amount of data available, themore robust the systems are at prediction. The major problem with machinelearning is the problem of getting huge amount data that has been curated. In thecontext of speech technologies, in a country like India with a large linguisticdiversity, getting data that is accurate for training is difficult. Another issue, isthat of simultaneous collection of data in multiple languages. Is there a way toreduce the amount of data required for training a machine learning system? Inthis tutorial, we show how signal processing can be used to guide machinelearning algorithms. In particular we study problems in speech synthesis,recognition, Indian music analysis, and computational brain research, whereefforts are made to first process the signal before subjecting it to machinelearning. Signal processing yields accurate results in the particular, while it maylead to a large number of insertions, deletions, and substitutions. Using machinelearning, and signal processing in tandem we show that the amount of datarequired for training systems can be reduced significantly.

Page 28: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Visual Analytics: “Bringing Data to Life”

Jaya Sreevalsan Nair

International Institute of Information Technology Bangalore, India

Abstract. John Tukey, the mathematician, said the following. once upon a timeabout analytics: “This is my favorite part about analytics: Taking boring flat dataand bringing it to life through visualization.” It remains true to a great extenteven today, in the time of big data. The objective of this tutorial is to impressupon the audience the need for visualization as an essential part of larger datascience workflows. Visualization in itself has evolved from being summaries tofacilitating complex exploratory analysis of data. This tutorial will demonstratetechniques of how data can be formatted to make the best use of some of thetime-tested visualization techniques, and how visualizations enable in theoverall data analysis.

Page 29: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Biological/Genomic Data Science:Moving Beyond Correlation to Causation

Manikandan Narayanan

Indian Institute of Technology Madras, India

Abstract. Discovering causal relations in a complex system is a fundamentalpursuit in many sciences and disciplines. When controlled intervention experi-ments to determine cause-and-effect is not feasible or ethical, causal inference issurprisingly possible from observational data alone - its theory (models/assumptions/language) and practice (concrete applications in biology/medicine)is the focus of this tutorial. You will find this tutorial appealing if you find causalinference from observational data intriguing (e.g., how can one break thesymmetry of an observed correlation between two variables to determine thecausal direction, or sever the links to not only known but also unknown con-founding factors?) and valuable (in terms of its broad applications, includingbioinformatics applications ranging from identifying causal risk factors ofhuman diseases to gene regulatory networks underlying living cells).

We will start with causal discovery between two variables using theso-called mediation-based and Mendelian Randomization (MR) approaches thatare analogous to Randomized Controlled Trials popularized by Ronald Fisher,and then move onto multivariate causal discovery using the framework ofBayesian networks and do-calculus pioneered by Judea Pearl. We intend tocover modern developments and data resources that aid causal discovery frombiomedical/genomic data (for instance, one recent resource pools 11 billioncorrelations between genetic variants and health/disease-related outcomes fromgenome-wide association studies, which are waiting to be mined for new causalfactors for human health and disease).

All relevant biology and causality concepts will be introduced. A basicknowledge of probability/statistics is assumed.

Page 30: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Social Network Analysis:Making the Invisible Visible

Shalini R. Urs

MYRA School of Business, Mysore, India

Abstract. Over the past decade, there has been a growing public fascinationwith the complex “connectedness” of modern society especially since theemergence of Social Networking sites. Whether the rapid spread of news or thetipping point of social/political movements gathering momentum or the cas-cading of epidemics and financial crises around the world with alacrity andintensity, it is attributed to the connectedness of today’s society. Many scientificdisciplines have come together and evolved into a new discipline of networkscience focused on understanding these complex connected systems operate.Social Network Analysis (SNA) has emerged as an approach and a tool touncover and understand the hidden side of connections. This tutorial willintroduce the basic concepts of a network, their attributes and their measuressuch as Centrality, Components, Cohesion, Geodesic, Density and Degree,Cores, Cliques and others. We will also introduce Graph Theory that underpinsnetwork science and uses graph theory as a primary tool in the broader exam-ination of networks. With the help of examples from across different domains,we will help participants understand the dynamics of social networks and howthis understanding can be used from uncovering terrorist networks to “TheNetwork of Global Corporate Control.” This tutorial will introduce the partic-ipants to some of the essential software tools such as Gephi, Pajek, NodeXL,Cytoscape and NetworkX. Participants will be shown how to install, import dataand analyze the network with the help of examples. A comparison of these fivesoftware tools concerning features and performance will be presented.

Page 31: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Contents

Data Science Foundations

GCRITICPA: A CRITIC and Grey Relational Analysis Based ServiceRanking Approach for Cloud Service Selection. . . . . . . . . . . . . . . . . . . . . . 3

Gireesha Obulaporam, Nivethitha Somu,Gauthama Raman ManiIyer Ramani, Akshya Kaveri Boopathy,and Shankar Sriram Vathula Sankaran

Cloud Enabled Intrusion Detector and Alerter Using Improved DeepLearning Technique. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

K. Kanagaraj, S. Swamynathan, and A. Karthikeyan

Temporal and Stochastic Modelling of Attacker Behaviour. . . . . . . . . . . . . . 30Rahul Rade, Soham Deshmukh, Ruturaj Nene, Amey S. Wadekar,and Ajay Unny

Automatic License Plate Recognition Using Deep Learning . . . . . . . . . . . . . 46Bhavin Dhedhi, Prathamesh Datar, Anuj Chiplunkar, Kashish Jain,Amrith Rangarajan, and Jayshree Kundargi

Data Management and Processing Technologies

Towards Reliable Storage for Cloud Systems with Selective DataEncryption and Splitting Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Z. Asmathunnisa and P. Yogesh

Smart Solar Energy Based Irrigation System with GSM. . . . . . . . . . . . . . . . 75C. Bhuvaneswari, K. Vasanth, S. M. Shyni, and S. Saravanan

Data Analytics and its Applications

Understanding the Role of Visual Features in Emoji Similarity . . . . . . . . . . . 89Sunny Rai, Apar Garg, and Shampa Chakraverty

Semantic Network Based Cognitive, NLP Powered QuestionAnswering System for Teaching Electrical Motor Concepts . . . . . . . . . . . . . 98

Atul Prakash Prajapati, Ashish Chandiok, and D. K. Chaturvedi

Novel Wrapper-Based Feature Selection for Efficient ClinicalDecision Support System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

R. Vanaja and Saswati Mukherjee

Page 32: Communications in Computer and Information Science 941978-981-13-3582-2/1.pdf · Communications in Computer and Information Science 941 Commenced Publication in 2007 Founding and

Linear and Nonlinear Analysis of Cardiac and Diabetic Subjects . . . . . . . . . . 130Ulka Shirole, Manjusha Joshi, and Pritish Bagul

A Study on Discontinuity Pattern in Online Social Networks DataUsing Regression Discontinuity Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

K. Sailaja Kumar, D. Evangelin Geetha, and T. V. Suresh Kumar

“Senator, We Sell Ads”: Analysis of the 2016 Russian FacebookAds Campaign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Ritam Dutt, Ashok Deb, and Emilio Ferrara

Semantic Query-Based Patent Summarization System (SQPSS). . . . . . . . . . . 169K. Girthana and S. Swamynathan

Proposed Strategy for Allergy Prediction Based on Weather Forecastingand Social Media Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Sugandha Sharma, Anmol Sachan, and Harneet Singh

Wind Characteristics and Weibull Parameter Analysis to Predict WindPower Potential Along the South-East Coastline of Tamil Nadu . . . . . . . . . . 190

P. S. Maran, P. M. Velumurugan, and B. Prabhu Dass Batvari

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

XXXVIII Contents