21ST ACM CONFERENCE ON INFORMATION AND KNOWLEDGE

21ST ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT

CIKM Maui, Hawaii

SHERATON MAUI RESORT & SPA MAUI, HAWAII OCT 29 - NOV 2, 2012 www.cikm2012.org

2012

2


3


4


TABLE OF CONTENTS

Conference Chairs’ Welcome .............................................................................................................. 5

Program Chairs’ Welcome .................................................................................................................. 7

Conference Organization ..................................................................................................................... 8

Keynotes ..................................................................................................................................................... 9

Keynote: User Engagement: The Network Effect Matters! ................................................. 9 Keynote: Learning Similarity Measures based on Random Walks ............................... 11 Keynote: Compressed Data Structures with Relevance .................................................... 13

Industry Day .......................................................................................................................................... 15

Main Conference Program ................................................................................................................ 24

Tuesday, October 30, 2012 .......................................................................................................... 24 Wednesday, October 31, 2012 .................................................................................................... 40 Thursday, November 1, 2012 ...................................................................................................... 45

Posters ..................................................................................................................................................... 53

Demonstrations .................................................................................................................................... 60

Workshops ............................................................................................................................................. 62

Social Programs .................................................................................................................................... 79

Conference Welcome Reception ................................................................................................ 79 Banquet ................................................................................................................................................ 80

Local Information ................................................................................................................................ 81

Sponsors .................................................................................................................................................. 82

5


CIKM 2012 Chair’s Welcome

On behalf of the organizing committee, it is my genuine honor and great pleasure to welcome you to

the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) in

Maui, Hawaii! I hope this conference proves to be both interesting and beneficial.

Since its inception, the CIKM conference has provided a unique international forum for the

presentation, discussion and dissemination of research findings in data management, information

retrieval and knowledge management. The purpose of the conference is to identify challenging

problems facing the development of future knowledge and information systems and to shape future

research directions though the publication of high quality, applied and theoretical research

findings. The conference has been a leading forum in which experts from academic, industry and

the public sector gather to exchange ideas, research achievements and technical developments in

multidisciplinary research areas.

As one of the world’s most recognized conferences in the field, this year’s CIKM conference has

received a record high number of submissions in the history of CIKM, as can be seen from the

following statistics:

1492 abstracts submitted

1088 full papers, 229 posters, and 70 demo papers submitted

146 papers accepted for presentation as full papers (13.4% acceptance rate) and 157

papers were accepted for short papers (27.8% cumulative acceptance rate)

The increased number of submissions alone is a great demonstration of the lively research areas

that contribute to the CIKM area. In addition, CIKM 2012 will host 15 workshops on cutting-edge

areas of research and a dedicated Industry Event featuring leading industrial practitioners. We are

grateful to all authors who chose to submit their work to CIKM 2012 and are very excited by the

final program.

CIKM values interdisciplinary research and we are proud to present three keynote speakers for the

main conference (Dr. Ricardo Baeza-yates, Prof. William Cohen, and Prof. Jeffrey S. Vitter) and four

keynote speakers for the Industry Event (Drs. Eric Brill, Raghu Ramakrishnan, Tom Malloy, and

Xuedong Huang), all of whom will give presentations that cross discipline boundaries. I deeply

appreciate their time and commitment to deliver their speeches and share their cutting-edge

research experiences and insightful comments in their research topics.

Putting together CIKM 2012 program represents a huge amount of team effort on behalf of many

people. The dedication and support of the Organizing Committee members helped overcome

obstacles to deliver a successful conference program on time. First of all, the contributions to the

technical program were selected by a renowned program committee headed by our three program

chairs, Guy Lebanon, Mohammed Zaki, and Haixun Wang. I am grateful to them for all their hard

work in managing such a complex reviewing task and to their senior program committee members,

program committee members and additional reviewers who freely gave their time, effort and

intelligence to the difficult task of selecting which contributions would form the program. I am also

very grateful to Wolfgang Gatterbauer, who monitored and managed the process of editing the

proceedings to meet very tight production schedules.

6


I am also grateful to Evgeniy Gabrilovich and Dou Shen to put together an excellent Industry Event;

Dimitrios Gunopulos and Alin Dobra to organize and manage the workshop program; Amelie

Marian and Paul N. Bennett to manage the demonstrations track; Arvin Arah and Deng Cai to handle

poster papers; Qi He, Prem Melville, and Yilong Yin who were great publicity chairs; Yi Chang and

Ya Zhang who actively seek sponsorship and whose hard work allowed us to award grants to

several student attendees; and Peter Scheuermann who organized the Best Paper Awards

committee. The local arrangement team has been superbly led by Lipyeow Lim and Debasis

Bhattacharya, who have made a huge effort into every aspect of the conference. Our treasures, Bo

Luo and Eun K Park, have worked hard to set up the registration site and monitor the registration

process. Finally, I value the effort of Jong Cheol Jeong (webmaster).

In addition, I would like to express my appreciation to all of our sponsors: Microsoft Research, eBay,

Yahoo! Labs, IBM Research, Yandex, Adobe, Google, and Rakuten. All their generous sponsorship

and support made this conference successful and possible. I would also like to thank the sponsoring

ACM SIGs, SIGIR and SIGWEB, and their respective chairs, James Allan (AIGIR) and Simon Harper

(SIGWEB), as well as Charles Nicholas (CIKM Liaison), for their advice and support.

Finally, I would like to thank the CIKM Steering Committee for their vote of confidence in us and their

consistent support and guidance, in particular, our special thanks to Eun K Park and David Grossman.

It is a great honor and pleasure to accept the responsibilities and challenges of general chair. I hope

that attendees find the technical program to be interesting and productive towards their research

endeavors. I hope that you all take full advantage of these opportunities for professional development

and that the conference will provide you with a valuable opportunity to share ideas with other

researchers and practitioners from institutions around the world. I further hope that the conference

will be stimulating, informative, enjoyable, and a fulfilling experience to all who attend it.

Enjoy the CIKM 2012 conference and your stay in Maui, Hawaii!

Xuewen Chen

ACM CIKM 2012 General Chair

Wayne State University, USA

7


Program Committee Chairs’ Welcome

Welcome to the 21st ACM International Conference on Information and Knowledge Management

(CIKM). Over the past two decades CIKM has been serving as a leading forum for researchers from

the database, information retrieval, and knowledge management communities. The purpose of the

conference is to disseminate the challenges and solutions for the next generation of knowledge and

information systems, and to shape future research directions through the publication of high-

quality research papers, both theoretical and applied. This year’s call for papers attracted 1088

submissions from all over the world. The program committee accepted 146 full papers (13.4%),

and an additional 157 short papers (27.8% cumulative acceptance rate).

We thank and acknowledge everyone who made this strong technical program possible. We thank

the authors for submitting their best work for consideration at CIKM. Our unreserved gratitude is to

the program committee and external reviewers, who worked extremely hard to provide high

quality reviews and feedback on the submissions. The senior program committee members deserve

much recognition for their meta-reviews and ensuring a smooth reviewing workflow. We would

like to thank the general chair, Prof. Xue-wen Chen, in assembling an excellent organizing team,

who have put together a strong and exciting program including the invited talks, papers, posters,

workshops and demos. Thanks are also due to Joonseok Lee and Seungyeon Kim who helped

administer the IR track. Finally we would like to thank all our corporate sponsors for their generous

support.

We hope that you will find this program interesting and thought-provoking. Please enjoy the

conference and the opportunity to network with friends and colleagues from around the world.

Guy Lebanon CIKM12 IR Track Chair

Georgia Institute of Technology

Haixun Wang CIKM12 DB Track Chair Microsoft Research Asia

Mohammed J. Zaki CIKM12 KM Track Chair

Rensselaer Polytechnic Institute

8


CIKM’12 Organization Team

Conference Chair Xue-wen Chen Wayne State University, USA

Program Chairs Guy Lebanon Georgia Institute of Technology (IR track) Mohammed Zaki Rensselaer Polytechnic Institute (KM Track) Haixun Wang Microsoft Research Asia (DB track)

Industry Chairs Evgeniy Gabrilovich Google Dou Shen CityGrid Media

Workshop Chairs Dimitrios Gunopulos University of Athens Alin Dobra University of Florida

Local Chairs Lipyeow Lim University of Hawaii Debasis Bhattacharya University of Hawaii

Sponsorship Chairs Yi Chang Yahoo! Ya Zhang Shanghai Jiaotong University

Publicity Chairs Qi He IBM Research Prem Melville IBM Research Yilong Yin Shandong University

Proceedings Chair Wolfgang Gatterbauer Carnegie Mellon University

Demonstration Chairs Amelie Marian Rutgers University Paul N. Bennett Microsoft Research

Poster Chairs Arvin Agah University of Kansas Deng Cai Zhejiang University

Award Chair Peter Scheuermann Northwestern University

Treasure Bo Luo University of Kansas Eun K Park California State University, Chico

Web Master Jong Cheol Jeong University of Kansas

9


CIKM’12 Keynote Speakers

Tuesday, October 30, 2012

User Engagement: The Network Effect Matters!

Ricardo Baeza-Yates

VP of Yahoo! Research

Abstract: In the online world, user engagement refers to the quality of the user experience that emphasizes the positive aspects of the interaction with a web application and, in particular, the phenomena associated with wanting to use that application longer and frequently. This definition is motivated by the observation that successful web applications are not just used, but they are engaged with. Users invest time, attention, and emotion into them. User engagement is measured in many ways, through methods of self-reporting (e.g., questionnaires), observer methods (e.g., facial expression analysis, speech analysis, desktop actions, etc.), neuro-physiological signal processing methods (e.g., respiratory and cardiovascular accelerations and decelerations, muscle spasms, etc.), and from a web analytics perspective (through online behavior metrics that assess users' depth of engagement with a site). These methods represent various tradeoffs between scale of data and depth of understanding (for instance, surveys are small-scale but deep, whereas clicks are large-scale but shallow in understanding). Little work has been done to integrate these various measures into a coherent understanding of engagement success. Online providers aim not only to engage users with each service, but across all services in their network. They spend increasing effort to direct users to various services (e.g.~using hyperlinks to help users navigate to and explore other services); in other words, to increase user traffic between their services. Nothing is known for users engaging across such a network of Web sites, something we call networked user engagement. We address this problem by combining techniques from web analytics and mining, information retrieval evaluation, and existing works on user engagement coming from the domains of information science, multimodal human computer interaction and cognitive psychology. In this way, we can combine insights from big data with deep analysis of human behavior in the lab or through crowd-sourcing experiment. This way of thinking is crucial to many areas, going beyond the web and will in time lead to a new genre of computational social sciences that transcend specific applications on the internet. This talk comprises three parts: (1) First we define user engagement, list its many characteristics as identified in the research and analytic literature, and discuss through real examples the challenges associated with measuring user engagement.

http://users.dcc.uchile.cl/~rbaeza/

10


(2) Second we describe recent data-driven approaches looking at user engagement through the development of new measures that allow for a better representation of how users engage with and across different web services, what we call networked user engagement (3) Finally we will describe how emerging research directions looking at affect and cognition as well as graph related measures are providing additional insights into measuring networked user engagement This work is being done in collaboration with Mounia Lalmas, Janette Lehmann, and Georges Dupret from Yahoo! Labs as well as Elad Yom-Tov (Microsoft Research).

Bio: Ricardo Baeza-Yates is VP of Yahoo! Research for Europe, Middle East and Latin America, leading the labs at Barcelona, Spain and Santiago, Chile, since 2006, as well as supervising the lab in Haifa, Israel since 2008. He is also part time Professor at the Dept. of Information and Communication Technologies of the Universitat Pompeu Fabra in Barcelona, Spain, since 2005. Until 2005 he was Professor and Director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile. He obtained a Ph.D. from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electrical engineering degree from the University of Chile, Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, as well as co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 300 other publications. He has received the Organization of American States award for young researchers in exact sciences (1993) and the CLEI Latin American distinction for contributions to CS in the region (2009). In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences. During 2007 he was awarded the Graham Medal for innovation in computing, given by the University of Waterloo to distinguished ex-alumni. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow.

11


Wednesday, October 31, 2012

Learning Similarity Measures based on Random Walks

William Cohen

Research Professor, Carnegie Mellon University

Abstract: The scientific literature can be represented as a graph of documents, terms, and meta-data, with edges corresponding to containment of a term in a document, authorship of a document by a person, and so on. One popular way of querying such a graph is via queries based on proximity measures, such as Random Walk with Restart (RWR).

In this talk, we describe a novel learnable proximity measure based on RWR. Instead of introducing one weight per edge label, as in most prior work, we introduce one weight for each edge label sequence. In this model proximity is defined by a weighted combination of simple "random walk experts", each corresponding to conducting a random walk constrained to follow a particular sequence of labeled edges.

Experiments on eight tasks using graphs based on literature from two subdomains of biology show that the new learning method significantly outperforms the prior methods. We extend the method to support two additional types of experts to model intrinsic properties of entities: "query-independent experts", which generalize the PageRank measure, and "popular entity experts" which allow rankings to be adjusted for particular entities that are especially important.

Finally, we present experiments in which we use this approach to learn relationships in the ontology of NELL, a wide-coverage, large-scale information extraction system for web data. We show that these types of learnable "proximity measures" are general enough to accurately model a significant number of real-world relations, and that they outperform an alternative technique that learns to model relations based on more traditional logical rules.

Bio: William Cohen received his bachelor's degree in Computer Science from Duke University in 1984, and a PhD in Computer Science from Rutgers University in 1990. From 1990 to 2000 Dr. Cohen worked at AT&T Bell Labs and later AT&T Labs-Research, and from April 2000 to May 2002 Dr. Cohen worked at Whizbang Labs, a company specializing in extracting information from the web. Dr. Cohen is President of the International Machine Learning Society, an Action Editor for the Journal of Machine Learning Research, and an Action Editor for the journal ACM Transactions on Knowledge Discovery from Data. He is also an editor, with Ron Brachman, of the AI and Machine Learning series of books published by Morgan Claypool. In the past he has also served as an action editor for the journal Machine Learning, the journal Artificial Intelligence, and the Journal of Artificial Intelligence Research.



12


He was General Chair for the 2008 International Machine Learning Conference, held July 6-9 at the University of Helsinki, in Finland; Program Co-Chair of the 2006 International Machine Learning Conference; and Co-Chair of the 1994 International Machine Learning Conference. Dr. Cohen was also the co-Chair for the 3rd Int'l AAAI Conference on Weblogs and Social Media, which was held May 17-20, 2009 in San Jose, and is the co-Program Chair for the 4rd Int'l AAAI Conference on Weblogs and Social Media, which will be held May 23-26 at George Washington University in Washington, D. C. He is an AAAI Fellow, and in 2008, he won the SIGMOD "Test of Time" Award for the most influential SIGMOD paper of 1998. Dr. Cohen's research interests include information integration and machine learning, particularly information extraction, text categorization and learning from large datasets. He holds seven patents related to learning, discovery, information retrieval, and data integration, and is the author of more than 100 publications.

13


Thursday, November 1, 2012

Compressed Data Structures with Relevance

Jeffrey S. Vitter

Executive Vice Chancellor, the University of Kansas

Abstract: We describe recent breakthroughs in the field of compressed data structures, in which the data structure is stored in a compressed representation that still allows fast answers to queries. We focus in particular on compressed data structures to support the important application of pattern matching on massive document collections. Given an arbitrary query pattern in textual form, the job of the data structure is to report all the locations where the pattern appears. Another variant is to report all the documents that contain at least one instance of the pattern. We are particularly interested in reporting only the most relevant documents, using a variety of notions of relevance. We discuss recently developed techniques that support fast search in these contexts as well as under additional positional and temporal constraints.

Bio: Dr. Jeffrey Vitter (M.B.A., Duke University, 2002; Ph.D., Stanford University, 1980; B.S. with highest honors, University of Notre Dame, 1977) is the provost and executive vice chancellor and the Roy A. Roberts Distinguished Professor at the University of Kansas. Previously he was on the faculty at Texas A&M University, where from 2008-2009 he served as provost and executive vice president for academics, with additional responsibilities for the academic mission of Texas A&M University in Doha, Qatar. From 2002-2008, Dr. Vitter served as the Frederick L. Hovde Dean of the College of Science and Professor in the Department of Computer Science at Purdue University. From 1993-2002, Dr. Vitter held a distinguished professorship at Duke University, where he was the Gilbert, Louis, and Edward Lehrman Professor. He served at Duke as chair of the Department of Computer Science from 1993-2001 and as co-director and founding member of the Center for Geometric and Biological Computing. From 1980-1992, he progressed through the faculty ranks and served in leadership roles in the Department of Computer Science at Brown University. Dr. Vitter serves on the Board of Advisors for the School of Science and Engineering at Tulane University. From 2000-2009 Dr. Vitter served on the Board of Directors of the Computing Research Association (CRA), and he continues to co-chair its Government Affairs Committee. He chaired ACM SIGACT, the Special Interest Group on Algorithms and Computation Theory, of the world's largest computer professional society, the Association for Computing Machinery.



14


Dr. Vitter is a Fellow of the Guggenheim Foundation, the American Association for the Advancement of Science, the Association for Computing Machinery, and the Institute of Electrical and Electronics Engineers. He was named a National Science Foundation Presidential Young Investigator and is a Fulbright Scholar. He has over 280 book, journal, conference, and patent publications, primarily on the algorithmic aspects of processing massive amounts of information. He is an ISI highly cited researcher with a Google Scholar h-index of 60.

15


CIKM’12 Industry Event

CIKM 2012 will include an Industry Event, which will be held on the last day of the main conference (November 1, 2012) in parallel with the technical tracks. The Industry Event will include a series of invited talks by influential technical leaders, who will present the state of the art in industrial research and development in information retrieval, knowledge management, databases, and data mining.

MORNING 10:15 - 10:20 Opening remarks 10:20 - 11:05 KEYNOTE: Eric Brill, eBay 11:05 - 11:35 David Carmel, IBM Research 11:35 - 12:20 KEYNOTE: Raghu Ramakrishnan, Microsoft MIDDAY 1:30 - 2:00 AnHai Doan, WalmartLabs & UW-Madison 2:00 - 2:30 Chao Liu, Tencent 2:30 - 3:00 Daniel Tunkelang, LinkedIn 3:00 - 3:30 Rajesh Parekh, Groupon AFTERNOON 4:00 - 4:45 KEYNOTE: Tom Malloy, Adobe 4:45 - 5:15 Christopher Olston, Google 5:15 - 6:00 KEYNOTE: Xuedong Huang, Microsoft

Title: Having A Great Career in Research Eric Brill, Vice President of Research, eBay Abstract: You will spend a huge chunk of your life sleeping in your bed and working at your job. Therefore it makes sense to buy the most comfortable bed you can afford, and to have the most awesome career possible. Over the years I've seen many brilliant (and not brilliant) people dead end in cruddy jobs, while others seem to magically land themselves in dream positions. In this talk I'll discuss what are the fundamental differences between these two groups that leads one to crud and the other to awesome, hopefully providing you with snippets of wisdom you can apply to yourself to ensure you have an awesome research career.

http://labs.ebay.com/eric-brill.html

16


Bio: Eric Brill is Vice President of Research at eBay, where he runs eBay Research Labs (eRL). He has been an innovation manager, building and managing research teams, for 20 years. His technical expertise is in machine learning and data mining over very large data sets, as well as statistical natural language processing, information search/retrieval and online advertising. Eric has published more than 70 academic papers and has more than 30 issued patents. Prior to eBay, Eric was a professor of computer science at Johns Hopkins, and spent a decade at Microsoft Research

Title: Is This Entity Relevant to Your Needs? David Carmel, Research Staff Member, IBM Haifa Research Lab Abstract: Relevance is a fundamental concept, though not completely understood, in Information Science as well as Information Retrieval (IR). For many years researchers have been dealing with the question of what makes a document relevant to a specific user's need. While there is still no clear consensus on the meaning of this concept, many successful IR models have been developed for ranking search results based on their "relevance likelihood". The blurriness of the relevance concept also arises in new emerging IR domains such as searching over entity relationship data (ERD). Search in this domain is driven by the identification, extraction, and exploitation of real-world entities and their relationships, as represented in unstructured or semi-structured textual sources. What makes such entities relevant to the user? Is it the same question that the IR community deals with for many decades? Can we adopt exiting IR models into this new domain in a straight forward manner? Does similarity measurement between entities and the user's query is enough for identifying relevant items? In this talk I'll provide an overview on some approaches that deal with relevance approximation in several related areas such as question answering and faceted search. Then I'll raise some research directions that are related to the fundamental questions mentioned above in the ERD domain. I'll briefly describe the results of some experiments we have conducted recently with entity ranking approaches. I will also argue that for many information needs in the ERD domain, exploratory search is essential as users should interactively explore the rich and complicated domain for relevant entities, either by restricting the search results to specific facets such as the entity type or other entity attributes, or through graph navigation. Bio: David is a Research Staff Member at the Information Retrieval group at IBM Haifa Research Lab. David's research is focused on search in the enterprise, query performance prediction, social search, and text mining. For several years David taught the Introduction to IR course at the CS department at Haifa University. At IBM, David is a key contributor to IBM enterprise search offerings.

http://researcher.ibm.com/view.php?person=il-CARMEL

17


David is a co-founder of the Juru search engine which provides integrated search capabilities to several IBM products, and was used as a search platform for several studies in the TREC conferences. David has published more than 80 papers in IR and Web journals and conferences, and serves on the editorial board of the IR journal and as a senior PC member or an Area Chair of many conferences (SIGIR, WWW, WSDM. CIKM). He organized a number of workshops and taught several tutorials at SIGIR, and WWW. David is co-author of the book "Estimating the Query Difficulty for Information Retrieval", published by Morgan & Claypool in 2010, and the co-author of the paper "Learning to estimate query difficulty" who won the Best Paper Award at SIGIR 2005. David earned his PhD in Computer Science from the Technion, Israel Institute of Technology in 1997.

Title: Social Media, Data Integration, and Human Computation AnHai Doan, Chief Scientist @WalmartLabs and Professor at UW-Madison Abstract: Social media has emerged as a major frontier on the World-Wide Web, with applications ranging from helping teenagers track Justin Bieber to e-commerce to fostering revolutions. In this talk I will discuss our work in this area, as carried out at Wisconsin, Kosmix, and @WalmartLabs. I describe how we integrate data from "traditional" Web sources to build a global taxonomy, greatly expand it with social-media data, then leverage it to build consumer-facing applications. Example applications include building topic pages, detecting Twitter events, and monitoring these events. I discuss the critical role of data integration and human computation in processing social media. Finally, I discuss how all of these can help the emerging area of social commerce, and why Walmart recently acquired Kosmix to make inroads into this new and exciting area. Bio: AnHai Doan is an Associate Professor at the University of Wisconsin-Madison. His interests cover databases, AI, and Web, with a current focus on data integration, large-scale knowledge bases, social media, crowdsourcing, and human computation. He received the ACM Doctoral Dissertation Award in 2003, a CAREER Award in 2004, and a Sloan Fellowship in 2007. AnHai was Chief Scientist of Kosmix, a social media startup acquired by Walmart in 2011. Currently he also works as Chief Scientist of @WalmartLabs, a research and development lab devoted to integrating social and mobile data for e-commerce.

http://pages.cs.wisc.edu/~anhai/

18


Title: From HyperText to HyperTEC Xuedong Huang, Chief Architect, Microsoft Abstract: The Hypertext-based web interaction metaphor has gained widespread acceptance as a web interaction metaphor. Website designers compose web page, associate them with hyperlink and hypertext, and have users follow the web structure to digest information. This simple metaphor is website centric. Users are typically in the walled garden of each website. To complete their tasks, users have to navigate between and interact with various websites. To navigate to a different website, search is needed by simply typing a few keywords. Today, search and browsing are two distinctive web activities. With the fast adoption of mobile and touch-enabled devices, the web is now more accessible with richer contextual information. A new web interaction metaphor based on HyperTEC (Touch, Entity, Context) will enable user seamlessly integrate search and browsing experience. HyperTEC is more user-centric. A user can touch the device with predefined gesture to review and explore contextual results powered by the modern search engine. The browsing context and touched entity are taken into account for enhanced search results moving from traditional keyword-based search to entity-centric exploring. While display and search advertising played a key role for e-commerce, HypeTEC powered search and browsing could open a new chapter for e-commerce in the future. Bio: Dr. Xuedong Huang is a Distinguished Engineer and chief architect of Microsoft Advertising in Microsoft's Online Services Division. He previously worked on core search technologies for Bing and helped to create a significantly improved architecture from Bing's speller to ranker. He holds over 60 U.S. patents contributing significantly in the areas of signal processing, speech recognition, natural language understanding, multimodal/gesture UI technologies, core search and online advertising. Huang received the 1992 Alan Newell research excellence leadership medal, the 1993 IEEE Signal Processing Society Paper Award, and 2003 and 2004 SpeechTek Top 10 Leaders for the Speech Industry awards. He was named a Fellow of the IEEE in 2000 for his contributions to spoken language technologies, and was honored with the Asian American Engineer of the Year Award in 2011. Huang has been the honorary dean and professor of the College of Software Engineering for his alma mater, Hunan University, China, and a member of the advisory committee for the University of Washington Electrical Engineering Department and China's National Supercomputer Center (Changsha).

http://www.microsoft.com/presspass/exec/de/Huang/default.mspx

19


Title: Question Answering through Tencent Open Platform Chao Liu, Research Director, Tencent, China Abstract: Tencent Inc. is the biggest Internet company in China, with more than 700 million monthly active users, and more than 170 million users online at the same time in peak time. While Tencent starts from an IM client called QQ in 1990s, it has been constantly developing into a platform covering most aspects of Internet life. In this talk, we will first introduce Tencent Open Platform, and then illustrate how the platform is leveraged by the question answering service to index more questions/answers with higher precision and recall and to achieve faster answer rate. We illustrate the architecture, design principles, and implementation details with real examples, and put forward challenges for open discussion. Bio: Chao Liu is the deputy director at the Social Search department in Tencent, Inc. Before joining Tencent, he was a researcher at Microsoft Research at Redmond, and led the Data Intelligence Group. His research has been focused on Web services (e.g., search and ads) and data mining, with about 40 conference/journal publications and many research results transferred to Microsoft Bing search engine. Chao has been on the program and organizing committees of many conferences (e.g., SIGIR, SIGKDD, WWW, etc), and actively campaigns for the mutualism between academia and industry. Chao earned his PhD in Computer Science from the University of Illinois at Urbana-Champaign in 2007, and B.S. in Computer Science from Peking University in 2003.

Title: Revolutionizing Digital Marketing with Big Data Analytics Tom Malloy, SVP and Chief Software Architect, Adobe, USA Abstract: The marketing function in the enterprise is undergoing disruptive changes. The well-known aphorism, "Half the money I spend on advertising is wasted; the trouble is I don't know which half", is no longer an acceptable guideline for the modern marketer. Marketing is rapidly evolving, employing less art and more science. This evolution presents an unprecedented opportunity for the analytics professional to apply her skills to a wide range of challenging real world problems. Bio: Tom Malloy is senior vice president and chief software architect at Adobe. He runs Adobe's Advanced Technology Labs, spearheading the company's long-term research and development initiatives.

http://www.linkedin.com/pub/chao-liu/42/a8a/aa0

http://www.adobe.com/aboutadobe/pressroom/executivebios/tommalloy.html

20


Malloy is responsible for defining Adobe's technology strategy as well as overseeing his team of computer scientists who are delivering the next generations of Adobe software innovations. Some of Malloy's most significant contributions have included the expansion of Adobe's products to the Windows® environment, development of advanced document security technologies, and the extension of Adobe® PDF as a de facto industry standard for automating document-based enterprise processes. Prior to joining Adobe in 1986, Malloy worked as a key software developer for Apple Computer. Malloy sits on the board of Aklara, an electronic auction firm, and is a member of ACM and IEEE. He holds three patents as well as bachelor's and master's degrees in computer science from Stanford University.

Title: Programming and Debugging Large-Scale Data Processing Workflows Christopher Olston, Staff Research Scientist, Google Abstract: This talk gives an overview of my team's work on large-scale data processing at Yahoo! Research. The talk begins by introducing two data processing systems we helped develop: PIG, a dataflow programming environment and Hadoop-based runtime, and NOVA, a workflow manager for Pig/Hadoop. The bulk of the talk focuses on debugging, and looks at what can be done before, during and after execution of a data processing operation: * Pig's automatic EXAMPLE DATA GENERATOR is used before running a Pig job to get a feel for what it will do, enabling certain kinds of mistakes to be caught early and cheaply. The algorithm behind the example generator performs a combination of sampling and synthesis to balance several key factors---realism, conciseness and completeness---of the example data it produces. * INSPECTOR GADGET is a framework for creating custom tools that monitor Pig job execution. We implemented a dozen user-requested tools, ranging from data integrity checks to crash cause investigation to performance profiling, each in just a few hundred lines of code. * IBIS is a system that collects metadata about what happened during data processing, for post-hoc analysis. The metadata is collected from multiple sub-systems (e.g. Nova, Pig, Hadoop) that deal with data and processing elements at different granularities (e.g. tables vs. Records; relational operators vs. reduce task attempts) and offer disparate ways of querying it. IBIS integrates this metadata and presents a uniform and powerful query interface to users.

http://infolab.stanford.edu/~olston/

21


Bio: Christopher Olston is a staff research scientist at Google, working on structured data. He previously worked at Yahoo! (principal research scientist) and Carnegie Mellon (assistant professor). He holds computer science degrees from Stanford (2003 Ph.D., M.S.; funded by NSF and Stanford fellowships) and UC Berkeley (B.S. with highest honors). Olston just started at Google in November 2011, so he hasn't done anything there yet. At Yahoo, Olston co-created Apache Pig, which is used for large-scale data processing by LinkedIn, Netflix, Salesforce, Twitter, Yahoo and others, and is offered by Amazon as a cloud service. Olston gave the 2011 Symposium on Cloud Computing keynote, and won the 2009 SIGMOD best paper award. During his flirtation with academia, Olston taught undergrad and grad courses at Berkeley, Carnegie Mellon and Stanford, and signed several Ph.D. dissertations.

Title: Leveraging Data to Power Local Commerce Rajesh Parekh, Director of Research, Groupon Abstract: Groupon's pioneering concept of daily deals in local commerce has rapidly evolved as a key enabler connecting online and mobile users with offline local merchants. At first glance, the problem of connecting users to merchants appears to be the widely studied problem in computational advertising of matching users to advertisers. However, there are several unique twists in local deals that present interesting opportunities for large-scale data mining. I will provide an overview of some challenging data problems, such as user deal personalization and deal portfolio selection, and present a "view from the trenches" on the key insights learned, approaches for solving these problems, and opportunities for continued innovation in this area. Bio: Dr. Rajesh Parekh is Director of Research at Groupon where he focuses on applying data mining, machine learning, and optimization algorithms to solving challenging problems in the space of daily deals. Prior to Groupon, Rajesh was Senior Director of Research at Yahoo! Labs where he led the display advertising targeting sciences. At Yahoo! he received the You Rock award for his work on real-time prediction of news-worthy queries, and the Data Wizard award for designing the system that optimizes the number of sponsored ads shown on a search results page. Rajesh earned his Ph.D. in Computer Science from Iowa State University. He received the Research Excellence Award for his dissertation research on constructive learning algorithms and the Teaching Excellence Award for his contributions to the introductory Computer Literacy and Applications course. He has authored over 25 research publications and has filed 20 patents. He is actively involved in the data mining community and is the co-chair of the new Industry Practice Expo track at the KDD 2012 conference.

http://www.cs.iastate.edu/~parekh/

22


Title: The Future of Information Discovery and Search: Content Optimization, Interactivity, Semantics, and Social Networks Raghu Ramakrishnan, Technical Fellow and CTO Information Services, Microsoft Abstract: The nature of information discovery has been transformed over the past few years. I will discuss some of the underlying trends that have re-shaped how users keep up with news (about the world, about their communities, about their friends and colleagues), discover and explore topics of interest, and search for specific information they require. First, as people consume information increasingly from websites and digital devices, algorithmic techniques for selecting content have revolutionized the traditional notion of a static publication in which every user saw the same content and presentation: personalized, context-sensitive targeting is becoming the norm, and the role of an editor who shapes this user experience is changing so as to leverage the algorithmic tools to achieve a desired editorial voice. Second, social networks are emerging as an ubiquitous, near-instantaneous distribution channel that publishers must take into account in order to maximize their reach. Third, search is becoming semantically richer, and the distinction between searching for information and discovering information serendipitously is blurring: increasingly, contextual information is triggering relevant searchable companion experiences. For example, while watching a TV program, users can see a stream of relevant entities and topics such as celebrities in a movie or teams and players in a game of soccer, and by clicking retrieve more detailed information on these entities and topics. I will present an overview of these trends, highlighting the computational opportunities and challenges. Bio: Raghu Ramakrishnan is a Technical Fellow and CTO for Information Services at Microsoft, and heads the Cloud and Information Services Lab (CISL). He was previously a professor at University of Wisconsin-Madison, and a Yahoo! Fellow, While serving as Chief Scientist for the portal, cloud and search divisions at Yahoo!, he drove content recommendation algorithms (CORE), cloud data stores (PNUTS), and semantic search ("Web of Things"). In 1999, he founded QUIQ, a company that introduced a cloud-based question-answering service. He has written the widely-used text "Database Management Systems". Ramakrishnan has received several awards, including the ACM SIGKDD Innovations Award and the SIGMOD 10-year Test-of-Time Award. He is a Fellow of the ACM and IEEE.

http://research.yahoo.com/Raghu_Ramakrishnan

23


Title: Data By The People, For The People Daniel Tunkelang, Principal Data Scientist, LinkedIn Abstract: LinkedIn has a unique data collection: the 160M+ members who use LinkedIn are also the content those same members access using our information retrieval products. LinkedIn members performed over 4 billion professionally-oriented searches in 2011, most of those to find and discover other people. Every LinkedIn search and recommendation is deeply personalized, reflecting the user's current employment, career history, and professional network. In this talk, I will describe some of the challenges and opportunities that arise from working with this unique corpus. I will discuss work we are doing in the areas of relevance, recommendation, and reputation, as well as the ecosystem we have developed to incent people to provide the high-quality semi-structured profiles that make LinkedIn so useful. Bio: Daniel Tunkelang leads the data science team at LinkedIn, which analyzes terabytes of data to produce products and insights that serve LinkedIn's members. Prior to LinkedIn, Daniel led a local search quality team at Google. Daniel was a founding employee of faceted search pioneer Endeca (recently acquired by Oracle), where he spent ten years as Chief Scientist. He has authored fourteen patents, written a textbook on faceted search, created the annual workshop on human-computer interaction and information retrieval (HCIR), and participated in the premier research conferences on information retrieval, knowledge management, databases, and data mining (SIGIR, CIKM, SIGMOD, SIAM Data Mining). Daniel holds a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.

http://www.cs.cmu.edu/~quixote/

24


CIKM’12 Main Conference Program

Tuesday, October 30, 2012

Conference Opening (8:10 – 8:30) Room: Maui Ballroom Welcome: Conference Chair Program introduction: PC Chairs Keynote Speech (8:30 – 9:30) Chair: Xue-wen Chen Room: Maui Ballroom Title: User Engagement: The Network Effect Matters! Speaker: Ricardo Baeza-Yates, Yahoo! Research Coffee Break (9:30 – 10:15)

Session 1 (10:15 – 12:20) KM Track: Recommender Systems Chair: Cornelia Caragea Room: Wailuku

LogUCB: An Explore-Exploit Algorithm For Comments Recommendation

Dhruv Kumar Mahajan, Rajeev Rastogi, Charu Tiwari, Adway Mitra

DQR: A Probabilistic Approach to Diversified Query Recommendation

Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo

Dynamic Covering for Recommendation Systems

Ioannis Antonellis, Anish Das Sarma, Shaddin Dughmi

MEET: A Generalized Framework for Reciprocal Recommender Systems

Lei Li, Tao Li

Social Contextual Recommendation

Meng Jiang, Peng Cui, Rui Liu, Qiang Yang, Fei Wang, Wenwu Zhu, Shiqiang Yang

Session 2 (10:15 – 12:20) KM Track: Pattern Mining Chair: Feida Zhu Room: Kahului

Mining High Utility Itemsets without Candidate Generation

Mengchi Liu, Junfeng Qu

A General Framework to Encode Heterogeneous Information Sources for Contextual Pattern

Mining

Weishan Dong, Wei Fan, Lei Shi, Changjin Zhou, Xifeng Yan

Incorporating Occupancy into Frequent Pattern Mining for High Quality Pattern

Recommendation

Linpeng Tang, Lei Zhang, Ping Luo, Min Wang

25


PARMA: A Parallel Randomized Algorithm for Approximate Association Rules Mining in

MapReduce

Matteo Riondato, Justin A DeBrabant, Rodrigo Fonseca, Eli Upfal

Interactive Pattern Mining on Hidden Data: A Sampling-based Solution

Mansurul Bhuiyan, Snehasis Mukhopadhyay, Mohammad Al Hasan Session 3 (10:15 – 12:20) IR Track: Evaluation Methodologies Chair: Guy Lebanon Room: Kihei and Wailea

An Analysis of Systematic Judging Errors in Information Retrieval

Gabriella Kazai, Nick Craswell, Emine Yilmaz, S.M.M Tahaghoghi

On Caption Bias in Interleaving Experiments

Katja Hofmann, Fritz Behr, Filip Radlinski

Alternative assessor disagreement and retrieval depth

William Webber, Praveen Chandar, Ben Carterette

Incorporating Variability in User Behavior into Systems Based Evaluation

Ben Carterette, Evangelos Kanoulas, Emine Yilmaz

Constructing Test Collections by Inferring Document Relevance via Extracted Relevant

Information

Shahzad Rajput, Matthew Ekstrand-Abueg, Virgil Pavlu, Javed A. Aslam Session 4 (10:15 – 12:20) IR Track: Social Media Search Chair: Jun Wang Room: Kapalua

Twevent: Segment-based Event Detection from Tweets

Chenliang Li, Aixin Sun, Anwitaman Datta

Making Your Interests Follow You on Twitter

Marco Pennacchiotti, Fabrizio Silvestri, Hossein Vahabi, Rossano Venturini

Generating Event Storylines from Microblogs

Chen Lin, Chun Lin, Jingxuan Li, Dingding Wang, Yang Chen, Tao Li

Social Book Search: Comparing Topical Relevance Judgements and Book Suggestions for

Evaluation

Marijn Koolen, Jaap Kamps, Gabriella Kazai

Content-Based Crowd Retrieval on the Real-Time Web

Krishna Y Kamath, James Caverlee

26


Short Paper Session S1 (10:15 – 12:20) KM Track: Text/Web Mining Chair: Ingmar Weber Room: Napili

Automatically Embedding Newsworthy Links to Articles

Hakan Ceylan, Ioannis Arapakis, Pinar Donmez, Mounia Lalmas

Feature Selection Based on Term Frequency and T-Test for Text Categorization

Deqing Wang, Hui Zhang, Rui Liu, Weifeng Lv

Extraction of Topic Evolutions from References in Scientific Articles and Its GPU

Acceleration

Tomonari Masada, Atsuhiro Takasu

Reconciling Ontologies and the Web of Data

Ziawasch Abedjan, Johannes Lorey, Felix Naumann

Exploiting Latent Relevance for Relational Learning of Ubiquitous Things

Lina Yao, Quan Z. Sheng

Mining Coherent Anomaly Collections On Web Data

Hanbo Dai, Feida Zhu, Ee-Peng Lim, HweeHwa Pang

Mining Topic-level Opinion Influence in Microblog

Daifeng Li, Xin Shuai, Guozheng Sun, Jie Tang, Ying Ding, Zhipeng Luo

Exploiting Enriched Contextual Information for Mobile App Classification

Hengshu Zhu, Huanhuan Cao, Enhong Chen, Hui Xiong, Jilei Tian

Incorporating Word Correlation into Tag-Topic Model for Semantic Knowledge Acquisition

Fang Li, Tingting He, Xinhui Tu, Xiaohua Hu

PriSM: Discovering and Prioritizing Severe Technical Issues from Product Discussion

Forums

Rashmi Gangadharaiah, Rose Catherine

Community-Based Classification of Noun Phrases in Twitter

Freddy Chong Tat Chua, William W Cohen, Justin Betteridge, Ee-Peng Lim

Joint Bilingual Name Tagging for Parallel Corpora

Qi Li, Haibo Li, Heng Ji, Wen Wang, Jing Zheng, Fei Huang

Short Paper Session S2 (10:15 – 12:20) KM Track: Networks and Graphs Chair: Fusheng Wang Room: Kula and Hana

Influence and Similarity on Heterogeneous Networks

Guan Wang, Qingbo Hu, Philip S. Yu

GRAFT: An Approximate Graphlet Counting Algorithm for Large Graph Analysis

Mahmudur Rahman, Mansurul Bhuiyan, Mohammad Al Hasan

Fast Approximation of Steiner Trees in Large Graphs

Andrey Gubichev, Thomas Neumann

Measuring Robustness of Complex Networks under MVC Attack

Rong-Hua Li, Jeffrey Xu Yu, Xin Huang, Hong Cheng, Zechao Shang

27


Meta Path-Based Collective Classification in Heterogeneous Information Networks

Xiangnan Kong, Philip S. Yu, Ying Ding, David J. Wild

Discretionary Social Network Data Revelation with a User-Centric Utility Guarantee

Yi Song, Panagiotis Karras, Sadegh Nobari, Giorgos Cheliotis, Mingqiang Xue, Stephane

Bressan

Empirical Validation of the Buckley--Osthus Model for the Web Host Graph: Degree and

Edge Distributions

Maxim Zhukovskiy, Dmitry Vinogradov, Yuri Pritykin, Liudmila Ostroumova, Evgeniy

Grechnikov, Gleb Gusev, Pavel Serdyukov, Andrei Raigorodskii

gSCorr: Modeling Geo-Social Correlations for New Check-ins on Location-Based Social

Networks

Huiji Gao, Jiliang Tang, Huan Liu

Unsupervised Discovery of Opposing Opinion Networks From Forum Discussions

Yue Lu, Hongning Wang, ChengXiang Zhai, Dan Roth

WiSeNet: Building a Wikipedia-based Semantic Network with Ontologized Relations

Andrea Moro, Roberto Navigli

Shaping Communities out of Triangles

Arnau Prat-Pèrez, David Dominguez-Sal, Josep M Brunat, Josep-Lluis Larriba-Pey

Degree Relations of Triangles in Real-world Networks and Graph Models

Nurcan Durak, Ali Pinar, Tamara G. Kolda, C. Seshadhri

Lunch Provided by the Conference (12:20 – 01:30) Session 5 (1:30 – 3:35) KM Track: Link and Graph Mining Chair: Ricardo Baeza-Yates Room: Wailuku

Graph Classification: A Diversified Discriminative FeatureSelection Approach

Yuanyuan Zhu, Jeffrey Xu Yu, Hong Cheng, Lu Qin

Multi-Scale Link Prediction

Donghyuk Shin, Si Si, Inderjit S Dhillon

An Analysis of How Ensembles of Collective Classifiers Improve Predictions in Graphs

Hoda Eldardiry, Jennifer Neville

Density Index and Proximity Search in Large Graphs

Nan Li, Xifeng Yan, Zhen Wen, Arijit Khan

Gelling, and Melting, Large Graphs by Edge Manipulation

Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos, Christos

Faloutsos

28


Session 6 (1:30 – 3:35) IR Track: Language Technologies Chair: Oren Kurland Room: Kahului

One Seed to Find Them All: Mining Opinion Features via Association

Zhen Hai, Kuiyu Chang, Gao Cong

Topic-Driven Reader Comments Summarization

Zongyang Ma, Aixin Sun, Quan Yuan, Gao Cong

Visualizing Timelines: Evolutionary Summarization via Iterative Reinforcement between

Text and Image Streams

Rui Yan, Xiaojun Wan, Mirella Lapata, Wayne Xin Zhao, Pu-Jen Cheng, Xiaoming Li

Fast Multi-task Learning for Query Spelling Correction

Xu Sun, Anshumali Shrivastava, Ping Li

Cross-Argument Inference for Implicit Discourse Relation Recognition

Yu Hong, Xiaopei Zhou, Tingting Che, Jianmin Yao, Qiaoming Zhu, Guodong Zhou Session 7 (1:30 – 3:35) DB Track: Graph and Knowledge Base Chair: Atish Das Sarma Room: Kihei and Wailea

Interpreting Keyword Queries over Web Knowledge Bases

Jeffrey Pound, Alexander K Hudek, Ihab F Ilyas, Grant Weddell

RDF Pattern Matching using Sortable Views

Zhihong Chong, He Chen, Zhenjie Zhang, Hu Shu, Guilin Qi, Aoying Zhou

Efficient Algorithms for Generalized Subgraph Query Processing

Wenqing Lin, Xiaokui Xiao, James Cheng, Sourav S Bhowmick

G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs

Sherif Sakr, Sameh Elnikety, Yuxiong He

A Graph-Based Approach for Ontology Population with Named Entities

Wei Shen, Jianyong Wang, Ping Luo, Min Wang

Session 8 (1:30 – 3:35) DB Track: Temporal, Spatial, and Multimedia Databases Chair: Kyuseok Shim Room: Kapalua

Decomposition-by-Normalization (DBN): Leveraging Approximate Functional Dependencies

for Efficient Tensor Decomposition

Mijung Kim, K. Selàuk Candan

A Filter-based Protocol for Continuous Queries over Imprecise Location Data

Yifan Jin, Reynold Cheng, Ben Kao, Kam-Yiu Lam, Yinuo Zhang

Leveraging Read Rates of Passive RFID Tags for Real-Time Indoor Location Tracking

Da Yan, Zhou Zhao, Wilfred Ng

Location-Aware Instant Search

Ruicheng Zhong, Ju Fan, Guoliang Li, Kian-Lee Tan, Lizhu Zhou

29


Indexing Uncertain Spatio-Temporal Data

Tobias Emrich, Hans-Peter Kriegel, Nikos Mamoulis, Matthias Renz, Andreas Züfle

Short Paper Session S3 (1:30 – 3:35) KM Track: Recommendation and Summary Chair: Parvathi Chundi Room: Napili

A Simple Approach to the Design of Site-Level Extractors Using Domain-Centric Principles

Chong Long, Xiubo Geng, Chang Xu, Sathiya Keerthi

Graph-Based Workflow Recommendation: On Improving Business Process Modeling

Bin Cao, Jianwei Yin, Shuiguang Deng, Dongjing Wang, Zhaohui Wu

What is Happening Right Now ... That Interests Me?

Ernesto Diaz-Aviles, Lucas Drumond, Zeno Gantner, Lars Schmidt-Thieme, Wolfgang

Nejdl

PRemiSE: Personalized News Recommendation via Implicit Social Experts

Chen Lin, Runquan Xie, Lei Li, Zhenhua Huang, Tao Li

Time-aware Topic Recommendation Based on Micro-blogs

Huizhi Liang, Yue Xu, Dian Tjondronegoro, Peter Christen

Topic-Sensitive Probabilistic Model for Expert Finding in Question Answer Communities

Guangyou Zhou, Siwei Lai, Kang Liu, Jun Zhao

The Early-Adopter Graph and its Application to Web-Page Recommendation

Ida Mele, Francesco Bonchi, Aristides Gionis

Real-Time Bid Optimization for Group-Buying Ads

Raju Balakrishnan, Rushi P Bhatt

A Probabilistic Approach to Mining Geospatial Knowledge from Social Annotations

Suradej Intagorn, Kristina Lerman

Providing Grades and Feedback for Student Summaries by Ontology-based Information

Extraction

Fernando Gutierrez, Dejing Dou, Stephen Fickas, Gina Griffiths

Using Program Synthesis for Social Recommendations

Alvin Cheung, Armando Solar-Lezama, Samuel Madden

Web-Scale Multi-Task Feature Selection for Behavioral Targeting

Amr Ahmed, Mohamed Aly, Abhimanyu Das, Alexander J Smola, Tasos Anastasakos

Dynamic Effects of Ad Impressions on Commercial Actions in Display Advertising

Joel Barajas, Ram Akella, Marius Holtan, Jaimie Kwon, Aaron Flores, Victor Andrei

30


Short Paper Session S4 (1:30 – 3:35) IR Track: Web Search Chair: Amèlie Marian Room: Kula and Hana

Content-Based Relevance Estimation on the Web Using Inter-Document Similarities

Fiana Raiber, Oren Kurland, Moshe Tennenholtz

Estimating Interleaved Comparison Outcomes from Historical Click Data

Katja Hofmann, Shimon Whiteson, Maarten de Rijke

Ranking News Events by Influence Decay and Information Fusion for Media and Users

Liang Kong, Shan Jiang, Rui Yan, Shize Xu, Yan Zhang

Leveraging Tagging for Neighborhood-aware Probabilistic Matrix Factorization

Le Wu, Enhong Chen, Qi Liu, Linli Xu, Tengfei Bao, Lei Zhang

Federated Search in the Wild

Dong Nguyen, Thomas Demeester, Dolf Trieschnigg, Djoerd Hiemstra

Task Tours: Helping Users Tackle Complex Search Tasks

Ahmed Hassan, Ryen W White

Structured Query Reformulations in Commerce Search

Sreenivas Gollapudi, Samuel Ieong, Anitha Kannan

Characterizing Web Search Queries that Match Very Few or No Results

Ismail Sengor Altingovde, Roi Blanco, Berkant Barla Cambazoglu, Rifat Ozcan, Erdem

Sarigil, Özgür Ulusoy

The Downside of Markup: Examining the Harmful Effects of CSS and Javascript on Indexing

Today's Web

Karl Gyllstrom, Carsten Eickhoff, Arjen P. de Vries, Marie-Francine Moens

A Unified Optimization Framework for Auction and Guaranteed Delivery in Online

Advertising

Konstantin Salomatin, Tie-Yan Liu, Yiming Yang

Sentiment-Focused Web Crawling

Gural Vural, B. Barla Cambazoglu, Pinar Senkul

GTE: A Distributional Second-Order Co-Occurrence Approach to Improve the Identification

of Top Relevant Dates in Web Snippets

Ricardo Campos, Gaîl Dias, Alìpio Jorge, Cèlia Nunes

Coffee Break (3:34 – 4:00) Session 9 (4:00 – 5:40) KM Track: Matrix Methods and Anomaly Detection Chair: Xingquan Zhu Room: Wailuku

Local Anomaly Descriptor: A Robust Unsupervised Algorithm for Anomaly Detection based

on Diffusion Space

Hao Huang, Hong Qin, Shinjae Yoo, Dantong Yu

Fast and Reliable Anomaly Detection in Categorical Data

Leman Akoglu, Hanghang Tong, Jilles Vreeken, Christos Faloutsos

31


TALMUD Transfer Learning for Multiple Domains

Orly Moreno, Bracha Shapira, Lior Rokach, Guy Shani

Utilizing Common Substructures to Speedup Tensor Factorization for Mining Dynamic

Graphs

Wei Liu, Jeffrey Chan, James Bailey, Christopher Leckie, Ramamohanarao Kotagiri

Session 10 (4:00 – 5:40) KM Track: Social Networks Chair: Ashwin Machanavajjhala Room: Kahului

Predicting Emerging Social Conventions in Online Social Networks

Farshad Kooti, Winter A. Mason, Krishna P. Gummadi, Meeyoung Cha

Collective Intelligence in the Online Social Network of Yahoo!Answers and Its Implications

Ze Li, Haiying Shen, Joseph Edward Grant

From Face-to-Face Gathering To Social Structure

Chunyan Wang, Mao Ye, Wang-chien Lee

Delineating Social Network Data Anonymization via Random Edge Perturbation

Mingqiang Xue, Panagiotis Karras, Raissi Chedy, Panos Kalnis, Hung Keng Pung Session 11 (4:00 – 5:40) IR Track: Advertising Chair: Oren Kurland Room: Kihei and Wailea

Multiview Hierarchical Bayesian Regression Model andApplication to Online Advertising

Tianbing Xu, Ruofei Zhang, Zhen Guo

Visual Appearance of Display Ads and Its Effect on Click Through Rate

Javad Azimi, Ruofei Zhang, Yang Zhou, Vidhya Navalpakkam, Jianchang Mao, Xiaoli Fern

The Wisdom of Advertisers: Mining Subgoals via Query Clustering

Takehiro Yamamoto, Tetsuya Sakai, Mayu Iwata, Chen Yu, Ji-Rong Wen, Katsumi Tanaka

Sequential Selection of Correlated Ads by POMDPs

Shuai Yuan, Jun Wang

Session 12 (4:00 – 5:40) IR Track: System Architecture, Distributed IR, Scalability Chair: Arun Iyengar Room: Kapalua

Diversity in Blog Feed Retrieval

Mostafa Keikha, Fabio Crestani, Bruce W Croft

Efficient Retrieval of Recommendations in a Matrix Factorization Framework

Noam Koenigstein, Parikshit Ram, Yuval Shavitt

KORE: Keyphrase Overlap Relatedness for Entity Disambiguation

Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, Gerhard Weikum

Shard Ranking and Cutoff Estimation for Topically Partitioned Collections

Anagha Kulkarni, Almer S. Tigelaar, Djoerd Hiemstra, Jamie Callan

32


Short Paper Session S5 (4:00 – 5:40) DB Track: Search, Retrieval and Big Data Chair: Sameh Elnikety Room: Napili

Top-k Retrieval Using Conditional Preference Networks

Hongbing Wang, Xuan Zhou, Wujin Chen, Peisheng Ma

LINDA: Distributed Web-of-Data-Scale Entity Matching

Christoph Böhm, Gerard de Melo, Felix Naumann, Gerhard Weikum

Finding the Optimal Path over Multi-Cost Graphs

Yajun Yang, Jeffrey Xu Yu, Hong Gao, Jianzhong Li

CloST: A Hadoop-based Storage System for Big Spatio-Temporal Data Analytics

Haoyu Tan, Wuman Luo, Lionel M. Ni

Loyalty-based Selection: Retrieving Objects That Persistently Satisfy Criteria

Zhitao Shen, Muhammad Aamir Cheema, Xuemin Lin

Optimizing Data Migration for Cloud-based Key-value Stores

Xiulei Qin, Wenbo Zhang, Wei Wang, Jun Wei, Xin Zhao, and Tao Huang

A New Tool for Multi-Level Partitioning in Teradata

Young-Kyoon Suh, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa

Short Paper Session S6 (4:00 – 5:40) DB Track: Query and Indexing Chair: Jianzhong Li Room: Kula and Hana

Sort-based Query-adaptive Loading of R-trees

Daniar Achakeev, Bernhard Seeger, Peter Widmayer

Diversifying Query Results on Semi-Structured Data

Mahbub Hasan, Abdullah Mueen, Vassilis Tsotras, Eamonn Keogh

An Efficient Index for Massive IOT Data in Cloud Environment

Youzhong Ma, Jia Rao, Weisong Hu, Xiaofeng Meng, Xu Han, Yu Zhang, Yunpeng Chai,

Chunqiu Liu

Impact Neighborhood Indexing (INI) in Diffusion Graphs

Jung Hyun Kim, K. Selcuk Candan, Maria Luisa Sapino

Applying Weighted Queries on Probabilistic Databases

Sebastian Lehrack

Fast PCA Computation in a DBMS with Aggregate UDFs and LAPACK

Carlos Ordonez, Naveen Mohanam, Carlos Garcia-Alvarado, Predrag T. Tosic, Edgar

Martinez

A Probabilistic Approach to Correlation Queries in Uncertain Time Series Data

Mahsa Orang, Nematollaah Shiri

33


Short-paper Posters (7:15 – 9:00)

Room: Maui Ballroom

Chair: Lipyeow Lim KM Track

Influence and Similarity on Heterogeneous Networks

Guan Wang, Qingbo Hu, Philip S. Yu

GRAFT: An Approximate Graphlet Counting Algorithm for Large Graph Analysis

Mahmudur Rahman, Mansurul Bhuiyan, Mohammad Al Hasan

Mining Long-lasting Exploratory User Interests from Search History

Bin Tan, Yuanhua Lv, ChengXiang Zhai

Fast Approximation of Steiner Trees in Large Graphs

Andrey Gubichev, Thomas Neumann

Automatically Embedding Newsworthy Links to Articles

Hakan Ceylan, Ioannis Arapakis, Pinar Donmez, Mounia Lalmas

A Simple Approach to the Design of Site-Level Extractors Using Domain-Centric Principles

Chong Long, Xiubo Geng, Chang Xu, Sathiya Keerthi

Reconciling Ontologies and the Web of Data

Ziawasch Abedjan, Johannes Lorey, Felix Naumann

Efficient Extraction of Ontologies from Domain Specific Text Corpora

Tianyu Li, Pirooz Chubak, Laks V.S. Lakshmanan, Rachel Pottinger

Effective and Efficient? Bilingual Sentiment Lexicon Extraction using Collocation Alignment

Zheng Lin, Songbo Tan, Xueqi Cheng, Xueke Xu, Weisong Shi

Exploiting Latent Relevance for Relational Learning of Ubiquitous Things


Discovering Personally Semantic Places from GPS Trajectories

Mingqi Lv, Ling Chen, Gencai Chen

Mining Coherent Anomaly Collections On Web Data

Hanbo Dai, Feida Zhu, Ee-Peng Lim, HweeHwa Pang

Meta Path-Based Collective Classification in Heterogeneous Information Networks

Xiangnan Kong, Philip S. Yu, Ying Ding, David J. Wild

Discretionary Social Network Data Revelation with a User-Centric Utility Guarantee

Yi Song, Panagiotis Karras, Sadegh Nobari, Giorgos Cheliotis, Mingqiang Xue, Stephane

Bressan

Empirical Validation of the Buckley--Osthus Model for the Web Host Graph: Degree and

Edge Distributions

Maxim Zhukovskiy, Dmitry Vinogradov, Yuri Pritykin, Liudmila Ostroumova, Evgeniy

Grechnikov, Gleb Gusev, Pavel Serdyukov, Andrei Raigorodskii

gSCorr: Modeling Geo-Social Correlations for New Check-ins on Location-Based Social

Networks

Huiji Gao, Jiliang Tang, Huan Liu

Swimming against the Streamz: Search and Analytics over the Enterprise Activity Stream

Ido Guy, Tal Steier, Maya Barnea, Inbal Ronen, Tal Daniel

34


What is Happening Right Now ... That Interests Me?

Ernesto Diaz-Aviles, Lucas Drumond, Zeno Gantner, Lars Schmidt-Thieme, Wolfgang

Nejdl

Frequent grams based Embedding for Privacy Preserving Record

Linkage Luca Bonomi, Li Xiong, Rui Chen, Benjamin C. M. Fung

If You are Happy and You Know It... Tweet

Amir Asiaee T., Mariano Tepper, Arindam Banerjee, Guillermo Sapiro

Hierarchical Topic Integration Through Semi-supervised Hierarchical Topic Modeling

Xian-Ling Mao, Jing He, Hongfei Yan, Xiaoming Li

PriSM: Discovering and Prioritizing Severe Technical Issues from Product Discussion

Forums

Rashmi Gangadharaiah, Rose Catherine

Preprocessing of Informal Mathematical Discourse in Context ofControlled Natural

Language

Raùl Ernesto Gutièrrez de Piòerez Reyes, Juan Francisco Dìaz Frìas

PathRank: A Novel Node Ranking Measure on a Heterogeneous Graph for Recommender

Systems

Sangkeun Lee, Sungchan Park, Minsuk Kahng, Sang-goo Lee

Exploring the Existing Category Hierarchy to Automatically Label the Newly-arising Topics

in cQA

Guangyou Zhou, Li Cai, Kang Liu, Jun Zhao

Query-Focused Multi-document Summarization Based on Query-Sensitive Feature Space

Wenpeng Yin, Yulong Pei, Fan Zhang, Lian'en Huang

Time-aware Topic Recommendation Based on Micro-blogs

Huizhi Liang, Yue Xu, Dian Tjondronegoro, Peter Christen

iSampling: Framework for Developing Sampling Methods Considering User's Interest

Jinoh Oh, Hwanjo Yu

WiSeNet: Building a Wikipedia-based Semantic Network with Ontologized Relations

Andrea Moro, Roberto Navigli

Shaping Communities out of Triangles

Arnau Prat-Pèrez, David Dominguez-Sal, Josep M Brunat, Josep-Lluis Larriba-Pey

Relational Co-Clustering via Manifold Ensemble Learning

Ping Li, Jiajun Bu, Chun Chen, Zhanying He

SemaFor: Semantic Document Indexing using Semantic Forests

George Tsatsaronis, Iraklis Varlamis, Kjetil Nørvåg

Measuring Website Similarity using an Entity-Aware Click Graph

Pablo N Mendes, Peter Mika, Hugo Zaragoza, Roi Blanco

Community-Based Classification of Noun Phrases in Twitter

Freddy Chong Tat Chua, William W Cohen, Justin Betteridge, Ee-Peng Lim

Real-Time Bid Optimization for Group-Buying Ads

Raju Balakrishnan, Rushi P Bhatt

35


A Probabilistic Approach to Mining Geospatial Knowledge from Social Annotations

Suradej Intagorn, Kristina Lerman

Providing Grades and Feedback for Student Summaries by Ontology-based Information

Extraction

Fernando Gutierrez, Dejing Dou, Stephen Fickas, Gina Griffiths

Joint Bilingual Name Tagging for Parallel Corpora

Qi Li, Haibo Li, Heng Ji, Wen Wang, Jing Zheng, Fei Huang

Using Program Synthesis for Social Recommendations

Alvin Cheung, Armando Solar-Lezama, Samuel Madden

Web-Scale Multi-Task Feature Selection for Behavioral Targeting

Amr Ahmed, Mohamed Aly, Abhimanyu Das, Alexander J Smola, Tasos Anastasakos

Balanced Coverage of Aspects for Text Summarization

Takuya Makino, Hiroya Takamura, Manabu Okumura

Dynamic Effects of Ad Impressions on Commercial Actions in Display Advertising

Joel Barajas, Ram Akella, Marius Holtan, Jaimie Kwon, Aaron Flores, Victor Andrei

A Hybrid Approach for Efficient Provenance Storage

Yulai Xie, Dan Feng, Zhipeng Tan, Lei Chen, Kiran-Kumar Muniswamy-Reddy, Yan Li,

Darrell D. E. Long

IR Track

Content-Based Relevance Estimation on the Web Using Inter-Document Similarities

Fiana Raiber, Oren Kurland, Moshe Tennenholtz

Trust Prediction via Aggregating Heterogeneous Social Networks

Jin Huang, Feiping Nie, Heng Huang, Yi-Cheng Tu

Estimating Interleaved Comparison Outcomes from Historical Click Data

Katja Hofmann, Shimon Whiteson, Maarten de Rijke

Automatic Image Annotation Using Tag-Related Random Search over Visual Neighbors

Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, Jiaguang Sun

Diversionary Comments under Political Blog Posts

Jing Wang, Clement T Yu, Philip S Yu, Bing Liu, Weiyi Meng

Discover Breaking Events with Popular Hashtags in Twitter

Anqi Cui, Min Zhang, Yiqun Liu, Shaoping Ma, Kuo Zhang

Query Likelihood with Negative Query Generation

Yuanhua Lv, ChengXiang Zhai

On the Connections between Explicit Semantic Analysis and Latent Semantic Analysis

Chao Liu, Yi-Min Wang

Variance Maximization via Noise Injection for Active Sampling in Learning to Rank

Wenbin Cai, Ya Zhang

36


More Than Relevance: High Utility Query Recommendation By Mining Users' Search

Behaviors

Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan

Finding Nuggets in IP Portfolios: Core Patent Mining through Textual Temporal Analysis

Po Hu, Minlie Huang, Peng Xu, Weichang Li, Adam K Usadi, Xiaoyan Zhu

Customizing Search Results for Non-Native Speakers

Theodoros Lappas, Michail Vlachos

Do Ads Compete or Collaborate? Designing Click Models with Full Relationship

Incorporated

Xin Xin, Irwin King, Ritesh Agrawal, Michael R. Lyu, Heyan Huang

Ranking News Events by Influence Decay and Information Fusion for Media and Users

Liang Kong, Shan Jiang, Rui Yan, Shize Xu, Yan Zhang

Leveraging Tagging for Neighborhood-aware Probabilistic Matrix Factorization

Le Wu, Enhong Chen, Qi Liu, Linli Xu, Tengfei Bao, Lei Zhang

Semantic Context Learning with Large-Scale Weakly-Labeled Image Set

Yao Lu, Wei Zhang, Ke Zhang, Xiangyang Xue

Sketch-based Indexing of n-Words

Samuel Huston, J. Shane Culpepper, W. Bruce Croft

Interactive and Context-Aware Tag Spell Check and Correction

Francesco Bonchi, Ophir Frieder, Franco Maria Nardini, Fabrizio Silvestri, Hossein Vahabi

Federated Search in the Wild

Dong Nguyen, Thomas Demeester, Dolf Trieschnigg, Djoerd Hiemstra

From sBoW to dCoT Marginalized Encoders for Text Representation

Zhixiang (Eddie) Xu, Minmin Chen, Kilian Q. Weinberger, Fei Sha

Structured Query Reformulations in Commerce Search

Sreenivas Gollapudi, Samuel Ieong, Anitha Kannan

Towards Jointly Extracting Aspects and Aspect-Specific Sentiment Knowledge

Xueke Xu, Songbo Tan, Yue Liu, Xueqi Cheng, Zheng Lin

Collaborative Ranking: Improving the Relevance for Tail Queries

Ke Zhou, Xin Li, Hongyuan Zha

BiasTrust: Teaching Biased Users About Controversial Topics

V.G.Vinod Vydiswaran, ChengXiang Zhai, Dan Roth, Peter Pirolli

Recommending Citations: Translating Papers into References

Wenyi Huang, Saurabh Kataria, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles, Lior

Rokach

Discovering Logical Knowledge for Deep Question Answering

Zhao Liu, Xipeng Qiu, Ling Cao, Xuanjing Huang

Mining Noisy Tagging from Multi-label Space

Zhongang Qi, Ming Yang, Zhongfei (Mark) Zhang, Zhengyou Zhang

Learning from Mistakes: Towards a Correctable Learning Algorithm

Karthik Raman, Krysta M Svore, Ran Gilad-Bachrach, Chris J.C. Burges

37


Consento: A New Framework for Opinion Based Entity Search and Summarization

Jaehoon Choi, Donghyeon Kim, Seongsoon Kim, Junkyu Lee, Sangrak Lim, Sunwon Lee,

Jaewoo Kang

Search Result Presentation Based on Faceted Clustering

Benno Stein, Tim Gollub, Dennis Hoppe

PolariCQ: Polarity Classification of Political Quotations

Rawia Awadallah, Maya Ramanath, Gerhard Weikum

A Comprehensive Analysis of Parameter Settings for Novelty-Biased Cumulative Gain

Teerapong Leelanupab, Guido Zuccon, Joemon M. Jose

Map to Humans and Reduce Error - Crowdsourcing for Deduplication Applied to Digital

Libraries

Mihai Georgescu, Dang Duc Pham, Claudiu S. Firan, Wolfgang Nejdl, Julien Gaugaz

Full-Text Citation Analysis: Enhancing Bibliometric and Scientific Publication Ranking

Xiaozhong Liu, Jinsong Zhang, Chun Guo

Detecting Offensive Tweets via Topical Feature Discovery over a Large Scale Twitter Corpus

Guang Xiang, Bin Fan, Ling Wang, Jason Hong, Carolyn Rose

The Downside of Markup: Examining the Harmful Effects of CSS and Javascript on Indexing

Today's Web

Karl Gyllstrom, Carsten Eickhoff, Arjen P. de Vries, Marie-Francine Moens

You Should Read This! Let Me Explain You Why

Roi Blanco, Diego Ceccarelli, Claudio Lucchese, Raffaele Perego, Fabrizio Silvestri

Characterizing Web Search Queries that Match Very Few or No Results

Ismail Sengor Altingovde, Roi Blanco, Berkant Barla Cambazoglu, Rifat Ozcan, Erdem

Sarigil, Özgür Ulusoy

A Unified Optimization Framework for Auction and Guaranteed Delivery in Online

Advertising

Konstantin Salomatin, Tie-Yan Liu, Yiming Yang

Modeling Browsing Behavior for Click Analysis in Sponsored Search

Azin Ashkan, Charles L. A. Clarke

Sentiment-Focused Web Crawling

Gural Vural, B. Barla Cambazoglu, Pinar Senkul

User Guided Entity Similarity Search Using Meta-Path Selection in Heterogeneous

Information Networks

Xiao Yu, Yizhou Sun, Brandon Norick, Tiancheng Mao, Jiawei Han

GTE: A Distributional Second-Order Co-Occurrence Approach to Improve the Identification

of Top Relevant Dates in Web Snippets

Ricardo Campos, Gaîl Dias, Alìpio Jorge, Cèlia Nunes

Stochastic Simulation of Time-Biased Gain

Mark D. Smucker, Charles L. A. Clarke

SonetRank: Leveraging Social Networks to Personalize Search

Abhijith Kashyap, Reza Amini, Vagelis Hristidis

38


Predicting Web Search Success with Fine-grained Interaction Data

Qi Guo, Dmitry Lagun, Eugene Agichtein

Multi-Session Re-Search: In Pursuit of Repetition and Diversification

Sarah K Tyler, Yi Zhang

Theme Chronicle Model: Chronicle Consists of Timestamp and Topical Words over Each

Theme

Noriaki Kawamae

Fast Top-K Similarity Queries Via Matrix Compression

Yucheng Low, Alice X Zheng

Exploiting Concept Hierarchy for Result Diversification

Wei Zheng, Hui Fang, Conglei Yao

DB Track

Sort-based Query-adaptive Loading of R-trees Daniar Achakeev, Bernhard Seeger, Peter Widmayer

Schema-Free Structured Querying of DBpedia Data

Lushan Han, Tim Finin, Anupam Joshi

Discovering Conditional Inclusion Dependencies

Jana Bauckmann, Ziawasch Abedjan, Ulf Leser, Heiko Müller, Felix Naumann

Diversifying Query Results on Semi-Structured Data

Mahbub Hasan, Abdullah Mueen, Vassilis Tsotras, Eamonn Keogh

SliceSort: Efficient Sorting of Hierarchical Data

Quoc Trung Tran, Chee-Yong Chan

Efficient Buffer Management for Piecewise Linear Representation of Multiple Data Streams

Qing Xie, Jia Zhu, Mohamed A. Sharaf, xiaofang zhou, Chaoyi Pang

On Skyline Groups

Chengkai Li, Nan Zhang, Naeemul Hassan, Sundaresan Rajasekaran, Gautam Das

Finding the Optimal Path over Multi-Cost Graphs

Yajun Yang, Jeffrey Xu Yu, Hong Gao, Jianzhong Li

An Efficient Index for Massive IOT Data in Cloud Environment

Youzhong Ma, Jia Rao, Weisong Hu, Xiaofeng Meng, Xu Han, Yu Zhang, Yunpeng Chai,

Chunqiu Liu

Clustering Wikipedia Infoboxes to Discover their Types

Thanh Hoang Nguyen, Huong Dieu Nguyen, Viviane Moreira, Juliana Freire

Keyword-based k-Nearest Neighbor Search in Spatial Databases

Guoliang Li, Jing Xu, Jianhua Feng

Credibility-Based Product Ranking for C2C Transactions

Rong Zhang, Chao Feng Sha, Min qi Zhou, Ao ying Zhou

Location Selection for Utility Maximization with Capacity Constraints

Yu Sun, Jin Huang, Yueguo Chen, Rui Zhang, Xiaoyong Du

39


Efficient Estimation of Dynamic Density Functions with an Application to Outlier Detection

Abdulhakim Ali Qahtan, Xiangliang Zhang, Suojin Wang

A Positional Access Method for Relational Databases

Dongzhe Ma, Jianhua Feng, Guoliang Li

Real-Time Aggregate Monitoring with Differential Privacy

Liyue Fan, Li Xiong

Efficient Distributed Locality Sensitive Hashing

Bahman Bahmani, Ashish Goel, Rajendra Shinde

Author-Conference Topic-Connection Model for Academic Network Search

Jianwen Wang, Xiaohua Hu, Xinhui Tu, Tingting He

Impact Neighborhood Indexing (INI) in Diffusion Graphs

Jung Hyun Kim, K. Selcuk Candan, Maria Luisa Sapino

Loyalty-based Selection: Retrieving Objects That Persistently Satisfy Criteria

Zhitao Shen, Muhammad Aamir Cheema, Xuemin Lin

Star-Join: Spatio-Textual Similarity Join

Sitong Liu, Guoliang Li, Jianhua Feng

Adapt: Adaptive Database Schema Design for Multi-Tenant Applications

Jiacai Ni, Guoliang Li, Jun Zhang, Lei Li, Jianhua Feng

A New Tool for Multi-Level Partitioning in Teradata

Young-Kyoon Suh, Ahmad Ghazal, Alain Crolotte, Pekka Kostamaa

Fast PCA Computation in a DBMS with Aggregate UDFs and LAPACK

Carlos Ordonez, Naveen Mohanam, Carlos Garcia-Alvarado, Predrag T. Tosic, Edgar

Martinez

Scaling Multiple-Source Entity Resolution using Statistically Efficient Transfer Learning

Sahand N Negahban, Benjamin I. P. Rubinstein, Jim Gemmell

A Probabilistic Approach to Correlation Queries in Uncertain Time Series Data

Mahsa Orang, Nematollaah Shiri

On Bundle Configuration for Viral Marketing in Social Networks

De-Nian Yang, Wang-Chien Lee, Nai-Hui Chia, Mao Ye, Hui-Ju Hung

Conference Reception (6:40 – 9:00)

Room: Courtyard

40


Wednesday, October 31, 2012

Conference Opening (8:00 – 8:20) Room: Maui Ballroom

Best paper Award Announcement CIKM2013 Presentation

Keynote Speech (8:20 – 9:20) Chair: Mohammed Zaki Room: Maui Ballroom Title: Learning Similarity Measure based on Random Walks Speaker: William Cohen, Carnegie Mellon University Coffee Break (9:20 – 10:05)

Session 13 (10:05 – 12:10) KM Track: Advertisement and Products Chair: Atish Das Sarma Room: Wailuku

Daily-Deal Selection for Revenue Maximization

Theodoros Lappas, Evimaria Terzi

Enabling Direct Interest-Aware Audience Selection

Ariel Fuxman, Anitha Kannan, Zhenhui Li, Panayiotis Tsaparas

Influence Propagation in Adversarial Setting: How to Defeat Competition with Least Amount

of Investment

Shahrzad Shirazipourazad, Brian Bogard, Harsh Vachhani, Arunabha Sen, Paul Horn

Large-scale Item Categorization for e-Commerce

Dan Shen, Jean-David Ruvini, Badrul Sarwar

Matching Product Titles using Web-based Enrichment

Vishrawas Gopalakrishnan, Suresh Parthasarathy Iyengar, Amit Madaan, Rajeev Rastogi,

Srinivasan Sengamedu

Session 14 (10:05 – 12:10) KM Track: Clustering Chair: Hans-Peter Kriegel Room: Kahului

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Kai-Yang Chiang, Joyce Jiyoung Whang, Inderjit S. Dhillon

Maximum Margin Clustering on Evolutionary Data

Xuhui Fan, Lin Zhu, Longbing Cao, Xia Cui, Yew-Soon Ong

Document-Topic Hierarchies from Document Graphs

Tim Weninger, Yonatan Bisk, Jiawei Han

41


Improving Document Clustering Using Automated Machine Translation

Xiang Wang, Buyue Qian, Ian Davidson

Right-Protected Data Publishing with Hierarchical Clustering Preservation

Michail Vlachos, Aleksander Wieczorek, Johannes Schneider

Session 15 (10:05 – 12:10) IR Track: Recommendation Systems Chair: Emine Yilmaz Room: Kihei and Wailea

Metaphor: A System for Related Search Recommendations

Azarias Reda, Yubin Park, Mitul Tiwari, Christian Posse, Sam Shah

Exploring Personal Impact for Group Recommendation

Xingjie Liu, Yuan Tian, Mao Ye, Wang-Chien Lee

The Efficient Imputation Method for Neighborhood-based Collaborative Filtering

Yongli Ren, Gang Li, Jun Zhang, Wanlei Zhou

Multi-Faceted Ranking of News Articles using Post-Read Actions

Deepak Agarwal, Bee-Chung Chen, Xuanhui Wang

A Decentralized Recommender System for Effective Web Credibility Assessment

Thanasis G. Papaioannou, Jean-Eudes Ranvier, Alexandra Olteanu, Karl Aberer

Session 16 (10:05 – 12:10) IR Track: Digital Libraries and Citation Analysis Chair: Wolfgang Nejdl Room: Kapalua

Towards an Effective and Unbiased Ranking of Scientific Literature through Mutual

Reinforcement

Xiaorui Jiang, Xiaoping Sun, Hai Zhuge

A Math-Aware Search Engine for Math Question Answering System

Tam T. Nguyen, Kuiyu Chang, Siu Cheung Hui

Contextualization using Hyperlinks and Internal Hierarchical Structure of Wikipedia

Documents

Muhammad Ali Norozi, Paavo Arvola, Arjen P. de Vries

Understanding Book Search Behavior on the Web

Jinyoung Kim, Henry Feild, Marc Cartright

Temporal Corpus Summarization Using Submodular Word Coverage

Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy, Thorsten Joachims Lunch On Your Own (12:10 – 01:30) Session 17 (1:30 – 3:35) KM Track: Text Mining Chair: Jamie Salvador Argullo Room: Wailuku

TCSST: Transfer Classification of Short & Sparse Text Using External Data

Guodong Long, Ling Chen, Xingquan Zhu, Chengqi Zhang

42


The Generalized Dirichlet Distribution in Enhanced Topic Detection

Karla L Caballero, Joel Barajas, Ram Akella

Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process

Joon Hee Kim, Dongwoo Kim, Suin Kim, Alice Oh

Two-part Segmentation of Text Documents

Deepak P, Karthik Visweswariah, Nirmalie Wiratunga, Sadiq Sani

On the Design of LDA Models for Aspect-based Opinion Mining

Samaneh Moghaddam, Martin Ester

Session 18 (1:30 – 3:35) IR Track: Formal Retrieval Models and Learning to Rank Chair: Yi Zhang Room: Kahului

Predicting Query Performance for Fusion-Based Retrieval

Gad Markovits, Anna Shtok, Oren Kurland, David Carmel

Back to the Roots: A Probabilistic Framework for Query-Performance Prediction

Oren Kurland, Anna Shtok, Shay Hummel, Fiana Raiber, David Carmel, Ofri Rom

Learning to Rank for Robust Question Answering

Arvind Agarwal, Hema Raghavan, Karthik Subbian, Prem Melville, Richard D Lawrence,

David C Gondek, James Fan

Learning to Rank By Aggregating Expert Preferences

Maksims N Volkovs, Hugo Larochelle, Richard S Zemel

Learning to Rank Duplicate Bug Reports

Jian Zhou, Hongyu Zhang Session 19 (1:30 – 3:35) DB Track: Probabilistic and Uncertain Data Chair: Daisy Zhe Wang Room: Kihei and Wailea

A Model-based Approach for RFID Data Stream Cleansing

Zhou Zhao, Wilfred Ng

What is the IQ of your Data Transformation System?

Giansalvatore Mecca, Paolo Papotti, Salvatore Raunich, Donatello Santoro

On the Foundations of Probabilistic Information Integration

Fereidoon Sadri

GPU Acceleration of Probabilistic Frequent Itemset Mining from Uncertain Databases

Yusuke Kozawa, Toshiyuki Amagasa, Hiroyuki Kitagawa

Completeness of Queries over SQL Databases

Werner Nutt, Simon Razniewski

43


Session 20 (1:30 – 3:35) DB Track: Top-k and Nearest Neighbor Queries Chair: Eduard C. Dragut Room: Kapalua

Being Picky-Processing Top-K Queries with Set-Defined Selections

Aleksandar Stupar, Sebastian Michel

Finding Top k Most Influential Spatial Facilities over Uncertain Objects

Liming Zhan, Ying Zhang, Wenjie Zhang, Xuemin Lin

Efficient Safe-Region Construction for Moving Top-K Spatial Keyword Queries

Weihuang Huang, Guoliang Li, Kian-Lee Tan, Jianhua Feng

Monochromatic and Bichromatic Reverse Nearest Neighbor Queries on Land Surfaces

Da Yan, Zhou Zhao, Wilfred Ng

Pay-as-you-go Maintenance of Precomputed Nearest Neighbors in Large Graphs

Tom Crecelius, Ralf Schenkel Coffee Break (3:35 – 3:50)

Session 21 (3:50 – 5:30) KM Track: Spatial and Temporal Methods Chair: Jalal Mahmud Room: Wailuku

Spatial Influence vs. Community Influence: Modeling the Global Spread of Social Media

Krishna Y Kamath, James Caverlee, Zhiyuan Cheng, Daniel Z Sui

TUT: A Statistical Model for Detecting Trends, Topics and User Interests in Social Media

Xuning Tang, Christopher C. Yang

Predicting Aggregate Social Activities Using Continuous-Time Stochastic Process

Shu Huang, Min Chen, Bo Luo, Dongwon Lee

Acquiring Temporal Constraints between Relations

Partha Pratim Talukdar, Derry Wijaya, Tom Mitchell

Session 22 (3:50 – 5:30) IR Track: Web Search Chair: Fabrizio Silvestri Room: Kahului

Towards Optimum Query Segmentation: In Doubt Without

Matthias Hagen, Martin Potthast, Anna Beyer, Benno Stein

Leaving So Soon? Understanding and Predicting Web Search Abandonment Rationales

Abdigani Diriye, Ryen White, Georg Buscher, Susan Dumais

Click Patterns: An Empirical Representation of Complex Query Intents

Huizhong Duan, Emre Kiciman, ChengXiang Zhai

Domain Dependent Query Reformulation for Web Search

Van Dang, Giridhar Kumaran, Adam Troy

44


Session 23 (3:50 – 5:30) DB Track: Web Data Management Chair: Lipyeow Lim Room: Kihei and Wailea

An Automatic Blocking Mechanism for Large-Scale De-duplication Tasks

Anish Das Sarma, Ankur Jain, Ashwin Machanavajjhala, Philip Bohannon

Processing Continuous Text Queries Featuring Non-Homogeneous Scoring Functions

Nelly Vouzoukidou, Bernd Amann, Vassilis Christophides

Comprehension-Based Result Snippets

Abhijith Kashyap, Vagelis Hristidis

An Effective Rule Miner for Instance Matching in a Web of Data

Xing Niu, Shu Rong, Haofen Wang, Yong Yu Short Paper Session S7 (3:50 – 5:30) IR Track: Ranking and Recommendation Chair: Hong Cheng Room: Kapalua

Variance Maximization via Noise Injection for Active Sampling in Learning to Rank

Wenbin Cai, Ya Zhang

More Than Relevance: High Utility Query Recommendation By Mining Users' Search

Behaviors

Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan

Recommending Citations: Translating Papers into References

Wenyi Huang, Saurabh Kataria, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles, Lior

Rokach

Discovering Logical Knowledge for Deep Question Answering

Zhao Liu, Xipeng Qiu, Ling Cao, Xuanjing Huang

Consento: A New Framework for Opinion Based Entity Search and Summarization

Jaehoon Choi, Donghyeon Kim, Seongsoon Kim, Junkyu Lee, Sangrak Lim, Sunwon Lee,

Jaewoo Kang

Search Result Presentation Based on Faceted Clustering

Benno Stein, Tim Gollub, Dennis Hoppe

Entity Centric Query Expansion for Enterprise Search

Xitong Liu, Hui Fang, Fei Chen, Min Wang

Automatic Query Expansion Based on Tag Recommendation

Vitor Oliveira, Guilherme Gomes, Fabiano Belem, Wladmir Brandao, Jussara Almeida,

Nivio Ziviani, Marcos Gonçalves

Query Recommendation for Children

Sergio Duarte Torres, Djoerd Hiemstra, Ingmar Weber, Pavel Serdyukov

Conference Banquet (5:30 – 8:30)

45


Thursday, November 1, 2012

Keynote Speech (8:30 – 9:30) Chair: Haixun Wang Room: Maui Ballroom Title: Compressed Data Structures with Relevance Speaker: Jeffrey S. Vitter, University of Kansas Coffee Break (9:30 – 10:15)

Industry Day Morning Session (10:15 – 12:20) Chair: Evgeniy Gabrilovich Room: Wailuku

10:15 – 10:20: Opening Remarks

10:20 – 11:05: Keynote Talk, Having a Great Career in Research

Eric Brill, eBay

11:05 – 11:35, Is This Entity Relevant to Your Needs?

David Carmel, IBM Research

11:35 – 12:20, Keynote Talk, The Future of Information Diversity and Search: Content

Optimization, Interactivity, Semantics, and Social Networks

Raghu Ramakrishnan, Microsoft

Session 24 (10:15 – 12:20) KM Track: Information Extraction Chair: Chengkai Li Room: Kahului

Non-stationary Bayesian Networks based on Perfect Simulation

Yi Jia, Wenrong Zeng, Jun Huan

Active Learning for Relation Type Extension with Local and Global Data Views

Ang Sun, Ralph Grishman

Segmenting Web-Domains and Hashtags using Length Specific Models

Sriram Srinivasan, Sourangshu Bhattacharya, Rudrasis Chakraborty

Crosslingual Distant Supervision for Extracting Relations of Different Complexity

Andre Blessing, Hinrich Schütze

Labeling by Landscaping: Classifying Token in Context by Pruning and Decorating Trees

Siddharth Patwardhan, Branimir Boguraev, Apoorv Agarwal, Alessandro Moschitti,

Jennifer Chu-Carroll Session 25 (10:15 – 12:20) IR Track: Topic Modeling and Content and Sentiment Analysis Chair: Paul McNamee Room: Kihei and Wailea

G-WSTD: A Framework For Geographic Web Search Topic Discovery

Di Jiang, Jan Vosecky, Kenneth Wai-Ting Leung, Wilfred Ng

46


Supporting Factual Statements with Evidence from the Web

Chee Wee Leong, Silviu Cucerzan

Role-explicit Query Identification and Intent Role Annotation

Haitao Yu, Fuji Ren

Towards Concept-Based Translation Models Using Search Logs for Query Expansion

Jianfeng Gao, Jian-Yun Nie

Joint Topic Modeling for Event Summarization across News and Social Media Streams

Wei Gao, Peng Li, Kareem Darwish

Session 26 (10:15 – 12:20) DB Track: Query Processing, Optimization and Performance Chair: Ariel Fuxman Room: Kapalua

CGStream: Continuous Correlated Graph Query for Data Streams

Shirui Pan, Xingquan Zhu

Efficient Influence-Based Processing of Market Research Queries

Anastasios Arvanitis, Antonios Deligiannakis, Yannis Vassiliou

Deco: Declarative Crowdsourcing

Aditya Ganesh Parameswaran, Hyunjung Park, Hector Garcia-Molina, Neoklis Polyzotis,

Jennifer Widom

Predicting the Effectiveness of Keyword Queries on Databases

Shiwen Cheng, Arash Termehchy, Vagelis Hristidis

You Can Stop Early with COLA: Online Processing of Aggregate Queries in the Cloud

Yingjie Shi, Xiaofeng Meng, Fusheng Wang, Yantao Gan

Short Paper Session S8 (10:15 – 12:20) KM Track: Learning and Knowledge Discovery Chair: Qi He Room: Napili

Hierarchical Co-Clustering Based on Entropy Splitting

Wei Cheng, Xiang Zhang, Feng Pan, Wei Wang

Adapting Vector Space Model to Ranking-based Collaborative Filtering

Shuaiqiang Wang, Jiankai Sun, Byron J Gao, Jun Ma

Joint Relevance and Answer Quality Learning for Question Routing in Community QA

Guangyou Zhou, Kang Liu, Jun Zhao

Learning Spectral Embedding via Iterative Eigenvalue Thresholding

Fanhua Shang, L. C. Jiao, Yuanyuan Liu, Fei Wang

Discovering Personally Semantic Places from GPS Trajectories

Mingqi Lv, Ling Chen, Gencai Chen

Swimming against the Streamz: Search and Analytics over the Enterprise Activity Stream

Ido Guy, Tal Steier, Maya Barnea, Inbal Ronen, Tal Daniel

Frequent grams based Embedding for Privacy Preserving Record Linkage

Luca Bonomi, Li Xiong, Rui Chen, Benjamin C. M. Fung

47


If You are Happy and You Know It... Tweet

Amir Asiaee T., Mariano Tepper, Arindam Banerjee, Guillermo Sapiro

Hierarchical Topic Integration Through Semi-supervised Hierarchical Topic Modeling

Xian-Ling Mao, Jing He, Hongfei Yan, Xiaoming Li

iSampling: Framework for Developing Sampling Methods Considering User's Interest

Jinoh Oh, Hwanjo Yu

Relational Co-Clustering via Manifold Ensemble Learning

Ping Li, Jiajun Bu, Chun Chen, Zhanying He

Measuring Website Similarity using an Entity-Aware Click Graph

Pablo N Mendes, Peter Mika, Hugo Zaragoza, Roi Blanco

A Hybrid Approach for Efficient Provenance Storage

Yulai Xie, Dan Feng, Zhipeng Tan, Lei Chen, Kiran-Kumar Muniswamy-Reddy, Yan Li,

Darrell D. E. Long

Short Paper Session S9 (10:15 – 12:20) IR Track: Search and Advanced IR Chair: Arunabha Sen Room: Kula and Hana

Customizing Search Results for Non-Native Speakers

Theodoros Lappas, Michail Vlachos

Sketch-based Indexing of n-Words

Samuel Huston, J. Shane Culpepper, W. Bruce Croft

Interactive and Context-Aware Tag Spell Check and Correction

Francesco Bonchi, Ophir Frieder, Franco Maria Nardini, Fabrizio Silvestri, Hossein Vahabi

From sBoW to dCoT Marginalized Encoders for Text Representation

Zhixiang (Eddie) Xu, Minmin Chen, Kilian Q. Weinberger

BiasTrust: Teaching Biased Users About Controversial Topics

V.G.Vinod Vydiswaran, ChengXiang Zhai, Dan Roth, Peter Pirolli

A Comprehensive Analysis of Parameter Settings for Novelty-Biased Cumulative Gain

Teerapong Leelanupab, Guido Zuccon, Joemon M. Jose

Differences in Effectiveness Across Sub-collections

Mark Sanderson, Andrew Turpin, Ying Zhang, Falk Scholer

Map to Humans and Reduce Error - Crowdsourcing for Deduplication Applied to Digital

Libraries

Mihai Georgescu, Dang Duc Pham, Claudiu S. Firan, Wolfgang Nejdl, Julien Gaugaz

You Should Read This! Let Me Explain You Why

Roi Blanco, Diego Ceccarelli, Claudio Lucchese, Raffaele Perego, Fabrizio Silvestri

User Guided Entity Similarity Search Using Meta-Path Selection in Heterogeneous

Information Networks

Xiao Yu, Yizhou Sun, Brandon Norick, Tiancheng Mao, Jiawei Han

48


Multi-Session Re-Search: In Pursuit of Repetition and Diversification

Sarah K Tyler, Yi Zhang

Fast Top-K Similarity Queries Via Matrix Compression

Yucheng Low, Alice X Zheng

Lunch Provided by the Conference (12:20 – 01:30) Industry Day Midday Session (1:30 – 3:35) Chair: Evgeniy Gabrilovich Room: Wailuku

1:30 – 2:00, Social Media, Data Integration, and Human Computation

AnHai Doan, WalmartLabs and UW-Madison

2:00 – 2:30, Question Answering Through Tencent Open Platform

Chao Liu, Tencent

2:30 – 3:00, Data by the People, for the People

Daniel Tunkelang, Linkedln

3:00 – 3:30, Leveraging Data to Power Local Commerce

Rajesh Parekh, Groupon

Session 27 (1:30 – 3:35) KM Track: Classification and Semantic Methods Chair: Siddharth Patwardhan Room: Kahului

A Novel Local Patch Framework for Fixing Supervised Learning Models

Yilei Wang, Bingzheng Wei, Jun Yan, Yang Hu, Zhi-Hong Deng, Zheng Chen

Automated Feature Weighting in Naive Bayes for High-dimensional Data Classification

Lifei Chen, Shengrui Wang

Learning to Discover Complex Mappings from Web Forms to Ontologies

Yuan An, Xiaohua Hu, Il-Yeol Song

Modeling Semantic Relations between Visual Attributes and Object Categories via Dirichlet

Forest Prior

Xin Chen, Xiaohua Hu, Zhongna Zhou, Yuan An, Tingting He, E.K. Park

CoNet: Feature Generation for Multi-View Semi-Supervised Learning with Partially

Observed Views

Brian Quanz, Jun Huan

Session 28 (1:30 – 3:35) IR Track: Multimedia and User Feedback Chair: Paul McNamee Room: Kihei and Wailea

Generating Facets for Phone-based Navigation of Structured Data

Krishna Kummamuru, Ajith Jujjuru, Mayuri Duggirala

The Effect of Aggregated Search Coherence on Search Behavior

Jaime Arguello, Robert Capra

49


Improving Bag-of-visual-Words Model with Spatial-Temporal Correlation for Video

Retrieval

Lei Wang, Dawei Song, Eyad Elyan

Exploring and Predicting Search Task Difficulty

Jingjing Liu, Chang Liu, Michael Cole, Nicholas J. Belkin, Xiangmin Zhang

Iterative Relevance Feedback with Adaptive Exploration/Exploitation Trade-off

Nicolae Suditu, François Fleuret

Session 29 (1:30 – 3:35) DB Track: Emerging and Advanced Topics Chair: Anish Das Sarma Room: Kapalua

A Practical Concurrent Index for Solid-State Drives

Risi Thonangi, Shivnath Babu, Jun Yang

Robust Distributed Indexing for Locality-Skewed Workloads

Mu-Woong Lee, Seung-won Hwang

Efficient Provenance Storage For Relational Queries

Zhifeng Bao, Henning Koehler, Liwei Wang, Xiaofang Zhou, Shazia Sadiq

Generically Extending Anonymization Algorithms to Deal with Successive Queries

Manuel Barbosa, Alexandre Pinto, Bruno Gomes

Authentication of Moving Range Queries

Duncan Yung, Eric Lo, Man Lung Yiu Short Paper Session S10 (1:30 – 3:35) IR Track: Click Models, Learning and Mining Chair: Haixun Wang Room: Napili

Do Ads Compete or Collaborate? Designing Click Models with Full Relationship

Incorporated

Xin Xin, Irwin King, Ritesh Agrawal, Michael R. Lyu, Heyan Huang

Finding Nuggets in IP Portfolios: Core Patent Mining through Textual Temporal Analysis

Po Hu, Minlie Huang, Peng Xu, Weichang Li, Adam K Usadi, Xiaoyan Zhu

Semantic Context Learning with Large-Scale Weakly-Labeled Image Set

Yao Lu, Wei Zhang, Ke Zhang, Xiangyang Xue

Mining Noisy Tagging from Multi-label Space

Zhongang Qi, Ming Yang, Zhongfei (Mark) Zhang, Zhengyou Zhang

Learning from Mistakes: Towards a Correctable Learning Algorithm

Karthik Raman, Krysta M Svore, Ran Gilad-Bachrach, Chris J.C. Burges

PolariCQ: Polarity Classification of Political Quotations

Rawia Awadallah, Maya Ramanath, Gerhard Weikum

Modeling Browsing Behavior for Click Analysis in Sponsored Search

Azin Ashkan, Charles L. A. Clarke

50


User Activity Profiling with Multi-Layer Analysis

Hongxia Jin

Stochastic Simulation of Time-Biased Gain

Mark D. Smucker, Charles L. A. Clarke

Predicting Web Search Success with Fine-grained Interaction Data

Qi Guo, Dmitry Lagun, Eugene Agichtein

Mining Sentiment Terminology Through Time

Hadi Amiri, Tat-Seng Chua Short Paper Session S11 (1:30 – 3:35) DB Track: Advanced DB Topics Chair: Seungwon Hwang Room: Kula and Hana

Efficient Logging for Enterprise Workloads on Column-Oriented In-Memory Databases

Johannes Wust, Joos-Hendrick Boese, Frank Renkes, Sebastian Blessing, Jens Krueger,

Hasso Plattner

Discovering Conditional Inclusion Dependencies

Jana Bauckmann, Ziawasch Abedjan, Ulf Leser, Heiko Müller, Felix Naumann

Efficient Buffer Management for Piecewise Linear Representation of Multiple Data Streams

Qing Xie, Jia Zhu, Mohamed A. Sharaf, Xiaofang Zhou, Chaoyi Pang

On Skyline Groups

Chengkai Li, Nan Zhang, Naeemul Hassan, Sundaresan Rajasekaran, Gautam Das

Clustering Wikipedia Infoboxes to Discover their Types

Thanh Hoang Nguyen, Huong Dieu Nguyen, Viviane Moreira, Juliana Freire

Efficient Estimation of Dynamic Density Functions with an Application to Outlier Detection

Abdulhakim Ali Qahtan, Xiangliang Zhang, Suojin Wang

Real-Time Aggregate Monitoring with Differential Privacy

Liyue Fan, Li Xiong

Efficient Distributed Locality Sensitive Hashing

Bahman Bahmani, Ashish Goel, Rajendra Shinde

Star-Join: Spatio-Textual Similarity Join

Sitong Liu, Guoliang Li, Jianhua Feng

Adapt: Adaptive Database Schema Design for Multi-Tenant Applications

Jiacai Ni, Guoliang Li, Jun Zhang, Lei Li, Jianhua Feng

Scaling Multiple-Source Entity Resolution using Statistically Efficient Transfer Learning

Sahand N Negahban, Benjamin I. P. Rubinstein, Jim Gemmell

On Bundle Configuration for Viral Marketing in Social Networks

De-Nian Yang, Wang-Chien Lee, Nai-Hui Chia, Mao Ye, Hui-Ju Hung Coffee Break (3:35 – 4:00)

51


Industry Day Afternoon Session (4:00 – 6:00) Chair: Evgeniy Gabrilovich Room: Wailuku

4:00 – 4:45, Keynote Talk, Revolutionizing Digital Marketing with Big Data Analytics

Tom Malloy, Adobe

4:45 – 5:15, Programming and Debugging Large-Scale Data Processing Workflows

Christopher Olston, Google

5:15 – 6:00, Keynote Talk, From HyperText to HyperTEC

Xuedong Huang, Microsoft

Session 30 (4:00 – 5:40) KM Track: Novel Applications Chair: Yuan An Room: Kahului

Model the Complex Dependence Structures of Financial Variables by Using Canonical Vine

Wei Wei, Xuhui Fan, Jinyan Li, Longbing Cao

A Unified Learning Framework for Auto Face Annotation by Mining Web Facial Images

Dayong Wang, Steven Chu Hong Hoi, Ying He

Efficient Jaccard-based Diversity Analysis of Large Document Collections

Fan Deng, Stefan Siersdorfer, Sergej Zerr

Knowing Where and How Criminal Organizations Operate Using Web Content

Michele Coscia, Viridiana Rios

Session 31 (4:00 – 5:40) IR Track: Social Networks Chair: Qi He Room: Kihei and Wailea

Social Recommendation Across Multiple Relational Domains

Meng Jiang, Peng Cui, Fei Wang, Qiang Yang, Wenwu Zhu, Shiqiang Yang

Mining Competitive Relationships by Learning across Heterogeneous Networks

Yang Yang, Jie Tang, Jacklyne Keomany, Yanting Zhao, Juanzi Li, Ying Ding, Tian Li,

Liangwei Wang

Evaluating Geo-Social Influence in Location-Based Social Networks

Chao Zhang, Lidan Shou, Ke Chen, Gang Chen, Yijun Bei

The Walls Have Ears: Optimize Sharing for Visibility and Privacy in Online Social Networks

Thang N. Dinh, Yilin Shen, My T. Thai

Short Paper Session S12 (4:00 – 5:40) IR Track: Social Networks Chair: Prem Melville Room: Napili

Trust Prediction via Aggregating Heterogeneous Social Networks

Jin Huang, Feiping Nie, Heng Huang, Yi-Cheng Tu

Diversionary Comments under Political Blog Posts

Jing Wang, Clement T Yu, Philip S Yu, Bing Liu, Weiyi Meng

52


Discover Breaking Events with Popular Hashtags in Twitter

Anqi Cui, Min Zhang, Yiqun Liu, Shaoping Ma, Kuo Zhang

Interest-Matching Information Propagation in Multiple Online Social Networks

Yilin Shen, Thang N. Dinh, Huiyuan Zhang, My T. Thai

Quality Models for Microblog Retrieval

Jaeho Choi, W. Bruce Croft, Jin Young Kim

Query-biased Learning to Rank for Real-time Twitter Search

Xin Zhang, Ben He, Tiejian Luo, Baobin Li

Location-Sensitive Resources Recommendation in Social Tagging Systems

Chang Wan, Ben Kao, David W. Cheung

Detecting Offensive Tweets via Topical Feature Discovery over a Large Scale Twitter Corpus

Guang Xiang, Bin Fan, Ling Wang, Jason Hong, Carolyn Rose

53


CIKM’12 Poster Session

Wednesday, October 31, 2012 Poster Session (10:05 – 12:10) KM Track Chair: Lipyeow Lim Room: Napili

Learning to Rank for Hybrid Recommendation

Jiankai Sun, Shuaiqiang Wang, Byron J. Gao, Jun Ma

Importance Weighted Passive Learning

Shuaiqiang Wang, Xiaoming Xi, Yilong Yin

A Tag-Centric Discriminative Model for Web ObjectsClassification


Outlier Detection using Centrality and Center-Proximity

Duck-Ho Bae, Seo Jeong, Sang-Wook Kim, Minsoo Lee

An Effective Category Classification Method Based on a Language Model for Question

Category Recommendation on a cQA service

Kyoungman Bae, Youngjoong Ko

Clustering Short Text Using Ncut-weighted Non-negative Matrix Factorization

Xiaohui Yan, Jiafeng Guo, Shenghua Liu, Xue-qi Cheng, Yanfeng Wang

Polygene-based Evolution: A Novel Framework for Evolutionary Algorithms

Shuaiqiang Wang, Byron J. Gao, Shuangling Wang, Guibao Cao, Yilong Yin

A Tensor Encoding Model for Semantic Processing

Michael Symonds, Peter D Bruza, Laurianne Sitbon, Ian Turner

Accelerating Locality Preserving Nonnegative Matrix Factorization

Guanhong Yao, Cai Deng

The Twitaholic Next Door.

Patrick Bamba, Julien Subercaze, Christophe Gravier, Nabil Benmira, Jimi Fontaine

Information Propagation in Social Rating Networks

Priyanka Garg, Irwin King, Michael R. Lyu

Maximizing Revenue from Strategic Recommendations under Decaying Trust

Paul Dütting, Monika Henzinger, Ingmar Weber

Weighted Linear Kernel with Tree Transformed Features For Malware Detection.

Prakash Mandayam Comar, Lei Liu, Sabyasachi Saha, Antonio Nucci, Pang-Ning Tan

Learning to Predict the Cost-Per-Click for Your Ad Words

Chieh-Jen Wang, Hsin-Hsi Chen

Dual Word and Document Seed Selection for Semi-supervised Sentiment Classification

Shengfeng Ju, Shoushan Li, Yan Su, Guodong Zhou, Yu Hong, Xiaojun Li

On Empirical Tradeoffs in Large Scale Hierarchical Classification

Rohit Babbar, Ioannis Partalas, Eric Gaussier, Cecile Amblard

54


An Interaction Framework of Service-oriented Ontology Learning

Jingsong Zhang, Yinglin Wang, Hao Wei

Infobox Suggestion for Wikipedia Entities

Afroza Sultana, Quazi Mainul Hasan, Ashis Kumer Biswas, Soumyava Das, Habibur

Rahman, Chris Ding, Chengkai Li

Time Feature Selection for Identifying Active Household Members

Pedro G. Campos, Alejandro Bellogin, Fernando Diez, Ivan Cantador

Text Classification with Relatively Small Positive Documents and Unlabeled Data

Fumiyo Fukumoto, Takeshi Yamamoto, Suguru Matsuyoshi, Yoshimi Suzuki

On Compressing Weighted Time-evolving Graphs

Wei Liu, Andrey Kan, Jeffrey Chan, James Bailey, Christopher Leckie, Jian Pei,

Ramamohanarao Kotagiri

Graph-based Collective Classification for Tweets

Yajuan Duan, Furu Wei, Ming Zhou, Heung-Yeung Shum

A Word-Order Based Graph Representation For Relevance Identification

Lakshmi Ramachandran, Edward F Gehringer

Tracing Clusters in Evolving Graphs with Node Attributes

Brigitte Boden, Stephan Günnemann, Thomas Seidl

Prediction of Retweet Cascade Size over Time

Andrey Kupavskii, Liudmila Ostroumova, Alexey Umnov, Svyatoslav Usachev, Pavel

Serdyukov, Gleb Gusev, Andrey Kustarev

An Efficient and Simple Under-sampling Technique for Imbalanced Time Series

Classification

Guohua Liang, Chengqi Zhang

Top-N Recommendation through Belief Propagation

Jiwoon Ha, Soon-Hyoung Kwon, Sang-Wook Kim, Christos Faloutsos, Sunju Park

Mining Advices from Weblogs

Alfan Farizki Wicaksono, Sung-Hyon Myaeng

Parallel Proximal Support Vector Machine for High-Dimensional Pattern Classification

Zhenfeng Zhu, Xingquan Zhu, Yangdong Ye, Yue-Fei Guo, Xiangyang Xue

On Using Category Experts for Improving the Performance and Accuracy in Recommender

Systems

Won-Seok Hwang, Ho-Jong Lee, Sang-Wook Kim, Minsoo Lee

Finding Influential Products on Social Domination Game

Jinyoung Yeo, Jin-woo Park, Seung-won Hwang

Entity Resolution using Search Engine Results

Madian Khabsa, Pucktada Treeratpituk, C. Lee Giles

Tweet Classification Based on Their Lifetime Duration

Hikaru Takemura, Keishi Tajima

Scalable Collaborative Filtering Using Incremental Update and Local Link Prediction

Xiao Yang, Zhaoxin Zhang, Ke Wang

55


Composing Activity Groups in Social Networks

Cheng-Te Li, Man-Kwan Shan

A Co-training based Method for Chinese Patent Semantic Annotation

Xu Chen, Zhiyong Peng, Cheng Zeng

Automatic Labeling Hierarchical Topics

Xian-Ling Mao, Zhao-Yan Ming, Zheng-Jun Zha, Tat-Seng Chua, Hongfei Yan, Xiaoming Li

An Unsupervised Method for Author Extraction from Web Pages Containing User-Generated

Content

Jing Liu, Xinying Song, Jingtian Jiang, Chin-Yew Lin

Hierarchical Target Type Identification for Entity-oriented Queries

Krisztian Balog, Robert Neumayer

Dictionary based Sparse Representation for Domain Adaptation

Rishabh Mehrotra, Rushabh Agrawal, Syed Aqueel Haider

Poster Session (1:30 – 3:35) IR Track Chair: Lipyeow Lim Room: Napili

Selecting Expansion Terms as a Set via Integer Linear Programming

Qi Zhang, Yan Wu, Xuanjing Huang

An Evaluation and Enhancement of Densitometric Fragmentation for Content Slicing Reuse

Killian Levacher, Seamus Lawless, Vincent Wade

Mathematical Equation Retrieval Using Plain Words as a Query

Shinil Kim, Seon Yang, Youngjoong Ko

Serial Position Effects of Clicking Behavior on Result Pages Returned by Search Engines

Mingda Wu, Shan Jiang, Yan Zhang

Towards Measruing the Visualness of a Concept

Jin-Woo Jeong, Xin-Jing Wang, Dong-Ho Lee

Fast Candidate Generation for Two-Phase Document Ranking: Postings List Intersection

with Bloom Filters

Nima Asadi, Jimmy Lin

Semantically Coherent Image Annotation with a Learning-based Keyword Propagation

Strategy

Chaoran Cui, Jun Ma, Shuaiqiang Wang, Shuai Gao, Tao Lian

Language Processing for Arabic Microblog Retrieval

Kareem Darwish, Walid Magdy, Ahmed Mourad

Hierarchical Image Annotation Using Semantic Hierarchies

Hichem Bannour, Cèline Hudelot

On the Inference of Average Precision from ScoreDistributions

Ronan Cummins

56


An Evaluation of Corpus-driven Measures of Medical Concept Similarity for Information

Retrieval

Bevan Koopman, Guido Zuccon, Peter Bruza, Laurianne Sitbon, Michael Lawley

A Constraint to Automatically Regulate Document-LengthNormalisation

Ronan Cummins, Colm O'Riordan

Bridging Offline and Online Social Graph Dynamics

Manuel Gomez Rodriguez, Monica Rogati

Predicting the Performance of Passage Retrieval for Question Answering

Eyal Krikon, David Carmel, Oren Kurland

Coarse-to-Fine Sentence-level Emotion Classification based on the Intra-sentence Features

and Sentential Context

Jun Xu, Ruifeng Xu, Qin Lu, Xiaolong Wang

Query-Performance Prediction and Cluster Ranking: Two Sides of the Same Coin

Oren Kurland, Fiana Raiber, Anna Shtok

Learning to Rank Search Results for Time-Sensitive Queries

Nattiya Kanhabua, Kjetil Nørvåg

On Active Learning in Hierarchical Classification

Yu Cheng, Kunpeng Zhang, Yusheng Xie, Ankit Agrawal, Alok Choudhary

Question-Answer Topic Model for Question Retrieval in Community Question Answering

Zongcheng Ji, Fei Xu, Bin Wang, Ben He

How Do Humans Distinguish Different People with Identical Names on the Web?

Harumi Murakami, Yuki Miyake

Enhancing Product Search by Best-Selling Prediction in E-Commerce

Bo Long, Jiang Bian, Anlei Dong, Yi Chang

Survival Analysis for Freshness in Microblogging Search

Gianni Amati, Giuseppe Amodeo, Carlo Gaibisso

Information Preservation in Static Index Pruning

Ruey-Cheng Chen, Chia-Jung Lee, Chiung-Min Tsai, Jieh Hsiang

Temporal Models for Microblogs

Jaeho Choi, W. Bruce Croft

I want what I need! Analyzing Subjectivity of Online Forum Threads

Prakhar Biyani, Cornelia Caragea, Amit Singh, Prasenjit Mitra

Improving the Performance of the Reinforcement Learning Model for Answering Complex

Questions

Yllias Chali, Sadid A. Hasan, Kaisar Imam

Relation Regularized Subspace Recommending for Related Scientific Articles

Qing Zhang, Jianwu Li, Zhiping Zhang, Li Wang

Exploring the Cluster Hypothesis, and Cluster-Based Retrieval, over the Web

Fiana Raiber, Oren Kurland

A Picture Paints a Thousand Words: a Method of Generating Image-text Timelines

Shize Xu, Liang Kong, Yan Zhang

57


Short-Text Domain Specific Key Terms/Phrases Extraction Using an n-gram Model with

Wikipedia

M. Atif Qureshi, Colm ORiordan, Gabriella Pasi

A New Probabilistic Model for Top-k Ranking Problem

Shuzi Niu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng

Large Scale Analysis of Changes in English Vocabulary over Recent Time

Adam Jatowt, Katsumi Tanaka

Climbing the App Wall: Enabling Mobile App Discovery through Context-Aware

Recommendations

Alexandros Karatzoglou, Linas Baltrunas, Karen Church, Matthias Böhmer

TwiSent: A Multistage System for Analyzing Sentiment inTwitter

Subhabrata Mukherjee, Akshat Malu, Balamurali A.R., Pushpak Bhattacharyya

Twitter Hyperlink Recommendation with User-Tweet-Hyperlink Three-way Clustering

Dehong Gao, Renxian Zhang, Wenjie Li, Yuexian Hou

Concavity in IR Models

Stèphane Clinchant

Extracting Interesting Association Rules from Toolbar Data

Ilaria Bordino, Debora Donato, Barbara Poblete

Predicting CTR of New Ads via Click Prediction

Alexander Kolesnikov, Yury Logachev, Valeriy Topinskiy

An Examination of Content Farms in Web Search using Crowdsourcing

Richard McCreadie, Craig Macdonald, Iadh Ounis, Jim Giles, Ferris Jabr

Demographic Context in Web Search Re-ranking

Eugene Kharitonov, Pavel Serdyukov

Poster Session (3:50 – 5:30) IR + DB Track Chair: Lipyeow Lim Room: Napili IR Track

On the Usefulness of Query Features for Learning to Rank

Craig Macdonald, Rodrygo L.T. Santos, Iadh Ounis

Session-based Query Performance Prediction

Andrey Kustarev, Yury Ustinovskiy, Anna Mazur, Pavel Serdyukov

A Latent Pairwise Preference Learning Approach for Recommendation from Implicit

Feedback

Yi Fang, Luo Si

Topic Based Pose Relevance Learning In Dance Archives

Reede Ren, John Collomosse, Joemon Jose

PhotoFall: Discovering Weblog Stories Through Photographs

Christopher Wienberg, Andrew S. Gordon

58


RESQ: Rank-Energy Selective Query Forwarding for Distributed Search Systems

Amin Teymorian, Xiao Qin, Ophir Frieder

The Face of Quality in Crowdsourcing Relevance Labels

Gabriella Kazai, Jaap Kamps, Natasa Milic-Frayling

Data Filtering in Humor Generation

Pawel Dybala, Rafal Rzepka, Kenji Araki, Kohichi Sayama

Predicting Primary Categories of Business Listings for Local Search

Changsung Kang, Jeehaeng Lee, Yi Chang

Where Do the Query Terms Come from? An Analysis of Query Reformulation in

Collaborative Web Search

Zhen Yue, Jiepu Jiang, Shuguang Han, Daqing He

Learning to Recommend with Social Relation Ensemble

Lei Guo, Jun Ma, Zhumin Chen, Haoran Jiang

A Scalable Approach For Performing Proximal Search For Verbose Patent Search Queries

Sumit

Bhatia, Bin He, Qi He, Scott Spangler

Is Wikipedia Too Difficult? Comparative Analysis of Readability of Wikipedia, Simple

Wikipedia and Britannica

Adam Jatowt, Katsumi Tanaka

Finding Food Entity Relationships using User-generated Data in Recipe Service

Young-joo Chung

SRGSIS: A Novel Framework Based on Social Relationship Graph for Social Image Search

Bo Lu, Ye Yuan, Guoren Wang

Exploring Simultaneous Keyword and Key Sentence Extraction: Improve Graph-based

Ranking Using Wikipedia

Xun Wang, Lei Wang, Jiwei Li, Sujian Li

Estimating Query Difficulty for News Prediction Retrieval

Nattiya Kanhabua, Kjetil Nørvåg

Recency-Sensitive Model of Web Page Authority

Maxim Zhukovskiy, Dmitry Vinogradov, Gleb Gusev, Pavel Serdyukov, Andrei

Raigorodskii

Evaluating Reward and Risk for Vertical Selection

Ke Zhou, Ronan Cummins, Mounia Lalmas, Joemon M. Jose

Contextual Evaluation of Query Reformulations in a Search Session by User Simulation

Jiepu Jiang, Daqing He, Shuguang Han, Zhen Yue, Chaoqun Ni

DB Track

Information-complete and Redundancy-free Keyword Search Over Large Data Graphs

Byron J. Gao, Zhumin Chen, Qi Kang

Spatial-aware Interest Group Queries in Location-based Social Networks

Yafei Li, Dingming Wu, Jianliang Xu, Byron Choi, Weifeng Su

59


Probabilistic Ranking in Fuzzy Object Databases

Thomas Bernecker, Tobias Emrich, Hans-Peter Kriegel, Matthias Renz, Andreas Züfle

Enabling Ontology Based Semantic Queries in Biomedical Database Systems

Shuai Zheng, Fusheng Wang, James Lu, Joel Saltz

Similarity Search in 3D Object-Based Video Data

Jakub Lokoc, Jurgen Wunschmann, Tomas Skopal, Albrecht Rothermel

Continuous Top-k Query for Graph Streams

Shirui Pan, Xingquan Zhu

Latent Topics in Graph-Structured Data

Christoph Böhm, Gjergji Kasneci, Felix Naumann

Fast and Accurate Incremental Entity Resolution Relative to an Entity Knowledge Base

Michael Welch, Chris Drome, Aamod Sane

60


CIKM’12 Demonstration Session

Wednesday, October 31, 2012 Chair: Amèlie Marian Room: Kula and Hana Demo Session 1 (10:05 – 12:10) IR and DB Demo Session 2 (1:30 – 3:35) DB and KM Demo Session 3 (3:50 – 5:30) KM and IR

KM Track

LUKe and MIKE:Learning from User Knowledge andManaging Interactive Knowledge

Extraction

Steffen Metzger, Michael Stoll, Katja Hose, Ralf Schenkel

PRAVDA-live: Interactive Knowledge Harvesting

Yafang Wang, Maximilian Dylla, Zhaochun Ren, Marc Spaniol, Gerhard Weikum

4Is of Social Bully Filtering: Identity, Inference, Influence, and Intervention

Yunfei Chen, Lanbo Zhang, Aaron Michelony, Yi Zhang

lonomics Atlas - A Tool To Explore Interconnected Ionomic, Genomic and Environmental

Data

Eduard C. Dragut, Mourad Ouzzani, Amgad Madkour, Nabeel Mohamed, Peter Baker,

David E. Salt

CarbonDB: a Semantic Life Cycle Inventory Database

Benjamin Bertin, Vasile-Marian Scuturici, Jean-Marie Pinon, Emmanuel Risler

Supporting Temporal Analytics for Health-Related Events in Microblogs

Nattiya Kanhabua, Sara Romano, AvarÈ Stewart, Wolfgang Nejdl

InCaToMi: Integrative Causal Topic Miner Between Textual and Non-textual Time Series

Data

Hyun Duk Kim, ChengXiang Zhai, Thomas A. Rietz, Daniel Diermeier, Meichun Hsu, Malu

Castellanos, Carlos A. Ceja Limon

A Tool for Automated Evaluation of Algorithms

Philipp Kranen, Stephan Wels, Tim Rohlfs, Sebastian Raubach, Thomas Seidl

IR Track

A Summarization Tool for Time-Sensitive Social Media

Walid Magdy, Ahmed Ali, Kareem Darwish

CrowdTiles: Presenting Crowd-based Information for Event-driven Information Needs

Stewart Whiting, Ke Zhou, Joemon Jose, Omar Alonso, Teerapon Leelanupab

ESA: Emergency Situation Awareness via Microbloggers

Jie Yin, Sarvnaz Karimi, Bella Robinson, Mark Cameron

Cager: A Framework for Cross-page Search

Zhumin Chen, Byron J. Gao, Qi Kang

61


Mixed-Initiative Conversational System using Question-Answer Pairs Mined from the Web

Wilson Wong, Lawrence Cavedon, John Thangarajah, Lin Padgham

PicAlert!: A System for Privacy-Aware Image Classification and Retrieval

Sergej Zerr, Stefan Siersdorfer, Jonathon Hare

TASE: A Time-Aware Search Engine

Sheng Lin, Peiquan Jin, Xujian Zhao, Lihua Yue

Gumshoe Quality Toolkit: Administering Programmable Search

Zhuowei Bao, Benny Kimelfeld, Yunyao Li, Sriram Raghavan, Huahai Yang

Simultaneous Realization of Page-centric Communication and Search

Yuhki Shiraishi, Jianwei Zhang, Yukiko Kawai, Toyokazu Akiyama

MOUNA: Mining Opinions to Unveil Neglected Arguments

Mouna Kacimi, Johann Gamper

DB Track

MAGIK: Managing Completeness of Data

Ognjen Savkovic, Mirza Paramita, Sergey Paramonov, Werner Nutt

Exploration of Monte-Carlo based Probabilistic Query Processing in Uncertain Graphs

Tobias Emrich, Hans-Peter Kriegel, Johannes Niedermayer, Matthias Renz, Andrè

Suhartha, Andreas Züfle

The Nautilus Analyzer: Understanding and Debugging Data Transformations

Melanie Herschel, Hanno Eichelberger

Demonstrating ProApproX 2.0: A Predictive Query Engine for Probabilistic XML

Asma Souihli, Pierre Senellart

HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple Twig

Pattern Queries

Hyebong Choi, Kyong-Ha Lee, Soo-Hyong Kim, Yoon-Joon Lee, Bongki Moon

MADden: Query-Driven Statistical Text Analytics

Christan Earl Grant, Joir-dan Gumbs, Kun Li, Daisy Zhe Wang, George Chitouras

STFMap: Query- and Feature-Driven Visualization of Large Time Series Data Sets

K. Seláuk Candan, Rosaria Rossini, Maria Luisa Sapino, Xiaolan Wang

Primates: A Privacy Management System for Social Networks

Imen Ben Dhia, Talel Abdessalem, Mauro Sozio

AMADA: Web Data Repositories in the Amazon Cloud

Andrès Aranda-Andújar, Francesca Bugiotti, Jesús Camacho-Rodrìguez, Dario Colazzo,

Franáois Goasdouè, Zoi Kaoudi, Ioana Manolescu

62


CIKM’12 Workshop Program

Monday, October 29, 2012

DUBMMSM - Data-driven User Behavioral Modelling and Mining from Social Media Room: Wailuku

9:00 - 9:15 am - Welcome and Introduction

9:15 - 9:30 am - Madness session (5 madness papers, 3 min each).

Analyzing Social Media Friendship for Personalization - Jonghyun Han, Hyunju Lee.

A Collective Synchronous Behavior Model on Social Media - Victor Liang, Vincent Ng

Probabilistic Macro Behavioral Targeting - Yusheng Xie

Pinteresting: Towards a Better Understanding of User Interests -Ana-Maria Popescu

The Framework of a People Recommender Based on a Time Series of User Preferences -

Kosuke Takano, Kin Fun Li

9:30 - 10:30 am - Paper Session 1 (2 paper + discussions)

Ranking and Combining Social Network Data for Web Personalization - Yi Zeng

Please Spread: Recommending Tweets for Retweeting with Implicit Feedback - Sheng Wang,

Xiaobo Zhou, Ziqi Wang, Ming Zhang

10:30 - 11:00 am - Coffee Break

11:00 - 12:00 pm - Paper Session 2 (2 paper + discussions)

Identifying and Characterizing User Communities on Twitter during Crisis Events - Aditi

Gupta, Anupam Joshi, Ponnurangam Kumaraguru

Using Social Data for Resume Job Matching - David Hardtke, Jacob Bollinger, Ben Martin

12:00 - 2 pm - Lunch

2:00 - 3:30 pm - Paper Session 3 (3 paper + discussions)

Twitter User Behavior Understanding with Mood Transition Prediction -Aditya Mogadala,

Vasudeva Varma

Analyzing Sentiments From Street Harassment Stories -Parvathi Chundi, April Corbet

Modeling Online Collective Emotions -David Garcia, Frank Schweitzer

3:30 - 4:00 pm - Coffee Break

4:00 - 5:00 pm - Panel

5:00 - 5:15 pm - Closing remarks

63



CloudDB - 2012 The Third International Workshop on Cloud Data Management

Room: Kahulul

8:15-8:30 -- Welcome by the Chair

8:30-10:00 -- Keynote Session 1: OLTP

8:30-9:15 -- Keynote 1: Carlo Curino, Microsoft. Benchmarking OLTP/Web Databases in the

Cloud: the OLTP-Bench Framework (Carlo Curino, Djellel Difallah, Andrew Pavlo, Phil

Cudre-Mauroux)

9:15-10:00 -- Keynote 2: Prof. Mohamed Sharaf, The University of QueenslandData

Freshness in Key-Value Data Stores

10:00-10:30 -- Coffee Break

10:30-12:00 -- Session 1: Workload-Aware Processing

10:30-11:00 -- Toward Non-Intrusive Elastic Query Processing in the Cloud. Ticiana Coelho

Da Silva, Mário Nascimento, Jose Macedo, Flávio R. C. Sousa, Javam Machado

11:00-11:30 -- The Yahoo! Cloud Datastore Load Balancer. Markus Klems, Adam Silberstein,

Jianjun Chen, Masood Mortazavi, Andrews Albert Sahaya, P.P.S. Narayan

11:30-12:00 -- HEDC: A Histogram Estimator For Data in the Cloud. Yingjie Shi, Xiaofeng

Meng, Fusheng Wang, Yantao Gan

12:00-1:30 -- Lunch

1:30-3:00 -- Keynote Session 2: Analytics and Social

1:30-2:15 -- Keynote 3: Prof. Geoffrey Fox, University of Indiana, Bloomington: Large Scale

Data Analytics on Clouds

2:15-3:00 -- Keynote 4: Prof. Ashwin Machanavajjhala, Duke University: Challenges in

Enabling Social Applications At Scale

3:00-3:30 -- Coffee Break

64


3:30-5:30 -- Session 2: Security, Privacy, Analytics

3:30-3:55 -- Cloud Computing for Environment-Friendly Data Centers. Michael Pawlish,

Aparna S. Varde, Stefan A. Robila

3:55-4:20 -- A Security Aware Stream Data Processing Scheme on the Cloud and its Efficient

Execution Methods. Katsuhiro Tomiyama, Hideyuki Kawashima, Hiroyuki Kitagawa

4:20-4:55 -- Differentially Private Top-k Query over MapReduce. Xu Han, Miao Wang,

Xiaojian Zhang, Xiaofeng Meng

4:55-5:20 -- Facilitating Real-Time Graph Mining. Zhuhua Cai, Dionysios Logothetis, Georgos

Siganos

5:20-5:30 -- Wrap-up and Summary


WKR/CDMW - The 2012 International Workshop on Web-scale Knowledge Representation Retrieval

and Reasoning & City Data Management 2012 Workshop

Room: Kihei

9:00-9:20 -- Introduction [Spyros Kotoulas (IBM Research)]

Session 1 Web-scale Knowledge Representation, Retrieval and Reasoning. Session Chair: Yi Zeng

(Chinese Academy of Sciences)

9:20-9:45 -- A Distributed, Semiotic-Inductive, and Human-Oriented Approach to Web-Scale

Knowledge Retrieval. Edy Portmann, Michael Alexander Kaufmann, Cédric Graf

9:45-10:10 -- OmpiJava - A Tool For Development Of High-Performance Reasoning

Applications For The Semantic Web. Alexey Cheptsov

10:10-10:35 -- Efficient Mining of Correlated Sequential Patterns Based on Null Hypothesis.

Cindy Xide Lin, Ming Ji, Marina Danilevsky, Jiawei Han

Session 2 City Data Management . Session Chair: Spyros Kotoulas (IBM Research)

10:50-11:15 -- DataBridges: Data Integration for Digital Cities. Melanie Herschel, Ioana

Manolescu

11:15-11:40 -- U2STRA: High-Performance Data Management of Ubiquitous Urban Sensing

Trajectories on GPGPUs. Fatiha Amanzougarene, Mohamed Chachoua, Karine Zeitouni

11:40-12:05 -- Qualitative Representation of Building Sites Annoyance. Jianting Zhang,

Simin You, Le Gruenwald

12:05-12:30 -- Discussion, Chair: Yi Zeng (Chinese Academy of Sciences)

65



SHB - International Workshop on Smart Health and Wellbeing Room: Wailea

Opening ceremony

Keynote Speech

Session 1:

Moving from Descriptive to Causal Analytics: Case Study of Discovering Knowledge from US

Health Indicators Warehouse. Jack Schryver, Mallikarjun Shankar and Songhua Xu

An Automated Data Utility Clustering Methodology using Data Constraint Rules. Stuart

Morton, Malika Mahoui and P. Joseph Gibson

Designing the Reconciled Schema for a Pharmacovigilance Data Warehouse Through a

Temporally-Enhanced ER Model. Riccardo Lora, Alberto Sabaini, Carlo Combi and Ugo

Moretti

Session 2:

Towards Large-scale Twitter Mining for Drug-related Adverse Events. Jiang Bian, Umit

Topaloglu and Fan Yu

Social Media Mining for Drug Safety Signal Detection. Christopher C. Yang, Haodong Yang,

Ling Jiang, and Mi Zhang

Session 3:

An Architecture for Personalized Health Information Retrieval. Nikhil Yadav and Christian

Poellabauer

Combining Multi-level Evidence for Medical Record Retrieval. Dongqing Zhu and Ben

Carterette

Simulating Prosthetic Vision with Distortions for Retinal Prosthesis Design. Mahadevan

Subramaniam, Parvathi Chundi and Eyal Margalit

66



BookOnline - Online Books, Complementary Social Media Room: Kapalua

8.50-9.00 Welcome

9.00-10.10 Keynote Session Chair: Gabriella Kazai (Microsoft Research)

Maribeth Back (FX Palo Alto) Revisiting the Future of Reading: The Research and Design Behind XFR

10.10-10.30 Session: Search and Discovery Session Chair: Monica Landoni (University of Lugano)

eBook meets Tabletop: Using Collaborative Visualization for Search and Serendipity in On-line Book Repositories R. Rädle, A. Weiler, S. Huber, H.-C. Jetter, S. Mansmann, H. Reiterer, and M. Scholl

10.30-11.00 Coffee

11.00-11.40 Session cont’d: Search and Discovery Session Chair: Monica Landoni (University of Lugano)

Spread Co-citation Relationship as a Measure for Document Retrieval M. Eto

Search and Exploration of Scanned Books M.-A. Cartright, J. Dalton, and J. Allan

11.45-12.25 Session: Personalization and Recommendation Session Chair: Peter Brusilovsky (University of Pittsburgh)

Personalized Recommendations on Books for K-12 Readers M. S. Pera and Y.-K. Ng

Stylometric Relevance-feedback towards a Hybrid Book Recommendation Algorithm P. C. Vaz, D. M. de Matos, and B. Martins

67


12.30-13.30 Lunch

13.30-14.40 Keynote Session Chair: Gabriella Kazai (Microsoft Research)

Natasa Milic-Frayling (Microsoft Research) The Future of Digital

14.40-15.20 Session: Reading Experience Beyond Text Session Chair: Carsten Eickhoff (Delft University of Technology) Accessible, Large-Print, Listening & Talking E-book (ALLT) A. Attarwala, R. Baecker, and C. Munteanu

Need for Automatically Generated Narration D. A. Evans and J. Reichenbach

15.20-16.00

16.00-16.30

16.30-17.00

17.00-17.30

17.30

Open-Discussion

Coffee

Open-Discussion cont’d

Report back

Close

68



DTMBIO – ACM Sixth International Workshop on Data and Text Mining in Biomedical Informatics Room: Napili

Session 1: Keynote Address. Session Chair: Doheon Lee (KAIST, Korea)

Gwan-Su Yi (KAIST)

Session 2: Mining Clinical Data and Text. Session Chair: Hua Xu (Vanderbilt University, US)

Lexicon-free and context-free drug names identification methods using Hidden Markov

Models and Pointwise Mutual Information (Jacek Malyszko ; Agata Filipowska)

Clinical Entity Recognition using Structural Support Vector Machines with Rich Features

(Buzhou Tang; Hongxin Cao; Yonghui Wu; Min Jiang ; Hua Xu)

Coffee Break

Inferring Appropriate Eligibility Criteria in Clinical Trial Protocols Without Labeled Data

(Angelo Restificar ; Sophia Ananiadou)

Predicting Baby Feeding Method from Unstructured Electron Health Record (Ashwani Rao;

Kristin Maiden; Benjamin Carterette ; Deborah Ehrentha)

Extracting Structured Information from Free-Text Medication Prescriptions Using

Dependencies (Andrew MacKinlay ; Karin Verspoor)

Lunch Break

Session 3: Mining Biological Data and Text. Session Chair: Min Song (Yonsei University, Korea)

Indexing Methods for Efficient Protein 3D Surface Search. Sungchul Kim; Sael Lee ; Hwanjo

Yu

Protein Complex Prediction via Bottleneck-Based Graph Partitioning. Jaegyoon Ahn; Dae

Hyun Lee; Youngmi Yoon; Yunku Yeu ; Sanghyun Park

Finding associations among SNPs for prostate cancer using collaborative filtering. Rohit

Kugaonkar; Aryya Gangopadhyay; Yelena Yesha; Anupam Joshi; Yaacov Yesha; Michael

Grasso; Mary Brady ; Napthali Rishe

Prediction of E3-specific Substrates by Using Known E3-Substrate Network. Youngwoong

Han ; Gwan-Su Yi

Detecting Type 2 Diabetes Causal SNP Combinations from GWAS Dataset with Optimal

Filtration. Chiyong Kang; Hyeji Yu ; Gwan-Su Yi

69


Coffee Break

TNMCA: Generation and Application of Network Motif-Based Inference Models for Drug

Repositioning. Jaejoon Choi; Kwangmin Kim; Min Song ; Doheon Lee

High Precision Rule Based PPI Extraction and Per-Pair Basis Performance. Junkyu Lee;

Seongsoon Kim; Sunwon Lee; Kyubum Lee ; Jaewoo Kang

Rule-based whole body modeling for analyzing multi-compound effects Woochang Hwang; Yongdeuk Hwang; Sunjae Lee ; Doheon Lee


MIXHS - The 2nd International Workshop on Managing Interoperability and complexity in Health Systems Room: Kula + Hana

09:00-9:10 Welcome

9:10-10:40 Session 1: Ontology-based Application on Clinical Data. Session Chair: Guoqian Jiang,

(Mayo Clinic, USA)

Clinical Clarity versus Terminological Order – The Readiness of SNOMED CT Concept

Descriptors for Primary Care. Zhe He, Michael Halper, Yehoshua Perl and Gai Elhanan

Extraction and analysis of the structure of labels in biomedical ontologies. Manuel Quesada-

Martínez, Jesualdo Tomás Fernández-Breis and Robert Stevens

Clinical Data Analysis using Ontology-guided Rule Learning. Hua Min and Janusz Wojtusiak

10:40-11:00 Coffee Break

11:00-12:30 Session 2: Electronic Health Systems Interoperability and Integration. Session Chair:

Cui Tao (Mayo Clinic, USA)

Harmonization of Detailed Clinical Models with Clinical Study Data Standards. Guoqian

Jiang, Julie Evans, Tom Oniki, Joey Coyle, Landen Bain, Stan Huff, Rebecca Kush and

Christopher Chute

Modeling UIMA Type System Using Web Ontology Language – towards Interoperability

among UIMA-based NLP Tools. Hongfang Liu, Stephen Wu, Cui Tao and Christopher Chute

Quality Assessement of Electronic Health Information Management Systems. Matt-Mouley

Bouamrane, Cui Tao and Frances Mair

70


12:30-14:00 Lunch Break

14:00-16:00 Session 3: Bio-Medical Knowledge Representation & Engineering. Session Chair: Hua

Min, (George Mason University, USA)

Construction and Maintenance of Clinical Pathway using Data Mining Methods. Shusaku

Tsumoto, Haruko Iwata and Shoji Hirano

Optimizing Semantic MEDLINE for Translational Science Studies Using Semantic Web

Technologies. Cui Tao, Yuji Zhang, Guoqian Jiang, Matt Mouley Bouamrane and Christopher

Chute

Bridging the Unstructured and Structured Worlds: an Adaptive Self Learning Medical Form

Generating System. Shuai Zheng, Fusheng Wang and James Lu

A Hybrid Approach to Finding Negated and Uncertain Expressions in Biomedical

Documents. Kazuki Fujikawa, Kazuhiro Seki and Kuniaki Uehara

16:00 Workshop Concluding remarks

71


Friday, November 2, 2012

PIKM - The 5th Ph.D. Workshop in Information and Knowledge Management Room: Wailuku

8:45 Opening: Aparna Varde

9:00 Session 1: Database Systems

9:00 When Big Data Leads to Lost Data. V.M. Megler and David Maier

9:30 Querying External Source Code Files of Programs Connecting to a Relational

Database. Carlos Garcia-Alvarado and Carlos Ordonez

10:00 SciQL: A Query Language for Unified Scientific Data Processing and Management.

Javad Chamanara and Birgitta König-Ries

10:30 Coffee Break

11:00 Session 2: Knowledge Management / Data Mining

11:00 Feature Selection for Link Prediction. Ye Xu and Dan Rockmore

11:30 Exploring and Analyzing Documents with Online Analytical Processing. Grzegorz

Drzadzewski and Frank Tompa

12:00 Is That Scene Dangerous?: Transferring Knowledge Over a Video Stream. Omar U

Florez and Curtis Dyreson

12:30 Lunch Break

14:00 Keynote: Advice for Young Jedi Knights and PhD Students (Invited talk) Ingmar Weber

15:00 Session 3: Information Retrieval

15:00 iTop: Interaction Based Topic Centric Community Discovery on Twitter. Denzil

Correa, Ashish Sureka and Mayank Pundir

15:30 Coffee Break

16:00 Search Tactics as Means of Examining Search Processes in Collaborative Exploratory

Web Search. Zhen Yue, Shuguang Han, Jiepu Jiang and Daqing He

16:30 Assessing the Relationship between Context, User Preferences, and Content in

Search Behavior. Hanna Knäusl and Bernd Ludwig

17:00 Recommendation Using Linked Data. Rouzbeh Meymandpour and Joseph Davis

72


17:30-18:30 Session 4: Posters

Intent-Aware Temporal Query Modeling for Keyword Suggestion. Fredrik Johansson, Tobias

Färdig, Vinay Jethava and Svetoslav Marinov

Towards an Advanced System for Real-Time Event Detection in High Volume Data Streams.

Andreas Weiler, Svetlana Mansmann and Marc Scholl

Multilevel Business Process Modeling: Motivation, Approach, Design Issues and

Applications. Christoph Schütz, Michael Schrefl and Lois Delcambre

Towards a More Efficient and Personalized Advertisement Content in On-line Social Networks. Patxi Galán-García, Carlos Laorden and Pablo G. Bringas


PLEAD - Politics, Elections and Data Room: Kahului

Session 1:

Invited talk: "The Diffusion of Political Memes in Social Media" by Filippo Menczer

Session 2:

"From Twindex to PredictWise: A Quick Overview of Political Analysis Tools" by Ingmar

Weber

"Political Polarization and Popularity in Online Participatory Media: an Integrated

Approach" by David Garcia, Fernando Mendez, Uwe Serdult and Frank

Session 3:

"Party Cohesion in Presidential Races: Applying Social Network Theory to the 2011

Preprimary" by Andrew Dowdle, Song Yang, Scott Limbocker, Patrick Stewart and Karen

Sebold

"Opinions Network for Politically Controversial Topics" by Rawia Awadallah, Maya

Ramanath and Gerhard Weikum

"French Presidential Elections: What are the Most Efficient Measures for Tweets?" by

Flavien Bouillot, Pascal Poncelet, Mathieu Roche, Dino Lenco, Elnaz Bigdeli and Stan Matwin

Session 4:

"The Price of Precision: Voter Microtargeting and its Potential Harms to the Democratic

Process" by Solon Barocas

Panel discussion with representatives from academia, industry and media; moderated by Ana-

Maria Popescu

73



ESAIR - Fifth Workshop on Exploiting Semantic Annotations in Information Retrieval Room: Kihei

9:15-10:00 Keynote Session I: Keynote Presentation [Chair: Jussi Karlgren]

10:00-10:30 Coffee Break

10:30-11:15 Keynote Session II: Keynote Presentation [Chair: Jaap Kamps]

11:15-12:30 Boaster and Poster Session [Chair: Jaap Kamps]

Krisztian Balog and Kjetil Nørvåg / On the Use of Semantic Knowledge Bases for

Temporally-aware Entity Retrieval

Amitava Das and Björn Gambäck / Exploiting 5W Annotations for Opinion Tracking

Ann-Marie Eklund / Why query annotations may help in providing accurate public health

information

Sumio Fujita, Georges Dupret and Ricardo Baeza-Yates / Semantics of Query Rewriting

Patterns in Search Logs

Arunav Mishra, Sairam Gurajada and Martin Theobald / Design and Evaluation of an IR-

Benchmark for SPARQL Queries with Full-text Conditions

Tadashi Nomoto and Noriko Kando / Conceptualizing Documents with Wikipedia

Sana Sellami and Claudia Catalin Gutiérrez Rodríguez / Semantic Annotation: What About

Quality?

Petr Sojka / Exploiting Semantic Annotations in Math Information Retrieval

Giovanni Yoko Kristianto, Goran Topic, Minh-Quoc Nghiem and Akiko Aizawa / Annotating

Scientific Papers for Mathematical Formulae Search

Masaharu Yoshioka and Noriko Kando / Multifaceted analysis of news articles by using

semantic annotated information

12:30-14:00 Lunch

14:00-15:30 Breakout session: Two breakout groups in parallel

semantic search [Chair/Reporter: Jussi Karlgren/Peter Mika]

structured retrieval [Chair/Reporter: Vanessa Murdock/Jaap Kamps]

15:30-16:00 Coffee

16:00-17:30 Final session [Chair: tba]: Reporting from breakout groups and concluding remarks.

18:00++ Social program: Dinner and symposium drinks and continued discussion!

74


Friday, November 2, 2012 ClowdSens - 1st International Workshop on Multimodal Crowd Sensing

Room: Wailea

Keynote Address. Session Chair: Haggai Roitman (IBM Research)

Invited Talk, Ido Guy: Crowdsourcing in the enterprise

Session 1. Session Chair: Haggai Roitman (IBM Research)

Algorithm for Representative Democracy Voting in Social Network. Zeinab Saeidi

Conceptual Modeling Principles for Crowdsourcing. Roman Lukyanenko; Jeffrey Parsons

Event Detection using Twitter and Structured Semantic Query Expansion. Heather S. Packer;

Sina Samangooei; Jonathon S. Hare; Nicholas Gibbins; Paul Lewis

Invited Talk. Session Chair: Haggai Roitman (IBM Research)

Invited Talk, Manuel Cebrian: Using Friends as Sensors to Detect Planetary-Scale Contagious

Outbreaks

Session 2. Session Chair: Haggai Roitman (IBM Research)

Harnessing the Crowds for Smart City Sensing. Haggai Roitman; Jonathan Mamou; Sameep

Mehta; Aharon Satt; L. V. Subramaniam

Greaaaat bargains starting from just 99p!!!! :-) Brand Perception in the Social Media. Michal

Shmueli-Scheuer; Benjamin Sznajder; Doron Cohen; Ariel Raviv; David Konopnicki, Haggai

Roitman

Session 3: Discussion. Session Chair: Haggai Roitman (IBM Research)

75



IKM2DR - Information and Knowledge Management for Developing Regions

Room: Kapalua

First AM Session: (chair: Nitendra Rajput)

Keynote address [Ricardo Baeza-‐Yates, Yahoo]

Direction Setting Panel (Ricardo Baeza-‐Yates, Doug Oard, Nitendra Rajput)

Second AM Session: (chair: Luz Quiroga)

Domain-‐specific search in Indian Languages [Nikihil Pattisapu, SKYPE]

Speech retrieval for India [Pekka Kallioniemi]

Query by Babbling for speech retrieval [Doug Oard]

Named entity recognition for Indian Languages (Mahathi Bhagavatula, SKYPE]

Lunch and Discussion Tables: Two discussion tables, led by Krishna

Kummamuru, Luz Quiroga

First PM Session: (chair: William Webber)

Invited talk (abstract uploaded) [Anitha Kannan, MSR]

Two parallel breakout sessions to develop a research agenda (Facilitator: Rajput)

Second PM Session: (chair: Doug Oard)

Report-‐outs from the three breakout sessions (by group-selected reporters)

Future directions panel discussion (Luz Quiroga, Nitendra Rajput, William Webber)

No-‐Host Evening Events:

Drinks: 5:30 PM Hula Grill (sunset is 5:54 PM)

Dinner: 6:30 PM Hula Grill

76



WIDM - Twelfth International Workshop on Web Information and Data Management Room: Napili

8:45-9:00 Welcome

9:00-10:00 Keynote

Search Beyond the Web: Data from Social Networks and Native Apps. Maria Grineva.

10:00-10:30 Web Data I

Modeling Topic Trends on the Social Web Using Temporal Signatures. Laura Christiansen,

Thomas Schimoler, Robin Burke and Bamshad Mobasher.

10:30-11:00 Coffee break

11:00-12:30 Web Data II

XPath satisfiability with downward and sibling axes is tractable under most of real-world

DTDs. Yasunori Ishihara, Kenji Hashimoto, Shogo Shimizu and Toru Fujiwara.

A Multi-layer Data Representation of Trajectories in Social Networks Based on Points of

Interest. Reinaldo Braga, Ali Tahir, Michela Bertolotto and Hervé Martin.

A Distributed Index for Efficient Parallel Top-k Keyword Search on Massive Graphs. Ming

Zhong and Mengchi Liu.

12:30-14:00 Lunch break

14:00-15:30 Web Context

Managing Analysis Context. Hua Li and Rafael Alonso.

Using Social Tags to Infer Context in Hybrid Music Recommendation. Negar Hariri,

Bamshad Mobasher and Robin Burke.

SNOPS: a smart environment for Cultural Heritage applications. Vincenzo Moscato, Antonio

Picariello, Angelo Chianese, Flora Amato and Giancarlo Sperlì.

15:30-16:00 Coffee break

16:00-17:30 Web Information Engineering

Web Crawler Middleware for Search Engine Digital Libraries: A Case Study for CiteSeerX.

Jian Wu, Pradeep Teregowda, Madian Khabsa, Douglas Jordan and C. Lee Giles.

TitleFinder: Extracting the Headline of News Web Pages based on Cosine Similarity and

Overlap Scoring Similarity. Hadi Mohammadzadeh, Thomas Gottron, Franz Schweiggert and

Gerhard Heyer.

M3D: A Tool for the Model Driven Development of Web Applications. Mario Luca Bernardi,

Marta Cimitile, Giuseppe Di Lucca and Fabrizio Maria Maggi.

17:30-17:45 Closing remarks

77



DOLAP - Fifteenth International Workshop on Data Warehousing and OLAP Room: Kula + Hana

8:30-8:45 Workshop Welcome and Introduction. Matteo Golfarelli

8:45-9:30 Invited Talk. Chair: Il-Yeol Song

Kostamaa, Pekka – Teradata. Efficient Big Data Analytics using SQL and Map-Reduce

9:30-10:45 Session 1: OLAP Query processing and Trends. Chair: Alkis Simitsis

Bernd Neumayr, Stefan Anderlik and Michael Schrefl. Towards Ontology-based OLAP:

Datalog-based Reasoning over Multidimensional Ontologies (25 mins)

Patrick Marcel, Rokia Missaoui and Stefano Rizzi. Towards Intensional Answers to OLAP

Queries for Analytical Sessions (25 mins)

Carlos Garcia-Alvarado and Carlos Ordonez. Query Processing on Cubes with Dimension

Ontologies (25 mins)

11:05-13:00 Session 2: Data Warehouse Design and Maintainability. Chair: Alfredo Cuzzocrea

Petar Jovanovic, Oscar Romero, Alkis Simitsis and Alberto Abello. ORE: An Iterative

Approach to the Design and Evolution of Multi-Dimensional Schemas (25 mins)

Svetlana Mansmann, Nafees Ur Rehman, Andreas Weiler and Marc H Scholl. Discovering

OLAP dimensions in semi-structured data (25 mins)

Nicolas Prat, Imen Megdiche and Jacky Akoka. Multidimensional Models Meet the Semantic

Web: Defining and Reasoning on OWL-DL Ontologies for OLAP (25 mins)

Alejandro Mat, Juan Trujillo, Elisa De Gregorio and Il-Yeol Song. Improving the

Maintainability of Data Warehouse Designs: Modeling Relationships between Sources and

Requirements (25 mins)

Stefan Berger and Michael Schrefl. FedDW Global Schema Architect - UML-based Design

Tool for the Integration of Logical Data Mart Schemas (16 mins)

13:00-14:00 Lunch break

14:00-15:45 Session 3: Performance and Benchmarking. Chair: Carlos Ordonez

Stephan Mueller. An In-Depth Analysis of Data Aggregation Cost Factors in a Columnar In-

Memory Database (25 mins)

Chantola Kit, Marouane Hachicha and J. Darmont. Benchmarking Summarizability

Processing in XML Warehouses with Complex Hierarchies(16 mins)

Craig Stanfill. Type 2 Slowly Changing Dimensions: A Case Study Using the Co>Operating

System. (16 mins)

Jianting Zhang, Simin You and Le Gruenwald. High-Performance Online Spatial and

Temporal Aggregations on Multi-core CPUs and Many-Core GPUs (16 mins)

Doulkifli Boukraa, Omar Boussaid, Fadila Bentayeb and Djamel Eddine Zegour. Managing a

Fragmented XML Data Cube with Oracle and Timesten (16 mins)

Arian Baer and Lukasz Golab. Towards Benchmarking Stream Data Warehouses (16 mins)

78


16:05-17:30 Session 4: Warehousing complex data. Chair: Patrick Marcel

Elio Masciari. Warehousing and Querying Trajectory Data Streams With Error Estimation

(25 mins)

Michel De Rougemont and Phuong Thao Cao. Approximate Answers to OLAP Queries on

Streaming Data Warehouses (25 mins)

Alfredo Cuzzocrea and Paolo Serafino. Enhanced Clustering of Complex Database Objects in

the ClustCube Framework (16 mins)

Mu Yin, Bin Wu and Zengfeng Zeng. HMGraph OLAP: A Novel Framework for Multi-

dimensional Heterogeneous Network Analysis (16 mins)

79


Courtyard (Tuesday October 30, 2012 6:40pm-9:00pm)

We take please in welcoming delegates registered to the main conference for a Welcome Reception held in the Sheraton Maui Resort, Courtyard

80


Conference Banquet (Wednesday October 31, 2012 5:30pm-8:30pm)

Join the Tihati cast on a spectacular voyage through the South Pacific. An exciting and colorful presentation of the Polynesians from Hawaii, Tahiti, Rarotonga, New Zealand and Samoa, set to pulsating and syncopated drum beats. From the celebrative festival dances of Tahiti, the exotic hypnotizing dances of the Tuamotus, the legendary love story of the enchanted winds of Maui, the breathtaking fireknife dance of Samoa. For those in love, dance with a loved one during the Hawaiian Wedding song.

5:30 p.m. Lei greeting, cocktails, Hawaiian music, Hawaiian games and activities.

6:15 p.m. Imu

6:30 p.m. All-You-Can-Eat Hawaiian Luau Buffet

7:30 p.m. "Tihati Polynesia Revue"

Note: Dependent on crowd size and weather, luau may start/end earlier.

Grilled Teriyaki Steak (Off the grill) Sauteed Mahi Mahi with Macadamia Nuts & Capers Lomi Lomi Salmon Kula Greens with Cucumber & Tomato Salad Pasta Salad Sliced Papaya and Pineapple

Kalua Pork Steamed Sweet Potatoes Poi Potato Macaroni Salad Fried Rice Assorted Rolls & Butter Assorted Desserts Coffee & Tea

81


The thought of lying on sun soaked beaches regularly named “the best” by travel magazines is enough to make any of your friends jealous. But once you arrive on Maui, you’ll see there’s so much more for them to envy. Most flights arrive at Maui’s main airport, Kahului Airport (OGG). Many airlines fly direct to Maui while others include Maui as a stopover. You’ll find resorts and hotels of every size and budget in Kapalua, Kaanapali, Lahaina, Kihei, Makena and Wailea on the sunny western coast as well as one resort in Hana in East Maui. It’s about a 45-minute drive from Kahului Airport to Lahaina. Once you’ve settled in you’ll want to explore Maui’s sweeping canvas of attractions. The western, or leeward side, is the drier side of the island and features Maui’s world-famous beaches including the beautiful Kaanapali Beach, home to a nightly sunset cliff diving ceremony. West Maui is also home to historic Lahaina, where you can find great shopping, dining and entertainment.

The eastern, or windward side, of the island is the wetter side of the island, home to the lush Iao Valley and the scenic road to Hana. The cool, elevated slopes of Haleakala are where you can find the farms and gardens of Upcountry Maui and the soaring summit of Haleakala National Park. There is so much to see and do on Maui it’s best to plan ahead. Just don’t forget to send your friends a postcard.

Featured Sites and Attractions

Lahaina, Maui Lahaina is a historic whaling village and lively west Maui hot spot. LEARN MORE Maui, Whale Watching, Culture and Arts, History, Luau, Dining, Shopping

Haleakala National Park, Maui Haleakala National Park, a scenic national park on the island of Maui and home to Maui’s highest peak. LEARN MORE Maui, History, Landmark, Land Adventures, Hiking Hana, Maui Hana is a small, untouched town on Maui’s eastern coastline. To get here visitors must travel one of the world’s most scenic drives. LEARN MORE Maui, Natural Beauty, Waterfalls, Off the beaten path, Landmark, Culture and Arts, Land Adventures Kula, Maui Upcountry Maui’s rustic town of Kula is known for its produce farms and botanical gardens. LEARN MORE Maui, Natural Beauty, Plantations, Farms and Gardens, Off the beaten path

http://www.gohawaii.com/maui/guidebook/topics/resorts-of-maui

http://www.gohawaii.com/maui/regions-neighborhoods/west-maui/kapalua

http://www.gohawaii.com/maui/regions-neighborhoods/west-maui/kaanapali-beach

http://www.gohawaii.com/maui/regions-neighborhoods/west-maui/lahaina

http://www.gohawaii.com/maui/regions-neighborhoods/south-maui/kihei

http://www.gohawaii.com/maui/regions-neighborhoods/south-maui/wailea

http://www.gohawaii.com/maui/regions-neighborhoods/east-maui/hana

http://www.gohawaii.com/maui/regions-neighborhoods/east-maui

http://www.gohawaii.com/maui/guidebook/topics/beaches-of-maui

http://www.gohawaii.com/maui/regions-neighborhoods/west-maui/kaanapali-beach

http://www.gohawaii.com/maui/regions-neighborhoods/west-maui


http://www.gohawaii.com/maui/regions-neighborhoods/central-maui/iao-valley-state-park


http://www.gohawaii.com/maui/regions-neighborhoods/upcountry-maui

http://www.gohawaii.com/maui/regions-neighborhoods/upcountry-maui/haleakala-national-park



http://www.gohawaii.com/maui/search?searchTerm=%20&keywords=&searchText=Maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:whalewatching%20OR%20SecondaryTag:whalewatching%20OR%20DerivedTag:whalewatching%29&keywords=whalewatching&searchText=Whale%20Watching&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:cultureandarts%20OR%20SecondaryTag:cultureandarts%20OR%20DerivedTag:cultureandarts%29&keywords=cultureandarts&searchText=Culture%20and%20Arts&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:history%20OR%20SecondaryTag:history%20OR%20DerivedTag:history%29&keywords=history&searchText=History&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:luau%20OR%20SecondaryTag:luau%20OR%20DerivedTag:luau%29&keywords=luau&searchText=Luau&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:dining%20OR%20SecondaryTag:dining%20OR%20DerivedTag:dining%29&keywords=dining&searchText=Dining&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:shopping%20OR%20SecondaryTag:shopping%20OR%20DerivedTag:shopping%29&keywords=shopping&searchText=Shopping&mIslandId=maui




http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:history%20OR%20SecondaryTag:history%20OR%20DerivedTag:history%29&keywords=history&searchText=History&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:landmark%20OR%20SecondaryTag:landmark%20OR%20DerivedTag:landmark%29&keywords=landmark&searchText=Landmark&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:landadventures%20OR%20SecondaryTag:landadventures%20OR%20DerivedTag:landadventures%29&keywords=landadventures&searchText=Land%20Adventures&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:hiking%20OR%20SecondaryTag:hiking%20OR%20DerivedTag:hiking%29&keywords=hiking&searchText=Hiking&mIslandId=maui




http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:naturalbeauty%20OR%20SecondaryTag:naturalbeauty%20OR%20DerivedTag:naturalbeauty%29&keywords=naturalbeauty&searchText=Natural%20Beauty&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:waterfalls%20OR%20SecondaryTag:waterfalls%20OR%20DerivedTag:waterfalls%29&keywords=waterfalls&searchText=Waterfalls&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:offthebeatenpath%20OR%20SecondaryTag:offthebeatenpath%20OR%20DerivedTag:offthebeatenpath%29&keywords=offthebeatenpath&searchText=Off%20the%20beaten%20path&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:landmark%20OR%20SecondaryTag:landmark%20OR%20DerivedTag:landmark%29&keywords=landmark&searchText=Landmark&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:cultureandarts%20OR%20SecondaryTag:cultureandarts%20OR%20DerivedTag:cultureandarts%29&keywords=cultureandarts&searchText=Culture%20and%20Arts&mIslandId=maui



http://www.gohawaii.com/maui/regions-neighborhoods/upcountry-maui/kula

http://www.gohawaii.com/maui/regions-neighborhoods/upcountry-maui/kula


http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:naturalbeauty%20OR%20SecondaryTag:naturalbeauty%20OR%20DerivedTag:naturalbeauty%29&keywords=naturalbeauty&searchText=Natural%20Beauty&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:plantations,farmsandgardens%20OR%20SecondaryTag:plantations,farmsandgardens%20OR%20DerivedTag:plantations,farmsandgardens%29&keywords=plantations,farmsandgardens&searchText=Plantations,%20Farms%20and%20Gardens&mIslandId=maui

http://www.gohawaii.com/maui/search?searchTerm=++AND%20%28PrimaryTag:offthebeatenpath%20OR%20SecondaryTag:offthebeatenpath%20OR%20DerivedTag:offthebeatenpath%29&keywords=offthebeatenpath&searchText=Off%20the%20beaten%20path&mIslandId=maui

82


CIKM’12 SPONSORS

GOLD sponsors

SILVER sponsors

BRONZE sponsors

http://www.google.com/

http://www.adobe.com/

http://global.rakuten.com/

http://www.yandex.com/

http://research.microsoft.com/en-us/

http://labs.ebay.com/

http://labs.yahoo.com/

http://www.research.ibm.com/