Human-in-the-loop data exploration is seeing a renewed interest in our big data science and data management community. With the rise of big data analytics, this area is growing to encompass not only approaches and algorithms to find the next best data items to explore but also the aspect of interactivity, i.e. accounting for feedback from the human users during exploration. Interactivity is essential to account for evolving needs during the exploration and also customize the discovery process. In this tutorial, we focus on exploration of Composite Items (CIs) that requires repeated interaction with human users. CIs address complex information needs that arise in a wide variety of emerging applications.
The tutorial will have the following parts: We will first review CI applications and shapes (15mn). We then discuss three big research questions (60 mn): (i) existing algorithms for CI formation, (ii) human-in-the-loop CIs, and (iii) optimization opportunities. We will conclude with research directions (15mn).
The proposed tutorial is timely. It brings together several related efforts and addresses unsolved questions in the emerging area of human-in-the-loop exploration of complex information needs. The tutorial is relevant to the general area of data science and more specifically to Scalable Analytics, Data Mining, Clustering and Knowledge Discovery, Indexing, Query Processing and Optimization, and Crowdsourcing. The technical topics covered are constrained optimization, ranking semantics, clustering, algorithms, and empirical evaluations.

Senjuti Basu Roy

Senjuti Basu Roy, is an Assistant Professor at NJIT. Senjuti’s broader research interests lie in the area of data and content management of web and structured data with a focus on exploration, analytics, and algorithms. In recent years, her research has focused on designing principled algorithms and systems that require man-machine collaboration. She was the PC Co-chair of SIGMOD 2018 mentorship track and VLDB 2018 PhD Workshop program. Senjuti was a co-organizer of ExploreDB 2016 (co-located with SIGMOD 2016) and the IEEE Workshop on Human-in-the-loop Methods and Human Machine Collaboration in BigData (IEEE HMData 2017, 2018) (co-located with IEEE Big data). She has organized a NSF workshop on converging human and technological perspectives in crowdsourcing research. Senjuti has published more than 55 prestigious research papers in top-tier international conferences and journals. Her research is funded by National Science Foundation, Office of Naval Research, National Institute of Health, Microsoft Research, and Multicare Health Systems.