Abstract : In today’s ever-growing demand for fast data analytics, heterogeneity severely undermines performance. On one hand, data format variety forces people to load their data into a single format, spending tons of resources and often losing valuable structural information. Or, requires a separate database system for each data type plus an integration tool to bring all the results together. All options are costly and waste valuable resources. On the other hand, “franken-chips” equipped with different types of potent compute units are severely under-utilised when running data analytics, as we’re used to coding with a CPU in mind and other core types are employed opportunistically, as an accelerating luxury. Nevertheless, hardware roadmaps indicate increasing levels of compute heterogeneity, and accelerator-level parallelism (ALP) is indeed the new way to make the best out of any hardware platform. Writing fast as well as programs that are portable to all kinds of microarchitectures, however, is an unsolved tradeoff. I will show how just-in-time (JIT) data virtualisation and code generation technologies can help execute queries fast across all kinds of data without costly preparation or heavy installations, as well as enable excellent utilisation of different hardware devices.
Bio : Anastasia Ailamaki is a Professor of Computer and Communication Sciences at EPFL and the CEO and co-founder of RAW Labs SA, a Swiss company that develops enables digital transformation for entreprises through real-time analysis of heterogeneous big data. Previously, she was on the faculty of the Computer Science Department at CMU, where she held the Finmeccanica endowed chair. She has received the 2019 ACM SIGMOD Edgar F. Codd Innovations Award, the 2019 EDBT Test of Time award, the 2018 Nemitsas Prize in Computer Science, an ERC Consolidator Award (2013), the European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), and ten best-paper awards in database, storage, and computer architecture conferences. She is an ACM fellow, an IEEE fellow, and an elected member of the Swiss, the Belgian, and the Cypriot National Research Councils.
Abstract : Our minds make inferences that appear to go far beyond standard machine learning. Whereas people can learn richer representations and use them for a wider range of learning tasks, machine learning algorithms have been mainly employed in a stand-alone context, constructing a single function from a table of training examples. In this talk, I shall touch upon a view on machine learning, called probabilistic programming, that can help capturing these human learning aspects by combining high-level programming languages and probabilistic machine learning — the high-level language helps reducing the cost of modelling and probabilities help quantifying when a machine does not know something. Since probabilistic inference remains intractable, existing approaches leverage deep learning for inference. Instead of “going down the full neural road,” I shall argue to use sum-product networks, a deep but tractable architecture for probability distributions. This can speed up inference in probabilistic programs, as I shall illustrate for unsupervised science understanding, and even pave the way towards automating density estimation, making machine learning accessible to a broader audience of non-experts.
This talk is based on joint works with many people such as Carsten Binnig, Zoubin Ghahramani, Andreas Koch, Alejandro Molina, Sriraam Natarajan, Robert Peharz, Constantin Rothkopf, Thomas Schneider, Patrick Schramwoski, Xiaoting Shao, Karl Stelzner, Martin Trapp, Isabel Valera, Antonio Vergari, and Fabrizio Ventola.
Bio : Kristian Kersting is a full professor (W3) for AI and ML at TU Darmstadt. After receiving his Ph.D. from U. Freiburg in 2006, he was with MIT, Fraunhofer IAIS, U. Bonn, and TU Dortmund. His main research interests are (deep) probabilistic programming and learning. Kristian has published over 170 peer-reviewed articles. He is an EurAI Fellow, an ELLIS Fellow and received the inaugural German AI Award (Deutscher KI-Preis) 2019, as well as several paper awards (TPM 2019, AIIDE 2015, ECML 2006) and the EurAI Dissertation Award 2006. Kristian has been on the (senior) PC of major AI/ML conferences (e.g. AAAI, ICML, IJCAI, NeurIPS, ICLR, and CVPR) and co-chaired the PC of ECML PKDD.
Abstract : Early use of knowledge graphs, before the start of this century, related to building a knowledge graph manually or semi-automatically and applying them for semantic applications, such as search, browsing, personalization, and advertisement. Taalee/Semagix Semantic Search in 2000 had a KG that covered many domains and supported search with an equivalent of today’s infobox .
Along with the growth of big data, machine learning became the preferred technique for searching, analyzing and deriving insights from such data. We observed the complementary nature of bottom-up (machine learning-driven) and top-down (semantic, knowledge graph and planning based) techniques .
Recently we have seen growing efforts involving the shallow use of a knowledge graph to improve the semantic and conceptual processing of data [3, 4]. The future promises deeper and congruent incorporation or integration of the knowledge graphs in the learning techniques (which we call knowledge-infused learning), where knowledge graphs combining statistical AI (bottom-up) and symbolic AI learning techniques (top-down) play a critical role in hybrid and integrated intelligent systems. Throughout this talk, we will provide real-world examples, products, and applications where the knowledge graph played a pivotal role.
Bio : Prof. Amit Sheth is an Educator, Researcher, and Entrepreneur. Prior to his joining the University of South Carolina as the founding director of the university-wide AI Institute, he was the LexisNexis Ohio Eminent Scholar and executive director of Ohio Center of Excellence in Knowledge-enabled Computing. He is a Fellow of IEEE, AAAI, and AAAS. He is among the highly cited computer scientists worldwide. He has founded three companies by licensing his university research outcomes, including the first Semantic Web company in 1999 that pioneered technology similar to what is found today in Google Semantic Search and Knowledge Graph. He is particularly proud of his students’ exceptionally success in academia and industry research labs and as entrepreneurs.
Abstract : What is the minimum amount of communication required to compute a query in parallel, on a cluster of servers? In the simplest case when we join two relations without skewed data, then we can get away by simply reshuffling the data once. But if the data is skewed, or if we need to compute multiple joins, then it turns out that the total communication cost is significantly larger than the input data. In this talk I will describe a class of algorithms for which we can prove formally that their communication cost is optimal. I will briefly review standard join algorithms (partitioned hash join and broadcast join), then describe the novel class of hypercube-based algorithms, and their analysis based on the fractional edge packing of the query.
Bio : Dan Suciu is a Professor in Computer Science at the University of Washington. He received his Ph.D. from the University of Pennsylvania in 1995, was a principal member of the technical staff at AT&T Labs and joined the University of Washington in 2000. Suciu is conducting research in data management, with an emphasis on topics related to Big Data and data sharing, such as probabilistic data, data pricing, parallel data processing, data security. He is a co-author of two books Data on the Web: from Relations to Semistructured Data and XML, 1999, and Probabilistic Databases, 2011. He is a Fellow of the ACM, holds twelve US patents, received the best paper award in SIGMOD 2000, SIGMOD 2019 and ICDT 2013, the ACM PODS Alberto Mendelzon Test of Time Award in 2010 and in 2012, the 10 Year Most Influential Paper Award in ICDE 2013, the VLDB Ten Year Best Paper Award in 2014, and is a recipient of the NSF Career Award and Alfred P. Sloan Fellowship. Suciu serves on the VLDB Board of Trustees, and is an associate editor for the Journal of the ACM, VLDB Journal, ACM TWEB, and Information Systems and is a past associate editor for ACM TODS and ACM TOIS. Suciu's PhD students Gerome Miklau, Christopher Re and Paris Koutris received the ACM SIGMOD Best Dissertation Award in 2006, 2010, and 2016 respectively, and Nilesh Dalvi was a runner up in 2008.