Keynotes - Cods Comad

Abhishek Singh

MeitY, Govt. of India

Speakers Bio: Shri Abhishek Singh is a career civil servant with 29 years of experience of governance & policy formulation. He specializes in the use of Technology for improving Governance. He is presently posted as Additional Secretary, Ministry of Electronics and Information Technology, Government of India, with responsibilities of Artificial Intelligence & Emerging Technologies, Cyber Security and Digital Skilling. He has previously served as CEO, Karmayogi Bharat in Department of Personnel & Training; and CEO NeGD, DIC and MyGov in Ministry of Electronics & Information Technology, Government of India. He has done Masters in Public Administration from Harvard Kennedy School of Government. He is also an alumnus of IIT Kanpur.

Debjani Ghosh

Distinguished Fellow, NITI Aayog
Chief Architect – NITI Frontier Tech Hub
Former President, nasscom
Former Vice President and MD – Intel South Asia

Speakers Bio: Debjani Ghosh, a respected and transformative leader in the technology industry, serves as a Distinguished Fellow at NITI Aayog and the Chief Architect of the NITI Frontier Tech Hub—India's pioneering think tank dedicated to advancing the country’s readiness in frontier technologies for accelerated economic growth and societal development. With a stellar career spanning over 28 years, including pivotal leadership roles at Intel and as the first female President of Nasscom, Debjani has been at the forefront of driving India’s emergence as a global innovation powerhouse. A leading voice in AI, her leadership embodies a passionate commitment to fostering a human-centric approach to technology, ensuring that advancements not only propel economic progress but also uplift societal well-being.

Dinesh Manocha

University of Maryland

Advancing Audio Understanding in the Age of Large Language Models

Abstract: Audio comprehension—including speech, non-speech sounds, and music—is essential for AI agents to interact effectively with the world. Yet, research in audio processing has lagged behind other areas like language and vision, hindered by limited datasets, the need for advanced architectures, and training methods suited to the inherent complexities of audio. At the same time, the rise of Large Language Models (LLMs) offers promising new directions, as they have shown a remarkable ability to understand and reason about the world through language, pushing forward foundational audio tasks like Automatic Speech Recognition (ASR), cross-modal retrieval, and audio captioning.

Our group is trying to bridge this gap with various innovative solutions, starting with GAMA, our large audio-language model designed for advanced audio perception and complex reasoning. GAMA is built with a specialized architecture, optimized audio encoding, and a novel alignment dataset, positioning it as a leader across benchmarks for audio understanding, reasoning, and hallucination reduction. Good representations are key to advancing perception and GAMA’s development builds on our past achievements, such as MAST and SLICER and EH-MAM, which are novel approaches for learning strong audio representations from unlabeled data. Complementing this, we introduced ReCLAP, a state-of-the-art audio-language encoder, and CompA, one of the first projects to tackle compositional reasoning in audio-language models—a critical challenge given audio’s inherently compositional nature.

Looking forward, we envision LALMs becoming integral to daily life, capable of conversational speech QA, information-extraction-based QA, and addressing knowledge-driven questions about diverse audio inputs. Achieving these ambitious goals requires both advanced data and architectures. Synthio, our latest synthetic data generation framework, supports this mission by generating data for complex audio understanding. Progress must also be measurable, so we’re dedicated to establishing comprehensive benchmarks. Our recent work, MMAU, rigorously tests LALMs on real-world tasks.

Speakers Bio: Dinesh Manocha is Paul Chrisman-Iribe Chair in Computer Science & ECE and Distinguished University Professor at University of Maryland College Park. His research interests include virtual environments, physically-basedmodeling, and robotics. His group has developed a number of software packages that are standard and licensed to 60+ commercial vendors. He has published more than 800 papers & supervised 52 PhD dissertations. He is a Fellow of AAAI, AAAS, ACM, IEEE, and NAI and member of ACM SIGGRAPH and IEEE VR Academies, and Bézier Award from Solid Modeling Association. He received the Distinguished Alumni Award from IIT Delhi the Distinguished Career in Computer Science Award from Washington Academy of Sciences. He was a co-founder of Impulsonic, a developer of physics-based audio simulation technologies, which was acquired by Valve Inc in November 2016.

Pramod Varma

EkStep Foundation and Ex- India Stack

Building for a Billion: India’s DPI Journey

Speakers Bio: Pramod Varma has been the chief architect of most of India’s Digital Public Infrastructure (DPI) efforts starting with Aadhaar - India’s unique ID system that covers 1.4 Billion people; eSign - an interoperable digital signature protocol; DigiLocker - digital credentialing and wallet system in India which is currently used by 280 million people having nearly 6.5 billion credentials; and UPI - the unified real time payment system that was launched in 2016 currently doing over 14 Billion transactions a month.

Vipin Kumar

University of Minnesota

Knowledge-Guided Machine Learning: A New Framework for Accelerating Scientific Discovery and Addressing Global Environmental Challenges

Abstract: Climate change, loss of bio-diversity, food/water/energy security for the growing population of the world are some of the greatest environmental challenges that are facing the humanity. These challenges have been traditionally studied by science and engineering communities via process-guided models that are grounded in scientific theories. Motivated by phenomenal success of Machine Learning (ML) in advancing areas such as computer vision and language modeling, there is a growing excitement in the scientific communities to harness the power of machine learning to address these societal challenges.

In particular, massive amount of data about Earth and its environment is now continuously being generated by a large number of Earth observing satellites, in-situ sensors as well as physics-based models. These information-rich datasets in conjunction with recent ML advances offer huge potential for understanding how the Earth's climate and ecosystem have been changing, how they are being impacted by humans actions, and for devising policies to manage them in a sustainable fashion. However, capturing this potential is contingent on a paradigm shift in data-intensive scientific discovery since the “black box” ML models often fail to generalize to scenarios not seen in the data used for training and produce results that are not consistent with scientific understanding of the phenomena.

This talk presents an overview of a new generation of machine learning algorithms, where scientific knowledge is deeply integrated in the design and training of machine learning models to accelerate scientific discovery. These knowledge-guided machine learning (KGML) techniques are fundamentally more powerful than standard machine learning approaches, and are particularly relevant for scientific and engineering problems that are traditionally addressed via process-guided (also called mechanistic or first principle-based) models, but whose solutions are hampered by incomplete or inaccurate knowledge of physics or underlying processes. While this talk will illustrate the potential of the KGML paradigm in the context of environmental problems (e.g., Ecology, Hydrology, Agronomy, climate science), the paradigm has the potential to greatly advance the pace of discovery in any discipline where mechanistic models are used.

Speakers Bio: Vipin Kumar is a Regents Professor and holds William Norris Chair in the department of Computer Science and Engineering at the University of Minnesota. His research spans data mining, high-performance computing, and their applications in Climate/Ecosystems and health care. His research has resulted in the development of the concept of isoefficiency metric for evaluating the scalability of parallel algorithms, as well as highly efficient parallel algorithms and software for sparse matrix factorization (PSPASES) and graph partitioning (METIS, ParMetis, hMetis). He has authored over 400 research articles, and co-edited or coauthored 11 books including two widely used text books ``Introduction to Parallel Computing", "Introduction to Data Mining", and a recent edited collection, “Knowledge Guided Machine Learning”. Kumar's current major research focus is on knowledge-guided machine learning and its applications to understanding the impact of human induced changes on the Earth and its environment.. Kumar’s research on this topic has been funded by NSF’s AI Instititues, BIGDATA, INFEWS, STC, GCR, and HDR programs, as well as ARPA-E, DARPA, and USGS. He also served as the Lead PI of a 5-year, $10 Million project, "Understanding Climate Change - A Data Driven Approach", funded by the NSF's Expeditions in Computing program (2010-2015).

Kumar is a Fellow of the AAAI, ACM, IEEE, AAAS, and SIAM. Kumar's foundational research in data mining and high performance computing has been honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD), the 2016 IEEE Computer Society Sidney Fernbach Award, one of IEEE Computer Society's highest awards in high performance computing, and Test-of-time award from 2021 Supercomputing conference (SC21).

Xin Luna Dong

Meta Reality Labs

Where Are We in the Journey to A Knowledgeable Assistant?

Abstract: For decades, multiple communities (Database, Information Retrieval, Natural Language Processing, Data Mining, AI) have pursued the mission of providing the right information at the right time. Efforts span web search, data integration, knowledge graphs, question answering. Recent advancements in Large Language Models (LLMs) have demonstrated remarkable capabilities in comprehending and generating human language, revolutionizing techniques in every front. However, their inherent limitations such as factual inaccuracies and hallucinations make LLMs less suitable for creating knowledgeable and trustworthy assistants.