Spatial databases came into prominence 20 years ago with the introduction of spatial data types into traditional relational databases. Over the last 20 years, many advances in spatial databases brought spatial data management into mainstream applications. With the recent developments in Big Data, NoSQL databases, Cloud Computing, Edge Computing, and AI technologies, there is disruption in the industry. The idea of consolidating all of the spatial data into one single database is being challenged. In this talk, we present the evolution of spatial data management to an autonomous spatial data platform. We present a reference architecture for this vision and describe how spatial applications can be built on top of this autonomous data platform. We also describe how Spatial Data Science applications can leverage this autonomous data platform.
We present our work in developing MiSTIC - a generalised Spatio-temporal data mining approach, and how it can be used to extract spatial regions or zones which exhibit consistent temporal behaviour. In case of the Health case study, it shows how the extracted zones concur well with the topograhical and other constraints that limit the diseases sphere of influence. Further the approach can help improve our estimates of risk of these regions when there is an outbreak. In the other case study on crop production (yield) systems, MiSTIC is further extended to indicate regions and sub-regions that are consistently high and low performers across the years irrespective of the input similarity or variability in soil, climate, agricultural practices, etc. These indicate that the high and low performers might be more strongly influenced over time by other regulatory and controlling factors at a spatial/regional scale than just the input-output interactions.
|10:00-10:15||Tea / Coffee Break|
This talk will focus on our recent efforts in adopting machine learning techniques for big spatial data and applications. This includes going for two orthogonal, but related, directions. First injecting the spatial awareness inside machine learning techniques and applications, which will result in a higher accuracy for such applications. Second, taking advantage of the recent advances in machine learning techniques to boost the usability, deployment, scalability, and accuracy of long lasting spatial and spatio-temporal data analysis techniques. For the first direction, we will present the Sya system as a full-fledged spatial machine leaning-based probabilistic knowledge base construction system. For the second direction, we will present machine-learning-based techniques for spatial autologistic regression and shortest path queries.
The physical surface of the landscape is undergoing transformation either naturally or due to human interference, giving rise to Land Cover Land Use Change (LCLUC). Expanding urban regions and consequent LCLUC have emerged as one of the major anthropogenic sources of global environmental degradation, bringing numerous stresses to landscapes, vegetation, habitats, water, etc. LCLUC at a sub-continent to global level can be monitored through high temporal and low spatial resolution data, such as those obtained from Landsat at 30 m or MODIS Terra/Aqua at 250 m spatial resolution. These satellites improve the ability to map large areas of Earth’s surface quickly due to their wider IFOV and inexpensively. However, different land cover (LC) types jointly occupy a single pixel, and the resulting spectral measurement is a composite of the individual spectra. The intrinsic scale of spatial variation in LC is usually finer than the scale of sampling imposed by the image pixels. Due to scale-resolution mismatch, the spatial resolution of the details on the ground is less than what is required, leading to sub-pixel heterogeneity, imposing limitations in modelling with these data sets. The talk will present some attempts to resolve the mixed pixel problem and also discuss some of the emerging challenges in geospatial research.
Traditionally, navigational systems supported preference metrics such as minimization of distance or travel-time. Such preference metrics are probably sufficient for the developed nations. However, navigation in a developing country has its own unique challenges which provides a way for novel CS research. For instance, in a developing nation setting, roads may not always be marked appropriately. In such a scenario, it would be interesting to develop computational techniques which can determine routes which balance the trade-off between optimality and ease-of-navigation (e.g., easily identifiable with less turns). Techniques developed for this problem can be generalized to the broader problem of path-finding under multiple objective functions. For example, minimizing traffic lights while maintaining the overall length of the route to be 20% of the optimal route.
Spatial data infrastructure (SDI) is the infrastructure that facilitates the discovery, access, management, distribution, reuse, and preservation of digital geospatial assets. These assets include maps, data, Web services, and tooboxes. The role of SDI have become more significant in today’s big data age and also when geospatial data has become the vital component of planning , monitoring and decision making. User demands are ever increasing for diverse geospatial data and Web services to solve the problems in their application domain. Building a geospatial platform catering the large userbase is a challenge in terms of many facets like Generating large Scale Data Assets, Creating Software Platforms for Geospatial data handling , Building the hardware platform and Optimizing the delivery mechanism . The talk focusses on these challenges and solutions for building Geospatial Platform of enterprise Scale for data visulalization, delivery and governance applications at National Scale.
Observations from the oceans are the backbone for providing operational oceanographic services (potential fishing zone advisory services, ocean state forecast, storm surges, cyclones, monsoon variability, tsunami etc.), research and development, calibration and validation of satellite sensors, parameterizing key processes for models and verifying model simulations. The Ocean Observing System (OOS) comprises of in-situ and remote sensing platforms measuring a suite of marine meteorological and oceanographic parameters on a broad spectrum of spatial and temporal time scales.
Preservation of the long-term oceanographic data and its availability from a single source would facilitate multi-disciplinary approach in understanding the oceans in a better way and bringing out new insights. Further, technological advances in the last two decades in oceanographic sensors; communication and computing systems facilitated the ocean scientists to acquire data in real-time from a variety of ocean observing platforms. Management of heterogeneous and voluminous oceanographic data is imperative to ensure high-quality data for research and for data-driven decision making in operational oceanographic services.
Ocean data management is becoming an integral part of ocean observation programmes with emerging technologies such as sensor observing service, web-based data services to serve the data with analytical and visualization features on the fly. We briefly present the ocean observing systems in the Indian Ocean, operational oceanographic services provided India and development of ocean data and information system, an end-to-end data management system to provide data services to the stakeholders and the recent developments.
The Geographic Information Science & Technology (GIS&T) community has been making key contributions to evolving research challenges in data sciences. There has been a recent explosion in the amounts of spatial data produced by several devices such as smartphones, satellites, space telescopes, medical devices, among others. This variety of such spatial data makes it widely used across important applications. This workshop will serve as a tutorial focusing on the challenges in geospatial data science, as well as a forum for interaction between researchers with both government and private agencies that are tasked to create or leverage geospatial data. This workshop aims to familiarize database researchers with the R&D opportunities in the area of geospatial technology and applications, and to encourage them to apply for research funding in these areas. The workshop will highlight present national-level initiatives to create mission critical geospatial infrastructure and to make DB researchers aware of challenging tasks of data collection, effective storage and access, map visualization, and data analysis, and invite them to participate in these research areas. While participants are expected to be interested in database technologies/data sciences, no specific background in geospatial technologies is expected.
The 4-hour workshop will include:
1) Keynote lectures that will cover major advances in geospatial data sciences and provide both the industry and research perspective.
2) Invited R&D talks researchers in India working on geospatial data science problems.
3) Overview of R&D challenges/opportunities delivered by experts and practitioners from Government agencies.
Title: Evolution of Spatial Data Management: Isolated Spatial databases to Autonomous Spatial Data Platform
Bio : Siva Ravada is the Senior Director of Development for the Oracle Spatial and Graph and Mapping technologies at Oracle. Siva has been with Oracle for over 17 years leading the spatial development activities at Oracle, both for on-premise systems as well as for cloud services. Prior to joining Oracle, Siva received his PhD degree from the Univ. Of Minnesota with a thesis on parallel algorithms for spatial databases. Under his leadership, Spatial technology is incorporated into different Oracle products including the database, middleware and applications. He is also a well-known researcher in the industry with over 50 patents and publications in the area of Spatial Databases.
Title: Machine Learning for Big Spatial Data and Applications
Bio : Mohamed Mokbel (PhD, Purdue University, MSc, BSc, Alexandria University) is Chief Scientist at Qatar Computing Research Institute and a Professor at University of Minnesota. His current research interests focus on systems and machine learning techniques for big spatial data and applications. Mohamed is an ACM Distinguished Scientist. His research work has been recognized by the VLDB 10-years Best Paper Award, four conference Best Paper Awards, and the NSF CAREER Award. Mohamed is the past elected Chair of ACM SIGPATIAL, current Editor-in-Chief for Distributed and Parallel Databases Journal, and on the editorial board of ACM Books, ACM TODS, VLDB Journal, ACM TSAS, and GoeInformatica journals. He has also served as PC Vice Chair of ACM SIGMOD and PC Co-Chair for ACM SIGSPATIAL and IEEE MDM
Bio : Soumya K Ghosh is a Professor in the Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur (IIT Kharagpur), India. Prior to IIT Kharagpur, he worked for Indian Space Research Organization in area of remote sensing and geographic Information systems for natural resource management. His primary areas of research include Spatial Data Science, Spatial Databases and Services and Cloud Computing. He has published more than 250 articles in peer reviewed journals and internal conference proceedings. He has been awarded the National Geospatial Chair Professorship by Department of Science and Technology, Government of India.
Bio : Umesh Bellur is a faculty in the department of Computer Science at IIT Bombay which he currently heads. His areas of research are distributed computing and systems and he currently is investigating problems in data center resource management as it pertains to serverless computing, large scale event delivery networks and infrastructure as a service clouds. He has published over 50 peer reviewed papers at well respected fora and was awarded the IBM Faculty award as well as the VMWare award for systems research. Prior to joining IIT in 2003, he was the cofounder of a startup in silicon valley that was subsequently acquired by IBM.
Bio : Partha Sarathi Acharya is the CEO of the Indian NSDI and a Scientist at the Department of Science and Technology (DST), Government of India. He has been associated with DST’s flagship program titled ‘Natural Resources Data Management System (NRDMS)’, which led to the birth of the Spatial Data Infrastructure (SDI) Initiatives at national and state levels. He has interests in promoting scientific and technical activities in areas like framing geospatial policies, geospatial interoperability, spatial database design and data modelling, sectoral decision support systems, and development of training modules and kits. He obtained his Master of Science in Physics from Utkal University, Bhubaneswar and M.Phil. in Computer Science from Jawaharlal Nehru University, New Delhi. He has worked as a Visiting Fellow at the State University of New York, Buffalo, USA designing an experimental spatial data model for the development of GIS databases under the UNDP-assisted project “GIS-based Technologies for Local Level Development Planning”.
Bio : Sumit is a Sr. Scientist at the Advanced Lab for Geographic Information Sciences and Engineering at the Department of Computer Science and Engineering, IIT Bombay and RA Consultant to Genesys International. He completed his Masters at the City University, London and has worked as a researcher with IIT Bombay, University of Muenster, Keio University, Chubu University and Ordnance Survey. His primary research domain has been spatial databases and analytics. He has published papers in international journals and conferences on topics related to geospatial ontologies, data integration and spatial data infrastructures.