Commentary

Multimodal geospatial data – Shrinking time from acquisition to actionable insights

Why is geospatial data performance (fast query time) such an obstacle? Various types of geospatial data are often kept in purpose-built data silos.

Norman Barker

March 13, 2024 4:03 pm

4 min read

Disputes and conflicts abroad call for rigorous contingency planning on the part of U.S. defense and intelligence operations. Part of this preparation involves having an accurate, up-to-the-minute intelligence from multiple geospatial modalities.

Geospatial knowledge has always been core to military intelligence, but its highly dynamic nature makes it vital to collapse the time window between data capture, and analysis and dissemination. Today, this latency is growing more pronounced as data volume, variety and speed grow at a mind-boggling rate. Such cumbersome access to information leads to slower, less accurate decision-making, which can negatively impact geo-intelligence.

Why is geospatial data performance (fast query time) such an obstacle? Various types of geospatial data are often kept in purpose-built data silos. Not having all this data in one place is a major impediment, forcing geospatial analysts to resort to inefficient, time-consuming and cumbersome methods to consolidate and analyze data in aggregate, where it becomes inherently richer.

Any single dataset only gives a finite amount of information that is used for a limited number of purposes. Integrating and linking these datasets is the key to unlocking valuable insights that lead to better decision-making. Spatial overlays – or the joining and viewing together of separate datasets sharing all or part of the same area – are an example. In a defense context, this ability to view data in aggregate via superimposing may translate into increased situational awareness, such as mapping terrain and the movement of people to identify hotspots for illegal border crossings or drug trafficking.

Traditionally, data analysts have had to download multiple file formats and develop their processing pipelines to synthesize and enrich data in this way. Before starting a processing task, an analyst had to search various databases to find the needed data and then download this complex data in multiple formats as inputs into a processing pipeline, with each input requiring its own API. In a defense example, target detection using hyperspectral data requires a custom processing pipeline incorporating aerial imagery for context and possibly point clouds for advanced 3D visualization.

Naturally, this convoluted approach hinders the ability to do rapid data processing spanning multiple sources. There isn’t a single, consolidated place for all geospatial analytics and machine learning, preventing deeper data contextualization.

Rapid processing from multiple data sources is the key to achieving this integrated information richness that supports more informed decision-making. Beyond basic data access and capture, this type of analysis adds even more complexity because heterogeneous tools are used to analyze each data type. For example, currently, advanced imagery analytics require custom tools with limited API integration. Imagine the power that could be unleashed if a single API optimized data access and could integrate all of these tools.

Finally, today’s geospatial analysts face restrictive computing limitations, given that data and compute are largely kept separate. Analysts often have to take the time to spin up clusters, which is outside of their core competency and can slow down time to insights even further. Advances in serverless architectures can eradicate this problem, allowing developers to spin up or down as many applications as they want, as frequently as possible, without concerns about hardware availability.

Any industry relying on geospatial data needs a new approach, one that is capable of delivering insights in minutes as opposed to days or weeks. This can be achieved through:

A single platform to support all data types – there needs to be an efficient and unified method to store and analyze all geospatial data and derivative results.
Distributed and highly scalable computing that allows geospatial analysts to fully embrace the cloud to run any pipeline at scale without having to initiate and activate clusters.
Finally, all of this needs to be accomplished while protecting sensitive information and ensuring data integrity. There should be compliant and isolated on-premises capabilities to ensure compliance with data sovereignty requirements for both your mission and partners.

Geospatial knowledge continues to offer a deep well of insights that are used for the betterment of defense and intelligence operations and, at a higher level, human society. However, the volume, variety, and velocity of this data require a new approach to manage it cohesively since current methods are too fragmented. Doing so will be the key to maximizing the power of geospatial information in the coming years, hopefully transforming data into life-changing intelligence within increasingly short timespans.

Norman Barker is vice president of geospatial at TileDB.