Spatial Intelligence

Spatial Intelligence is AI's ability to understand, reason about, and interact with the three-dimensional physical world. This emerging field combines computer vision, robotics, and physics simulation to enable AI systems to perceive depth, spatial relationships, and physical properties—moving beyond 2D understanding to true 3D world comprehension.

🎯 Overview

What is Spatial Intelligence?

Spatial Intelligence represents AI's next frontier—enabling machines to understand the 3D world as humans do. Unlike traditional AI that processes flat images or text, spatially intelligent systems can:

Perceive and reason about 3D space
Understand physical relationships between objects
Navigate and manipulate the real world
Build persistent spatial maps
Simulate physics and predict outcomes

This technology powers autonomous vehicles, robotics, AR/VR, and the next generation of AI agents that interact with physical environments.

📋 Topics Covered

3D Scene Understanding: Depth perception, object localization, spatial relationships
Computer Vision: LiDAR, RGB-D cameras, stereo vision, SLAM
Robotics Navigation: Path planning, obstacle avoidance, embodied AI
Large Geospatial Models (LGMs): Spatial AI foundation models
World Models: Simulating physical environments
AR/VR Applications: Spatial anchoring, mixed reality
Autonomous Systems: Self-driving, drones, warehouse robots
Physics Simulation: Understanding gravity, collision, dynamics
Spatial Mapping: 3D reconstruction, point clouds
Embodied AI: Agents that learn through physical interaction
3D Spatial Reasoning: Point clouds, camera operations, view switching

🚀 Leading Research & Platforms

Industry Leaders

1. Niantic Spatial AI - Large Geospatial Models 🔴 Advanced

URL: https://www.nianticspatial.com/blog/spatial-intelligence-ai-breakthrough
Description: Niantic (creators of Pokémon GO) is building Large Geospatial Models (LGMs)—the spatial counterpart to LLMs. Trained on billions of real-world images from 10M+ locations, LGMs enable AI to understand space and structures like humans do, inferring what the world looks like from different angles.
Key Concepts:
- Large Geospatial Models (LGMs)
- Visual Positioning System (VPS)
- Persistent spatial anchors
- Real-world 3D mapping at scale
- "Operating system for the physical world"
Why It's Groundbreaking: First company building a global-scale spatial AI model from crowdsourced AR data
Applications: Enterprise AR, robotics navigation, spatial computing, digital twins
Best For: Understanding the future of spatial AI, LGM architecture, real-world AI systems

2. World Labs - Fei-Fei Li's Spatial AI Startup 🔴 Advanced

URL: https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence
Blog Post: "From Words to Worlds: Spatial Intelligence is AI's Next Frontier" by Fei-Fei Li
Description: Founded by legendary Stanford AI professor Fei-Fei Li (creator of ImageNet), World Labs is building foundational world models for spatial intelligence. The vision: AI that understands the semantically, physically, geometrically, and dynamically complex 3D world.
Research Focus:
- World models and 3D generation
- Large-scale spatial training data
- New model architectures beyond 1D/2D sequences
- Embodied AI and robotics
- Scientific simulation
Key Insight: "Building spatially intelligent AI requires world models—generative models capable of understanding, reasoning, generation, and interaction with complex 3D worlds far beyond today's LLMs."
Why It Matters: Led by ImageNet creator, defining the next decade of AI research
Best For: Understanding spatial AI vision, world models, future research directions

3. NVIDIA Spatial AI & World Models 🔴 Advanced

URL: https://www.nvidia.com/en-us/glossary/world-models/
URL: https://www.ibm.com/think/news/cosmos-ai-world-models (IBM Cosmos partnership)
Description: NVIDIA's Cosmos platform enables world models that understand 3D dynamics, physics, and spatial properties. Powers Isaac Sim for robot training and autonomous vehicle simulation with realistic environments.
Key Technologies:
- World models for robotics
- Physics simulation (Isaac Sim)
- Autonomous vehicle training
- Synthetic data generation at scale
- NVIDIA Jetson for edge spatial AI
Applications: Factory robots, warehouse automation, self-driving cars, industrial robotics
Open Source: Cosmos includes open-source models and simulation tools
Best For: Robot simulation, synthetic training data, physics-aware AI, edge deployment

Academic Research

4. Stanford Geospatial & Spatial Intelligence 🟡 Intermediate | 🔴 Advanced

URL: https://earth.stanford.edu/geospatial
Research Groups: Computer Vision Lab, AI Lab, Robotics Lab
Description: Stanford's cutting-edge research in spatial AI, led by pioneers like Fei-Fei Li, Silvio Savarese, and others. Focuses on 3D scene understanding, embodied AI, and spatial reasoning.
Key Research Areas:
- 3D scene reconstruction
- Spatial reasoning in language models
- Embodied AI and robotics
- Visual navigation
- Multi-modal spatial learning
Free Courses:
- CS231A: Computer Vision - From 3D Reconstruction to Recognition
- CS336: Robot Perception and Decision-Making
- Spatial Intelligence seminars
Publications: Access via Stanford AI Lab website
Best For: Academic research, PhD-level spatial AI, cutting-edge methods

5. MIT Spatial Intelligence Lab 🟡 Intermediate | 🔴 Advanced

URL: https://web.mit.edu/ (Search for spatial AI labs)
Key Labs: CSAIL, Media Lab, AeroAstro (autonomous systems)
Description: MIT's interdisciplinary research spanning computer vision, robotics, and spatial computing. Strong focus on embodied AI, SLAM, and autonomous navigation.
Research Topics:
- Simultaneous Localization and Mapping (SLAM)
- Depth estimation from monocular images
- 3D object detection
- Spatial memory in neural networks
- AR/VR spatial computing
Free Resources:
- MIT OpenCourseWare: 6.801 Machine Vision
- Spatial AI lectures and papers
- Open datasets (e.g., MIT Places)
Best For: SLAM techniques, depth estimation, embodied AI research

6. UC Berkeley Spatial AI & Robotics 🟡 Intermediate | 🔴 Advanced

URL: https://bair.berkeley.edu/ (Berkeley AI Research)
URL: https://www.robolabs.org/summeratberkeley (VEX AI Summer Academy)
Description: World-class research in robotics, computer vision, and spatial intelligence. Home to pioneers in deep RL (Sergey Levine), 3D vision, and embodied AI.
Key Research:
- Robotic manipulation in 3D space
- Visual foresight (predicting future states)
- Object-centric spatial representations
- Embodied navigation
Free Courses:
- CS194-26: Intro to Computer Vision and Computational Photography
- CS287: Advanced Robotics
- EE106A: Introduction to Robotics
Summer Programs: VEX AI Robotics Academy (hands-on spatial AI)
Best For: Robotic manipulation, visual prediction, academic courses

Industry Tools & Platforms

7. Esri - Geospatial AI with ArcGIS 🟡 Intermediate

URL: https://www.esri.com/en-us/geospatial-artificial-intelligence/overview
Description: Enterprise geospatial AI platform combining GIS (Geographic Information Systems) with machine learning. Enables spatial analysis at massive scale with real-time monitoring and prediction.
Key Features:
- AI-powered spatial analytics
- Anomaly detection in geographic data
- Predictive modeling for urban planning
- Real-time location intelligence
- Automated pattern recognition
- Integration with satellite imagery
Applications: Urban planning, disaster response, supply chain optimization, environmental monitoring
Free Resources:
- ArcGIS tutorials
- Spatial AI documentation
- Sample datasets
Best For: Enterprise GIS, urban analytics, location intelligence, practical applications

8. Google ARCore & Geospatial API 🟢 Beginner | 🟡 Intermediate

URL: https://developers.google.com/ar/develop/geospatial
Description: Google's platform for building AR experiences with spatial understanding. Geospatial API enables global-scale AR anchored to real-world locations using Visual Positioning Service (VPS).
Key Features:
- Visual Positioning System (VPS)
- Global localization (100+ countries)
- Persistent Cloud Anchors
- Environmental understanding
- Light estimation
- Depth API
Free Tools: ARCore SDK, Geospatial Creator, extensive documentation
Applications: AR navigation, location-based experiences, spatial commerce
Best For: AR developers, mobile spatial apps, global-scale localization

🔬 Cutting-Edge Research (2024-2026)

Advanced Spatial Reasoning Systems

9. Think3D: Thinking with Space for Spatial Reasoning (arXiv Jan 2026) ⭐ NEW 🔴 Advanced

URL: https://arxiv.org/abs/2601.13029
GitHub: https://github.com/zhangzaibin/spagent
Description: Breakthrough framework enabling Vision Large Models (VLMs) to reason in 3D space rather than 2D perception. Uses 3D reconstruction (point clouds, camera poses) to allow agents to actively manipulate space, switch views (ego/global), and perform interactive 3D chain-of-thought reasoning.
Key Innovation:
- Training-free spatial reasoning (+7.8% on BLINK/MindCube, +4.7% on VSI-Bench)
- Active viewpoint selection via reinforcement learning
- 3D point cloud manipulation
- Ego-centric and global view switching
- Solves visual ambiguity through spatial exploration
Performance: GPT-4.1 and Gemini 2.5 Pro significantly improved on spatial benchmarks
Applications: Spatial VQA, 3D scene understanding, robot navigation, AR/VR
Research Impact: First to demonstrate training-free 3D reasoning for VLMs
Best For: Advanced spatial reasoning, VLM enhancement, 3D cognitive systems
[Tags: spatial-reasoning 3d-chain-of-thought vlm point-clouds arxiv 2026]

10. Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning (arXiv Feb 2026) ⭐ NEW 🔴 Advanced

URL: https://arxiv.org/abs/2602.21186
GitHub: https://github.com/hustvl/Spa3R
Description: Novel self-supervised framework learning unified, view-invariant spatial representations from unposed multi-view images. Introduces Predictive Spatial Field Modeling (PSFM) paradigm where models synthesize feature fields for arbitrary views, achieving 58.6% SOTA accuracy on VSI-Bench 3D VQA.
Key Innovation:
- Self-supervised learning from 2D images (no 3D data required)
- View-invariant spatial representations
- Predictive Spatial Field Modeling (PSFM)
- Holistic 3D scene understanding
- Lightweight adapter for VLM integration
Technical Approach: Learns to predict spatial fields conditioned on compact latent representations, enabling VLMs to reason with global spatial context
Performance: 58.6% accuracy on 3D VQA (state-of-the-art)
Significance: Proves spatial intelligence can emerge from 2D vision alone without explicit 3D instruction tuning
Applications: Vision-language models, 3D scene understanding, spatial VQA
Best For: Spatial field modeling, VLM grounding, scalable spatial intelligence
[Tags: spatial-field-modeling psfm self-supervised vlm sota arxiv 2026]

11. SpatialReasoner: Flexible 3D Spatial Reasoning Framework (GitHub 2024) 🟡 Intermediate | 🔴 Advanced

URL: https://github.com/metason/SpatialReasoner
Description: Open-source framework for flexible 3D spatial reasoning with 100+ spatial predicates and corresponding relations. Handles fuzzy spatial situations, confidence measures, and semantic processing in 3D for XR, AR, VR, and large world models.
Key Features:
- XR-focused (real & virtual 3D objects)
- 100+ spatial predicates (distance, orientation, containment, topology)
- Fuzzy logic for imprecise detections
- Confidence handling
- Spatial Reasoner Syntax for 3D queries
- Integration with LLMs and Large World Models (LWM)
- Voice interaction in space
Applications:
- AR/VR spatial queries
- Object classification by spatial relations
- Spatial rule engines
- Semantic 3D understanding
- Voice-controlled spatial interaction
Open Source: Fully free, active development
Best For: 3D spatial logic, XR applications, semantic spatial processing, rule engines
[Tags: 3d-reasoning spatial-predicates xr fuzzy-logic open-source github 2024]

📚 Key Concepts Explained

Large Geospatial Models (LGMs)

Spatial equivalent of Large Language Models. Trained on billions of location-tagged images to understand 3D structure of the world. Can infer hidden information (e.g., what's behind a building) and reason spatially.

World Models

AI systems that build internal representations of 3D environments, simulate physics, and predict future states. Enable robots to plan actions by mentally simulating outcomes. (See World Models category)

Visual Positioning System (VPS)

Advanced localization technology using computer vision to determine precise position and orientation in 3D space, more accurate than GPS (centimeter-level precision).

SLAM (Simultaneous Localization and Mapping)

Algorithms that enable robots to build maps of unknown environments while tracking their position within that map in real-time.

Embodied AI

AI agents that learn through physical interaction with the environment, building spatial understanding through experience (like humans do).

3D Chain-of-Thought Reasoning

Interactive spatial reasoning process where VLMs actively explore 3D scenes through viewpoint manipulation, reconstruction, and progressive hypothesis refinement.

Predictive Spatial Field Modeling (PSFM)

Learning paradigm where models predict spatial feature fields for unseen viewpoints, enabling view-invariant spatial understanding without explicit 3D supervision.

🎯 Applications

Autonomous Vehicles: 3D scene understanding, path planning, obstacle detection
Robotics: Warehouse automation, surgical robots, manipulation in 3D
AR/VR: Spatial anchoring, occlusion, realistic interactions
Smart Cities: Urban planning, traffic optimization, infrastructure monitoring
Drones: Navigation, mapping, inspection
Construction: Site monitoring, progress tracking, digital twins
Healthcare: Surgical planning, spatial anatomy visualization
Retail: Spatial commerce, virtual try-on
Gaming: Realistic physics, environmental interaction
Spatial VQA: Answering questions about 3D scenes and spatial relationships

🔗 Related Categories

World Models - Simulating physical environments
Computer Vision - Visual perception systems
Robotics & Embodied AI - Physical AI agents
Multimodal AI - Multi-sensor fusion
Autonomous Systems - Self-driving technology
AR/VR Development - Spatial computing

University Resources

Stanford AI Resources - Fei-Fei Li's courses
MIT AI Resources - SLAM and robotics
Berkeley AI Resources - Robotic manipulation

📊 Statistics

Resource Count: 11 platforms, research groups, and cutting-edge papers
Market Size: Spatial AI market projected to reach $300B+ by 2030
Key Players: Niantic, World Labs, NVIDIA, Google, Meta
Research Hubs: Stanford, MIT, Berkeley, CMU
Latest Research: Think3D (Jan 2026), Spa3R (Feb 2026) Last Updated: February 28, 2026

💡 Learning Path

Beginners:

Read Fei-Fei Li's "From Words to Worlds" blog post
Explore Google ARCore tutorials
Learn basic computer vision (Stanford CS231n)

Intermediate:

Study Niantic's LGM approach
Experiment with NVIDIA Isaac Sim
Learn SLAM basics (MIT courses)
Explore SpatialReasoner framework

Advanced:

Research world models (NVIDIA Cosmos)
Study academic papers from Stanford/MIT
Implement Think3D or Spa3R frameworks
Build spatial AI projects with real robots
Explore cutting-edge arXiv papers (2026)

Contributing

To add a resource:

✅ Spatial Focus: Must relate to 3D understanding, not just 2D vision
✅ Free Access: Documentation, courses, or tools available at no cost
✅ Reputable Source: Academic institutions, established tech companies, research labs
✅ Active Development: Ongoing research or product updates

Format:

- [Resource Name](URL) - Description emphasizing spatial intelligence applications and unique features.

Sources: Niantic, World Labs, NVIDIA, Stanford, MIT, Berkeley, Esri, Google, arXiv (2024-2026)

← Back to Main README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spatial Intelligence

🎯 Overview

📋 Topics Covered

🚀 Leading Research & Platforms

Industry Leaders

1. Niantic Spatial AI - Large Geospatial Models 🔴 Advanced

2. World Labs - Fei-Fei Li's Spatial AI Startup 🔴 Advanced

3. NVIDIA Spatial AI & World Models 🔴 Advanced

Academic Research

4. Stanford Geospatial & Spatial Intelligence 🟡 Intermediate | 🔴 Advanced

5. MIT Spatial Intelligence Lab 🟡 Intermediate | 🔴 Advanced

6. UC Berkeley Spatial AI & Robotics 🟡 Intermediate | 🔴 Advanced

Industry Tools & Platforms

7. Esri - Geospatial AI with ArcGIS 🟡 Intermediate

8. Google ARCore & Geospatial API 🟢 Beginner | 🟡 Intermediate

🔬 Cutting-Edge Research (2024-2026)

Advanced Spatial Reasoning Systems

9. Think3D: Thinking with Space for Spatial Reasoning (arXiv Jan 2026) ⭐ NEW 🔴 Advanced

10. Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning (arXiv Feb 2026) ⭐ NEW 🔴 Advanced

11. SpatialReasoner: Flexible 3D Spatial Reasoning Framework (GitHub 2024) 🟡 Intermediate | 🔴 Advanced

📚 Key Concepts Explained

Large Geospatial Models (LGMs)

World Models

Visual Positioning System (VPS)

SLAM (Simultaneous Localization and Mapping)

Embodied AI

3D Chain-of-Thought Reasoning

Predictive Spatial Field Modeling (PSFM)

🎯 Applications

🔗 Related Categories

University Resources

📊 Statistics

💡 Learning Path

Contributing

FilesExpand file tree

spatial-intelligence.md

Latest commit

History

spatial-intelligence.md

File metadata and controls

Spatial Intelligence

🎯 Overview

📋 Topics Covered

🚀 Leading Research & Platforms

Industry Leaders

1. Niantic Spatial AI - Large Geospatial Models 🔴 Advanced

2. World Labs - Fei-Fei Li's Spatial AI Startup 🔴 Advanced

3. NVIDIA Spatial AI & World Models 🔴 Advanced

Academic Research

4. Stanford Geospatial & Spatial Intelligence 🟡 Intermediate | 🔴 Advanced

5. MIT Spatial Intelligence Lab 🟡 Intermediate | 🔴 Advanced

6. UC Berkeley Spatial AI & Robotics 🟡 Intermediate | 🔴 Advanced

Industry Tools & Platforms

7. Esri - Geospatial AI with ArcGIS 🟡 Intermediate

8. Google ARCore & Geospatial API 🟢 Beginner | 🟡 Intermediate

🔬 Cutting-Edge Research (2024-2026)

Advanced Spatial Reasoning Systems

9. Think3D: Thinking with Space for Spatial Reasoning (arXiv Jan 2026) ⭐ NEW 🔴 Advanced

10. Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning (arXiv Feb 2026) ⭐ NEW 🔴 Advanced

11. SpatialReasoner: Flexible 3D Spatial Reasoning Framework (GitHub 2024) 🟡 Intermediate | 🔴 Advanced

📚 Key Concepts Explained

Large Geospatial Models (LGMs)

World Models

Visual Positioning System (VPS)

SLAM (Simultaneous Localization and Mapping)

Embodied AI

3D Chain-of-Thought Reasoning

Predictive Spatial Field Modeling (PSFM)

🎯 Applications

🔗 Related Categories

University Resources

📊 Statistics

💡 Learning Path

Contributing