Building an AI Companion for Vision Pro

The Vision

I'm building an AI companion for Apple Vision Pro—a 3D avatar I can see, talk to, and interact with across all my virtual experiences.

This isn't a prototype or experiment. This is a production system designed to:

I purchased Vision Pro because I believe spatial computing is the platform of the future. Not in 5 years—now.

The hardware exists. The APIs exist. The only thing missing is the software that makes spatial computing feel essential rather than novel.

This companion requires integrating multiple complex systems:

Rather than rushing to code, I'm starting with fundamentals:

Understanding the language, type system, and SwiftUI patterns.

Windows, Volumes, Immersive Spaces—how spatial computing actually works.

Loading models, animations, spatial audio, and entity management.

Connecting voice input to AI reasoning to audio output.

A 3D avatar standing in space, speaking intelligently in response to my voice.

I'll document every step of this journey here. The successes, the roadblocks, the architectural decisions, and the lessons learned.

This blog exists to capture the process of building production software for a platform that's still defining itself.

Next post: Swift fundamentals and the visionOS development environment setup.