We build a street‑scale physical AI network that makes cities machine‑readable.
Every street, building, and public space generates continuous signal — movement, occupancy, density, environmental state, context. None of it has ever been understood at scale, because no system existed that could perceive it, process it, and act on it in real time. Multimodal AI, running on our own hardware, deployed into the fabric of cities.
Cities have been instrumented for decades. Sensors on every corner. Cameras on every block. Screens on every street. The infrastructure of observation is already there. The devices stay isolated, the data stays siloed — no shared model of what is actually happening in physical space, no system that sees across environments, infers meaning, and exposes that intelligence at scale. The gap is not sensors. The gap is cognition.
On‑device models convert raw street‑level signals — movement, acoustics, environment, context — into structured events at the edge. XOO does not build the sensors that capture those signals; it builds the physical AI layer that turns any sensor mix into decision‑grade context. Those events feed a single intelligence layer — one schema, one context model, one data layer. With every new node the model compounds, and physical space starts to behave like software infrastructure.
Movement, acoustics, environment, context — from existing infrastructure and emit structured events in milliseconds. Raw data never leaves the node. Only anonymized, structured events hit the network — privacy guaranteed by architecture, not policy.
Edge events feed a unified time‑series backend. Partners and operators consume structured context — feeds, dashboards, direct integrations — all reading from the same model. Because XOO owns the node, XOO owns the schema. That is not an integration advantage. It is a structural moat.
Data products and programmatic services run directly on the event layer. Each node is economically self‑sustaining from day one. Every deployment expands the network, improves the model, and creates value across all stakeholders simultaneously.
Every XOO node is a purpose‑built multimodal compute unit deployed into public space under visible functions like charging infrastructure, screens, or street furniture. The node does not exist to capture more raw data — it exists to run physical AI on top of the sensors that are already there. Models on‑device fuse visual, acoustic, and environmental input into structured events and decision‑grade context. XOO owns the edge and the inference layer; the underlying sensor mix can change without touching the intelligence.
HARDWARE: Purpose‑built multimodal compute nodes with embedded inference — attached to existing sensors and host functions like charging, media, or monitoring.
PROCESS: On‑device physical AI. Raw sensor data never leaves the node; models fuse multimodal input into structured, anonymized events and decision‑grade context.
CLOUD: Edge events feed time‑series models and real‑time context surfaces, exposing one unified intelligence layer via APIs and dashboards.
COMPOUND: Every node improves model accuracy. Every deployment increases network value for every node already live.
The physical AI layer is built once. Use cases are different reads of the same events — not separate products and not separate deployments. One multimodal node generates decision‑grade signal for every vertical simultaneously. Adding a vertical means updating a model, not installing new hardware.
Edge hardware is cheap enough, regulation is pushing privacy-first design, outdoor media is going data-driven, and cities can’t afford to build infrastructure the old way. The company that moves now defines the standard – before the market consolidates around someone else’s schema.
AI directly on street-level sensors is no longer expensive. The hardware race is over. What remains is the software layer that turns raw sensor data into structured intelligence – and whoever ships that layer first owns the installed base.
Programmatic DOOH is growing fast, but it’s flying blind at street level. Advertisers are buying impressions they can’t verify. A physical AI layer that measures actual movement and attention becomes the attribution standard the industry is waiting for.
GDPR and its successors make centralized tracking architectures slow and legally fragile. Systems that process data on-device and emit only anonymized events can scale in markets where traditional surveillance infrastructure stalls.
Demand for public charging, monitoring, and safer streets is rising. Budgets are not. The only question is how to fund rollouts without constant subsidy.
Most teams can do one of three things: build hardware, train models, or ship platform infrastructure at city scale. XOO does all three — integrated, under one stack.
Scaled mobility hardware across Europe — Bosch Mobility Services, Coup scooter sharing. Owns rollout architecture, partner distribution, and network economics. Knows how physical infrastructure gets deployed and financed at scale — asset leasing, revenue share, operator relationships.
10 years in system engineering and platform development. Builds the event layer that turns edge inference into structured context for partners and operators. Background in sourcing and procurement — engineers for real-world rollout constraints, not ideal conditions.
Builds the product narrative and vertical use cases — Ad Tech, ESG, Safety. Drives go-to-market from pilot to repeatable city rollout. Founded urbanmates, managed Flynn electric scooter. Knows how to turn an infrastructure deployment into a market position.
5 years in computer vision and AI. 3 years in context analytics and graph architecture. Turns multimodal sensor signals into a robust event and context model — the layer that makes one node readable across every vertical simultaneously.




Have questions? We’re happy to help.