The physical world is the largest uncomputed environment on earth.

XOO builds that system

We build a street‑scale physical AI network that makes cities machine‑readable.

Every street, building, and public space generates continuous signal — movement, occupancy, density, environmental state, context. None of it has ever been understood at scale, because no system existed that could perceive it, process it, and act on it in real time. Multimodal AI, running on our own hardware, deployed into the fabric of cities.

01 - THE PROBLEM

Perception without intelligence is noise.

None of it thinks.

Cities have been instrumented for decades. Sensors on every corner. Cameras on every block. Screens on every street. The infrastructure of observation is already there. The devices stay isolated, the data stays siloed — no shared model of what is actually happening in physical space, no system that sees across environments, infers meaning, and exposes that intelligence at scale. The gap is not sensors. The gap is cognition.

02 - THE SOLUTION

XOO: Physical AI network for city‑scale inference.

Every node runs multimodal AI. Every event feeds one global model.

On‑device models convert raw street‑level signals — movement, acoustics, environment, context — into structured events at the edge. XOO does not build the sensors that capture those signals; it builds the physical AI layer that turns any sensor mix into decision‑grade context. Those events feed a single intelligence layer — one schema, one context model, one data layer. With every new node the model compounds, and physical space starts to behave like software infrastructure.

EDGE AI

On‑device models process multimodal sensor input

Movement, acoustics, environment, context — from existing infrastructure and emit structured events in milliseconds. Raw data never leaves the node. Only anonymized, structured events hit the network — privacy guaranteed by architecture, not policy.

BACKEND

One global intelligence layer

Edge events feed a unified time‑series backend. Partners and operators consume structured context — feeds, dashboards, direct integrations — all reading from the same model. Because XOO owns the node, XOO owns the schema. That is not an integration advantage. It is a structural moat.

SERVICES

Built to fund rollout

Data products and programmatic services run directly on the event layer. Each node is economically self‑sustaining from day one. Every deployment expands the network, improves the model, and creates value across all stakeholders simultaneously.

03 - THE PLATFORM

XOO builds the nodes.

The compute engine runs inside.

Every XOO node is a purpose‑built multimodal compute unit deployed into public space under visible functions like charging infrastructure, screens, or street furniture. The node does not exist to capture more raw data — it exists to run physical AI on top of the sensors that are already there. Models on‑device fuse visual, acoustic, and environmental input into structured events and decision‑grade context. XOO owns the edge and the inference layer; the underlying sensor mix can change without touching the intelligence.

HARDWARE: Purpose‑built multimodal compute nodes with embedded inference — attached to existing sensors and host functions like charging, media, or monitoring.

PROCESS: On‑device physical AI. Raw sensor data never leaves the node; models fuse multimodal input into structured, anonymized events and decision‑grade context.

CLOUD: Edge events feed time‑series models and real‑time context surfaces, exposing one unified intelligence layer via APIs and dashboards.

COMPOUND: Every node improves model accuracy. Every deployment increases network value for every node already live.

04 - USE CASES

One inference layer. Many markets.

One node. Multiple revenue lines.

The physical AI layer is built once. Use cases are different reads of the same events — not separate products and not separate deployments. One multimodal node generates decision‑grade signal for every vertical simultaneously. Adding a vertical means updating a model, not installing new hardware.

Ad Tech

Media that measures itself

Ad Tech

Physical inventory that measures itself. Anonymized, real‑time audience signals turn any screen network into impression‑true, programmatic DOOH — no individual tracking, no new sensors.

City Operation

Street telemetry that becomes city services

City Operation

Operational intelligence from street telemetry. Live event streams turn multimodal sensor data into anomaly alerts, utilization signals, and predictive maintenance flags — so cities move from reactive firefighting to model‑driven planning, through one interface instead of another data silo.

Mobility

Movement patterns that become demand signals

Mobility

Movement as infrastructure signal. Pedestrian and vehicle flows become structured time‑series that drive utilization curves, demand forecasts, and location intelligence — every new node sharpening every existing model.

Air and environment

Measurements that become city‑scale surfaces

Air and environment

Physical AI for environmental grids. Dense, multimodal sensor networks become city‑scale air‑quality surfaces structured for ESG compliance, regulatory reporting, and urban planning — all from the same intelligence layer accessed via API.

Security

Anomalies detected without surveillance infrastructure

Security

Anomaly detection as a software update. XOO’s physical AI layer runs on nodes already in the field, turning multimodal signals into drone and threat detection without new hardware, without new sensors, and without building a separate surveillance infrastructure.

Retail and places

Footfall as addressable demand

Retail and places

Physical AI turns flow events into time‑window audience cohorts, so media buyers target against real reach signals instead of modeled estimates — outdoor inventory finally behaves like performance media.
05 - WHY NOW

Four forces have aligned.

The window is open right now.

Edge hardware is cheap enough, regulation is pushing privacy-first design, outdoor media is going data-driven, and cities can’t afford to build infrastructure the old way. The company that moves now defines the standard – before the market consolidates around someone else’s schema.

1

Edge hardware has crossed the cost threshold Running

AI directly on street-level sensors is no longer expensive. The hardware race is over. What remains is the software layer that turns raw sensor data into structured intelligence – and whoever ships that layer first owns the installed base.

2

Outdoor media needs real audience data

Programmatic DOOH is growing fast, but it’s flying blind at street level. Advertisers are buying impressions they can’t verify. A physical AI layer that measures actual movement and attention becomes the attribution standard the industry is waiting for.

3

Regulation rewards on-device AI

GDPR and its successors make centralized tracking architectures slow and legally fragile. Systems that process data on-device and emit only anonymized events can scale in markets where traditional surveillance infrastructure stalls.

4

Cities need infrastructure that pays for itself

Demand for public charging, monitoring, and safer streets is rising. Budgets are not. The only question is how to fund rollouts without constant subsidy.

06 - TEAM

Built for city scale deployment

Most teams can do one of three things: build hardware, train models, or ship platform infrastructure at city scale. XOO does all three — integrated, under one stack.

Mat Schubert
CEO & Co-Founder

Scaled mobility hardware across Europe — Bosch Mobility Services, Coup scooter sharing. Owns rollout architecture, partner distribution, and network economics. Knows how physical infrastructure gets deployed and financed at scale — asset leasing, revenue share, operator relationships.

Tim Schöllhammer
Head of Plattform Engineering

10 years in system engineering and platform development. Builds the event layer that turns edge inference into structured context for partners and operators. Background in sourcing and procurement — engineers for real-world rollout constraints, not ideal conditions.

Marc Zimmermann
CPO & Co-Founder

Builds the product narrative and vertical use cases — Ad Tech, ESG, Safety. Drives go-to-market from pilot to repeatable city rollout. Founded urbanmates, managed Flynn electric scooter. Knows how to turn an infrastructure deployment into a market position.

Carlo Fritz
AI and Data Modeling

5 years in computer vision and AI. 3 years in context analytics and graph architecture. Turns multimodal sensor signals into a robust event and context model — the layer that makes one node readable across every vertical simultaneously.

Unsere Partner

Request Pitch Deck

Have questions? We’re happy to help.