Open Semantic Interchange (OSI)

Definition

Open Semantic Interchange (OSI) is a vendor-neutral, open-source specification for defining and exchanging semantic models — the business definitions of metrics, dimensions, datasets, and relationships — across data analytics, business intelligence (BI), and artificial intelligence (AI) platforms. It uses a declarative YAML format to express how business concepts such as “revenue,” “active customer,” or “conversion rate” are calculated, so that every connected tool and AI agent draws those definitions from a single source of truth rather than redefining them independently.

The initiative was launched on September 23, 2025, led by Snowflake in partnership with Salesforce, dbt Labs, BlackRock, and RelationalAI, along with a coalition of ecosystem vendors. The specification is licensed under Apache 2.0, and the founding organizations have stated they intend to donate the project’s outputs to the Apache Software Foundation so the standard remains community-governed rather than controlled by any single vendor. An initial v0.1 specification was published in January 2026.

How OSI relates to marketing

Marketing organizations run on metrics that are calculated and re-calculated across many disconnected systems: a web analytics platform, a CRM, an email service provider, a marketing automation tool, a customer data platform, an attribution tool, and one or more BI dashboards. Each of these tools often defines the same concept differently. “Marketing-qualified lead,” “conversion,” “customer acquisition cost,” and “lifetime value” can carry different filters, time windows, and aggregation logic depending on which system computes them. The result is a recurring problem known as metric drift, where the same business question returns different numbers depending on the tool used.

OSI addresses this by letting an organization define each marketing metric once in a portable specification that all downstream tools and AI agents consume. For marketers specifically, OSI is relevant in three ways. First, it aims to ensure that Marketing, Finance, and Sales report the same figures for shared metrics such as revenue and pipeline. Second, it provides AI agents and large language models (LLMs) with governed business context so that AI-generated answers to marketing questions are grounded in approved definitions rather than inferred logic. Third, it reduces vendor lock-in, allowing marketing teams to move business logic between martech and analytics platforms without rebuilding metric definitions each time.

How to “calculate” OSI

OSI is a specification, not a metric, so it is not calculated. Instead, it provides a structured way to encode the calculation logic for the metrics an organization already uses. An OSI semantic model is authored as a YAML file containing a set of defined building blocks. The core classes in the specification are:

ClassPurpose
Semantic ModelThe top-level container representing a complete model, including its datasets, relationships, and metrics.
DatasetsLogical business entities — fact and dimension tables — with their fields and structure.
FieldsRow-level attributes used for grouping, filtering, and building metric expressions.
MetricsAggregate calculations such as sums, averages, and ratios, which can span multiple datasets.
DimensionsCategorical attributes that answer where, when, and who.
RelationshipsForeign-key constraints that connect datasets, supporting both simple and composite keys.

A distinguishing feature of OSI is the ai_context field. This is a dedicated section in the YAML where an author writes natural-language instructions describing how an AI agent should use the model — for example, indicating that a model supports time-based analysis and customer segmentation. This field is intended to bridge structured definitions and the natural-language understanding that LLMs require to query data accurately.

A simplified example of an OSI semantic model defines a model name and description, an ai_context instruction block, one or more datasets with their source and primary key, fields with dimension properties (such as marking a date field as a time dimension), and metrics whose expressions are written in a SQL dialect such as ANSI SQL.

How to utilize OSI

Common use cases include the following.

Establishing a single source of truth for shared metrics. An organization authors canonical definitions for metrics like revenue, marketing-qualified leads, and customer lifetime value once, then exposes those definitions to every BI tool and AI agent so reported numbers are consistent across departments.

Grounding AI agents and LLMs in approved business logic. By feeding governed semantic models to AI systems, organizations reduce the risk of AI “hallucinating” metric definitions or producing answers based on conflicting logic. This is increasingly relevant as analytics shifts from human-driven dashboards toward agentic AI that queries data autonomously.

Reducing integration debt. Instead of building and maintaining many custom, point-to-point (N-to-N) integrations between proprietary tools — each with its own definition layer — teams maintain one specification that participating tools can read.

Preserving portability during platform migrations. Because business logic is decoupled from any specific platform, an organization can move semantic models between analytics or martech vendors without re-implementing every metric.

Aligning cross-functional reporting. Marketing, Finance, and Sales teams can reference the same governed definitions, reducing reconciliation meetings and disputes over whose numbers are correct.

It is worth noting the maturity of the standard. The initial specification was published in early 2026, and broader native platform support and import/export tooling were roadmapped for later phases. Organizations evaluating OSI should treat it as an emerging standard and confirm the current support status of their specific tools.

Comparison to similar approaches

ApproachWhat it standardizesScopeGovernanceRelationship to OSI
Open Semantic Interchange (OSI)Semantic model definitions (metrics, dimensions, datasets, relationships, AI context)Cross-vendor exchange across analytics, BI, and AIOpen source, Apache 2.0, multi-vendor coalitionThe interchange standard itself
Vendor-specific semantic layers (e.g., proprietary BI modeling layers)Metrics and dimensions within one platformSingle platformControlled by individual vendorOSI aims to interoperate across these rather than replace internal modeling
dbt Semantic Layer / MetricFlowMetric definitions tied to a transformation frameworkPrimarily the dbt and connected ecosystemOpen source, vendor-leddbt Labs is a founding OSI participant; serves as a contributing technology
Data catalogs and metadata management toolsMetadata, lineage, and governance of data assetsDiscovery and governance, not query-time calculationVariesComplementary; OSI has a working group focused on catalog integration
Customer data platforms (CDPs)Unified customer profiles and audience definitionsCustomer data activationVendor-controlledDifferent layer; OSI governs metric semantics, not profile unification

Best practices

Define metrics with the broadest agreement first. Begin with high-traffic, cross-functional metrics such as revenue, conversions, and customer counts, where metric drift causes the most disruption, before modeling niche metrics.

Involve business stakeholders in definitions. Because OSI encodes business logic, marketing, finance, and analytics stakeholders should agree on each definition before it is committed, rather than leaving definitions to engineering interpretation.

Use the ai_context field deliberately. Write clear, specific natural-language instructions so AI agents understand the intended scope and use of each model. Vague context reduces the accuracy of AI-generated answers.

Treat semantic models as version-controlled code. Store OSI YAML files in source control, review changes through pull requests, and document the rationale for definition changes so metric history is auditable.

Confirm tool support before committing. Verify that the specific platforms in your stack support OSI import and export at the version you intend to use, since native support has been rolling out in phases.

Keep models extensible. The specification is designed to be extended; structure models so organization-specific or industry-specific additions do not break compatibility with the core standard.

Several trends are shaping OSI’s trajectory. The first is the planned transfer of governance to the Apache Software Foundation, which is intended to keep the standard vendor-neutral and increase confidence for broad adoption. The second is expansion of the participant ecosystem; the coalition grew substantially after launch to include dozens of analytics, BI, catalog, and data quality vendors, and continued growth would increase the practical value of the standard as more tools read and write the format natively. The third is the shift toward agentic AI: as autonomous AI agents take on more analytical and decision-making tasks, a governed semantic foundation becomes more important for ensuring those agents act on trusted definitions. The fourth is the development of converter and validation tooling through dedicated working groups, which will determine how easily organizations can adopt OSI without manual rework. The degree of real-world impact will ultimately depend on whether major platforms beyond the founding members implement native support.

FAQs

What problem does OSI solve? It addresses semantic fragmentation, where the same business metric is defined differently across tools, causing inconsistent reporting (metric drift), costly manual reconciliation, unreliable AI outputs, and complex point-to-point integrations.

Is OSI a product I can buy? No. OSI is an open-source specification, not a commercial product. It is licensed under Apache 2.0 and freely available, with reference materials hosted on GitHub.

Who created OSI? It was launched in September 2025, led by Snowflake with founding participants including Salesforce, dbt Labs, BlackRock, and RelationalAI, plus a coalition of ecosystem vendors that has continued to expand.

What file format does OSI use? OSI semantic models are authored in a declarative YAML format that defines datasets, fields, metrics, dimensions, relationships, and AI context.

How is OSI different from a data catalog? A data catalog focuses on discovering, governing, and tracking the lineage of data assets. OSI defines the calculation logic and meaning of metrics and how those definitions are exchanged across tools. The two are complementary, and OSI has a working group dedicated to catalog integration.

Does OSI replace my BI tool’s modeling layer? Not necessarily. OSI is designed to enable interoperability across tools so a definition authored once can be consumed by many platforms, rather than to replace each platform’s internal modeling capabilities.

Why does OSI matter for AI and LLMs? AI agents and LLMs can produce confident but inaccurate answers when they infer metric logic. OSI supplies governed definitions and a dedicated ai_context field so AI systems reason from approved business logic.

Is OSI production-ready? An initial specification (v0.1) was published in January 2026, with broader native tool support and converters roadmapped for later phases. Organizations should verify the current support status of their specific tools before relying on it in production.

How does OSI relate to marketing teams specifically? It helps ensure marketing metrics match the figures used by finance and sales, gives marketing-focused AI tools trustworthy context, and reduces lock-in so business logic can move between martech and analytics platforms.

Who governs OSI long term? The founding organizations have stated an intention to donate the project’s outputs to the Apache Software Foundation to keep it community-governed and vendor-neutral.

Sources

Tags:

Was this helpful?