Data Products

A Data Product is the only way agents are allowed to read or write data in TraceMem.

It is not a database connection, not a table, and not an API client.

A Data Product is a governed, purpose-bound interface to data that is safe to use inside decision envelopes and safe to remember forever.

Why Data Products Exist

Agents fail in enterprises not because they lack data, but because:

Data access rules are implicit
Privacy constraints are scattered
Schemas change silently
No one knows which version of data rules applied at decision time

Data Products solve this by turning "data access" into a named, versioned, auditable contract.

Agents never touch raw data.
They interact with Data Products.

What a Data Product Is (and Is Not)

A Data Product is:

A logical access boundary
A semantic contract over one or more data sources
A policy attachment point
A purpose-bound interface
A hashable, versioned artifact

A Data Product is not:

A physical database
A schema registry
An ETL pipeline
A copy of the data
A data warehouse table

Where Data Products Sit in the Flow

text

Decision Envelope
    ↓
Data Product read/write
    ↓
Policies evaluated
    ↓
Approvals (if needed)
    ↓
Outcome committed
    ↓
Decision Trace recorded

Every read or write event in a decision trace references a Data Product.

Core Responsibilities

A Data Product defines:

What data is exposed - Exposed schema (subset of source schema)
For what purposes - Allowed purposes (e.g., "order_processing", "support")
Which operation is allowed - Exactly one of: read, insert, update, or delete
Under which restrictions - Data residency, result modes, field-level restrictions
With which policies - Attached policies that apply to all access
In what shape - Schema definition
Under what version - Immutable version with hash

Important: Each data product supports exactly one operation. If you need multiple operations (e.g., both read and insert), create separate data products.

It answers:

"What was the agent allowed to see or change in this decision?"

Data Product Lifecycle

1. Creation (Draft)

Data Products are created by administrators via:

The Admin Dashboard
The Admin API

Agents cannot create or modify Data Products.

Status: draft - Not used by agents yet

2. Publishing

Once published:

A Data Product becomes immutable
It receives a version identifier and hash
It can be used by agents
New decisions automatically use the latest published version

Status: published - Available for agent use

3. Deprecation

When a Data Product is replaced:

Old versions are deprecated (not deleted)
Historical traces remain valid
New decisions use the latest published version

Status: deprecated - Not used for new decisions, but historical traces remain valid

Key Components

Sources

Data Products reference one or more Connectors as data sources:

json

{
  "sources": [
    {
      "connector_id": "postgres-06505000-f1a19e",
      "type": "database",
      "system": "postgres",
      "resource": "public.customers"
    }
  ]
}

Exposed Schema

Only a subset of the source schema is exposed:

json

{
  "exposed_schema": [
    {
      "name": "customer_id",
      "type": "string",
      "classification": "identifier"
    },
    {
      "name": "email",
      "type": "string",
      "classification": "pii"
    },
    {
      "name": "tier",
      "type": "string",
      "classification": "business"
    }
  ]
}

Allowed Purposes

Every access must specify a purpose:

json

{
  "allowed_purposes": [
    "order_processing",
    "support_triage",
    "renewal_context"
  ]
}

Restrictions

Data Products can apply restrictions:

json

{
  "restrictions": {
    "data_residency": "eu",
    "result_mode_default": "summary",
    "allow_raw_values": false,
    "insert_config": {
      "return_created": true
    }
  }
}

Insert Method Configuration:

The insert_config object allows fine-grained control over insert operations:

return_created (boolean, optional): When true, the created object(s) are returned after insert operations. This is useful when you need database-generated IDs, timestamps, or other computed values immediately after insertion. Defaults to false.
allow_custom_primary_key (object, optional): For primary keys with auto-generated defaults (sequences, auto_increment, UUIDs), you can control whether users can provide custom values. Maps item IDs to boolean values. Example: {"item_id_123": true} allows custom values for that primary key.
column_config (object, optional): Per-column configuration controlling:
- required: Whether the column must be provided in insert requests
- allowed: Whether the column can be included in insert requests
- default_behavior: How to handle default values:
  - "user_provided": User must provide the value
  - "db_default": Use the database default (omit column from INSERT)
  - "tracemem_default": Use a fixed default value (specify in tracemem_default_value)
  - "null": Set to NULL (only for nullable columns)

Example:

json

{
  "insert_config": {
    "return_created": true,
    "allow_custom_primary_key": {
      "id": false
    },
    "column_config": {
      "id": {
        "required": false,
        "allowed": false,
        "default_behavior": "db_default"
      },
      "email": {
        "required": true,
        "allowed": true
      },
      "status": {
        "required": false,
        "allowed": true,
        "default_behavior": "tracemem_default",
        "tracemem_default_value": "active"
      }
    }
  }
}

Allowed Operations

Every Data Product declares which operation is allowed. Each data product supports exactly one operation.

Available operations:

read - Read records (supports optional allow_multiple parameter, defaults to limit 1)
insert - Create new records (supports optional return_created configuration to return created objects)
update - Update records (single or many based on required columns)
delete - Delete records (single or many based on required columns)

Important: The legacy write operation has been deprecated. Use insert, update, or delete instead.

Example - Read-only product:

json

{
  "allowed_operations": {
    "read": true,
    "insert": false,
    "update": false,
    "delete": false
  }
}

Example - Insert-only product:

json

{
  "allowed_operations": {
    "read": false,
    "insert": true,
    "update": false,
    "delete": false
  }
}

At runtime:

Every data access must specify an operation
TraceMem validates the operation against the product's allowed_operations
Disallowed operations fail closed with an error
Exactly one operation must be enabled per data product

If you need multiple operations: Create separate data products. For example:

customer_data_read - For reading customer information
customer_data_insert - For creating new customers
customer_data_update - For updating existing customers

This separation provides better governance, clearer audit trails, and more granular policy control.

Attached Policies

Policies can be attached to Data Products:

json

{
  "attached_policies": [
    {
      "policy_id": "pii_access_v1",
      "required": true
    }
  ]
}

Purpose-Bound Access

Every read or write operation must specify a purpose:

python

# Agent reads data with explicit purpose
data = agent.read(
    product="customer_data",
    purpose="order_processing",  # Must be in allowed_purposes
    query={"customer_id": "123"}
)

Why this matters:

GDPR/CCPA compliance
Audit trail shows why data was accessed
Data minimization (only access what you need)

Versioning

Data Products are versioned and immutable once published:

Draft - Can be edited freely
Published - Immutable, receives version number and hash
New Version - Editing creates a new version, old version remains
Deprecated - Old versions can be deprecated, but remain for historical traces

Benefits:

Historical traces remain valid
Policy changes don't break audit trails
Clear evolution of data access rules

How Agents Use Data Products

Agents interact with Data Products through:

Agent MCP - decision_read and decision_write tools
SDKs (coming soon) - Language-specific SDKs

Example:

python

# Create decision
decision = agent.create_decision(
    intent="customer.order.create",
    automation_mode="propose"
)

# Read via Data Product (product must have read operation enabled)
customer = agent.read(
    decision_id=decision.id,
    product="customer_data",  # This product only allows read operations
    purpose="order_processing",
    query={"customer_id": "123"}
)

# Insert via Data Product (product must have insert operation enabled)
result = agent.write(
    decision_id=decision.id,
    product="orders",  # This product only allows insert operations
    purpose="order_creation",
    mutation={
        "operation": "insert",
        "records": [{"customer_id": "123", "total": 299.99}]
    }
)

# If the data product has return_created enabled, the created record is returned
if result.get("created_records"):
    created_order = result["created_records"][0]
    order_id = created_order["id"]  # Use the database-generated ID

Note: Each data product supports only one operation. The customer_data product in this example only allows reads, while the orders product only allows inserts. If you need to both read and insert, you would create separate products.

Best Practices

Minimal exposed schema - Only expose fields agents need
Specific purposes - Use specific purposes, not generic ones
Version carefully - Test drafts before publishing
Attach policies - Use policies for access control
Document purposes - Make purposes clear and specific

Relationship to Other Concepts

Connectors - Data Products reference Connectors as sources
Policies - Data Products can have attached policies
Decision Envelopes - All data access happens within Decision Envelopes
Decision Traces - Every read/write event references a Data Product

Data Products

Why Data Products Exist

What a Data Product Is (and Is Not)

Where Data Products Sit in the Flow

Core Responsibilities

Data Product Lifecycle

1. Creation (Draft)

2. Publishing

3. Deprecation

Key Components

Sources

Exposed Schema

Allowed Purposes

Restrictions

Allowed Operations

Attached Policies

Purpose-Bound Access

Versioning

How Agents Use Data Products

Best Practices

Relationship to Other Concepts

Related Topics

Next Steps