Data Products
A Data Product is the only way agents are allowed to read or write data in TraceMem.
It is not a database connection, not a table, and not an API client.
A Data Product is a governed, purpose-bound interface to data that is safe to use inside decision envelopes and safe to remember forever.
Why Data Products Exist
Agents fail in enterprises not because they lack data, but because:
- Data access rules are implicit
- Privacy constraints are scattered
- Schemas change silently
- No one knows which version of data rules applied at decision time
Data Products solve this by turning "data access" into a named, versioned, auditable contract.
Agents never touch raw data.
They interact with Data Products.
What a Data Product Is (and Is Not)
A Data Product is:
- A logical access boundary
- A semantic contract over one or more data sources
- A policy attachment point
- A purpose-bound interface
- A hashable, versioned artifact
A Data Product is not:
- A physical database
- A schema registry
- An ETL pipeline
- A copy of the data
- A data warehouse table
Where Data Products Sit in the Flow
Decision Envelope
↓
Data Product read/write
↓
Policies evaluated
↓
Approvals (if needed)
↓
Outcome committed
↓
Decision Trace recorded
Every read or write event in a decision trace references a Data Product.
Core Responsibilities
A Data Product defines:
- What data is exposed - Exposed schema (subset of source schema)
- For what purposes - Allowed purposes (e.g., "order_processing", "support")
- Which operation is allowed - Exactly one of: read, insert, update, or delete
- Under which restrictions - Data residency, result modes, field-level restrictions
- With which policies - Attached policies that apply to all access
- In what shape - Schema definition
- Under what version - Immutable version with hash
Important: Each data product supports exactly one operation. If you need multiple operations (e.g., both read and insert), create separate data products.
It answers:
"What was the agent allowed to see or change in this decision?"
Data Product Lifecycle
1. Creation (Draft)
Data Products are created by administrators via:
- The Admin Dashboard
- The Admin API
Agents cannot create or modify Data Products.
Status: draft - Not used by agents yet
2. Publishing
Once published:
- A Data Product becomes immutable
- It receives a version identifier and hash
- It can be used by agents
- New decisions automatically use the latest published version
Status: published - Available for agent use
3. Deprecation
When a Data Product is replaced:
- Old versions are deprecated (not deleted)
- Historical traces remain valid
- New decisions use the latest published version
Status: deprecated - Not used for new decisions, but historical traces remain valid
Key Components
Sources
Data Products reference one or more Connectors as data sources:
{
"sources": [
{
"connector_id": "postgres-06505000-f1a19e",
"type": "database",
"system": "postgres",
"resource": "public.customers"
}
]
}
Exposed Schema
Only a subset of the source schema is exposed:
{
"exposed_schema": [
{
"name": "customer_id",
"type": "string",
"classification": "identifier"
},
{
"name": "email",
"type": "string",
"classification": "pii"
},
{
"name": "tier",
"type": "string",
"classification": "business"
}
]
}
Allowed Purposes
Every access must specify a purpose:
{
"allowed_purposes": [
"order_processing",
"support_triage",
"renewal_context"
]
}
Restrictions
Data Products can apply restrictions:
{
"restrictions": {
"data_residency": "eu",
"result_mode_default": "summary",
"allow_raw_values": false,
"insert_config": {
"return_created": true
}
}
}
Insert Method Configuration:
The insert_config object allows fine-grained control over insert operations:
-
return_created(boolean, optional): Whentrue, the created object(s) are returned after insert operations. This is useful when you need database-generated IDs, timestamps, or other computed values immediately after insertion. Defaults tofalse. -
allow_custom_primary_key(object, optional): For primary keys with auto-generated defaults (sequences, auto_increment, UUIDs), you can control whether users can provide custom values. Maps item IDs to boolean values. Example:{"item_id_123": true}allows custom values for that primary key. -
column_config(object, optional): Per-column configuration controlling:required: Whether the column must be provided in insert requestsallowed: Whether the column can be included in insert requestsdefault_behavior: How to handle default values:"user_provided": User must provide the value"db_default": Use the database default (omit column from INSERT)"tracemem_default": Use a fixed default value (specify intracemem_default_value)"null": Set to NULL (only for nullable columns)
Example:
{
"insert_config": {
"return_created": true,
"allow_custom_primary_key": {
"id": false
},
"column_config": {
"id": {
"required": false,
"allowed": false,
"default_behavior": "db_default"
},
"email": {
"required": true,
"allowed": true
},
"status": {
"required": false,
"allowed": true,
"default_behavior": "tracemem_default",
"tracemem_default_value": "active"
}
}
}
}
Allowed Operations
Every Data Product declares which operation is allowed. Each data product supports exactly one operation.
Available operations:
read- Read records (supports optionalallow_multipleparameter, defaults to limit 1)insert- Create new records (supports optionalreturn_createdconfiguration to return created objects)update- Update records (single or many based on required columns)delete- Delete records (single or many based on required columns)
Important: The legacy write operation has been deprecated. Use insert, update, or delete instead.
Example - Read-only product:
{
"allowed_operations": {
"read": true,
"insert": false,
"update": false,
"delete": false
}
}
Example - Insert-only product:
{
"allowed_operations": {
"read": false,
"insert": true,
"update": false,
"delete": false
}
}
At runtime:
- Every data access must specify an operation
- TraceMem validates the operation against the product's
allowed_operations - Disallowed operations fail closed with an error
- Exactly one operation must be enabled per data product
If you need multiple operations: Create separate data products. For example:
customer_data_read- For reading customer informationcustomer_data_insert- For creating new customerscustomer_data_update- For updating existing customers
This separation provides better governance, clearer audit trails, and more granular policy control.
Attached Policies
Policies can be attached to Data Products:
{
"attached_policies": [
{
"policy_id": "pii_access_v1",
"required": true
}
]
}
Purpose-Bound Access
Every read or write operation must specify a purpose:
# Agent reads data with explicit purpose
data = agent.read(
product="customer_data",
purpose="order_processing", # Must be in allowed_purposes
query={"customer_id": "123"}
)
Why this matters:
- GDPR/CCPA compliance
- Audit trail shows why data was accessed
- Data minimization (only access what you need)
Versioning
Data Products are versioned and immutable once published:
- Draft - Can be edited freely
- Published - Immutable, receives version number and hash
- New Version - Editing creates a new version, old version remains
- Deprecated - Old versions can be deprecated, but remain for historical traces
Benefits:
- Historical traces remain valid
- Policy changes don't break audit trails
- Clear evolution of data access rules
How Agents Use Data Products
Agents interact with Data Products through:
- Agent MCP -
decision_readanddecision_writetools - SDKs (coming soon) - Language-specific SDKs
Example:
# Create decision
decision = agent.create_decision(
intent="customer.order.create",
automation_mode="propose"
)
# Read via Data Product (product must have read operation enabled)
customer = agent.read(
decision_id=decision.id,
product="customer_data", # This product only allows read operations
purpose="order_processing",
query={"customer_id": "123"}
)
# Insert via Data Product (product must have insert operation enabled)
result = agent.write(
decision_id=decision.id,
product="orders", # This product only allows insert operations
purpose="order_creation",
mutation={
"operation": "insert",
"records": [{"customer_id": "123", "total": 299.99}]
}
)
# If the data product has return_created enabled, the created record is returned
if result.get("created_records"):
created_order = result["created_records"][0]
order_id = created_order["id"] # Use the database-generated ID
Note: Each data product supports only one operation. The customer_data product in this example only allows reads, while the orders product only allows inserts. If you need to both read and insert, you would create separate products.
Best Practices
- Minimal exposed schema - Only expose fields agents need
- Specific purposes - Use specific purposes, not generic ones
- Version carefully - Test drafts before publishing
- Attach policies - Use policies for access control
- Document purposes - Make purposes clear and specific
Relationship to Other Concepts
- Connectors - Data Products reference Connectors as sources
- Policies - Data Products can have attached policies
- Decision Envelopes - All data access happens within Decision Envelopes
- Decision Traces - Every read/write event references a Data Product