Skip to content

Operation Log (oplog)

The aaiclick/oplog/ module captures operation provenance inside ClickHouse with zero AI dependencies — every object created or transformed is recorded for lineage tracing and debugging.


Storage

operation_log (ClickHouse)

Implementation: aaiclick/oplog/models.py — see OPERATION_LOG_DDL, init_oplog_tables()

Append-only audit log. Fields: id (Snowflake), result_table, operation, kwargs (Map), sql_template, task_id, job_id, created_at. ORDER BY (result_table, created_at). Cleaned up by BackgroundWorker._cleanup_expired_jobs() when the owning job expires (see AAICLICK_JOB_TTL_DAYS).

All inputs named via kwargs (e.g. {"left": ..., "right": ...} for binary ops).

table_registry (SQL)

Implementation: aaiclick/orchestration/lifecycle/db_lifecycle.py — see TableRegistry

Ownership metadata for every ClickHouse data table: table_name (PK) → (job_id, task_id, run_id, created_at). Written once at creation by the lifecycle handler's queued OPLOG_TABLE op; deleted by BackgroundWorker._cleanup_unreferenced_tables() when the table is dropped, and by _cleanup_expired_jobs() / _cleanup_orphaned_resources() for TTL'd rows.

Previously lived in ClickHouse as an append-only MergeTree table. Moved to SQL because every consumer is a keyed lookup or owner join during background cleanup — not append-only audit.

Initialization

init_oplog_tables(ch_client) creates operation_log on CH idempotently and validates its schema via _validate_schema(). It also performs a one-time copy of any pre-existing CH table_registry rows into SQL and drops the CH side (no-op on fresh installs).


OplogCollector

Implementation: aaiclick/oplog/collector.py — see OplogCollector, get_oplog_collector()

Buffer-based event sink. Collects OperationEvent objects in memory; batch-inserts to operation_log (CH) and table_registry (SQL) on flush(). Accessed via ContextVar _oplog_collector.


Instrumentation Points

Implementation: aaiclick/data/operators.py, aaiclick/data/ingest.py, aaiclick/data/object.py, aaiclick/data/data_context.py

Each instrumentation is a 2-line addition: get collector from ContextVar, call record() if not None.

Location Operation kwargs
data_context.create_object() (none — registry only)
data_context.create_object_from_value() "create_from_value" {}
operators._apply_operator_db() "add", "sub", etc. {"left": left.table, "right": right.table}
operators._apply_aggregation() "sum", "mean", etc. {"source": source.table}
ingest.concat_objects_db() "concat" {"source_0": s0.table, "source_1": s1.table, ...}
ingest.insert_objects_db() "insert" {"source": src.table, "target": tgt.table}
object.Object.copy() "copy" {"source": self.table}

create_object() only calls record_table() (populates table_registry), not record(), to avoid double-counting — higher-level functions record the operation that produced the table.


Graph Queries

Implementation: aaiclick/oplog/lineage.py — see backward_oplog(), forward_oplog(), oplog_subgraph(), OplogGraph.to_prompt_context()

Graph traversal over operation_log. backward_oplog() traces upstream lineage via recursive CTE. OplogGraph.to_prompt_context() formats the graph as plain text for LLM consumption.


Table Lifecycle & Cleanup ✅ IMPLEMENTED

Implementation: aaiclick/oplog/cleanup.py — see lineage_aware_drop(), aaiclick/orchestration/background/background_worker.py — see BackgroundWorker._cleanup_unreferenced_tables() and BackgroundWorker._cleanup_expired_jobs()

All cleanup is job-driven. The per-job preservation_mode gates what cleanup does:

Mode Cleanup behavior
NONE Drop unpinned tables as soon as refs fall to zero (default).
FULL Skip the drop entirely — tables live until the job TTL expires.

BackgroundWorker._cleanup_expired_jobs() deletes all job data (CH tables, oplog entries, SQL metadata) for jobs completed more than AAICLICK_JOB_TTL_DAYS ago.


Environment Variables

Variable Default Description
AAICLICK_JOB_TTL_DAYS 90 Days after job completion before all job data is deleted.
AAICLICK_DEFAULT_PRESERVATION_MODE NONE Default preservation mode for jobs that don't pass one explicitly. One of NONE, FULL.