Ledger Audit and Tamper Detection
MESHFLOW_MOCK=1 python3 hands_on/11_ledger_audit.pyLesson 11: Ledger Audit And Tamper Detection
Lesson Goal
By the end of this lesson, you should be able to:
- Explain why a tamper-evident ledger is essential for governed AI.
- Describe how SHA-256 hash chaining makes tampering detectable.
- Use the ReplayLedger API to read, verify, diff, and export runs.
- Travel back to any step in a run and fork a new run from that point.
- Anonymize a run to comply with GDPR right-to-erasure requirements.
Estimated time: 40 to 55 minutes.
1. Why Audit Matters In AI Systems
An AI workflow can produce wrong, biased, or harmful output. When that happens, you need to answer three questions:
- What exactly did the system do?
- Why did it do that?
- Has anyone tampered with the record since the run completed?
A standard application log answers the first two questions but not the third. Logs can be edited silently. In regulated environments — healthcare, finance, legal — silent editing is unacceptable. You need a ledger that makes tampering visible.
2. SHA-256 Hash Chaining
MeshFlow records every workflow step as a StepRecord. After writing each record, it computes a SHA-256 hash of the record content combined with the hash of the previous record. This creates a chain:
step_0: content_hash_0
step_1: SHA-256(content_1 + content_hash_0) = hash_1
step_2: SHA-256(content_2 + hash_1) = hash_2
step_N: SHA-256(content_N + hash_{N-1}) = hash_N
To verify the chain, the verifier recomputes every hash from scratch. If any record was modified — even a single character — the chain breaks at that step. The verifier reports exactly which step was tampered with.
What SHA-256 hash chaining proves:
- Completeness: no steps were deleted from the middle.
- Integrity: no step content was changed after it was written.
- Order: steps cannot be reordered without breaking the chain.
What it does NOT prove:
- The original record was correct (garbage in, garbage out).
- The system was running the right code version.
- Access was properly controlled before the record was written.
3. The ReplayLedger API
from meshflow.core.ledger import ReplayLedger
ledger = ReplayLedger("my_pipeline.db")
# List all run IDs stored in this ledger
run_ids = ledger.list_runs()
# Read all step records for a run
steps = ledger.get_run(run_id)
# Aggregated metrics for a run
summary = ledger.run_summary(run_id)
# summary contains: total_cost_usd, total_tokens, total_carbon_g, duration_s, step_count
# Verify the hash chain — returns True or raises on tamper
ok = ledger.verify_chain(run_id)
# Export to JSON string
json_text = ledger.export_run(run_id)
# Export to CSV string
csv_text = ledger.export_run_csv(run_id)
4. Time-Travel And Forking
The ledger stores every intermediate state. You can reconstruct the workflow context at any step — this is called time-travel:
# Load the state as it was after step 3
state_at_step_3 = ledger.load_state(run_id, step=3)
# Fork: start a new run from step 3 with a different policy
new_run_id = ledger.fork(run_id, step=3, new_run_id="fork_run_001")
Time-travel is useful for:
- Debugging: reproduce the exact state that caused a failure.
- Counterfactual analysis: what would have happened with a different policy?
- Incremental correction: fix a bad step without re-running the whole pipeline.
5. Comparing Runs With diff
After changing a prompt or policy, you can compare two runs step by step:
delta = ledger.diff(run_id_before, run_id_after)
# Returns a list of per-step differences in output, cost, tokens, and carbon
This is particularly useful for regression testing: run a workflow before and after a change, then confirm the outputs changed only where expected.
6. GDPR Anonymization
Under GDPR's right to erasure, you may need to remove personally identifiable information (PII) from audit records without breaking the chain. MeshFlow provides anonymize_run():
ledger.anonymize_run(run_id)
# Overwrites PII fields with [REDACTED] markers
# Recomputes the chain hashes so verify_chain still passes
Important: anonymization is a destructive operation. The original PII cannot be recovered after anonymization. Run it only when legally required.
7. Tamper Detection In Practice
When you call verify_chain, the ledger walks every step in the run and recomputes each hash. If it finds a mismatch it raises an exception identifying the tampered step:
TamperDetectedError: step 4 hash mismatch
expected: a3f2b1...
found: d7e9c3...
In production, run verify_chain on a schedule (for example, after every batch completes) so you detect tampering quickly. Store the chain hash externally (for example, in a read-only object store) for an additional independent verification point.
8. Hands-On Lab
Run the ledger audit demo:
MESHFLOW_MOCK=1 python3 hands_on/11_ledger_audit.py
Observe:
- How many runs are listed by
ledger.list_runs() - The step count and cost summary from
run_summary - The verification result from
verify_chain - The JSON export structure from
export_run - What the diff shows between two runs
Then open one of the audit JSON files in the repository root:
cat audit_run_480c.json | python3 -m json.tool | head -40
Identify the step_records array and read the hash field on each record.
9. Summary
The ReplayLedger records every workflow step with a SHA-256 hash chain that makes tampering visible. You can read, verify, export, diff, time-travel, fork, and anonymize runs. In regulated environments, the ledger is not optional — it is the proof that the system did what it claims to have done.
Key operations:
get_run→ read all stepsverify_chain→ detect tamperingload_state→ reconstruct context at any stepfork→ branch from a past statediff→ compare two runsanonymize_run→ GDPR-compliant redaction
Exercises
Exercises
Exercise 1: Run the Script and Read the Output
Goal: Familiarize yourself with the full output of the ledger audit hands-on script.
Instructions:
- Open a terminal and navigate to the
meshflow_tutorialproject root. - Run the hands-on script:
python hands_on/11_ledger_audit.py
- Read through every line of output carefully. The script creates several agent runs, writes them to the ledger, and then calls various ledger API methods.
- Answer the following questions in a short notepad or comment block:
- How many runs were created during the script execution? - What was the SHA-256 hash of the first step in the first run? (Look for a field like hash or step_hash in the printed output.) - Did verify_chain() return True or False on the first check? - Which run ID was used for the time-travel load_state demo?
Expected output: The script should print a chain-verification result of True, a run summary table, and at least one exported JSON block. No Python tracebacks should appear.
Exercise 2: Open an Audit JSON File and Read the Hash Field
Goal: Understand the raw structure of a ledger record on disk.
Instructions:
- After running the script from Exercise 1, locate the exported audit JSON file. The script writes it to a path printed in the output (look for a line like
Exported run to: ...). - Open the file in any text editor or with
python -m json.tool <filename>for pretty-printing. - Find and record:
- The top-level run_id field. - The steps array. How many steps are present? - The hash field on the first step. This is the SHA-256 of (previous_hash + step_payload). - The hash field on the second step. Notice that it incorporates the first step's hash.
- Manually verify the chain by copying the first step's hash and confirming it appears embedded in the data used to compute the second step's hash (the tutorial explains the exact concatenation formula).
Expected output: A clear view of the nested hash values and a conceptual understanding that each hash depends on all prior hashes.
Exercise 3: Call verify_chain and Then Manually Edit the DB to Test Tamper Detection
Goal: Experience how hash chaining detects tampering.
Instructions:
- Run the script and note a
run_idthatverify_chain()confirms as valid (True). - Locate the ledger SQLite database file (the script prints its path, or check
~/.meshflow/ledger.dbby default). - Open the database with the SQLite CLI:
sqlite3 ~/.meshflow/ledger.db
- Inspect the steps table:
SELECT * FROM steps LIMIT 5;
- Pick any step and update a field — for example, change the
outputcolumn of step 1 for your chosen run:
UPDATE steps SET output = '{"tampered": true}' WHERE run_id = '<your_run_id>' AND step_index = 1;
- Exit SQLite (
.quit) and runverify_chain(run_id)again in a Python script or REPL:
from meshflow.ledger import ReplayLedger
ledger = ReplayLedger()
print(ledger.verify_chain("<your_run_id>"))
- Confirm the result is now
False. Record which step index is reported as the first tampered step.
Expected output: verify_chain returns False and identifies step 1 (or the step you edited) as the integrity violation.
Clean-up: Restore the original value or re-run the hands-on script to generate a fresh run.
Exercise 4: Compare Two Runs with diff
Goal: Use diff to understand how two runs diverged.
Instructions:
- Run
11_ledger_audit.pytwice (or find two existing runs in your ledger withlist_runs()). - Record the
run_idvalues of both runs. They should be runs of the same workflow but may have different inputs or outputs. - In a Python REPL or short script, call:
from meshflow.ledger import ReplayLedger
ledger = ReplayLedger()
runs = ledger.list_runs(limit=5)
run_a = runs[0]["run_id"]
run_b = runs[1]["run_id"]
delta = ledger.diff(run_a, run_b)
print(delta)
- Examine the diff output. Identify:
- Which steps are present in run A but not run B (or vice versa). - Which steps have the same name but different outputs. - Any changes in timing or token counts.
- Write two to three sentences explaining what the diff tells you about how the workflow behaved differently between the two runs.
Expected output: A structured diff object (dict or dataclass) showing added, removed, and changed steps between the two runs.
Exercise 5: Anonymize a Run and Verify the Chain Still Passes
Goal: Confirm that GDPR anonymization does not break ledger integrity.
Instructions:
- Pick a run ID from your ledger (use
list_runs()to find one). - Call
anonymize_run:
from meshflow.ledger import ReplayLedger
ledger = ReplayLedger()
run_id = "<your_run_id>"
ledger.anonymize_run(run_id)
- Immediately call
verify_chainon the same run:
result = ledger.verify_chain(run_id)
print("Chain valid after anonymization:", result)
- Inspect the exported JSON with
export_run(run_id)to confirm that PII fields (names, emails, IP addresses, or any fields marked as personal data) have been replaced with placeholder values (e.g.,"[REDACTED]"ornull). - Try to call
verify_chainonce more after you manually edit a non-anonymized field (repeat the tamper test from Exercise 3) to confirm tampering is still detectable even after anonymization.
Expected output: verify_chain returns True after anonymization and False after the manual tamper. The export shows redacted PII fields while preserving structural fields like step_index, agent_id, and timestamp.
Reflection question: Why does anonymizing PII not invalidate the hash chain? Think about which fields are included in the hash computation versus which are treated as metadata.