VSO-SCORE-001 | Version 1.0 | December 2025

AI Decision Auditability Benchmark v1.0

Aligned with VeritasChain Protocol (VCP)

Measure Algorithmic Trading Transparency with a Common Score

A vendor-neutral, standard-aligned benchmark for evaluating auditability — usable with or without VCP implementations.

Quantify third-party independent verifiability
10 criteria covering tamper detection, sequence fixation, decision provenance
Includes Evidence Pack template for audit submission

Why Now

The current state of algorithmic trading auditability

Black-Box Decisions

AI-driven trading decisions are opaque. When regulators ask "why?", there's no auditable answer.

Logs Exist, But...

Logs are recorded, but authenticity and sequence cannot be proven. Timestamps can be disputed.

Evidence Quality Gap

Audits fail at the final stage: evidence quality. Manual gathering takes days, formats are inconsistent.

Reference Implementation

Auditability Benchmark — Reference Implementation

A local-only, audit-safe reference implementation for running the AI Decision Auditability Benchmark and exporting regulator-ready evidence.

VAP Scorecard Explorer is a reference implementation of the AI Decision Auditability Benchmark (10 criteria / 20 points).

Allows audit and assurance teams to:

  • Run a consistent, repeatable assessment
  • Record scoring rationale and evidence notes
  • Export an Audit Evidence Pack (ZIP / PDF)

Privacy & Security

All processing runs locally. No network communication. No external APIs. No analytics.

VAP Scorecard Explorer - UI Preview Open Scorecard Explorer (Local-Only)

Benchmark specification and scoring criteria are published openly as the canonical reference.

What It Is

A diagnostic score, not an implementation proposal

Purpose

This is not a technology adoption proposal.

This benchmark enables organizations to diagnose their auditability using an industry-standard measure. Results directly serve as evidence quality for external audits and regulatory compliance.

  • Self-assessment tool for internal teams
  • Third-party evaluation framework
  • Vendor-neutral, technology-agnostic

Note: This benchmark does not provide certification or endorsement. It offers an independent, evidence-based assessment framework.

20
Maximum Points
Criteria 10
Points each 0 / 1 / 2
PoC Time ~3 hours

10 Evaluation Criteria

Ordered by audit relevance. Evidence-centric criteria first, technical implementation details later.

#1 0 / 1 / 2

Third-Party Verifiability

第三者検証可能性

"Can an external party independently verify the audit trail?"

0No external verification possible
1Partial verification with vendor assistance
2Full independent verification using standard tools
#2 0 / 1 / 2

Tamper Evidence

改ざん検知

"Can unauthorized modifications be detected?"

0No tamper detection; silent modification possible
1Basic checksums with gaps
2Cryptographic integrity (hash chains, Merkle trees)
#3 0 / 1 / 2

Sequence Fixation

順序の固定

"Is Decision → Order → Execution order immutable?"

0Events can be reordered post-hoc
1Timestamps exist but no cryptographic binding
2Monotonic sequencing with cryptographic linkage
#4 0 / 1 / 2

Decision Provenance

判断由来

"Can inputs, conditions, and rationale be traced?"

0Only outcomes recorded
1Some inputs logged but incomplete
2Full provenance: data, parameters, model state, logic
#5 0 / 1 / 2

Responsibility Boundaries

責任境界

"Who approved, modified, or overrode each action?"

0No attribution; generic accounts
1Username logged but no signature
2Digital signatures on all approvals/overrides
#6 0 / 1 / 2

Audit Submission Readiness

監査提出性

"Can evidence be exported for regulatory review?"

0Manual gathering required; takes days
1Partial export; separate extraction needed
2One-click export; complete package in <5 min
#7 0 / 1 / 2

Retention & Durability

保持期間・耐久運用

"Are records retained for required periods (e.g., 7 years)?"

0No policy; data may be lost
1Policy exists but incomplete enforcement
2Enforced retention with redundancy & integrity checks
#8 0 / 1 / 2

Timestamp Reliability

時刻の信頼性

"Are timestamps synchronized to a trusted source?"

0Local system clocks only
1NTP sync but no drift monitoring
2PTP or RFC 3161 with documented accuracy
#9 0 / 1 / 2

Cryptographic Strength

暗号強度

"Do algorithms meet current security standards?"

0Deprecated algorithms (MD5, SHA-1)
1Adequate but no key management
2Strong (Ed25519, SHA-256+) with key lifecycle
#10 0 / 1 / 2

Cryptographic Agility

暗号移行性(PQC準備)

"Can the system migrate to new algorithms?"

0Hard-coded; migration breaks verification
1Algorithm identifiers exist but untested
2Documented PQC migration path verified

3-Hour PoC Assessment

Minimum viable test procedure for all 10 criteria

Total Time: ~3 hours
1

Export & Verify

30 min

Export sample audit log (10-100 records). Give it to someone unfamiliar with your system.

Rule: No phone calls, no vendor support, no internal tools allowed.

2

Tamper Test

20 min

Modify one field in one historical record. Run integrity check.

Pass: Automatic detection with alert; modification location identified.

3

Sequence Check

15 min

Find a Decision → Order → Execution chain. Verify cryptographic binding.

Test: Try to insert a backdated event. If possible, score 0.

4

Provenance & Attribution

35 min

Pick a random decision from last week. Reconstruct: inputs, parameters, logic, approver.

Target: Full context retrievable in <10 min = Score 2.

5

Audit Export

30 min

Simulate: "Regulator requests all activity for Account X, Date Y."

Target: One-click export; complete package in <5 minutes = Score 2.

6

Technical Review

50 min

Review retention policy, time source, cryptographic algorithms, migration plan.

Covers: Criteria #7-10 (Retention, Timestamp, Crypto Strength, Agility)

Evidence Pack Template

Third-party submission template for audit and regulatory review

CONFIDENTIAL Third-Party Submission Template | Version 1.0

Template Contents

  • Overall Score: /20 points with assessment level
  • Score Breakdown: All 10 criteria with individual scores
  • Evidence Index: Filename + SHA-256 hash for each item
  • Attestation: Assessor signature and date

Evidence Index Sample

#1 audit_log_2025-01.json
SHA-256: a7f3c9d2...
#2 tamper_test_results.pdf
SHA-256: b8e4d1f5...

EU AI Act Regulatory Mapping

Alignment with Regulation (EU) 2024/1689 for high-risk AI systems

EU AI Act Article Requirement Benchmark Coverage
Article 12 Record-keeping / Logging ✓ Direct Criteria 1-7
Article 13 Transparency ◐ Partial Criteria 4, 5
Article 14 Human Oversight ◐ Partial Criterion 5
Article 17 Quality Management ✓ Supported Criteria 6, 7

MiFID II / RTS 25 Synergy: Criterion #8 (Timestamp Reliability) also addresses RTS 25 clock synchronization requirements (±100μs for HFT, ±1ms for others).

Who Is This For

Industry stakeholders who benefit from standardized auditability measurement

Audit / Assurance

Set a common baseline for audit engagements. Compare systems objectively.

  • Standardized assessment criteria
  • Evidence Pack for submissions
  • Cross-organization comparison

RegTech Vendors

Demonstrate your product's auditability with quantifiable metrics.

  • Marketing with concrete scores
  • Product differentiation
  • Regulatory compliance proof

Brokers / Venues

Turn transparency into a competitive advantage. Speed up audit submissions.

  • Client trust differentiator
  • Faster regulatory responses
  • Reduced audit costs

What Your Score Means

Interpretation guide for assessment results

16-20

Strong Auditability

Ready for external audit and regulatory review. Continue maintaining best practices.

11-15

Moderate Auditability

Address gaps in 0-score areas before external audit. Focus on quick wins first.

6-10

Limited Auditability

Significant improvements needed. Prioritize evidence-centric criteria #1-6.

0-5

Inadequate

Fundamental gaps require immediate attention. Consider system redesign.

Downloads & Resources

All benchmark documents and resources

FAQ

Frequently asked questions about the benchmark

Do I need to adopt VCP to use this benchmark?

No. This benchmark is a measurement tool for auditability, usable regardless of technology choice. However, achieving scores close to 20 typically requires cryptographic integrity mechanisms—which VCP provides as one option.

Can audit firms use this for client assessments?

Yes. The benchmark is licensed under CC BY 4.0. Audit firms can use it for client engagements with attribution. The Evidence Pack provides a standardized submission format.

Does confidential data leave our organization?

Not necessarily. The benchmark is designed for internal self-assessment. For third-party submissions, the Evidence Pack uses SHA-256 hashes to prove file integrity without exposing actual content. You control what gets shared.

What should I submit for audit?

Use the Evidence Pack template: overall score, 10-criteria breakdown, Evidence Index (filename + SHA-256 hash), and assessor attestation. The hash-based index proves evidence authenticity without requiring full data disclosure.

What score is considered "good enough"?

16-20 points indicates strong auditability and readiness for external audit. 11-15 is moderate—address 0-score items first. Below 10 requires significant improvement before regulatory engagement.

Is there certification available?

This benchmark is for self-assessment and third-party evaluation. For formal certification, see the VC-Certified program which uses VCP compliance as its basis.

Published by VeritasChain Standards Organization (VSO)
as part of the VCP standards ecosystem.