TesseraMonthly Joint Reading · April 2026

For Acme Corp (DEMO) · Period 2026-04-01 to 2026-04-30

acme-pilot saved $63,000 in April 2026.

Total measured savings this period
$63,000
reduction of 34.6% versus ratified baseline
You keep · 80.0%
$50,400

Direct cost reduction retained by acme-pilot, net of the Performance Fee.

Tessera earned · 20.0%
$12,600

Performance Fee, debited from prepaid balance against measured savings per §4-§5 of the Tessera Terms of Service.

Joint baseline
182k
re-priced at this month's volume
Actual cost paid
133k
across all in-scope workloads
Reduction
34.6%
below baseline · this period
Performance Fee · 20%
12.6k
invoiced at Day 60 · per §4.3

Spend trend

Joint baseline (gray) vs actual paid cost (olive) over the last six readings

$0$46k$91k$137k$182kDEC 25JAN 26FEB 26MAR 26APR 26BASELINEACTUAL

The gap between bars is the measured Ongoing Savings.

Per-workload breakdown

5 workloads this period · indicator reflects reduction versus its anchored baseline

StatusWorkloadProvider · modelRequestsBaselineActualSavedReduction
Optimized
product-search-summarizeOpenAI · gpt-4o-2024-08-06280,00042.0k24.0k18.0k42.9%
Optimized
support-classifierAnthropic · claude-opus-4-7-202645,00058.0k32.0k26.0k44.8%
Partial
doc-extractionGoogle · gemini-2.5-pro95,00050.0k38.0k12.0k24.0%
Partial
realtime-coding-assistAnthropic · claude-sonnet-4-x120,00032.0k25.0k$7,00021.9%
Excluded
code-review-bot · deepseek-chatno ratified anchor covers this period28,000$014.0k$00.0%
Optimized · 30%+ reductionPartial · 10-30%Action needed · <10%Excluded · no ratified anchor

What to fix next

Open optimization opportunities ranked by projected monthly savings · plus recent implementations for accountability

Implemented · Implemented
Opus → Sonnet swap on support-classifier (90% of intents are clear-cut)
Model routing · 2 engineering-days · reversible within day · quality risk low

Support classifier evaluation set (200 golden examples) shows Sonnet-4.x preserves 96% accuracy on the 90% of cases where the intent is unambiguous. RouteLLM router handles the 10% edge cases on Opus. Implemented Mar 2026.

Projected · per month
24.0k
range 20.0k28.0k
Recommended · In progress
Route 70% of product-search traffic to gpt-4o-mini
Model routing · reversible within hours · quality risk medium

Promptfoo eval shows gpt-4o-mini matches summarization quality on 92% of test set. Route by complexity heuristic.

Projected · per month
$2,400
range $1,800$3,200
Lower priority · In progress
Enable prompt caching on support-classifier system prompt
Caching · reversible within hours · quality risk low

System prompt 2.4k tokens stable across 87% of calls. Enable Anthropic prompt cache (5-min TTL) → ~50% input cost reduction.

Projected · per month
$510
range $380$640
Implemented · Implemented
Move code-review-bot to DeepSeek Batch API
Batch eligibility · reversible within day · quality risk low

Daily 450 reviews are non-realtime (overnight digest). Batch API = 50% cost reduction with same model.

Implemented 2026-04-16 · counted toward this period's measured savings

Projected · per month
$33
range $28$38
Critical · In progress
Enable Google Batch API on doc-extraction workload
Batch eligibility · 2 engineering-days · reversible within hours · quality risk low

Document extraction workload (Gemini-2.5-Pro · 95k requests/month) currently runs synchronously. Workload is overnight-tolerant per existing SLA. Google batch API runs at 50% list price. One Python service to wrap the existing pipeline; switching back is reversible within hours.

Projected · per month
$8,000
range $6,00010.0k
Critical · In progress
Anthropic prompt cache on support-classifier (system prompt 1,840 tokens, current cache hit 18%)
Caching · 1 engineering-day · reversible within hours · quality risk low

Support classifier system prompt + intent taxonomy is 1,840 tokens, identical across all 45k monthly requests. Prompt caching not configured — current hit rate from random page-cache only. Adding cache_control: { type: ephemeral } on the static prefix lifts hit rate to ~85%, dropping prefix cost by 9x.

Projected · per month
$6,500
range $5,500$7,500
Recommended · In progress
LLMLingua-2 compression on doc-extraction inputs (avg 1,800 input tokens · low semantic density)
Output compression · 3 engineering-days · reversible within day · quality risk medium

Document extraction input averages 1,800 tokens with substantial boilerplate and structural padding. LLMLingua-2 (Microsoft) compresses 30-40% with under 2% measured quality loss on extraction tasks per their published benchmarks. Inference layer added before existing extraction call.

Projected · per month
$3,200
range $2,400$4,000
Recommended · In progress
RouteLLM threshold tuning on realtime-coding-assist (currently routing 78% to Sonnet; eval suggests 92% safe)
Model routing · 5 engineering-days · reversible within week · quality risk medium

Realtime coding assist currently routes 78% of requests to Sonnet, 22% to Opus based on rough heuristic. Re-running RouteLLM scoring against your golden set suggests 92% safely route to Sonnet with no measurable quality drop on diff acceptance rate.

Projected · per month
$2,800
range $2,200$3,400

Savings trajectory

Measured savings delivered each closing period

$0$32k$63k$25.0kDEC 25$41.0kJAN 26$52.0kFEB 26$58.0kMAR 26$63.0kAPR 26

Methodology in four points

  1. iOngoing Savings = Joint Baseline cost (anchor blended cost × actual request volume) − actual paid cost, summed across in-scope workloads.
  2. iiWorkloads without a ratified anchor covering the period are excluded from savings — their actual cost is shown for transparency only.
  3. iiiProvider price moves, unrelated workload shrinkage, and seasonal volume effects are excluded per §6 of the Tessera Terms of Service.
  4. ivIf period savings are zero or negative, Performance Fee is zero (drift floor, §3.7); the Monitoring Fee remains due.

Audit trail · countersignature

Reading version
v1
Computed at
2026-05-01 02:14:00 UTC
Drift floor
Not triggered

Acknowledgement of this Reading constitutes acceptance of the Ongoing Savings figure and the Performance Fee derived from it. Disputes are governed by §18 of the Tessera Terms of Service — fifteen calendar days from issuance, in writing, with disputed portion withheld from balance debit and undisputed portion debited normally.

Fintechagency OÜ d.b.a. Tessera

Name · title · date

Acme Corp (DEMO)

Name · title · date