Duyetbot Agent
Advanced

Benchmarks

P50 latency/token costs. E2E/Vitest/Phase1 data. Routing 5s, batch 500ms, 75t/query.

TL;DR: P50 end-to-end: 5s response. 75 tokens/query. Batch saves 55%. From Vitest E2E + prod logs.

Table of Contents

Latency Table

From PLAN.md timings + apps/telegram-bot/src/__tests__/e2e/performance.test.ts.

PhaseP50P95Notes
Webhook -> DO6ms20msFire-and-forget
Batch Alarm500ms1sWindow
Routing300ms2sHybrid classify
LLM Simple2s5sDirect
LLM Orchestrator4s10sPlan+workers
E2E Total5s12s✅ Prod

Token Table

100 queries/day baseline.

AgentTokens/Query% QueriesTotal/Day
Pattern/Simple7580%6k
LLM Classify30020%1.5k
Orchestrator150010%1.5k
Avg75-7.5k vs 30k

Perf Flow

       +-----------+
       |Webhook T0 |
       +-----+-----+
             |
             v
       +-----------+
       |Alarm T500ms|
       +-----+-----+
             |
             v
       +-----------+
       |Classify300m|
       +-----+-----+
             |
             v
          +---------+
          |Simple?  |
          +---+---+-+
              |   |
          Yes |   | No
              v   v
          +----+ +----+
          |LLM | |Orch |
          | 2s | | 4s |
          +--+-+ +-+--+
             |   |
             +---+
               |
               v
          +----------+
          |Edit Resp  |
          |T5s OK     |
          +----------+

Refs E2E perf tests.

Vitest Snippet

Hypothetical metrics capture.

// apps/telegram-bot/src/__tests__/e2e/performance.test.ts
it('P50 E2E latency', async () => {
  const start = performance.now();
  await sendMessage('hi');  // Triggers full flow
  const latency = performance.now() - start;
  expect(latency).toBeLessThan(6000);  // P50 5s
});

Run: bun vitest e2e/performance.

Savings Quiz

Q: Batch 3 msgs saves?

A: 55% (1 call vs 3) ✅
B: 0%
C: 75%

Deploy + wrangler tail. Benchmark your queries!

On this page