S
Sensei

Built for product managers and developers.

Your AI shouldn't be a black box

PMs waste hours manually reviewing traces to find what's broken. We automatically detect and cluster your top problems so you know exactly what to fix first.

100ms
Overhead target
200ms
Read P95
1 page
Drop‑in SDK
Early Access
General Availability Soon
Overview

Core Features

Everything you need to understand and improve your AI conversations

Health score

Per-conversation 0-100 score showing if users got what they needed.

Problem detection

Auto‑flags loops, nonsense, and frustration with links to conversations.

Problem clustering

Groups failures by type and ranks by frequency so you know what to fix first.

Usage patterns

Segments conversations by intent to see which use cases are broken.

How It Works

From SDK to insights in 3 steps

1

SDK captures conversations

Wrap your AI calls with our SDK. Tracks messages, metadata, and user interactions automatically.

2

AI detects and clusters problems

Our system automatically flags failures, groups them by type, and ranks by frequency.

3

Dashboard shows what to fix

See your top problems with example traces. Know exactly what to prioritize and how to fix it.

Drop‑in SDK

Wrap your AI calls and track conversations with a single import. Retries, batching, and flush‑on‑unload built‑in.

  • One‑time SDK init with API key
  • Track messages and metadata
  • Optional auto‑wrapper for OpenAI/Anthropic
import { Sensei } from '@sensei/sdk'

const sensei = new Sensei({
  apiKey: process.env.SENSEI_API_KEY,
  baseUrl: 'https://api.sensei.com'
})

await sensei.track({
  conversationId: 'conv_123',
  messages: [
    { role: 'user', content: 'How do I reset my password?', timestamp: Date.now() },
    { role: 'assistant', content: 'Here are the steps…', timestamp: Date.now() }
  ],
  metadata: { userId: 'user_456' }
})

Frequently Asked Questions

How is this different from Langfuse or Arize?

Langfuse and Arize give you infrastructure to run evals and view traces. We give you insights - automatically detecting what's broken and clustering problems by frequency so you know what to fix first.

Do I need to write custom evals?

No. We automatically detect common failure patterns like loops, nonsense responses, and user frustration. Our AI judges conversation quality without requiring custom evaluation code.

How does the SDK work?

Drop-in wrapper around your AI calls. One import, tracks conversations automatically. Retries, batching, and flush-on-unload built-in. Adds ~100ms overhead.

Can I see the actual conversations?

Yes. Every problem cluster links to example traces so you can verify our categorization and understand the specific failure patterns.

What's the pricing?

Free during early access. We'll announce pricing before general availability. Expect usage-based pricing similar to other developer tools.

When will it be available?

We're onboarding teams in batches during early access. Join the waitlist and we'll reach out within 1-2 weeks.

Get early access

We’re onboarding teams in batches. Get started free and we’ll reach out.

We’ll never spam you. Unsubscribe any time.