Changelog ⎮ pyannoteAI Speaker Diarization & Platform Updates

Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Jan 15, 2026

STT Orchestration now supports OpenAI Whisper Large V3 Turbo

You can now orchestrate pyannoteAI Precision‑2 diarization models with the OpenAI Whisper Large V3 Turbo STT model. Combining best‑in‑class speaker attribution with state‑of‑the‑art multilingual transcription.

What it does:

Aligns speaker diarization and transcription into a unified workflow powered by the Whisper Large V3 Turbo model.

Technical details:

Connects to OSS transcription model: OpenAI Whisper Large V3 Turbo
Integrates seamlessly with pyannoteAI Precision‑2 diarization models
Returns structured speaker‑attributed transcripts with timestamps and text

Use case:

Leverage Whisper’s support for 99 languages to cover broader use cases and deployment scenarios.

Generate clean, speaker‑attributed transcripts in a single API call.

👉 Explore technical documentation and tutorials: docs.pyannote.ai/tutorials/speech-to-text-diarization

Dec 11, 2025

STT orchestration for speaker-attributed transcription

STT orchestration is now available. This feature aligns diarization and transcription in a unified workflow.

What it does:

STT orchestration orchestrates pyannoteAI diarization with transcription services. Instead of running diarization and transcription separately, then reconciling outputs manually, you make one API call and receive speaker-attributed transcripts.

Technical details:

Supports Precision-2 pyannoteAI diarization models
Connects to OSS transcription models: Nemo Parakeet
Returns structured output: start/end timestamps, speaker IDs, transcribed text
Reduces timestamp reconciliation errors and ambiguous segments

Use case:

Clean speaker-attributed transcripts for downstream summarization or analytics.

Benchmarks show improved tcpWER metrics compared to typical STT providers.

Read our blog post for more details about the feature and its usage.
Explore the technical documentation and tutorials.
The tutorial notebooks shown in the video are now available on our GitHub.

Oct 6, 2025

New pricing plans

Introduce new pricing plans designed to fit exactly where you are in your journey with speech processing.

New Pricing Plans

Developer Plan: €19/month, up to 125 hours of premium processing, access to Precision-2 & Community-1
Starter Plan: €99/month, up to 825 hours premium processing, 3 concurrent jobs & seats
Enterprise Plan: Custom pricing with self-hosted deployment options

Host the open-source version on our servers

Open-source models are now accessible via a hosted API (zero setup required)
No setup or hosting headaches, direct API access
No upfront infrastructure investment or ongoing DevOps

👉 Read our blog post to deploy the OSS model via API

Sep 29, 2025

Community-1 now available

Our latest Open-Source diarization model Community-1 is now available in pyannote.audio 4.0!

Our most significant open-source update brings you faster, more accurate diarization with seamless integration, and more:

Cleaner speaker assignment: Exclusive single-speaker mode eliminates overlapping conflicts.
Faster training pipeline: Metadata caching and optimized data loaders reduce training time by 40% while maintaining quality.
Smarter STT reconciliation: Streamlined timestamp reconciliation means fewer integration headaches.
Better accuracy: Significant improvements over OSS 3.1, especially with noisy, real-world audio conditions.
Premium path: Seamless upgrade to pyannoteAI Premium models when you need production-grade accuracy.

How to upgrade: pip install --upgrade pyannote.audio. New features are opt-in; existing code keeps working.

Note: Python 3.10+ required for this version.

👉 Read the full blog post for more details about the model and its capabilities

Sep 2, 2025

Precision-2 now available in API

Our latest diarization model Precision-2 is now available in our API!

It delivers +14% better accuracy than Precision-1 and +28% vs pyannote.audio OSS 3.1, a new standard in speaker diarization, and more:

Sharper results: +5% better segmentation & cross-talk handling vs Precision-1.
Smarter speaker counts: correctly predicts the number of speakers on 70% of our hardest benchmark (vs 50% with Precision-1).
More control: set min/max speakers, use exclusive mode for easy STT alignment, and leverage per-segment confidence vectors for faster QA.

How to use: set the new model parameter to precision-2 in your diarization and identification API requests: learn more in the docs.

Progressive switch: until October 3, you can switch between Precision-1 and Precision-2. After that, Precision-1 will be deprecated and all your API requests will use Precision-2 automatically.

No cost changes: enjoy the improved model at the same price.

👉 Read the full blog post for more details about the model and its capabilities

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Get Started

Book a demo