Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Changelog

pyannoteAI Changelog

Get the latest updates on our API, AI models and more.

Dec 11, 2025

STT orchestration is now available. This feature aligns diarization and transcription in a unified workflow.

What it does:

STT orchestration orchestrates pyannoteAI diarization with transcription services. Instead of running diarization and transcription separately, then reconciling outputs manually, you make one API call and receive speaker-attributed transcripts.

Technical details:

  • Supports Precision-2 pyannoteAI diarization models

  • Connects to OSS transcription models: Nemo Parakeet

  • Returns structured output: start/end timestamps, speaker IDs, transcribed text

  • Reduces timestamp reconciliation errors and ambiguous segments

Use case:

  • Clean speaker-attributed transcripts for downstream summarization or analytics.

Benchmarks show improved tcpWER metrics compared to typical STT providers.

Read our blog post for more details about the feature and its usage.
Explore the technical documentation and tutorials.
The tutorial notebooks shown in the video are now available on our GitHub.

Oct 6, 2025

Introduce new pricing plans designed to fit exactly where you are in your journey with speech processing.

New Pricing Plans

  • Developer Plan: €19/month, up to 125 hours of premium processing, access to Precision-2 & Community-1

  • Starter Plan: €99/month, up to 825 hours premium processing, 3 concurrent jobs & seats

  • Enterprise Plan: Custom pricing with self-hosted deployment options

Host the open-source version on our servers

  • Open-source models are now accessible via a hosted API (zero setup required)

  • No setup or hosting headaches, direct API access

  • No upfront infrastructure investment or ongoing DevOps

👉 Read our blog post to deploy the OSS model via API

Sep 29, 2025

Our latest Open-Source diarization model Community-1 is now available in pyannote.audio 4.0!

Our most significant open-source update brings you faster, more accurate diarization with seamless integration, and more:

  • Cleaner speaker assignment: Exclusive single-speaker mode eliminates overlapping conflicts.

  • Faster training pipeline: Metadata caching and optimized data loaders reduce training time by 40% while maintaining quality.

  • Smarter STT reconciliation: Streamlined timestamp reconciliation means fewer integration headaches.

  • Better accuracy: Significant improvements over OSS 3.1, especially with noisy, real-world audio conditions.

  • Premium path: Seamless upgrade to pyannoteAI Premium models when you need production-grade accuracy.

How to upgrade: pip install --upgrade pyannote.audio. New features are opt-in; existing code keeps working.

Note: Python 3.10+ required for this version.

👉 Read the full blog post for more details about the model and its capabilities

Sep 2, 2025

Our latest diarization model Precision-2 is now available in our API!

It delivers +14% better accuracy than Precision-1 and +28% vs pyannote.audio OSS 3.1, a new standard in speaker diarization, and more:

  • Sharper results: +5% better segmentation & cross-talk handling vs Precision-1.

  • Smarter speaker counts: correctly predicts the number of speakers on 70% of our hardest benchmark (vs 50% with Precision-1).

  • More control: set min/max speakers, use exclusive mode for easy STT alignment, and leverage per-segment confidence vectors for faster QA.

How to use: set the new model parameter to precision-2 in your diarization and identification API requests: learn more in the docs.

Progressive switch: until October 3, you can switch between Precision-1 and Precision-2. After that, Precision-1 will be deprecated and all your API requests will use Precision-2 automatically.

No cost changes: enjoy the improved model at the same price.

👉 Read the full blog post for more details about the model and its capabilities

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Make the most of conversational speech
with AI

Detect, segment, label and separate speakers in any language.

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.