WhisperLiveKit
by
QuentinFuxa

Description: Simultaneous speech-to-text models

View on GitHub ↗

Summary Information

Updated 27 minutes ago

Added to GitGenius on September 5th, 2025

Created on December 19th, 2024

Open Issues & Pull Requests: 8 (+0)

Number of forks: 1,087

Total Stargazers: 10,519 (+0)

Total Subscribers: 60 (+0)

Issue Activity (beta)

Open issues: 14

New in 7 days: 0

Closed in 7 days: 0

Avg open age: 124 days

Stale 30+ days: 11

Stale 90+ days: 0

Recent activity

Opened in 7 days: 0

Closed in 7 days: 0

Comments in 7 days: 0

Events in 7 days: 0

Top labels

Awaiting Input (8)
To Be Analyzed (8)
enhancement (6)
bug (4)
help wanted (1)
likely solved in the latest version (1)
question (1)

Most active issues this week

No issue events were indexed in the last 7 days.

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 24.1 hours

Mean response time: 20.6 days

90th percentile: 42.6 days

Tracked items: 246

Most active contributors

QuentinFuxa - 508 events, 212 issues
SilasK - 33 events, 22 issues
rupnil-codes - 14 events, 3 issues
needabetterusername - 13 events, 7 issues
AeneasChristodoulou - 11 events, 6 issues

Related by overlapping contributors

Detailed Description

WhisperLiveKit is a fascinating project that combines the power of OpenAI's Whisper speech-to-text model with the real-time capabilities of LiveKit, a WebRTC-based platform for building live audio/video applications. Essentially, it allows for live, low-latency transcription of audio streams in a multi-participant setting, making it ideal for applications like live captioning, real-time meeting notes, or accessibility features in online events. The repository provides the necessary components to integrate Whisper's transcription directly into a LiveKit room.

At its core, the project leverages Whisper.cpp, a C++ port of OpenAI’s Whisper model, optimized for running on CPUs and GPUs. This is crucial for performance, as running a large language model like Whisper in real-time requires significant computational resources. WhisperLiveKit doesn't rely on the OpenAI API directly, which avoids API costs and provides more control over the transcription process. Instead, it downloads and utilizes a quantized Whisper model locally, allowing for offline operation and reduced latency. The project supports various model sizes (tiny, base, small, medium, large) offering a trade-off between accuracy and speed.

The architecture is cleverly designed. Audio streams from LiveKit participants are captured and sent to a dedicated "transcriber" service. This service, built using FastAPI, receives the audio chunks, preprocesses them for Whisper, performs the transcription, and then publishes the resulting text back to the LiveKit room via data channels. This separation of concerns – LiveKit handling the real-time audio/video transport and the FastAPI service handling the transcription – makes the system more modular and scalable. The use of data channels ensures that the transcriptions are delivered to all participants in the room with minimal delay.

Key components include a LiveKit client application (example provided in React) that joins a room and sends/receives audio, a FastAPI server that hosts the Whisper transcriber, and Docker Compose files for easy deployment. The repository also includes configuration options for customizing the Whisper model, audio parameters (sample rate, channels), and LiveKit room settings. A significant feature is the support for speaker diarization, which attempts to identify *who* is speaking at any given time, adding another layer of usefulness to the transcriptions.

The project is still under active development, but it demonstrates a compelling use case for combining cutting-edge speech-to-text technology with real-time communication platforms. It's a valuable resource for developers looking to add live transcription capabilities to their LiveKit-based applications, offering a flexible and cost-effective alternative to relying solely on cloud-based transcription services. The documentation, while still evolving, provides a good starting point for understanding the architecture and deploying the system. Future development likely will focus on improving diarization accuracy, optimizing performance for different hardware configurations, and expanding the range of supported Whisper models.

WhisperLiveKit
by
QuentinFuxa

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

WhisperLiveKit
by
QuentinFuxaQuentinFuxa/WhisperLiveKit

Repository Details

WhisperLiveKit by QuentinFuxa

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

WhisperLiveKit by QuentinFuxaQuentinFuxa/WhisperLiveKit

Repository Details

WhisperLiveKit
by
QuentinFuxa

WhisperLiveKit
by
QuentinFuxaQuentinFuxa/WhisperLiveKit