joinly MCP Server

An MCP server that enables AI agents to join and actively participate in video calls on Google Meet, Zoom, and Microsoft Teams through browser automation, providing live transcripts, voice interaction, and chat capabilities.

Get this MCP server

An MCP server that enables AI agents to join and actively participate in video calls on Google Meet, Zoom, and Microsoft Teams through browser automation, providing live transcripts, voice interaction, and chat capabilities.

Installation

Docker

docker pull ghcr.io/joinly-ai/joinly:latest
docker run --env-file .env ghcr.io/joinly-ai/joinly:latest --client <MeetingURL>

External Client

docker run -p 8000:8000 ghcr.io/joinly-ai/joinly:latest
uvx joinly-client --env-file .env <MeetingUrl>

GPU Support

docker pull ghcr.io/joinly-ai/joinly:latest-cuda
docker run --gpus all --env-file .env -p 8000:8000 ghcr.io/joinly-ai/joinly:latest-cuda

Configuration

Environment Variables

# .env
# for OpenAI LLM
JOINLY_LLM_MODEL=gpt-4o
JOINLY_LLM_PROVIDER=openai
OPENAI_API_KEY=your-openai-api-key

MCP Server Configuration

{
    "mcpServers": {
        "localServer": {
            "command": "npx",
            "args": ["-y", "package@0.1.0"]
        },
        "remoteServer": {
            "url": "http://mcp.example.com",
            "auth": "oauth"
        }
    }
}

Available Tools

Tool Description
join_meeting Join a meeting with URL, participant name, and optional password
leave_meeting Leave the current meeting
speak_text Speak text using TTS (requires text parameter)
send_chat_message Send a message to chat (requires message parameter)
mute_yourself Mute your microphone
unmute_yourself Unmute your microphone
get_chat_history Get chat history of current meeting in JSON format
get_participants Get participants of current meeting in JSON format
get_transcript Get transcript of current meeting in JSON format, optionally filtered by minutes
get_video_snapshot Get an image from the current meeting, for example, to view the current screen share

Capabilities

  • Live Interaction: Enables agents to perform tasks and respond in real-time via voice or chat during meetings
  • Conversation Flow: Built-in logic that ensures natural conversations, handling interruptions and multi-speaker interaction
  • Cross-Platform: Join Google Meet, Zoom, and Microsoft Teams (or any platform accessible via browser)
  • Bring-Your-Own-LLM: Works with all LLM providers (also locally with Ollama)
  • Choose-Your-Preferred-TTS/STT: Modular design supports multiple services - Whisper/Deepgram for STT and Kokoro/ElevenLabs/Deepgram for TTS
  • 100% open-source, self-hosted, and privacy-first

Environment Variables

Required

  • JOINLY_LLM_MODEL - LLM model to use (e.g., gpt-4o)
  • JOINLY_LLM_PROVIDER - LLM provider (openai, anthropic, ollama)

Optional

  • OPENAI_API_KEY - OpenAI API key for GPT models
  • ELEVENLABS_API_KEY - ElevenLabs API key for TTS
  • DEEPGRAM_API_KEY - Deepgram API key for STT/TTS

Usage Examples

Join a meeting and answer questions with access to latest news from the internet
Create GitHub issues during meetings based on discussions
Edit Notion page content in real-time during meetings
Retrieve meeting transcripts and participant information
Send chat messages and voice responses during video calls

Notes

Provides a live transcript resource (transcript://live) that you can subscribe to for real-time updates. Supports GPU acceleration with CUDA for improved transcription quality. Can run as a standalone client or as an MCP server with external clients connecting to it.

Comments (0)

Sign In Sign in to leave a comment.

Spark Drops

Weekly picks: best new AI tools, agents & prompts

Venture Crew
Terms of Service

© 2026, Venture Crew