Voice Streaming

Stream audio directly into voice channels via WebSocket. Music bots, TTS, live audio relay — no WebRTC needed.

How It Works

Argon bots don't use WebRTC. Instead, they stream audio directly to a WebSocket ingress endpoint. The server handles mixing and distribution to all participants in the voice channel.

Your Bot

Opus over WebSocket

SFU

Ingress

WebRTC

Listeners

Step-by-Step Flow

Request a Stream Token

Call the StreamToken endpoint with the target space and channel:

Request

POST /api/bot/IVoice/v1/StreamToken
Authorization: Bot YOUR_TOKEN
Content-Type: application/json

{
  "spaceId": "550e8400-e29b-41d4-a716-446655440000",
  "channelId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
}

Response

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "ingressUrl": "ws://ingress.argon.gl:12880",
  "roomName": "550e8400-.../6ba7b810-..."
}

Connect to the WebSocket Ingress

Open a WebSocket connection to the ingressUrl with your token and audio configuration as query parameters:

// Full connection URL with all parameters
ws://ingress.argon.gl:12880?token=eyJhbGci...&stereo=false&frame_duration_ms=20&track_name=audio&track_source=microphone

Connection Parameters

Parameter	Type	Default	Description
token	string	—	Required. The JWT token from `StreamToken`.
stereo	boolean	false	Set to `true` for stereo audio (2 channels). Mono by default.
channels	number	1	Alternative to `stereo`. Set to `2` for stereo.
frame_duration_ms	number	20	Opus frame duration in milliseconds. Valid values: `10`, `20`, `40`, `60`.
track_name	string	""	Display name for the audio track (visible to other participants).
track_source	string	""	Track source type, e.g. `microphone`, `screen_share_audio`.
metadata	string	""	Arbitrary metadata string attached to the track participant.

Music bot (stereo, 20ms frames)

ws://ingress.argon.gl:12880?token=TOKEN&stereo=true&frame_duration_ms=20&track_name=music&track_source=microphone

Stream Opus Audio Frames

Send binary WebSocket frames containing Opus-encoded audio data. Each frame should be a single Opus packet.

Audio Requirements

Parameter	Value
Codec	Opus
Sample Rate	48,000 Hz
Channels	Mono (1) or Stereo (2) — configured via `stereo` or `channels` query param
Frame Duration	10, 20 (default), 40, or 60 ms — configured via `frame_duration_ms` query param
Transport	Binary WebSocket frames (one Opus packet per frame)

Do not send raw PCM. The ingress expects Opus-encoded packets. Use a library like opusenc, ffmpeg, or your language's Opus bindings to encode before sending.

Room & Token Details

Property	Details
Room Name Format	{spaceId}/{channelId}
Token Type	LiveKit JWT (HS256)
Token TTL	2 hours — request a new one before it expires
Bot Permissions	Publish audio/video, subscribe to tracks, send data, update own metadata

Prerequisites & Validation

The StreamToken endpoint validates the following before issuing a token:

The bot must be a member of the space — returns 403 otherwise.
The channel must exist — returns 404 if not found.
The channel must be a voice channel — returns 400 for text channels.
Rate limited to 5 requests per minute.

Code Example

TypeScript (Bun / Node.js)

import { spawn } from "child_process";

// 1. Get stream token
const resp = await fetch("https://api.argon.gl/api/bot/IVoice/v1/StreamToken", {
  method: "POST",
  headers: {
    "Authorization": "Bot YOUR_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ spaceId: "...", channelId: "..." }),
});
const { token, ingressUrl } = await resp.json();

// 2. Encode audio to Opus with ffmpeg (stereo, 48kHz)
const ffmpeg = spawn("ffmpeg", [
  "-i", "music.mp3",
  "-f", "opus", "-ar", "48000", "-ac", "2",
  "-frame_duration", "20", "pipe:1",
]);

// 3. Stream to ingress (stereo, 20ms frames)
const wsUrl = `${ingressUrl}?token=${token}&stereo=true&frame_duration_ms=20&track_name=music`;
const ws = new WebSocket(wsUrl);

ws.addEventListener("open", () => {
  ffmpeg.stdout.on("data", (chunk) => {
    ws.send(chunk);
  });
});

ffmpeg.on("close", () => ws.close());

C# (.NET)

using System.Diagnostics;
using System.Net.Http.Json;
using System.Net.WebSockets;

// 1. Get stream token
using var http = new HttpClient();
http.DefaultRequestHeaders.Add("Authorization", "Bot YOUR_TOKEN");

var resp = await http.PostAsJsonAsync(
    "https://api.argon.gl/api/bot/IVoice/v1/StreamToken",
    new { spaceId = "...", channelId = "..." });
var data = await resp.Content.ReadFromJsonAsync<JsonElement>();
var token = data.GetProperty("token").GetString()!;
var url = data.GetProperty("ingressUrl").GetString()!;

// 2. Encode audio with ffmpeg
var ffmpeg = Process.Start(new ProcessStartInfo {
    FileName = "ffmpeg",
    Arguments = "-i music.mp3 -f opus -ar 48000 -ac 2 -frame_duration 20 pipe:1",
    RedirectStandardOutput = true,
    UseShellExecute = false
})!;

// 3. Stream to ingress
using var ws = new ClientWebSocket();
await ws.ConnectAsync(
    new Uri($"{url}?token={token}&stereo=true&frame_duration_ms=20&track_name=music"),
    CancellationToken.None);

var buf = new byte[960];
int read;
while ((read = await ffmpeg.StandardOutput.BaseStream.ReadAsync(buf)) > 0)
    await ws.SendAsync(buf.AsMemory(0, read), WebSocketMessageType.Binary, true, CancellationToken.None);

Use Cases

Music Bot

Stream audio from local files or your own media library. Encode to Opus and push to the channel.

TTS Bot

Convert text to speech (e.g., via Google TTS or Coqui), encode to Opus, and stream the result.

Live Audio Relay

Relay audio from an external source (radio stream, microphone, podcast feed) into the channel.

Audio Notifications

Play short audio clips for alerts, announcements, or sound effects triggered by events.

Next Steps

→ IVoice API Reference — endpoint details and types → Real-time Events — listen for VoiceJoin/VoiceLeave events → API Reference — all interfaces