Voice Streaming

Stream audio directly into voice channels via WebSocket. Music bots, TTS, live audio relay — no WebRTC needed.

How It Works

Argon bots don't use WebRTC. Instead, they stream audio directly to a WebSocket ingress endpoint. The server handles mixing and distribution to all participants in the voice channel.

Your Bot
Opus over WebSocket
SFU
Ingress
WebRTC
Listeners

Step-by-Step Flow

1

Request a Stream Token

Call the StreamToken endpoint with the target space and channel:

Request
POST /api/bot/IVoice/v1/StreamToken
Authorization: Bot YOUR_TOKEN
Content-Type: application/json

{
  "spaceId": "550e8400-e29b-41d4-a716-446655440000",
  "channelId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
}
Response
{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "ingressUrl": "ws://ingress.argon.gl:12880",
  "roomName": "550e8400-.../6ba7b810-..."
}
2

Connect to the WebSocket Ingress

Open a WebSocket connection to the ingressUrl with your token and audio configuration as query parameters:

// Full connection URL with all parameters
ws://ingress.argon.gl:12880?token=eyJhbGci...&stereo=false&frame_duration_ms=20&track_name=audio&track_source=microphone

Connection Parameters

Parameter Type Default Description
tokenstringRequired. The JWT token from StreamToken.
stereobooleanfalseSet to true for stereo audio (2 channels). Mono by default.
channelsnumber1Alternative to stereo. Set to 2 for stereo.
frame_duration_msnumber20Opus frame duration in milliseconds. Valid values: 10, 20, 40, 60.
track_namestring""Display name for the audio track (visible to other participants).
track_sourcestring""Track source type, e.g. microphone, screen_share_audio.
metadatastring""Arbitrary metadata string attached to the track participant.
Music bot (stereo, 20ms frames)
ws://ingress.argon.gl:12880?token=TOKEN&stereo=true&frame_duration_ms=20&track_name=music&track_source=microphone
3

Stream Opus Audio Frames

Send binary WebSocket frames containing Opus-encoded audio data. Each frame should be a single Opus packet.

Audio Requirements

Parameter Value
CodecOpus
Sample Rate48,000 Hz
ChannelsMono (1) or Stereo (2) — configured via stereo or channels query param
Frame Duration10, 20 (default), 40, or 60 ms — configured via frame_duration_ms query param
TransportBinary WebSocket frames (one Opus packet per frame)

Do not send raw PCM. The ingress expects Opus-encoded packets. Use a library like opusenc, ffmpeg, or your language's Opus bindings to encode before sending.

Room & Token Details

Property Details
Room Name Format{spaceId}/{channelId}
Token TypeLiveKit JWT (HS256)
Token TTL2 hours — request a new one before it expires
Bot PermissionsPublish audio/video, subscribe to tracks, send data, update own metadata

Prerequisites & Validation

The StreamToken endpoint validates the following before issuing a token:

  • The bot must be a member of the space — returns 403 otherwise.
  • The channel must exist — returns 404 if not found.
  • The channel must be a voice channel — returns 400 for text channels.
  • Rate limited to 5 requests per minute.

Code Example

TypeScript (Bun / Node.js)

import { spawn } from "child_process";

// 1. Get stream token
const resp = await fetch("https://api.argon.gl/api/bot/IVoice/v1/StreamToken", {
  method: "POST",
  headers: {
    "Authorization": "Bot YOUR_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ spaceId: "...", channelId: "..." }),
});
const { token, ingressUrl } = await resp.json();

// 2. Encode audio to Opus with ffmpeg (stereo, 48kHz)
const ffmpeg = spawn("ffmpeg", [
  "-i", "music.mp3",
  "-f", "opus", "-ar", "48000", "-ac", "2",
  "-frame_duration", "20", "pipe:1",
]);

// 3. Stream to ingress (stereo, 20ms frames)
const wsUrl = `${ingressUrl}?token=${token}&stereo=true&frame_duration_ms=20&track_name=music`;
const ws = new WebSocket(wsUrl);

ws.addEventListener("open", () => {
  ffmpeg.stdout.on("data", (chunk) => {
    ws.send(chunk);
  });
});

ffmpeg.on("close", () => ws.close());

C# (.NET)

using System.Diagnostics;
using System.Net.Http.Json;
using System.Net.WebSockets;

// 1. Get stream token
using var http = new HttpClient();
http.DefaultRequestHeaders.Add("Authorization", "Bot YOUR_TOKEN");

var resp = await http.PostAsJsonAsync(
    "https://api.argon.gl/api/bot/IVoice/v1/StreamToken",
    new { spaceId = "...", channelId = "..." });
var data = await resp.Content.ReadFromJsonAsync<JsonElement>();
var token = data.GetProperty("token").GetString()!;
var url = data.GetProperty("ingressUrl").GetString()!;

// 2. Encode audio with ffmpeg
var ffmpeg = Process.Start(new ProcessStartInfo {
    FileName = "ffmpeg",
    Arguments = "-i music.mp3 -f opus -ar 48000 -ac 2 -frame_duration 20 pipe:1",
    RedirectStandardOutput = true,
    UseShellExecute = false
})!;

// 3. Stream to ingress
using var ws = new ClientWebSocket();
await ws.ConnectAsync(
    new Uri($"{url}?token={token}&stereo=true&frame_duration_ms=20&track_name=music"),
    CancellationToken.None);

var buf = new byte[960];
int read;
while ((read = await ffmpeg.StandardOutput.BaseStream.ReadAsync(buf)) > 0)
    await ws.SendAsync(buf.AsMemory(0, read), WebSocketMessageType.Binary, true, CancellationToken.None);

Use Cases

Music Bot

Stream audio from local files or your own media library. Encode to Opus and push to the channel.

TTS Bot

Convert text to speech (e.g., via Google TTS or Coqui), encode to Opus, and stream the result.

Live Audio Relay

Relay audio from an external source (radio stream, microphone, podcast feed) into the channel.

Audio Notifications

Play short audio clips for alerts, announcements, or sound effects triggered by events.

Next Steps