HumanBit Logo

Web Developer | HumanBit main

full-time
Posted on August 18, 2025

Job Description

VoiceBot Web implementation Doc Server–Client Event Handling This document explains how the VoiceBot component communicates with the server via WebSocket using structured event-based messages. It covers: Client-sent events (start, media) Server-sent events (media, mark, clear) Audio processing flow πŸ” Event Flow Summary [Client] └─ ➀ Connect to WebSocket └─ ➀ Send 'start' event └─ ➀ Start recording microphone └─ ➀ Encode PCM(16-bit Mono) (S16LE) β†’ Base64 └─ ➀ Send 'media' chunks [Server] └─ ➀ Receive 'media' chunks └─ ➀ Process + respond with 'media' (Base64) └─ ➀ Optionally send 'mark' or 'clear' events [Client] └─ ➀ Receive events └─ ➀ 'media': decode + play └─ ➀ 'clear': stop playback 󰳕 Client-Sent Events 1. start EventSent once after WebSocket connection is established. Informs server to begin voice session. { "event": "start", "start": { "call_sid": "test", // placeholder or session identifier (for server) "stream_sid": "test" // stream identifier (for client) } } πŸ“Œ Sent from: initializeWebSocket πŸ“Œ Sent after: 2-second delay post connection 2. media Event Sent continuously during active microphone recording every 200ms of 200ms of audio. { "event": "media", "media": { "chunk": "inbound", // direction of audio "payload": "<base64_PCM>", // audio chunk (16-bit Mono PCM encoded to Base64) "timestamp": 1721047849.123 // Unix timestamp } } 🌐 Server-Sent Events 1. media Event Sent by server as a response audio chunk. Meant to be played by the client. { "event": "media", "media": { "payload": "<base64_PCM>" } } πŸ“Œ Handled in: ws.onmessage πŸ“Œ Decoded and played in: playAudioResponse() 2. clear Event Instructs client to stop audio playback and clear existing buffer. In this case server has encountered interruption based on voice, this axpect client to stop playing the audio and clear the audio buffer if any for older audio{ "event": "clear" } 3. mark Event (Optional) Typically used as metadata or checkpoints from server. this is sent every time the audio is played till the server mark event and should be returned to server after finishing the audio playback. { "event": "mark", "mark": { "type": "info", "value": "partial_result" } } Timeout if WebSocket doesn't open in 10s πŸ›  Developer Notes ● Make sure server is expecting start and media messages ● media event will only be sent after the start even sent by client, all the media event if sent before start will be ignored or may fail ● Audio must be 16-bit linear PCM, sampled at 8000 Hz

Powered by
HumanBit Logo