Qubax exposes two audio endpoints — one for generating spoken audio from text (Text-to-Speech) and one for transcribing uploaded audio into text (Speech-to-Text). Both are OpenAI-compatible and accept your qbx_live_... key as a Bearer token.
Authorization: Bearer qbx_live_...Generate spoken audio from input text. The response body is the raw audio bytes in the requested format (no JSON wrapper).
POST https://api.qubax.ai/v1/audio/speech| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | TTS model to use (e.g. tts-1). |
| input | string | Yes | The text to synthesize. Max 4,096 characters. |
| voice | string | No | Voice preset: alloy, echo, fable, onyx, nova, shimmer. Default alloy. |
| response_format | string | No | Output container: mp3, opus, aac, flac, wav, pcm. Default mp3. |
from pathlib import Path
from openai import OpenAI
client = OpenAI(
api_key="qbx_live_...",
base_url="https://api.qubax.ai/v1",
)
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input="Hello world, this is Qubax text-to-speech.",
)
# The SDK exposes the raw bytes via .content
Path("speech.mp3").write_bytes(response.content)
print("Wrote speech.mp3")
curl https://api.qubax.ai/v1/audio/speech \
-H "Authorization: Bearer *** \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello world, this is Qubax text-to-speech.",
"voice": "alloy"
}' \
-o speech.mp3Because the response is raw audio, point your HTTP client's output straight at a file (-o speech.mp3 in cURL, or write response.content to disk in Python).
Transcribe an audio file into text. This endpoint accepts multipart/form-data with a file upload and returns a JSON object containing the transcript.
POST https://api.qubax.ai/v1/audio/transcriptions| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Yes | The audio file to transcribe. Max 25 MB. |
| model | string | Yes | Transcription model to use (e.g. whisper-1). |
| Supported formats |
|---|
wav mp3 mp4 m4a ogg webm flac |
file upload is capped at 25 MB. For longer recordings, split the file before uploading.from openai import OpenAI
client = OpenAI(
api_key="qbx_live_...",
base_url="https://api.qubax.ai/v1",
)
with open("audio.mp3", "rb") as audio_file:
result = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
)
print(result.text)
curl https://api.qubax.ai/v1/audio/transcriptions \
-H "Authorization: Bearer *** \
-F "[email protected]" \
-F "model=whisper-1"
A successful transcription returns a JSON object with the full text:
{
"text": "The quick brown fox jumps over the lazy dog."
}Pass response_format: "verbose_json" in the form data to receive segments with per-phrase timestamps, or "srt" / "vtt" for caption-ready text.