Python SDK

Qubax is fully OpenAI-compatible, so you can use the official openai Python SDK. Point the client at the Qubax base URL, supply your API key, and every chat completion, streaming, and embedding call works unchanged.

Install

Install the OpenAI Python package from PyPI.

Shell
pip install openai

Configure

Create a client with the Qubax base URL and your API key. Your key starts with qbx_live_.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.qubax.ai/v1",
    api_key="qbx_live_your_api_key_here",
)
ℹ️
You can also set the key and base URL via environment variables:
OPENAI_API_KEY and OPENAI_BASE_URL.

Chat completion

A standard non-streaming chat completion request.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.qubax.ai/v1",
    api_key="qbx_live_your_api_key_here",
)

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum entanglement in one sentence."},
    ],
)

print(response.choices[0].message.content)

Streaming

Stream tokens as they are generated by setting stream=True.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.qubax.ai/v1",
    api_key="qbx_live_your_api_key_here",
)

stream = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta is not None:
        print(delta, end="", flush=True)
print()

Async

Use AsyncOpenAIfor non-blocking workloads such as servers or concurrent fan-out.

Python
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="https://api.qubax.ai/v1",
    api_key="qbx_live_your_api_key_here",
)

async def main():
    stream = await client.chat.completions.create(
        model="gpt-5",
        messages=[{"role": "user", "content": "Say hello in three languages."}],
        stream=True,
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta.content
        if delta is not None:
            print(delta, end="", flush=True)
    print()

asyncio.run(main)

Error handling

Catch the standard OpenAI exception types. Insufficient credits raise a RateLimitError; an invalid key or model raises AuthenticationErroror BadRequestError.

Python
from openai import OpenAI, APIError, RateLimitError, AuthenticationError

client = OpenAI(
    base_url="https://api.qubax.ai/v1",
    api_key="qbx_live_your_api_key_here",
)

try:
    response = client.chat.completions.create(
        model="gpt-5",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

except AuthenticationError:
    print("Invalid API key. Check that it starts with qbx_live_.")

except RateLimitError:
    print("Out of credits or rate limited. Top up at app.qubax.ai.")

except APIError as e:
    print(f"Qubax API error ({e.status_code}): {e.message}")