OpenAI-Compatible Inference

Prepaid Llama API on owned GPUs

Both Hemispheres provides a practical OpenAI-compatible inference endpoint for builders who want prepaid credits, simple billing, async batch jobs, and a direct support channel instead of a heavyweight cloud procurement process.

Start with trial credits Open live playground

Who It Is For

Useful for teams that need a second provider or a faster buying path

Indie builders

Ship quickly with familiar `/v1/models`, `/v1/chat/completions`, and batch endpoints plus a prepaid balance instead of a long vendor setup.

Internal tooling teams

Use the API for prototypes, assistants, or reasoning-heavy workflows without committing immediately to a larger contract.

Agencies and consultants

Add a backup LLM provider or a dedicated workload lane when your existing stack is too rigid or too slow to support a client deadline.

API Example

Drop into an OpenAI-style client or queue async work

curl https://probqa.com/v1/chat/completions \
  -H "Authorization: Bearer cgs_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
    "messages": [
      {"role": "system", "content": "Be concise and useful."},
      {"role": "user", "content": "Draft a launch plan for a GPU API business."}
    ],
    "max_tokens": 180
  }'

curl https://probqa.com/v1/batches/embeddings \
  -H "Authorization: Bearer cgs_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "intfloat/multilingual-e5-large-instruct",
    "requests": [
      {"custom_id": "doc-1", "input": "GPU inference for semantic search."},
      {"custom_id": "doc-2", "input": "SAT solver benchmark notes."}
    ]
  }'

Read latency benchmark All benchmarks

Why Buyers Choose It

Transparent usage and direct operator access

Prepaid credits instead of a custom billing cycle.
OpenAI-compatible surface for low-friction integration.
Async chat and embeddings batches for larger indexing or offline workloads.
Visible model catalog and browser playground for quick evaluation.
Direct support email when you need a human instead of a ticket maze.

Next Step

Get trial credits and run a real completion

Create an account, top up only when you are ready, and test the browser playground or the API directly. If you need a custom rollout path, contact support below.

Start with trial credits Open LLM playground