Benchmark

LLM latency baseline on March 30, 2026

This page records a short public request against the OpenAI-compatible chat endpoint on `probqa.com`. The purpose is to show the live request path, token accounting, and end-to-end latency for a small completion, not to claim maximum throughput.

Request

Public API call used for the baseline

curl https://probqa.com/v1/chat/completions \
  -H "Authorization: Bearer cgs_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
    "messages": [
      {"role": "system", "content": "Be concise and useful."},
      {"role": "user", "content": "Give a one-sentence description of an online SAT solver."}
    ],
    "max_tokens": 96
  }'

Result

Measured output

Transport

HTTP status: `200`
Response size: `778 bytes`

Usage

Prompt tokens: `32`
Completion tokens: `38`
Total tokens: `70`

Preview

“An online SAT (Satisfiability) solver is a web-based tool that takes a Boolean satisfiability problem as input and returns a solution or determines that the problem is unsatisfiable.”

Interpretation

Why this matters to buyers

Next Step

Use the same curl shape on your own workload

Trial credits let you test the exact OpenAI-compatible surface with your own prompts, your own system messages, and your own token budgets.