Pricing

Pay only for what you use.

No subscriptions. No seats. No minimums. Get an API key, add credits, and pay per million tokens — starting at $0.40 in / $0.80 out per 1M for Zoraxe. Up to 25× cheaper than frontier US vendors.

Base feeNone
UsagePer 1M tokens, by model
MinimumsNone
BillingPrepaid credits · USD
How pricing works

Simple. Pure consumption.

Add credits to your account, make API calls, pay only for what you use. Priced per million tokens — separately for input and output. No subscriptions, no seats, no idle charges.

1 · Get an API key

Sign up, add prepaid credits, and receive your private API key. Point your SDK at api.zoraxe.ai/v1 and you're live.

2 · Call any model

Choose from 10 models — Zoraxe, GLM, DeepSeek, Qwen, Kimi. Switch per request. Input and output billed separately at the per-model rates in the table below.

3 · Pay as you go

Usage is deducted from your credit balance in real time. Set per-key spend caps to stay in control. Top up anytime. Volume discounts kick in automatically at scale.

Typical spend: $5–$50/month for light prototyping, $100–$500/month for active production apps. Need unlimited usage on your own infrastructure? See enterprise pricing ↓

Per-model usage rates · USD per 1M tokens

Every model, on one private endpoint.

All models run on the same sovereign Zoraxe infrastructure — your data never leaves. Pricing is per million tokens, charged separately for input (your prompt) and output (the response). No base fee on top.

Model Alias Input Output
— Zoraxe native models
— GLM family
GLM 5.1
Long context (1M tokens), strong reasoning.
glm-5.1 $1.68/1M $5.28/1M
GLM 5
Flagship GLM generation; prior version of 5.1.
glm-5 $1.68/1M $5.28/1M
GLM 4.7
Mid-tier GLM; excellent quality-to-price.
glm-4.7 $0.72/1M $2.64/1M
— DeepSeek family
DeepSeek V3.1
Strong generalist with competitive throughput.
deepseek-v3.1 $0.67/1M $2.02/1M
DeepSeek V3.2
Latest DeepSeek; sharper reasoning, same pricing.
deepseek-v3.2 $0.67/1M $2.02/1M
— Qwen & Kimi
Qwen3 Plus
High-quality multilingual with cheap input.
qwen3-plus $0.60/1M $3.60/1M
Kimi K2.5
Moonshot custom variant; solid generalist.
kimi-k2.5-custom $0.72/1M $3.60/1M
Kimi K2.6
Latest Kimi; stronger reasoning and tool-use.
kimi-k2.6-custom $1.14/1M $4.80/1M

All prices in USD per 1,000,000 tokens. Input tokens are counted from your prompt (system + user messages + function schemas). Output tokens are counted from the model's response. Embeddings and fine-tuning priced separately — contact us.

Enterprise

Private deployments: flat platform fee.

If you need Zoraxe deployed in your own VPC or on-prem — with the API, Chat, and Code running on your infrastructure and your hardware — pricing moves to a flat annual platform fee plus an implementation engagement. No per-token markup. You own the GPU capacity.

  • Flat annual platform fee (site license)
  • Implementation engagement (2–6 weeks)
  • Managed support with SLA & runbooks
  • Air-gapped and sovereign-region available
  • BAA, DPA, SOC 2 report, security questionnaires

What's included

Unlimited API usage
On your own GPU capacity
✓ Included
Chat + Code + Automate
All products, all users
✓ Included
Integrations & MCP
120+ native connectors
✓ Included
Managed upgrades
On your maintenance schedule
✓ Included
99.9% SLA
With support tiers
✓ Included
FAQ

Common pricing questions.

What's a token, in plain English?+
A token is roughly 3-4 characters of English. One million tokens is approximately 750,000 words — about a 3,000-page book. For Zoraxe at $0.40 input / $0.80 output per million tokens, a typical back-and-forth conversation costs fractions of a cent.
Why are Zoraxe and Zoraxe Coder so much cheaper than everything else?+
Zoraxe runs on open-weight models we've fine-tuned and self-host on our own GPU capacity. We don't pay third-party licensing fees, and we don't mark up API costs. You get the savings.
What do I get access to with an API key?+
Your API key unlocks Zoraxe Chat (the web app), Zoraxe Coding (the backend for Cursor, VS Code, JetBrains), and the full OpenAI-compatible API. All 10 models in the table above are available on the same endpoint. Standard rate limits (10 RPM · 100K TPM) and email support included.
Is there a minimum spend or commitment?+
No. Pure pay-as-you-go — add credits, use them, top up when needed. No minimum spend, no monthly fee, no annual contract. If you don't make a single API call, you pay nothing.
Do unused credits expire?+
No expiry on prepaid credits. They sit in your balance until you use them. You can set per-key spend caps and daily limits to stay in full control of your costs.
Are there volume discounts?+
Yes. At ~$500/month in usage, discounted rates apply automatically. Above ~$5K/month, a private deployment with flat platform pricing is usually more cost-effective — talk to us.
Do you charge for embeddings or function calls?+
Embeddings are billed at $0.05 per 1M tokens on our Zoraxe embedding model. Function calling is billed as part of the regular chat completion — your function schema counts as input tokens, the tool-call response counts as output.
Can we run this in our own VPC?+
Yes — see the enterprise section above. Private deployments move to a flat annual platform fee. All models above are available on your own infrastructure. Air-gapped deployments available for defense and regulated customers.
How do I pay?+
Prepaid credits via credit card for self-serve customers. Monthly invoicing with net-30 terms for organizations on annual contracts. Payments in USD, CAD, or EUR.
Get started

Start building. Pay only for what you use.

Get an API key, add credits, and point your SDK at our endpoint. You'll be making calls in minutes — no subscriptions, no surprises on the bill.