Billing Description
Understanding QCode.cc's billing methods, pricing data sources, and fee calculation logic
Billing Description¶
This document provides a detailed introduction to QCode.cc's billing methods and pricing logic, helping you understand how costs are calculated.
Billing Principles¶
QCode.cc bills based on token usage. Each time you call an AI model, the cost consists of two parts:
-
Input Tokens: The content you send to the model, including prompts, context, file content, etc.
-
Output Tokens: The response content generated by the model.
What is a Token? A token is the basic unit of text processed by the model. In English, 1 token is approximately 4 characters or ΒΎ of a word; in Chinese, 1 character typically corresponds to 1-2 tokens.
Cost calculation formula:
Total Cost = Input Tokens Γ Input Unit Price + Output Tokens Γ Output Unit Price
Pricing Data Source¶
Why not use official prices directly?¶
Prices publicly listed by model providers like Anthropic and OpenAI are for API calls, but they have not disclosed their complete internal token counting calculation rules (for example: whether system prompts are included, how tool use tokens are counted, how cache hits are priced, etc.). This causes discrepancies between estimates based on official prices and actual bills.
LiteLLM Open Source Pricing Table¶
To ensure transparency and fairness in billing, QCode.cc uses the model pricing table maintained by LiteLLM, a widely recognized open-source project in the industry, as the billing benchmark:
Data source address: github.com/BerriAI/litellm/model_prices_and_context_window.json
Why choose LiteLLM?
-
Industry Standard: LiteLLM is one of the most popular LLM API proxy gateways, used by thousands of companies and developers.
-
Community Maintained: Pricing data is continuously maintained and verified by the open-source community, ensuring accuracy.
-
Comprehensive Coverage: Covers all models from major providers like Anthropic, OpenAI, and Google.
-
Open and Transparent: All data is publicly available on GitHub, accessible for anyone to view and verify.
-
Timely Updates: When model providers adjust prices, the community updates pricing data promptly.
Main Model Pricing Reference¶
Pricing for commonly used models (unit: USD / million tokens):
Claude Series (Anthropic)¶
| Model | Input Price | Output Price | Cache Write | Cache Read |
|---|---|---|---|---|
| claude-opus-4-6 | $5.00 | $25.00 | $6.25 | $0.50 |
| claude-sonnet-4-6 | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-opus-4-5-20251101 | $5.00 | $25.00 | $6.25 | $0.50 |
| claude-sonnet-4-5-20250929 | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-haiku-4-5-20251001 | $1.00 | $5.00 | $1.25 | $0.10 |
GPT / Codex Series (OpenAI)¶
| Model | Input Price | Output Price | Cache Read |
|---|---|---|---|
| gpt-5.4 | $2.00 | $16.00 | $0.20 |
| gpt-5.4-pro (gpt-5.4 Pro) | $2.00 | $16.00 | $0.20 |
| gpt-5.4-codex | $2.00 | $16.00 | $0.20 |
| gpt-5.3-codex-spark | $1.75 | $14.00 | $0.175 |
| gpt-5.3-codex | $1.75 | $14.00 | $0.175 |
Note: The above prices come from the LiteLLM pricing table and may change with provider price adjustments. For the latest prices, please refer to the LiteLLM data source. The gpt-5.4 series is the latest generation model.
About Cache Pricing¶
Some models (such as the Claude series) support Prompt Caching, which caches repeated context content. Cache-related pricing:
-
Cache Write: The cost of writing content to cache for the first time, usually slightly higher than regular input prices.
-
Cache Read: The cost when hitting the cache, usually around 10% of regular input prices.
The caching mechanism can significantly reduce usage costs in scenarios with repeated context.
Billing Examples¶
Assume you use claude-sonnet-4-5-20250929 for a code Q&A session:
| Item | Quantity | Unit Price | Cost |
|---|---|---|---|
| Input tokens | 5,000 | $3.00 / million | $0.015 |
| Output tokens | 2,000 | $15.00 / million | $0.030 |
| Total | $0.045 |
In actual usage, a complete interaction with Claude Code usually involves multiple API calls (analyzing code, generating solutions, executing operations, etc.), so the actual cost will be higher than a single call.
Price Update Mechanism¶
-
QCode.cc regularly syncs with the latest pricing data from LiteLLM.
-
When a model provider announces price adjustments, the LiteLLM community updates the data source promptly, and we sync accordingly.
-
Price updates do not affect historical charges that have already been incurred; they only affect new usage after the update.
How to View Usage¶
Dashboard (Recommended)¶
Log in to the QCode.cc Console and navigate to the "Usage Statistics" page to view:
-
Model Call Details: Model, token count, and cost for each call.
-
Cost Summary: Daily and monthly cost statistics.
-
Plan Consumption Progress: Current subscription plan quota usage.
Viewing in CLI¶
Use the /cost command in Claude Code to quickly view the usage overview for the current session:
/cost
Tip: The cost shown by
/costis an approximate value. For accurate data, please refer to the Dashboard.
FAQ¶
Is your price the same as the official price?¶
Our token unit prices come directly from the LiteLLM open-source pricing table and are consistent with the API prices publicly listed by each provider. The main difference lies in the token counting methodβproviders' internal token counting rules are not fully disclosed, so there may be minor discrepancies with the /cost estimate in the CLI.
How often is pricing data updated?¶
We sync with the LiteLLM data source regularly. Updates are typically completed within a few days after a provider announces price adjustments.
How can I verify the prices myself?¶
You can directly view the LiteLLM pricing data source:
-
Search for the model name you use (e.g.,
claude-sonnet-4-5-20250929). -
Check the
input_cost_per_tokenandoutput_cost_per_tokenfields. -
Multiply the per-token price by 1,000,000 to get the price per million tokens.
Why choose a third-party pricing table instead of custom prices?¶
Choosing an open, transparent third-party data source ensures fairness. LiteLLM's pricing table is maintained by the community, and anyone can review and verify it, avoiding pricing disputes.