Billing Description

Understanding QCode.cc's billing methods, pricing data sources, and fee calculation logic

title: "Billing Description" description: "Understanding QCode.cc's billing methods, pricing data sources, and fee calculation logic"

Billing Description ¶

This document provides a detailed introduction to QCode.cc's billing methods and pricing logic, helping you understand how costs are calculated.

Billing Principles ¶

QCode.cc bills based on token usage. Each time you call an AI model, the cost consists of two parts:

Input Tokens: The content you send to the model, including prompts, context, file content, etc.
Output Tokens: The response content generated by the model.

What is a Token? A token is the basic unit of text processed by the model. In English, 1 token is approximately 4 characters or ¾ of a word; in Chinese, 1 character typically corresponds to 1-2 tokens.

Cost calculation formula:

Total Cost = Input Tokens × Input Unit Price + Output Tokens × Output Unit Price

Prices publicly listed by model providers like Anthropic and OpenAI are for API calls, but they have not disclosed their complete internal token counting calculation rules (for example: whether system prompts are included, how tool use tokens are counted, how cache hits are priced, etc.). This causes discrepancies between estimates based on official prices and actual bills.

LiteLLM Open Source Pricing Table ¶

To ensure transparency and fairness in billing, QCode.cc uses the model pricing table maintained by LiteLLM, a widely recognized open-source project in the industry, as the billing benchmark:

Data source address: github.com/BerriAI/litellm/model_prices_and_context_window.json

Why choose LiteLLM?

Industry Standard: LiteLLM is one of the most popular LLM API proxy gateways, used by thousands of companies and developers.
Community Maintained: Pricing data is continuously maintained and verified by the open-source community, ensuring accuracy.
Comprehensive Coverage: Covers all models from major providers like Anthropic, OpenAI, and Google.
Open and Transparent: All data is publicly available on GitHub, accessible for anyone to view and verify.
Timely Updates: When model providers adjust prices, the community updates pricing data promptly.

Main Model Pricing Reference ¶

Pricing for commonly used models (unit: USD / million tokens):

Claude Series (Anthropic)¶

Model	Input Price	Output Price	Cache Write	Cache Read
claude-opus-4-6	$5.00	$25.00	$6.25	$0.50
claude-sonnet-4-6	$3.00	$15.00	$3.75	$0.30
claude-opus-4-5-20251101	$5.00	$25.00	$6.25	$0.50
claude-sonnet-4-5-20250929	$3.00	$15.00	$3.75	$0.30
claude-haiku-4-5-20251001	$1.00	$5.00	$1.25	$0.10

GPT / Codex Series (OpenAI)¶

Model	Input Price	Output Price	Cache Read
gpt-5.5	$5.00	$30.00	$0.50
gpt-5.4	$2.50	$15.00	$0.25
gpt-5.4-mini	$0.75	$4.50	$0.07
gpt-5.3-codex	$1.75	$14.00	$0.17

Note: Prices are based on qcode.cc/models and may change with vendor adjustments.

About Cache Pricing ¶

Some models (such as the Claude series) support Prompt Caching, which caches repeated context content. Cache-related pricing:

Cache Write: The cost of writing content to cache for the first time, usually slightly higher than regular input prices.
Cache Read: The cost when hitting the cache, usually around 10% of regular input prices.

The caching mechanism can significantly reduce usage costs in scenarios with repeated context.

Billing Examples ¶

Assume you use claude-sonnet-4-5-20250929 for a code Q&A session:

Item	Quantity	Unit Price	Cost
Input tokens	5,000	$3.00 / million	$0.015
Output tokens	2,000	$15.00 / million	$0.030
Total			$0.045

In actual usage, a complete interaction with Claude Code usually involves multiple API calls (analyzing code, generating solutions, executing operations, etc.), so the actual cost will be higher than a single call.

Price Update Mechanism ¶

QCode.cc regularly syncs with the latest pricing data from LiteLLM.
When a model provider announces price adjustments, the LiteLLM community updates the data source promptly, and we sync accordingly.
Price updates do not affect historical charges that have already been incurred; they only affect new usage after the update.

How to View Usage ¶

Dashboard (Recommended)¶

Model Call Details: Model, token count, and cost for each call.
Cost Summary: Daily and monthly cost statistics.
Plan Consumption Progress: Current subscription plan quota usage.

Viewing in CLI ¶

Use the /cost command in Claude Code to quickly view the usage overview for the current session:

/cost

Tip: The cost shown by /cost is an approximate value. For accurate data, please refer to the Dashboard.

FAQ ¶

Is your price the same as the official price?¶

Our token unit prices come directly from the LiteLLM open-source pricing table and are consistent with the API prices publicly listed by each provider. The main difference lies in the token counting method—providers' internal token counting rules are not fully disclosed, so there may be minor discrepancies with the /cost estimate in the CLI.

How often is pricing data updated?¶

We sync with the LiteLLM data source regularly. Updates are typically completed within a few days after a provider announces price adjustments.

How can I verify the prices myself?¶

You can directly view the LiteLLM pricing data source:

Visit model_prices_and_context_window.json.
Search for the model name you use (e.g., claude-sonnet-4-5-20250929).
Check the input_cost_per_token and output_cost_per_token fields.
Multiply the per-token price by 1,000,000 to get the price per million tokens.

Why choose a third-party pricing table instead of custom prices?¶

Choosing an open, transparent third-party data source ensures fairness. LiteLLM's pricing table is maintained by the community, and anyone can review and verify it, avoiding pricing disputes.

Plans & Pricing — View current plans and discounts
Models & Pricing — View real-time pricing and rates for all supported models

← Previous

Error Code Reference

Model Pricing

🚀

Get Started with QCode — Claude Code & Codex

One plan for both Claude Code and Codex, Asia-Pacific low latency

View Pricing Plans → Create Account

Team of 3+?

Enterprise: dedicated domain + sub-key management + ban protection, from ¥250/person/mo

Learn Enterprise →

Billing Description