Billing Description

Understanding QCode.cc's billing methods, pricing data sources, and fee calculation logic

Billing Description

This document provides a detailed introduction to QCode.cc's billing methods and pricing logic, helping you understand how costs are calculated.

Billing Principles

QCode.cc bills based on token usage. Each time you call an AI model, the cost consists of two parts:

  • Input Tokens: The content you send to the model, including prompts, context, file content, etc.

  • Output Tokens: The response content generated by the model.

What is a Token? A token is the basic unit of text processed by the model. In English, 1 token is approximately 4 characters or ΒΎ of a word; in Chinese, 1 character typically corresponds to 1-2 tokens.

Cost calculation formula:

Total Cost = Input Tokens Γ— Input Unit Price + Output Tokens Γ— Output Unit Price

Pricing Data Source

Why not use official prices directly?

Prices publicly listed by model providers like Anthropic and OpenAI are for API calls, but they have not disclosed their complete internal token counting calculation rules (for example: whether system prompts are included, how tool use tokens are counted, how cache hits are priced, etc.). This causes discrepancies between estimates based on official prices and actual bills.

LiteLLM Open Source Pricing Table

To ensure transparency and fairness in billing, QCode.cc uses the model pricing table maintained by LiteLLM, a widely recognized open-source project in the industry, as the billing benchmark:

Data source address: github.com/BerriAI/litellm/model_prices_and_context_window.json

Why choose LiteLLM?

  • Industry Standard: LiteLLM is one of the most popular LLM API proxy gateways, used by thousands of companies and developers.

  • Community Maintained: Pricing data is continuously maintained and verified by the open-source community, ensuring accuracy.

  • Comprehensive Coverage: Covers all models from major providers like Anthropic, OpenAI, and Google.

  • Open and Transparent: All data is publicly available on GitHub, accessible for anyone to view and verify.

  • Timely Updates: When model providers adjust prices, the community updates pricing data promptly.

Main Model Pricing Reference

Pricing for commonly used models (unit: USD / million tokens):

Claude Series (Anthropic)

Model Input Price Output Price Cache Write Cache Read
claude-opus-4-6 $5.00 $25.00 $6.25 $0.50
claude-sonnet-4-6 $3.00 $15.00 $3.75 $0.30
claude-opus-4-5-20251101 $5.00 $25.00 $6.25 $0.50
claude-sonnet-4-5-20250929 $3.00 $15.00 $3.75 $0.30
claude-haiku-4-5-20251001 $1.00 $5.00 $1.25 $0.10

GPT / Codex Series (OpenAI)

Model Input Price Output Price Cache Read
gpt-5.4 $2.00 $16.00 $0.20
gpt-5.4-pro (gpt-5.4 Pro) $2.00 $16.00 $0.20
gpt-5.4-codex $2.00 $16.00 $0.20
gpt-5.3-codex-spark $1.75 $14.00 $0.175
gpt-5.3-codex $1.75 $14.00 $0.175

Note: The above prices come from the LiteLLM pricing table and may change with provider price adjustments. For the latest prices, please refer to the LiteLLM data source. The gpt-5.4 series is the latest generation model.

About Cache Pricing

Some models (such as the Claude series) support Prompt Caching, which caches repeated context content. Cache-related pricing:

  • Cache Write: The cost of writing content to cache for the first time, usually slightly higher than regular input prices.

  • Cache Read: The cost when hitting the cache, usually around 10% of regular input prices.

The caching mechanism can significantly reduce usage costs in scenarios with repeated context.

Billing Examples

Assume you use claude-sonnet-4-5-20250929 for a code Q&A session:

Item Quantity Unit Price Cost
Input tokens 5,000 $3.00 / million $0.015
Output tokens 2,000 $15.00 / million $0.030
Total $0.045

In actual usage, a complete interaction with Claude Code usually involves multiple API calls (analyzing code, generating solutions, executing operations, etc.), so the actual cost will be higher than a single call.

Price Update Mechanism

  • QCode.cc regularly syncs with the latest pricing data from LiteLLM.

  • When a model provider announces price adjustments, the LiteLLM community updates the data source promptly, and we sync accordingly.

  • Price updates do not affect historical charges that have already been incurred; they only affect new usage after the update.

How to View Usage

Log in to the QCode.cc Console and navigate to the "Usage Statistics" page to view:

  • Model Call Details: Model, token count, and cost for each call.

  • Cost Summary: Daily and monthly cost statistics.

  • Plan Consumption Progress: Current subscription plan quota usage.

Viewing in CLI

Use the /cost command in Claude Code to quickly view the usage overview for the current session:

/cost

Tip: The cost shown by /cost is an approximate value. For accurate data, please refer to the Dashboard.

FAQ

Is your price the same as the official price?

Our token unit prices come directly from the LiteLLM open-source pricing table and are consistent with the API prices publicly listed by each provider. The main difference lies in the token counting methodβ€”providers' internal token counting rules are not fully disclosed, so there may be minor discrepancies with the /cost estimate in the CLI.

How often is pricing data updated?

We sync with the LiteLLM data source regularly. Updates are typically completed within a few days after a provider announces price adjustments.

How can I verify the prices myself?

You can directly view the LiteLLM pricing data source:

  1. Visit model_prices_and_context_window.json.

  2. Search for the model name you use (e.g., claude-sonnet-4-5-20250929).

  3. Check the input_cost_per_token and output_cost_per_token fields.

  4. Multiply the per-token price by 1,000,000 to get the price per million tokens.

Why choose a third-party pricing table instead of custom prices?

Choosing an open, transparent third-party data source ensures fairness. LiteLLM's pricing table is maintained by the community, and anyone can review and verify it, avoiding pricing disputes.

πŸš€
Get Started with QCode β€” AI Coding Assistant
Official Claude Code relay, fast and reliable, ready to use
View Pricing Plans β†’ Create Account