Image Input (Vision)

Feed images to Claude Code: paste, drag-and-drop, or reference a file path so the model can read screenshots, mockups, architecture diagrams, and charts. Powered by QCode.cc vision models — one API Key works across every endpoint.

Image Input (Vision)

Claude Code doesn't just read code — it can see images. Hand it a screenshot, a design mockup, an error screen, or an architecture diagram, and a vision-capable model will literally "read" what's in the picture, then act on it together with your prompt — turning a mockup into a UI, locating a bug from an error screenshot, or explaining an architecture diagram or data chart.

This page is about image input: you provide the picture, the model understands it. That is a different thing from "image generation."

⚠️ Don't confuse this with image generation This page is about feeding images to the model so it can understand them (vision input). If you want the AI to generate / edit images (posters, illustrations, UI mockups, cutout/background swaps), that's a separate capability — use gpt-image-2. See gpt-image-2 Image Generation. In one line: reading images = this page; drawing images = image-2.


Three ways to provide an image

Inside a Claude Code session, there are three ways to give an image to the model:

Method How Best for
Paste Copy an image, then press Ctrl+V into the input box Images grabbed by a screenshot tool, anything in your clipboard
Drag-and-drop Drag an image file straight into the terminal window Image files you already have in a file manager
Reference a path Write the image's file path in your prompt Images already in the project, batch/scripted processing

On macOS, a screenshot goes to the clipboard by default, so Ctrl+V works right away; if the screenshot was saved as a file, drag-and-drop or a path is more reliable. Paste and drag support varies by terminal / OS — if it isn't recognized, fall back to "reference a file path," which is the most reliable option.

Reference-a-path examples

> Look at the mockup ./design/dashboard.png and implement it with React + Tailwind
> What's causing this error screenshot @screenshots/build-error.png, and how do I fix it?
> Explain the system architecture in docs/architecture.png  how do the components interact?

Referencing a path is reproducible: put it in a script or a prompt template and it points to the exact same image every time, with no manual pasting.


Typical use cases

1. Mockup / screenshot → UI code

Give Claude Code a mockup or a screenshot of an existing page and have it reproduce runnable frontend code.

> Here's the mockup @mockups/pricing-page.png. Implement it with Next.js + Tailwind.
  Match the layout, spacing, and colors as closely as possible, with sensible component splits.

The model reads the layout, copy, and colors from the image and generates the matching component code. Fidelity depends on how clear the image is and on the constraints in your prompt (framework, styling system, naming conventions, etc.).

2. Error screenshot → locate and fix

Those red errors in your IDE, browser console, or terminal — just screenshot them instead of retyping a long stack trace.

> @screenshots/ts-error.png How do I fix this TypeScript error? The relevant code is in src/api/client.ts

The model reads the error message and stack from the screenshot and, combined with the code file you point to, suggests a fix.

3. Architecture diagram / data chart → understand and explain

Hand it an architecture diagram, sequence diagram, flowchart, monitoring dashboard, or data chart, and have it explain, extract data, or write code / docs from it.

> Explain the data-flow diagram @diagrams/data-flow.png and draft an API doc based on it
> Read this monitoring screenshot @screenshots/latency.png  what's the p95 latency range? Any abnormal spikes?

Which models can see images

Vision is a property of the model — not every model can read images. Vision-capable models available via QCode.cc:

Model Notes
claude-opus-4-8 Flagship; strong vision + reasoning, first pick for complex mockups / charts
claude-sonnet-4-6 Balanced choice; great value for everyday image reading / UI reproduction
GPT-5.x (gpt-5.5 / gpt-5.4 / gpt-5.4-mini) OpenAI-family vision models with strong multimodal understanding

Lightweight text- / code-only models may not support image input. Feed an image to a non-vision model and the image is ignored or errors out — pick one of the models above. For switching models, see Model Selection.

QCode.cc's advantage: one cr_ API Key covers every protocol and model. The same key works with Claude vision models and GPT-5.x alike, switching on demand without changing accounts. For endpoints and protocol paths, see Endpoints & API Formats (users in China should prefer asia.qcode.cc).


Practical tips

  • Keep images clear: too low a resolution or blurred text means the model can't read it accurately. Use originals for mockups and error screenshots; don't over-compress.
  • Don't pile on too many images at once: more and larger images burn more tokens and make it harder to stay on point. Focus on one or two key images at a time.
  • Spell out your intent in text: state next to the image what you want it to do ("reproduce in React," "find the cause of the error," "extract the table data") — far better than dropping an image alone.
  • Redact sensitive info first: mask keys, tokens, internal addresses, user data, etc. in the screenshot before sending it.
  • Use paths for batch scenarios: to process images in scripts / automation, "reference a file path" is the most controllable approach; combined with the headless mode in Automation & CI/CD you can batch-process.

With a single QCode.cc API Key, enjoy Claude vision models and GPT-5.x multimodal power, with quota shared across every endpoint. See plans & pricing →

Related Documents

Use QCode with 9router
Add QCode.cc as a custom provider in 9router, a local multi-provider router, for cross-provider fallback and unified management
gpt-image-2 Image Generation and Editing
OpenAI-compatible gpt-image-2 text-to-image + image-edit API: drop in by switching base_url, multi-region endpoints, unified billing with your QCode key
Adaptive Thinking Configuration Guide
Adaptive thinking on Claude Opus 4.8 / Sonnet 4.6 / Haiku 4.5: thinking + effort parameters, differences from the legacy budget_tokens, and quality vs. cost trade-offs
🚀
Get Started with QCode — Claude Code & Codex
One plan for both Claude Code and Codex, Asia-Pacific low latency
View Pricing Plans → Create Account
Team of 3+?
Enterprise: dedicated domain + sub-key management + ban protection, from ¥250/person/mo
Learn Enterprise →