Перейти к основному содержимому

Vision

The Vision feature lets you send images alongside your text messages to vision-capable AI models. The model can analyze, describe, and answer questions about the images you provide.

Supported Providers

Not all models support vision. The following providers and models can process images:

ProviderVision Models
AnthropicClaude Sonnet 4, Claude Opus 4, Claude Haiku 3.5, and other Claude 3+ models
OpenAIGPT-4o, GPT-4o mini, GPT-4 Turbo, o1, o3
xAIGrok 2 Vision
Google GeminiGemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash
OpenRouterAny vision-capable model available through OpenRouter
к сведению

The model registry indicates which models support vision via the "vision" capability tag. If a model does not support vision, the image will be ignored or cause an error.

How to Send Images

There are three ways to attach an image to your message:

1. Paste from Clipboard (Ctrl+V / Cmd+V)

Copy an image from any source (screenshot tool, web browser, image editor) and paste it directly into the message input area. The image is detected automatically and appears as a thumbnail preview.

2. Upload Button

Click the camera icon button next to the Send button. A file picker opens where you can select an image from your device.

3. Drag and Drop

Drag an image file from your file manager and drop it onto the message input area.

Image Preview

Once an image is attached, a thumbnail preview appears above the input area. You can:

  • See what image is queued for sending
  • Click the X button to remove the image before sending
  • Type your text message alongside the image
подсказка

You can attach an image and send it with no text. Just paste or upload the image and hit Enter. The model will analyze the image and describe what it sees.

Sending the Message

When you click Send (or press Enter), both your text and the attached image are sent together as a single message. The image is encoded as a base64 data URL and included in the API request.

After sending, the image preview is cleared automatically. The user message in the chat history shows your text (the image data is stored in the message internally but displayed as text in the chat view).

Image Format Support

The following image formats are supported:

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • GIF (.gif)
  • WebP (.webp)
warning

Large images increase API costs because they consume more tokens. Most providers have image size limits. Images are sent as base64-encoded data, so a 1 MB image adds roughly 1.3 MB to the request payload. Consider resizing very large images before sending.

Provider-Specific Formatting

The platform automatically formats image data according to each provider's API requirements:

  • Anthropic uses the image content block format with source.type: "base64" and the image's MIME type
  • OpenAI, xAI, OpenRouter, Gemini use the image_url content block format with a data URL

You do not need to handle this -- it is automatic based on the selected provider.

Multiple Images

You can send one image per message. To discuss multiple images, send them in separate messages. The model retains context from previous messages, so you can say "compare this image to the one I sent earlier."

Enable/Disable Vision

Vision is enabled by default. You can toggle it in Settings > Capabilities. When disabled, the image upload button and paste handling are deactivated.

Use Cases

  • Screenshot analysis -- paste a screenshot and ask "What error is shown here?"
  • Document reading -- photograph a document and ask the model to extract text or summarize
  • Code review -- share a screenshot of code and ask for improvements
  • Design feedback -- upload a mockup and get design suggestions
  • Math problems -- photograph a math problem and ask for a solution
  • Data visualization -- share a chart and ask for interpretation