Vision
The Vision feature lets you send images alongside your text messages to vision-capable AI models. The model can analyze, describe, and answer questions about the images you provide.
Supported Providers
Not all models support vision. The following providers and models can process images:
| Provider | Vision Models |
|---|---|
| Anthropic | Claude Sonnet 4, Claude Opus 4, Claude Haiku 3.5, and other Claude 3+ models |
| OpenAI | GPT-4o, GPT-4o mini, GPT-4 Turbo, o1, o3 |
| xAI | Grok 2 Vision |
| Google Gemini | Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash |
| OpenRouter | Any vision-capable model available through OpenRouter |
The model registry indicates which models support vision via the "vision" capability tag. If a model does not support vision, the image will be ignored or cause an error.
How to Send Images
There are three ways to attach an image to your message:
1. Paste from Clipboard (Ctrl+V / Cmd+V)
Copy an image from any source (screenshot tool, web browser, image editor) and paste it directly into the message input area. The image is detected automatically and appears as a thumbnail preview.
2. Upload Button
Click the camera icon button next to the Send button. A file picker opens where you can select an image from your device.
3. Drag and Drop
Drag an image file from your file manager and drop it onto the message input area.
Image Preview
Once an image is attached, a thumbnail preview appears above the input area. You can:
- See what image is queued for sending
- Click the X button to remove the image before sending
- Type your text message alongside the image
You can attach an image and send it with no text. Just paste or upload the image and hit Enter. The model will analyze the image and describe what it sees.
Sending the Message
When you click Send (or press Enter), both your text and the attached image are sent together as a single message. The image is encoded as a base64 data URL and included in the API request.
After sending, the image preview is cleared automatically. The user message in the chat history shows your text (the image data is stored in the message internally but displayed as text in the chat view).
Image Format Support
The following image formats are supported:
- JPEG (.jpg, .jpeg)
- PNG (.png)
- GIF (.gif)
- WebP (.webp)
Large images increase API costs because they consume more tokens. Most providers have image size limits. Images are sent as base64-encoded data, so a 1 MB image adds roughly 1.3 MB to the request payload. Consider resizing very large images before sending.
Provider-Specific Formatting
The platform automatically formats image data according to each provider's API requirements:
- Anthropic uses the
imagecontent block format withsource.type: "base64"and the image's MIME type - OpenAI, xAI, OpenRouter, Gemini use the
image_urlcontent block format with a data URL
You do not need to handle this -- it is automatic based on the selected provider.
Multiple Images
You can send one image per message. To discuss multiple images, send them in separate messages. The model retains context from previous messages, so you can say "compare this image to the one I sent earlier."
Enable/Disable Vision
Vision is enabled by default. You can toggle it in Settings > Capabilities. When disabled, the image upload button and paste handling are deactivated.
Use Cases
- Screenshot analysis -- paste a screenshot and ask "What error is shown here?"
- Document reading -- photograph a document and ask the model to extract text or summarize
- Code review -- share a screenshot of code and ask for improvements
- Design feedback -- upload a mockup and get design suggestions
- Math problems -- photograph a math problem and ask for a solution
- Data visualization -- share a chart and ask for interpretation