
Veo 3.1 Lite Image-to-Video: Turn Product Photos Into Clips in Under a Minute
How to use Veo 3.1 Lite's image-to-video mode to create product demos, social media content, and brand videos from still photos — with real examples and workflow tips.
What you'll learn
- ✅ How image-to-video works in Veo 3.1 Lite vs text-only generation
- ✅ Which types of product photos work best (and which don't)
- ✅ First frame and last frame technique for controlled motion
- ✅ Prompt templates for product demos, fashion, food, and social hooks
- ✅ Full workflow: photo → video → ready to post
Why Image-to-Video Changes the Workflow
Text-to-video is powerful, but it's probabilistic — you describe what you want and the model interprets it. Image-to-video is different: you provide the exact visual starting point, and the model animates from there.
For product work, this matters. Your product has a specific shape, color, material, and branding. Text prompts can't guarantee those details. An image can.
Veo 3.1 Lite supports image-to-video at 720p and 1080p, in both 16:9 and 9:16 formats, for 4s, 6s, or 8s durations. At 20 credits for 8 seconds on NanoBanana, it's cheap enough to run 5–10 variations on a single product shot and pick the best one.
How Image-to-Video Works in Veo 3.1 Lite
You provide:
- One reference image — the first frame of the video
- A text prompt — describes the motion, camera, and audio
- Duration and aspect ratio — 4s/6s/8s, 16:9 or 9:16
The model generates video that starts from your image and animates outward. The image sets the visual identity; the prompt directs what happens next.
The key insight: the image handles "what it looks like," the prompt handles "what it does." Split the work that way and you get consistent, directed output.

What Makes a Good Input Image
Not all product photos work equally well. Here's what the model handles reliably versus what causes problems:
| Image Type | Works Well | Avoid |
|---|---|---|
| Clean product on solid/simple background | ✅ | |
| Single hero product, centered | ✅ | |
| High contrast, clear edges | ✅ | |
| Multiple SKUs in one frame | ❌ Confuses motion focus | |
| Heavy text/watermarks over product | ❌ Text artifacts in motion | |
| Low-res or heavily compressed images | ❌ Blurry output | |
| Extreme wide shots with small product | ❌ Product loses detail |
Best practice: Use the cleanest version of your product photo — the same one you'd use for an e-commerce listing. Remove backgrounds if possible. The cleaner the input, the more control you have over the output.
The First Frame / Last Frame Technique
Veo 3.1 Lite supports setting just the first frame (your product image as the opening shot). For controlled transitions — where you want the video to start at point A and end at point B — you can also set a last frame.
Use cases:
- Unboxing reveal: First frame = closed box. Last frame = open box with product visible.
- Before/after: First frame = problem state. Last frame = solved state.
- Rotate and settle: First frame = product at angle. Last frame = front-facing hero position.
This technique gives you cinematic control without complex prompting. The model interpolates the motion between your two anchors.
Prompt Templates by Use Case
These prompts are structured for image-to-video. The image provides the visual baseline — the prompt directs motion and feel.
Product: Hero Shot with Camera Move
The camera slowly dollies in toward the product.
Soft studio lighting, clean background.
No movement except the camera push.
SFX: silence.
Duration: 6 seconds.The product rotates slowly 45 degrees clockwise, revealing its side profile.
Tabletop surface, warm side lighting catching texture details.
Camera static, 85mm lens.
SFX: subtle ambient studio hum.Product: Lifestyle / In-Use
A hand reaches in from the right and picks up the product naturally.
Kitchen counter environment, warm afternoon light through a window.
Handheld camera feel, slight movement.
SFX: ambient kitchen sounds, soft handling noise.The product is poured/opened/used in the natural way it's intended.
Close-up, 85mm. Soft natural light.
Focus shifts to the key moment of use.
SFX: the sound of the product being used.Fashion / Apparel
Vertical 9:16 format.
The garment moves gently as if in a light breeze.
Model is still; only fabric has motion.
Outdoor natural light, overcast sky for diffused shadows.
SFX: wind, distant ambient sound.Vertical 9:16 format.
A close-up of the fabric texture. Camera pulls back slowly to reveal the full garment.
Shallow depth of field, 85mm.
SFX: silence.Food & Beverage
Steam rises gently from the dish/drink.
Overhead camera, static.
Warm practical lighting, dark background for contrast.
SFX: ambient café or kitchen sound, very low.Close-up. The liquid pours slowly into frame from above, filling the glass.
Camera static, 85mm. Black background, single side light.
SFX: the sound of liquid pouring, ice clinking.Social Hook (Vertical, 0–4 seconds)
Vertical 9:16. Close-up.
The product spins once and comes to a stop facing the camera.
Bright, clean background. Quick, energetic motion.
SFX: a short whoosh sound as it spins, then stops.
Duration: 4 seconds.Vertical 9:16. Medium shot.
The product drops into frame from above and lands with a satisfying impact.
High-contrast background. Slight slow-motion on the impact.
SFX: a clean thud as it lands.
Duration: 4 seconds.Full Workflow: Photo to Posted Video
Prepare your image
Use a high-res product photo with a clean background. Ideally: PNG or JPG at 1000px+ on the short side, your standard e-commerce hero image format.
Choose your format
For Instagram/TikTok/Shorts: 9:16 vertical, 6s. For website embeds or YouTube: 16:9, 8s. For quick social hooks: 9:16, 4s.
Upload to the generator
Go to Veo 3.1 Lite on NanoBanana, switch to Image-to-Video mode, and upload your product photo.
Add your prompt
Copy one of the templates above, or write your own. Remember: the image handles appearance — your prompt only needs to direct motion, camera, and audio.
Generate and compare
Run 2–3 variations with the same image but slightly different prompts (e.g., dolly in vs. static + rotate). At 20 credits per 8s clip, 3 variations = 60 credits.
Download and post
No post-processing needed for social. For product pages or ads, you may want to trim or loop the clip in a basic video editor.
Common Issues and Fixes
The product looks distorted after a second or two
The model is over-animating. Reduce motion in your prompt: add camera static or minimal movement, only [specific element] moves.
Background changes unexpectedly
Your background has too much detail and the model is reinterpreting it. Reshoot on a simpler background, or add background unchanged, only product moves to your prompt.
The video looks like a slideshow, not smooth motion
Prompt for continuous motion: smooth continuous camera move or fluid 360 rotation. Avoid start/stop action descriptions.
Portrait image shows black bars in 9:16 output Crop or pad your input image to 9:16 before uploading. Mismatched aspect ratios cause the model to letterbox.
What Veo 3.1 Lite Can't Do (For Product Work)
- No 4K — max is 1080p. Fine for web and social; not suitable for large-format print or digital signage.
- No Extension — you can't extend a generated clip beyond 8 seconds in Lite tier.
- No multi-product comparison — animating two products interacting is unreliable. Generate separately and edit together.
- No text overlay — don't rely on the model to add readable text/prices/callouts. Add those in post.
Try It: Free First Generation
NanoBanana's Veo 3.1 Lite generator supports image-to-video with the same prompt interface. Upload your product photo, paste a prompt from above, pick your format, and generate.
→ Try Veo 3.1 Lite Image-to-Video
20 credits for 8 seconds. Half the cost of Veo 3.1.
FAQ
Disclosure
Video examples use footage from the Veo 3.1 model family. Workflow recommendations are based on practical testing of image-to-video generation. Results vary by input image quality and prompt specificity.
Autor
Categorías
Más artículos

Google Veo 3.1 Lite: Half the Cost of Veo 3.1 Fast, Same Speed
Google launched Veo 3.1 Lite on March 31, 2026 — the most affordable model in the Veo family at $0.05/sec for 720p. Here's what it can do, what it can't, and whether it's right for your workflow.

Seedance 2.0: The Complete Guide to ByteDance's Multimodal AI Video Generation
Explore Seedance 2.0, ByteDance's revolutionary AI video model featuring multimodal input, native audio-video sync, 2K resolution output, and director-level creative control.

Veo 3.1 Lite Prompt Guide: 20+ Ready-to-Use Prompts for Cinematic AI Video
Learn exactly how to prompt Veo 3.1 Lite for cinematic results. Covers shot types, camera movement, audio, and 20+ copy-paste prompts across genres — no fluff.