How to write effective text prompts to generate AI videos?

A well-crafted prompt is key to dictating the content of the video produced by an AI model, whether you're generating AI videos from text or images. In this article, we present a formula for writing AI video generation prompts that can help you achieve optimal video output.

Structures of AI Video Generation Prompts

Text to Video Prompt Structure

Prompt = Subject + Action + Scene + (Camera Movement+ Lighting + Style)

Subject: What or who is the focus of the video? It can be people, animals, plants, objects, etc. To clarify, the subject should be described in detail, covering elements like:

Appearance (e.g., athletic performance, hairstyle, clothing, accessories)
Facial features, expressions, and emotions
Body postures...

Action: What is the subject doing? This is the core of your prompt, as it drives the video’s storyline. Ensure that the action is clear and concise.

Scene: Where is the action taking place? This includes the foreground, background, and any other elements that set the scene.

Camera Movement: Refers to the type of camera shot, angle, and movement that adds to the narrative and visual appeal. Use camera techniques like:

Zoom in/out
Focus in/out
Move left/right/forward/down
Orbit around the subject/360-degree view
Aerial shot
Wide angle
Close-up
Handheld camera/subtle shake
Tracking shot (following the subject)

You can combine camera movements, such as move down and zoom out, aerial shot and zoom in, a handheld camera and follow the moving object, etc.

Lighting: The lighting in the video can significantly impact its mood and depth. Descriptions of lighting should enhance the atmosphere and emotion of the video, such as warm light, morning light, spotlight on the subject, and backlighting.

Style: Setting the tone and style of the video. This can include visual style, emotional tone, and overall mood, such as anime, American comics, etc.

Image to Video Prompt Structure

Single-Action Prompt Structure

Prompt = Subject + Action + Background + Background Movement+ Camera Movement

Multi-Action Prompt Structure

Prompt = Subject 1 + Action 1 + Action 2

Prompt = Subject 1 + Action 1 + Subject 2 + Action 2 ...

Subject: Just like in Text-to-Video, the subject represents the main focus, and its appearance should be described in detail.

Action: Describes the motion of the subject within the scene. Since it's an image-to-video prompt, you might describe the subtle movement that turns a still image into a short, dynamic video.

Background: Describes the surrounding environment, which can help create a more immersive scene.

Background Movement: Refers to the dynamic elements or subtle shifts in the environment that help bring the scene to life.

Camera Movement: You can combine multiple techniques mentioned above for added dynamism.

Advanced Features

Start and End Frames

Available with FlexClip Pro and Kling 1.6 PRO.

You to upload two images as start and end frames, and the model will specify how the video should begin or end.

Multiple Subject Reference

Available with Kling 1.6 PRO Subject Reference.

You can upload 1-4 images and select the subjects, including people, animals, objects, or elements in the video you want to generate. AI will create videos based on these image references and the prompt.

With this feature, you can generate consistent character videos. You can put the character in different clothes and different scenes for immersive storytelling or create interactions between characters.

High-Quality Video Examples

Text to Video

Prompt 1: "A serene landscape featuring a large, lush tree with drooping branches, situated on a small island in a calm lake. The scene is illuminated by soft sunlight, with fluffy clouds in a bright blue sky and green foliage surrounding the water. The leaves gradually change from vibrant green to withered yellow."

Prompt 2: "A cheerful corgi wearing sunglasses lounges on a bright orange flotation device, floating in a sparkling blue ocean. The corgi gently bobs up and down with the movement of the waves under a sunny sky with fluffy white clouds."

Prompt 3: "A young woman with long blonde hair, wearing a white t-shirt and blue jeans, walking through a sunny park with green trees in the background, carrying a brown shoulder bag, followed by a smooth tracking shot that moves alongside her as she walks."

Prompt 4: "A vibrant bird sits on a branch, displaying striking blue and green feathers, surrounded by lush green leaves. The background is softly blurred, with sunlight filtering through the trees. The bird looks around alertly to the left and right, then suddenly takes off and flies away."

Image to Video

Prompt 1: "A cute, fluffy kitten wearing a navy blue captain's hat, steering a wooden boat on sparkling blue ocean waters."

Prompt 2: "A motorcyclist in black gear rides a sleek orange and white motorcycle along a winding road through a picturesque autumn landscape."

Prompt 3: "The glass perfume bottle floats on the water, surrounded by blooming daisies, gently swaying with the waves."

Prompt 4: "A skier with orange ski suit gliding down a snowy mountain slope, with snow spraying around."

Prompt 5: "Two horses are drinking water and eating grass beside a clear lake. Beside the lake are lush green grass and majestic snow-capped mountains. Clouds are rolling and gradually covering the snow-capped mountains."

Tips for Effective Prompts

Use simple words and sentence structures. Avoid overly complex or abstract language. Simple and concise prompts tend to yield the most accurate results. Break down your prompt into smaller chunks to help AI better understand the task.
Include explicit keywords like "switch to [new shot]" to indicate a transition between shots. If the scene changes, be sure to describe the new scene in detail.
Movement should follow physical principles. It's best to describe movements that are likely to occur in the scene.
Make sure the prompt matches the actual content of the image. For example, don't describe "a man" if the image clearly shows a woman, or say "in a grassland" when the background is actually an urban setting.
Avoid specifying exact numbers in your prompts. AI models may struggle with numerical consistency.
Use cultural keywords for specific styles. Incorporate cultural terms like "Oriental mood," "Chinese," or "Mediterranean" if you're aiming for a particular aesthetic or cultural theme.
Use split-screen scenarios effectively. For split-screen videos, be specific in describing the scenes in each section.
For start and end frames function, choose two similar images with the same theme will get a smoother transformation.
To highlight the frequency and intensity of actions or emphasize the subject's characteristics, use appropriate adverbs of degree such as quickly, intensely, frequently, and more to convey the dynamics more effectively.
Aside from Google Veo 3, other AI video models generate videos without sound. To bring your video to life with sound effects, just turn on the AI SFX option before generating.

By mastering the art of writing effective video prompts, you can significantly improve the quality and relevance of the AI-generated content. Whether you are working with text-to-video or image-to-video prompts, following these guidelines and examples will help you get the results you're looking for.