How to create a long video on Grok
Many people mistakenly believe that the free version of Grok AI can only create short, disjointed clips, and sometimes the consistency of the characters in the images isn't well-represented. However, if you know how to combine the tools and apply the right workflow, you can absolutely create a 60-second or even longer cinematic short film.
The best thing about this method is its ability to maintain character consistency – the same face, hairstyle, and outfit throughout every scene. Below is a detailed step-by-step guide to help you achieve that.
A 6-step process for creating character-consistent videos with Grok AI.
Below is the video I created from 6 scenes using Grok.
Guide to creating long videos with Grok
Step 1: Create a script and a scene creation prompt.
Step 1: Open ChatGPT , Claude , or any other AI you find capable of scripting, and request a short story with 5 scenes (for example, suspense or drama).
Here's an example of a prompt asking AI to write a script:
Requirement: Create a short story with 5 suspenseful scenes to use as a basis for an animated video.
-
The only location: A dilapidated, rusty old bus stop next to a vast cornfield.
-
Time: At sunset, as the sun is setting, casting long shadows and a reddish glow.
-
Figure:
-
Lam: A 13-year-old boy, wearing a hoodie and carrying a backpack.
-
Milo: A small, brown dog, either a Poodle or a grass dog, who looks very lively.
-
The mysterious figure: A character wearing a long, black cloak that covers their head (appears from Scene 2).
-
-
5-scene structure (Detailed AI required):
-
Scene 1: Lam and Milo are sitting at the bus stop. Everything is normal; Lam is looking at his phone, and Milo is lying at his feet.
-
Scene 2: Milo suddenly stands up, growling as he stares towards the cornfield behind the station. From the shadows of the corn stalks, a figure in a black cloak emerges, standing silently and watching them.
-
Scene 3: The mysterious figure begins to approach the station slowly and silently. Lam notices something unusual, puts away his phone, and grips Milo's leash tightly.
-
Scene 4: The cloaked figure stopped under the dim lights of the waiting station, just a few meters from Lin. He extended a thin hand from beneath his cloak. Lin trembled and said, "What do you want? "
-
Scene 5: The station lights flickered on and off repeatedly. Milo barked loudly. When the lights came back on, the mysterious figure had vanished, leaving only an old toy of Milo's on the ground. Lam looked around, breathing heavily: "Milo, run! "
-
Step 2: Clearly define the setting (use only one location for consistency) and design the character (for example, a 13-year-old boy wearing a hoodie and backpack, with his dog Milo above).
If you wish, you can ask the AI to create a separate image of your character to ensure the character's image in the video is as consistent as possible. Request more scenes if you want to create a longer video.
If possible, please ask ChatGPT to write a Cinematic Image Prompt (a command to create highly detailed images depicting the appearance and clothing of the main character).
Step 2: Create a reference image in Grok
This is the "backbone" scene that will shape the character's face for the entire video.
Log in to the Grok AI dashboard.
Click on Imagine and paste the character description from ChatGPT into the text box. If you have already created a character image, select it.
Adjust the aspect ratio to 16:9 (standard YouTube video/landscape).
Click Generate . Save your favorite image as a reference.
Step 3: Create Video Prompts for Each Scene
Go back to ChatGPT and ask the tool to write Video Animation Prompts for all 5 scenes based on the storyline. Make sure the Prompt contains:
- Camera movement (Dolly in, dolly out, pan.).
- The character's specific actions.
- The surrounding environment.
Step 4: The secret to maintaining character consistency (Most important)
This is the core technique that determines the success or failure of a video:
Paste the Video Prompt of Scene 1 into the Grok AI Video folder (remember to set the aspect ratio to 16:9). Press Generate and download the first clip.
Open the downloaded clip and drag the timeline to the last frame where the main character is most clearly visible (face, hairstyle, clothing). Avoid frames where the character is facing away or is obscured.
Here you have two options: Save Video Frame , and when you save that frame to your computer, you will have to upload that frame in the next scene. This option is used when you don't want to create the next video immediately.
Alternatively, if you want to create it immediately, simply select "Copy video frame ," and after completing the previous video, paste the copied video frame into the video creation frame, then paste the video creation prompt into the next scene. Here, since I chose "Save frame," for the next scene, I'll have to select "Add image" > "Edit image."
Return to Grok AI and paste the copied frame into the description box to use it as a Reference Image for the next scene. If you have already saved the frame, select the add icon.
Once you've loaded or pasted the frame along with the prompt for the next scene, select Create Video.
And so you repeat the process: "Save or copy the frame from the previous scene as the image for the next scene > paste or load the saved frame from the previous scene > paste the next scene prompt and create the video."
In summary, the process will be: create a script > get a video creation prompt > create the main character's image > create video 1 > save video 1 frames > paste/upload video 1 frames and create scene 2.
Step 5: Edit and finalize the video in CapCut
- Open CapCut software and add the Voiceover audio file first.
- Drag and drop video clips from Grok onto the Timeline one by one.
- Adjust the timing (you can increase/decrease the clip speed slightly) so that the visuals match the dialogue.
- Add transition effects. Suggestion: Use a black fade effect between scenes to create a cinematic, mysterious feel.
- Export the video in high resolution and you're done!
See also: Latest guide on using Capcut on your computer
Key features of creating cinematic videos with Grok AI
- Completely free: No premium paid account is required to harness Grok's video creation power.
- Cinematic quality: AI understands terms like camera angles, lighting, and smooth motion very well.
- Absolute character control: Solving the biggest "pain point" for AI video creators: characters becoming distorted or changing appearance from scene to scene.
Conclude
With the clever combination of ChatGPT, Grok AI's image/video creation features, and CapCut, you can completely master the art of AI-powered storytelling. The "frame splicing" technique is the golden key to eliminating character distortion, unlocking the opportunity to produce high-quality content at zero cost. Start experimenting with your first script today!
You should read it
- ★ Turn Still Photos into Videos with Grok's New Tool – How to Do It Super Easy
- ★ Instructions on how to disable the automatic video creation feature in Grok.
- ★ How to write basic code with Grok
- ★ Experience the hilarious Grok AI chatbot on X
- ★ How to create a 3D animated 'stomach crying for help' video with Veo 3 in Genminigen AI