I review student work every week and the same problems show up. Tools change, models update, but these seven mistakes haven't gone away. If your AI animations still look obviously AI, or your videos aren't getting traction, you're probably making at least three of these. Probably more.
Here's what to look for, and the fix for each.
Mistake 1: Using too many style modifiers in the prompt
"Cinematic, dramatic, atmospheric, moody, hyperrealistic, ultra-detailed, 8K, cinematic lighting, golden hour, depth of field..." This is the AI animation equivalent of yelling at the model.
The model can't pull all those directions at once. It picks the strongest signal and ignores the rest, which is why your output looks generic — every prompt is asking for the same five things. Three modifiers max. Pick the three that matter most for this specific shot.
Fix: Write your prompt as if you're directing one shot of a film. Subject, action, mood. Stop there.
Mistake 2: Bad audio
The single biggest "tells AI" giveaway isn't the visuals — it's the audio. Specifically: tinny ElevenLabs voice with no compression, music that's too loud and never ducks for the voice, no ambient sound at all.
Your ear is more trained than your eye. People register bad audio in 0.5 seconds and lose trust in the entire video.
Fix:
- Run your voice through a free compressor (CapCut has one built in).
- Duck music to -12dB whenever voice is playing.
- Add ambient sound under every scene — a low room tone, distant wind, faint reverb.
- Your voice should sit at -3 to -6 dB peak, not maxed out.
Mistake 3: Switching styles between shots
Shot 1 looks anime. Shot 2 looks photoreal. Shot 3 is suddenly painterly. Even if each shot is technically good, the video falls apart because there's no visual consistency.
This happens when creators get excited and try multiple looks within a single video. It always looks worse than committing to one style.
Fix: Pick a style at the start of every project and don't deviate. If you want to try a different style, save it for the next video — don't blend mid-piece.
Mistake 4: Generic, lifeless first second
Your hook frame is a slow zoom on a face. Or a wide establishing shot of a landscape. Or a generic "atmospheric" image with no clear subject. Viewers swipe in 0.8 seconds.
Your first frame has to make someone freeze their thumb. That happens through specificity, not spectacle.
"Specific" beats "epic" every single time. A close-up on a sword being drawn beats a wide shot of an army. A face mid-emotion beats a beautiful landscape. Detail beats scope.
Fix: Look at your first frame. If it could be the first frame of any AI video on TikTok, it's wrong. Make it impossible to confuse with anything else.
Mistake 5: No story, just vibes
This is the failure mode I see most often. The video has nice shots, decent audio, consistent style. And nothing happens. There's no question being asked, no contradiction, no payoff. Just images.
Vibes are not a hook. The fact that you used AI to make something pretty doesn't carry the video on its own anymore — that novelty died in 2024.
Fix: Before you generate anything, write the video as one sentence. "This video tells the story of [X] who [Y] and the result was [Z]." If you can't write that sentence, the video doesn't have a story yet. Don't shoot until it does.
Mistake 6: Ignoring the cliffhanger
The last 2 seconds end with "Thanks for watching!" or trail off into music. You just told the algorithm "this video is over, recommend something else."
The last 2 seconds of every short determine whether someone watches it twice (the algorithm's strongest engagement signal) or whether they comment. Both of those depend on an unresolved moment, not a wrap-up.
Fix: End on one of these:
- An unresolved question ("But what happened to the boy?")
- A contradiction reveal ("And the strangest part? He survived.")
- A visual punctuation that recontextualizes the whole video
- A cliffhanger setting up part 2
Mistake 7: Captions you can't read
Tiny captions. White on white background. Captions that don't sync to the voice. Captions in a script font that looks elegant but is unreadable on a 6-inch phone screen at arm's length.
85% of mobile viewers watch with sound off. If they can't read your captions, they leave at 2 seconds.
Fix:
- Hard-burn captions (don't rely on platform auto-captions).
- Use a bold sans-serif (Inter, Helvetica, Bebas, JetBrains Mono).
- White text with a 2-3px black stroke, or text on a solid colored block.
- One short phrase per caption (3-6 words).
- Sync captions exactly to the spoken word, not lagging behind.
The diagnostic checklist
Run a recent video through this list. Score each from 1-10:
- Does the prompt use 3 or fewer style modifiers? ____
- Is the audio properly mixed and ducked? ____
- Is the visual style consistent across every shot? ____
- Is the first frame specific and unmistakable? ____
- Can you describe the story in one sentence? ____
- Does the video end with a cliffhanger or unresolved moment? ____
- Are captions hard-burned, bold, sans-serif, and synced? ____
If anything scored under 6, that's where to focus on the next video. The improvement compounds. Fix one of these per project and within 5 videos you'll be operating at a different level.
The meta-mistake
The biggest mistake of all isn't on this list — it's not finishing videos.
Most people producing AI animations spend 20 hours on shot 1, 8 hours on shot 2, then quit because shot 3 isn't perfect. They never publish, never learn what the algorithm rewards, never get the data feedback loop running.
Ship rough videos. Improve from feedback. Twenty published videos teach you more than two perfect ones.
Want me to review your work?
The academy includes weekly community feedback rounds where I and other students review each other's videos against this exact checklist. The fastest way to spot what's broken is to have a fresh set of eyes on it.
Join the academy / $29.99 mo →