Short-Form VideoAutomationFFmpegPlaywrightVideo Production

Automated Short-Form Video Production: The Complete Technical Pipeline from HTML Templates to FFmpeg

· 28 min read

The Raw Efficiency of Short-Form Video

A 15-second video conveys far more information than a 200-word text block. On Instagram Reels, YouTube Shorts, and TikTok, short-form videos have the highest reach and engagement rates of any content format.

The problem: producing a short-form video is incredibly time-consuming.

Manually editing a 15-second video in Premiere Pro or CapCut -- from concept to completion -- takes at least 30-60 minutes. What if you need to post 3-5 per day?

Our solution: automate the production with code.

System Architecture Overview

Ultra Lab's automated short-form video production system has three stages:

HTML Animation Templates -> Playwright Capture -> FFmpeg Compositing

Stage 1: HTML Animation Templates

Use web technologies (HTML + CSS + JavaScript) to create every frame of the video animation.

Why HTML instead of After Effects?

  • Programmable: Text, numbers, and colors can all be controlled with variables
  • Templatized: One template can be used with hundreds of different content variations
  • No Adobe license needed: Open-source tech, zero cost
  • Version controlled: Templates are code -- manageable with Git

A typical template structure:

<div class="video-container" style="width:1080px; height:1920px;">
  <div class="background-animation">...</div>
  <div class="text-layer">
    <h1 class="hook-text">Did you know?</h1>
    <p class="main-content">90% of people don't know this...</p>
  </div>
  <div class="cta-layer">
    <span>Follow @ultralab.tw</span>
  </div>
</div>

CSS animations handle all entrance, emphasis, and transition effects:

.hook-text {
  animation: slideUp 0.6s ease-out 0.5s both;
}
.main-content {
  animation: fadeIn 0.8s ease-out 1.5s both;
}

Stage 2: Playwright Capture

Playwright is a headless browser automation tool. We use it to:

  1. Open the HTML template page
  2. Wait for animations to complete
  3. Capture screenshots frame by frame (30 FPS = 30 images per second)
  4. Output as an image sequence

Why Playwright over Puppeteer?

  • Supports more browser engines
  • More accurate CSS animation rendering
  • Built-in waiting mechanisms, less prone to dropped frames

Each frame is a 1080x1920 PNG image. A 15-second video produces approximately 450 images.

Stage 3: FFmpeg Compositing

FFmpeg is the Swiss Army knife of audio/video processing. We use it to composite the image sequence into the final video:

ffmpeg -framerate 30 -i frame_%04d.png \
  -i background_music.mp3 \
  -c:v libx264 -pix_fmt yuv420p \
  -shortest output.mp4

During this stage, we also add:

  • Background music: Automatically selected from a preset music library
  • Sound effects: Notification sounds when text appears
  • Subtitle tracks: Auto-generated SRT subtitles

Three Psychological Trigger Templates

We've designed three categories of proven short-form video templates, each targeting a different psychological trigger:

Fear Type

Open with alarming data or facts to trigger "Do I have this problem too?" anxiety.

Examples:

  • "90% of people won't have enough retirement savings"
  • "Your password may have already been leaked"

Efficiency Type

Show a quick way to solve a problem, making viewers think "It's that simple?"

Examples:

  • "3 steps to automate your IG posting"
  • "This tool saves me 2 hours every day"

Greed Type

Showcase potential gains or opportunities to trigger "I want that too" desire.

Examples:

  • "This side hustle earns $1,500/month"
  • "A single SaaS tool generating $30,000/year in revenue"

Each category has 3-5 visual variations, totaling 10-15 templates that rotate to prevent viewer fatigue.

Batch Production Workflow

The complete workflow in practice:

  1. Once per week: Set the week's topics and content direction
  2. AI auto-generates: Gemini creates copy based on template type
  3. Auto-template insertion: Code injects the copy into HTML templates
  4. Batch capture: Playwright captures each template sequentially
  5. Batch compositing: FFmpeg batch-processes all videos
  6. Scheduled publishing: Videos automatically enter the publishing queue

Batch-processing 20 videos takes approximately 15-20 minutes (depending on machine performance).

Cost Structure

Item Cost
HTML template development One-time (included in service)
Playwright + FFmpeg Open-source, free
AI copy generation NT$300-500/month
Server / local compute Existing hardware is sufficient
Background music licensing Free asset libraries
Monthly total NT$300-500

Compared to outsourcing a single short-form video (NT$1,000-3,000/video), the automated production system costs 1/100th of manual production.

Who Is This For?

  • Brand owners: Need consistent short-form video output but don't have an editing team
  • Content creators: One person managing short-form video across multiple platforms
  • Marketing agencies: Batch-producing short-form videos for clients
  • E-commerce sellers: Product showcases, promotional countdowns, unboxing videos

Conclusion

Automated short-form video production doesn't require After Effects skills or expensive software licenses. With the open-source combination of HTML + Playwright + FFmpeg, you can build a high-efficiency short-form video production pipeline.

Want to learn more about the technical details, or ready to start using our system? Free consultation -- we reply within 24 hours.

Weekly AI Automation Playbook

No fluff β€” just templates, SOPs, and technical breakdowns you can use right away.

Join the Solo Lab Community

Free resource packs, daily build logs, and AI agents you can talk to. A community for solo devs who build with AI.

Need Technical Help?

Free consultation β€” reply within 24 hours.