AI tech debtarchitecturetemplate engineLLMGeminiengineering decisions

The Real Fix for AI Tech Debt: Don't Use Less AI — Limit Its Scope

· 49 min read

TL;DR

Our AI website generator UltraSite used to have Gemini generate complete HTML from scratch (400-700 lines). Quality was a coin flip. Sometimes stunning, sometimes broken.

After reading an article about AI creating a new kind of tech debt, we didn't "use less AI." We redrew the responsibility boundary:

  • AI handles: Content strategy (taglines, about copy, blog titles, color decisions)
  • Humans handle: HTML structure, CSS animations, visual quality

Result: 5-10x fewer tokens, 3-4x faster generation, quality went from random to consistent.


The Trigger: An Article That Made Us Stop and Think

Harsh's article on Dev.to identifies three types of AI tech debt:

  1. Cognitive Debt — Developers use AI to write code but don't understand why it's structured that way
  2. Verification Debt — Tests pass ≠ actually correct. Green CI creates false confidence
  3. Architectural Debt — AI prefers repetition over abstraction, scattering slightly different implementations everywhere

One developer's quote captured it perfectly:

"I used to be a craftsman... and now I feel like I am a factory manager at IKEA."

The numbers are sobering:

  • AI writes 41% of all new commercial code in 2026
  • Experienced developers see a 19% productivity decrease with AI tools
  • Fortune 50 companies saw a 10x increase in security vulnerabilities in 6 months

My first reaction wasn't "we have this problem." It was "we literally just fixed this three days ago."


UltraSite v1: A Living Textbook of AI Tech Debt

UltraSite is one of our products — paste your Threads URL, get a personal website in 30 seconds.

v1 architecture:

User pastes Threads URL
    ↓
Jina Reader fetches markdown
    ↓
367-line mega prompt → Gemini
    ↓
Gemini outputs 400-700 lines of complete HTML
    ↓
Post-processing: inject GSAP animation JS
    ↓
Render in iframe

That 367-line prompt specified everything:

  • Tailwind CSS CDN configuration
  • GSAP + Lenis + ScrollTrigger animation system
  • Glassmorphism card effects
  • Parallax orbs with configurable speeds
  • Custom cursor with mix-blend-mode
  • SVG noise texture overlay
  • 5 complete page sections with exact layout specs
  • Typography hierarchy
  • Color derivation logic from bio content

Every single generation, Gemini had to reinvent the wheel — rewrite identical CSS, reassemble identical HTML structure, re-decide identical font settings.

This perfectly matches Harsh's three debt types

Cognitive debt: We couldn't always explain why Gemini made certain layout decisions. Why did this orb end up top-right instead of bottom-left? Why text-7xl instead of text-8xl? Answer: because LLMs are stochastic.

Verification debt: Looks okay in the iframe? Ship it. But did anyone test responsive? Are animations firing? Is color contrast accessible? Every generation was a gamble.

Architectural debt: The 367-line prompt itself was a massive blob of tech debt. Every modification required finding the right spot in 300 lines of natural language. And Gemini routinely ignored half the rules anyway.


The Epiphany: AI's Problem Isn't "Too Much" — It's "Wrong Scope"

Harsh suggests "treat AI like a brilliant junior developer." We went further:

AI should do what it's genuinely good at. Nothing else.

What AI (LLMs) actually excel at:

  • ✅ Understanding context and voice
  • ✅ Extracting core themes from large text
  • ✅ Making brand positioning and copy decisions
  • ✅ Determining what color palette "feels right" for a person

What AI is bad at:

  • ❌ Producing consistent HTML structure
  • ❌ Using correct Tailwind CSS class names (frequently invents non-existent classes)
  • ❌ Maintaining visual quality consistency
  • ❌ Remembering every rule in a 367-line instruction set

The answer was clear: remove HTML/CSS from AI's responsibility entirely.


UltraSite v2: Template + JSON Architecture

New architecture:

User pastes Threads URL + selects template style
    ↓
Jina Reader fetches markdown (unchanged)
    ↓
80-line prompt → Gemini
    ↓
Gemini outputs ~50 lines of JSON (pure content, no HTML)
    ↓
Template engine: JSON + hand-crafted template → complete HTML
    ↓
Post-processing: inject animation JS (unchanged)
    ↓
User can switch templates instantly (no AI re-call needed)

Gemini now outputs only this:

{
  "roleLabel": "Founder",
  "tagline": ["Deep", "Insight", "Builder"],
  "subtitle": "From finance to gaming, from thinking to action",
  "aboutHeading": "Some people's excuses are more active than their hands.",
  "aboutParagraphs": ["...", "..."],
  "identityCards": [
    { "emoji": "📈", "title": "Finance Pro", "description": "..." }
  ],
  "blogArticles": [
    { "title": "Night Views & Life Lessons", "content": "...", "expanded": "..." }
  ],
  "connectHeading": "Public feed for ideas. Private channel for real talk.",
  "colorTheme": "violet"
}

Zero HTML. AI only makes content decisions, never touches structure.

Templates are hand-crafted by humans

We built three HTML templates, each thoroughly tested for:

  • Complete responsive behavior
  • Correct GSAP animation class hooks
  • Color system with CSS variable integration
  • All edge cases (no profile photo? Gradient circle with initials)
Template Style Signature
Midnight Glass Glassmorphism Frosted cards, parallax orbs, violet gradients
Neon Terminal Hacker aesthetic Pure black, scanlines, monospace, hard borders
Soft Brutalism Editorial Oversized type, thick color borders, offset shadows

All three share the same animation system (GSAP + Lenis). Only the visual style differs.


The Numbers

Metric v1 (mega prompt) v2 (template + JSON)
Prompt length 367 lines ~80 lines
Gemini output 8,000-15,000 tokens 800-1,500 tokens
Generation time (Gemini) 4-8 seconds 1-2 seconds
HTML quality consistency Random Stable
Vercel timeout risk High (10s limit) Safe
Template switching ❌ Full regeneration ✅ Instant
New Vercel functions +0 (stays at 11/12)

Token consumption dropped 5-10x. This isn't just about cost — it makes the entire system predictable.


This Pattern Generalizes

What we learned:

1. AI decides, systems execute

Ask AI "what's this person's brand positioning?" not "build me a website with brand positioning."

The first is AI's sweet spot. The second asks AI to simultaneously be a strategist, designer, and frontend engineer — with no quality guarantee on any of the three.

2. Structured output > free-form text

responseMimeType: 'application/json' is the single most effective Gemini config we've used.

Forced JSON output means:

  • No "Here's the HTML I generated for you:" preamble
  • Every field can be validated with fallbacks
  • Output format is 100% predictable

3. Hand-craft the irreplaceable parts

Templates are UltraSite's moat. Anyone can ask Gemini to generate a website. Not everyone can hand-craft templates with GSAP parallax animations, glassmorphism effects, and custom cursors in a dark theme.

The AI-generated part (copy) is replaceable. The human-built part (templates) is not.

4. Give users a sense of control

v1 was a black box — input URL, get output, don't like it? Regenerate and pray.

v2 lets users choose templates, see brand analysis, switch styles. Same AI output, but users feel ownership over the result.


Responding to Harsh's Advice

Harsh says "treat AI like a brilliant junior developer." I'd refine that:

Treat AI like a creative director with impeccable taste but unreliable hands.

Let them decide direction — brand positioning, copy tone, color strategy. But you don't let them write production code. You have a dedicated system (template engine) that turns their ideas into reliable output.

This isn't "using less AI." Our AI usage didn't decrease — every generation still calls Gemini. But AI's scope of responsibility is strictly bounded.

The most precise line from Harsh's article:

"At what point did we stop building software and start just generating it?"

Our answer: We do both. AI generates content. Humans build systems.


Implementation Details (For Engineers)

If you're building something similar, here's our tech stack:

  • Gemini 2.5 Flash + responseMimeType: 'application/json' + thinkingBudget: 0
  • Template format: HTML strings + {{slot}} markers (custom ~100-line engine, no Mustache/Handlebars)
  • Animations: GSAP 3 + ScrollTrigger + Lenis (injected into all templates as fixed JS)
  • CDN: Tailwind CSS CDN + Google Fonts (generated HTML is a standalone single file, no build step)
  • Images: Server-side download + base64 embedding (bypasses Threads CDN referrer restrictions)
  • Vercel: All new files use _ prefix (don't count as serverless functions), staying at 11/12 limit

Conclusion

AI tech debt is real. But the fix isn't going back to hand-writing everything.

The fix is drawing a line: this side is AI's responsibility, that side is the system's responsibility. What AI handles should be small, verifiable, and have fallbacks. What the system handles should be stable, tested, and human-built.

Generate fast. Understand everything. But most importantly — build systems you can control.


Try UltraSite v2 → ultralab.tw/create

Scan your AI system for security vulnerabilities → ultralab.tw/probe

Weekly AI Automation Playbook

No fluff — just templates, SOPs, and technical breakdowns you can use right away.

Join the Solo Lab Community

Free resource packs, daily build logs, and AI agents you can talk to. A community for solo devs who build with AI.

Need Technical Help?

Free consultation — reply within 24 hours.