Content Cascade Engine: Write One Blog Post, Auto-Generate 5 Social Posts
The Worst Part of Blogging Isn't Writing — It's Everything After
If you create content regularly, you know the drill.
You spend two hours writing a solid technical blog post. You hit save. You feel accomplished.
Then you remember: "I still need to turn this into social media posts."
So you open Threads, stare at the blank compose box, and start thinking: What's the core hook? How do I split this into multiple posts? Each one needs a different angle but they should still be cohesive...
Another hour gone.
Final output: 6 pieces of content (1 blog + 5 social posts). Time spent: 3 hours. One-third of that time was spent rephrasing things you already wrote.
That's not creation. That's manual labor. And manual labor should be automated.
Content Cascade Engine: Automatic Content Multiplication
In late March, I built a system called Content Cascade Engine.
The concept is dead simple: you write a blog post, the system automatically splits it into social media posts.
┌─────────────────────────────────────────────────┐
│ Content Cascade │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Blog MD │───▶│ Ollama │───▶│MindThread│ │
│ │(new post)│ │ultralab:7b│ │ API │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ Daily 07:00 Split into Auto-publish │
│ scan for new 3-5 posts with to Threads │
│ blog posts different hooks @ultralab.tw │
│ │
│ Cost: $0 Model: local Manual: 0 min │
└─────────────────────────────────────────────────┘
It runs on WSL2, triggered by a systemd timer every morning at 7 AM. The whole pipeline finishes in about 45 seconds.
Full Pipeline Breakdown
Step 1: Detect New Posts
# content-cascade.sh core logic (simplified)
BLOG_DIR="/mnt/c/Users/User/UltraLab/content/blog"
POSTED_LOG="$HOME/.openclaw/data/cascade-posted.json"
# Find .md files newer than the last run
NEW_POSTS=$(find "$BLOG_DIR" -name "*.md" -newer "$POSTED_LOG" -type f)
if [ -z "$NEW_POSTS" ]; then
echo "[cascade] No new posts found. Exiting."
exit 0
fi
The system compares against a JSON log file to ensure each post is only cascaded once. Edited old posts don't re-trigger.
Step 2: Extract Content
Read the Markdown file, strip the frontmatter, and feed the raw text to Ollama.
# Extract body (strip frontmatter)
CONTENT=$(sed '1{/^---$/!q;};1,/^---$/d' "$POST_FILE")
TITLE=$(grep '^title:' "$POST_FILE" | sed 's/title: *"\(.*\)"/\1/')
TAGS=$(grep '^tags:' "$POST_FILE" | head -1)
Step 3: Ollama Splitting (The Core)
This is the heart of the system. The prompt design determines output quality.
PROMPT=$(cat <<'PROMPT_END'
You are Ultra Lab's social content strategist.
## Task
Split the following blog post into 3-5 Threads posts.
## Rules
1. Each post must stand alone — readers who haven't seen the blog must understand it
2. Each post: 200-280 characters for English (or ~200-280 Chinese characters)
3. Each post must open with a DIFFERENT hook type:
- Contrarian take ("Everyone says X, but actually...")
- Personal experience ("Last week I spent 3 hours on...")
- Data-driven ("We tested for 30 days. The results...")
- Question lead ("Have you ever wondered why...")
- Practical tip ("This method saves me 1 hour daily:")
4. Tone: developer-casual, not marketing-speak. Use "I" and "you" freely
5. End each post with a CTA or thought-provoking question
6. Minimal emoji — max 1-2 per post
7. Include 1-2 relevant hashtags
## Output format
Separate each post with ---. Output only the post content, no numbering or explanations.
## Blog title
{TITLE}
## Blog content
{CONTENT}
PROMPT_END
)
A few design decisions worth highlighting:
Diverse hook types — Forcing a different opening style for each post prevents them from all sounding like they came from the same mold. Without this rule, Ollama tends to start 4 out of 5 posts with "Did you know..."
Precise word count — 200-280 words is the Threads sweet spot. Too short lacks depth, too long loses readers. I tested 300+ posts: the 200-280 range gets 34% higher read-through rate than 300+.
Standalone requirement — The most critical rule. Social posts cannot assume the reader has seen your blog. Each post must carry its own context.
Tone control — "Developer-casual, not marketing-speak" is six words that do more work than a paragraph of instructions. Ollama handles this directive surprisingly well.
Step 4: Ollama Inference
RESPONSE=$(curl -s http://localhost:11434/api/generate \
-d "{
\"model\": \"ultralab:7b\",
\"prompt\": \"$ESCAPED_PROMPT\",
\"stream\": false,
\"options\": {
\"temperature\": 0.75,
\"top_p\": 0.9,
\"num_predict\": 2048
}
}" | jq -r '.response')
We use our own fine-tuned ultralab:7b model (based on Qwen2.5:7b), running on an RTX 3060 Ti at about 13.2 tok/s. Generating 5 posts takes roughly 30-40 seconds.
Temperature 0.75 was found through iteration. 0.5 is too conservative — the hooks start sounding alike. 1.0 is too wild — occasionally goes off-topic. 0.75 balances diversity with accuracy.
Step 5: Publish via MindThread API
# Split response into individual posts
IFS='---' read -ra POSTS <<< "$RESPONSE"
for post in "${POSTS[@]}"; do
CLEAN_POST=$(echo "$post" | sed '/^$/d' | head -20)
[ -z "$CLEAN_POST" ] && continue
# Publish to Threads via MindThread API
curl -s -X POST "http://localhost:18789/api/mindthread/post" \
-H "Content-Type: application/json" \
-d "{
\"accountId\": \"vll91c1mmN8ZowrvfNuc\",
\"content\": \"$ESCAPED_POST\",
\"source\": \"cascade\"
}"
sleep 300 # 5 min between posts to avoid spam detection
done
MindThread is our own Threads automation platform managing 27 accounts. Content Cascade feeds directly into MindThread's API for scheduled publishing to the @ultralab.tw account.
Real Example: One Blog Post to Five Social Posts
Let's use last week's article "How I Manage 5 Products Solo: My Coordinator Architecture" as an example.
Source Article
A 2,400-word technical post about how a solo developer uses AI Agents and Claude Code to manage 5 product lines simultaneously.
Cascade Output
Post 1 (Contrarian hook):
Running 5 products solo is suicide in the normal world. But what if your "team" is 4 AI agents + Claude Code?
I currently manage UltraLab, MindThread, Ultra Advisor, UltraTrader, and OpenClaw. Spread across 3 machines, 4 languages, 2 Firebase projects.
The key isn't how much time you have. It's whether you've automated the "deciding what to do" part too.
My system scans all 5 products every morning, ranks priorities, and I just execute.
The biggest enemy of a solo company isn't capability — it's decision paralysis. #solodev #AIautomation
Post 2 (Data hook):
I did the math. If each of my 5 products needed a 3-person team, I'd need 15 people.
What I actually have: myself + 4 AI agents + 34 systemd timers.
Last 30 days: UltraLab landed 3 clients, MindThread hit 3.3M views, Ultra Advisor published 63 SEO articles. All running simultaneously.
The secret isn't "do everything yourself." It's "build systems so you don't have to."
How many hours of your day are spent on repetitive decisions? #BuildInPublic
Post 3 (Practical tip hook):
This saves me 2 hours every day: offload context-switching to AI.
My Coordinator architecture: every morning, agents auto-aggregate git logs, error reports, and user feedback from all 5 products. They produce a single "Today's Action List."
I don't spend 30 minutes remembering "where did I leave off on that UltraTrader bug?" The system just tells me.
The human brain shouldn't be used for remembering. It should be used for judging. #productivity
Post 4 (Question hook):
Ever wonder why big companies can run 10 products while you're dying at 2?
It's not because they have more people. It's because they have systems.
The difference: enterprise systems are Jira + Slack + weekly standups. Solo dev systems can be AI agents + cron jobs + zero meetings.
I automated monitoring, reporting, and scheduling across all 5 products. Weekly reports are agent-written. Daily digests are systemd-triggered.
A solo company doesn't mean doing everything alone. It means building a system where one person is enough. #SaaS
Post 5 (Personal experience hook):
Last Wednesday morning, Ollama's memory spiked to 2GB and got killed by systemd.
Same time: 3 MindThread accounts had reach drops. An Ultra Advisor user reported a mobile layout bug.
Three fires, simultaneously. Old me would've panicked for 30 minutes.
But the system had already prioritized: Ollama (blocks all agents) > client bug > reach drop (can wait).
You don't build systems for the good days. You build them for the days everything breaks at once. #devlife
Quality Comparison: Handwritten vs Cascade
I ran a 30-day A/B test:
| Metric | Handwritten | Cascade | Delta |
|---|---|---|---|
| Avg. reach | 1,420 | 1,285 | -9.5% |
| Engagement rate | 4.2% | 3.8% | -0.4% |
| Read-through rate | 67% | 71% | +4% |
| Production time | 12 min/post | 0 min/post | -100% |
| Weekly output | 5-8 posts | 15-25 posts | +200% |
The verdict: Cascade posts are ~10% lower quality per-post, but 3x the volume.
Total reach = quality x quantity. Even with 10% less reach per post, tripling the volume doubles your total reach.
One unexpected finding: Cascade posts actually have a higher read-through rate than handwritten ones. My theory: the prompt enforces a strict 200-280 word limit, naturally keeping posts in the sweet spot. When I write manually, I sometimes hit 400+ words and the read-through rate drops.
Systems are more disciplined than humans.
30-Day Data
Actual production numbers since Content Cascade went live:
Blog posts written: 12
Cascade social posts: 47
Avg. splits per article: 3.9
Auto-publish success rate: 97.9% (1 failed on API timeout, manually resent)
Ollama inference failures: 0
Daily compute cost: $0 (local GPU)
Total content output: 59 pieces (12 + 47)
If all handwritten: ~42 hours
Actual time spent: ~24 hours (blog only)
Time saved: 18 hours/month
18 hours per month freed from content repurposing. That's 18 hours I can spend writing more blog posts, building product, or sleeping.
Why Ollama Instead of Gemini / Claude API
This is the most common question I get. I have API access to both Gemini and Claude. So why does Content Cascade use local Ollama?
1. Cost
Cascade runs once daily, processes 1-3 articles, each roughly 2000 tokens input + 1500 tokens output.
| Option | Monthly Cost | Annual Cost |
|---|---|---|
| Ollama (local) | $0 | $0 |
| Gemini Flash | ~$0.30 | ~$3.60 |
| Claude Haiku | ~$1.80 | ~$21.60 |
| GPT-4o mini | ~$1.20 | ~$14.40 |
Gemini Flash is cheap. But $0 is cheaper. And the GPU is already running other agent tasks — the marginal cost is literally zero.
2. Privacy
Blog content is technically public, so privacy isn't a hard requirement. But I have a principle: data that doesn't need to leave your machine shouldn't leave your machine. Build the habit.
3. Reliability
APIs go down. Gemini had a 3-hour outage in early March — right during my cascade window. Ollama runs on my own hardware. As long as there's power, it's available.
4. Latency
Ollama runs locally. Zero network latency. The entire cascade pipeline completes in 45 seconds. Going through APIs would take 2-3 minutes (network + rate limits).
5. Quality Is Sufficient
Social media posts don't require Claude Opus-level reasoning. Qwen2.5 7B handles this task perfectly well. I tested Claude Sonnet vs Ollama outputs — the quality gap is about 5-8%. Social media readers won't notice the difference.
Using a frontier model for commodity tasks is waste. You don't drive a Ferrari to the grocery store.
Advanced: Self-Correcting Cascade
The system isn't perfect. The most common issue: Ollama occasionally produces malformed output (missing --- delimiters, posts exceeding 300 words).
I added a post-processing validation layer:
validate_post() {
local post="$1"
local char_count=$(echo -n "$post" | wc -m)
# Length check
if [ "$char_count" -gt 320 ]; then
echo "[cascade] Post too long ($char_count chars), truncating..."
post=$(echo "$post" | head -c 850) # UTF-8 Chinese ~3 bytes/char
post="${post}..."
fi
# Empty check
if [ "$char_count" -lt 50 ]; then
echo "[cascade] Post too short, skipping."
return 1
fi
echo "$post"
}
After adding this layer, the publish success rate jumped from 91% to 97.9%.
You Can Build This Too
The core concept behind Content Cascade is straightforward:
- A content source (blog, YouTube, podcast)
- An LLM (Ollama, Gemini, ChatGPT — anything works)
- A distribution channel (Threads API, Twitter API, IG API)
- A script to connect them
If you'd rather not build it yourself, our UltraGrowth managed service includes Content Cascade:
- You write (or we write for you)
- The system auto-splits into social posts
- Auto-scheduled publishing across platforms
- Monthly performance reports
A blog post's value shouldn't stop at the blog. Let it cascade.
Technical Specs at a Glance
| Component | Spec |
|---|---|
| Schedule | systemd timer, daily 07:00 CST |
| LLM | Ollama ultralab:7b (Qwen2.5 7B base) |
| GPU | NVIDIA RTX 3060 Ti 8GB |
| Inference speed | ~13.2 tok/s |
| Processing time | ~45 sec/article |
| Publish target | Threads @ultralab.tw via MindThread API |
| Split count | 3-5 posts/article |
| Success rate | 97.9% |
| Monthly cost | $0 |
Write once, publish five times. That's what cascade means.