Case Study 01
Metra AI — Production SaaS for Telegram Content Automation
Solo-built end-to-end SaaS platform with multi-agent LLM orchestration.
From architecture to deployment in 3 months.
The business problem
Telegram channel owners and content teams spend enormous amounts of time creating posts manually.
Standard solutions either don't fit Telegram's specific UX (Buffer, Hootsuite are built for Instagram/Twitter),
or rely on raw ChatGPT output that produces generic, low-quality content requiring extensive manual editing.
The core pain: content teams spend 60–80% of their time on production instead of strategy,
and quality suffers because AI-generated content typically lacks brand voice, channel-specific context,
and real-time relevance.
What I built
A full SaaS platform that automates the entire Telegram content workflow:
- AI-powered post generation that maintains brand voice and channel lore
- Scheduling system with reusable weekly presets
- Real-time data integration for news-driven content
- Integrated Telegram CRM for managing leads without exposing owner account credentials
- Multi-account, multi-operator infrastructure for agencies running multiple channels
Architecture: why multi-agent instead of single LLM call
The core technical innovation is a multi-stage LLM orchestration pipeline rather than
relying on single API calls. This was a deliberate architectural choice based on a key insight:
LLMs perform poorly when given too many simultaneous constraints. A single 3000-token
prompt asking for "rewrite this post in voice X, with lore Y, in format Z, with rules
A/B/C" produces inconsistent results because attention dilutes across requirements.
Solution: decompose post generation into specialized stages, each with a single focused responsibility.
Standard posting pipeline
- Structure parser — extracts post anatomy and preserves links (which premium LLMs strip out by default)
- Content rewriter — handles each paragraph in isolated calls, preserving structural integrity
- Style enhancer — adds emojis and formatting based on channel persona
- Auto-rules validator — applies channel-specific rules (e.g., removing punctuation, enforcing length limits)
Extended posting pipeline (generates from scratch)
- Archetype selector — chooses post structure based on content type and length parameters
- Block-by-block generator — writes each section with focused context
- Style and formatting application
- Auto-rules validation
This architecture eliminates the typical AI failures — hallucinations mid-text, structural
drift, lore over-application, format breakage — that single-shot LLM calls produce.
Key technical innovations
1. Prompt sanitization layer
The image generation model initially refused to generate humans due to safety filters.
Rather than switching models, I built a sanitization layer that rewrites user prompts
to pass filters while preserving generation intent. This avoids the need to host more
expensive alternative models.
2. Lore compression and translation
User-provided channel lore (often 3000+ tokens of unstructured text) is compressed and
translated to the channel's posting language at upload time. The AI receives a structured,
language-matched summary instead of raw lore, dramatically improving content relevance
while reducing token costs.
3. Lore as context, not constraint
Discovered through iteration that AI over-applies lore when given it as direct instruction.
Architected lore as soft context that influences but doesn't dominate generation — producing
more natural content.
4. Premium LLM selection strategy
Chose specific LLMs for specific reasons:
- Lower censorship for legitimate edge cases
- Better real-time news integration via Perplexity-style APIs
- Cost optimization at scale
Infrastructure
- Multiple Ubuntu servers in production (main service, CRM, staging)
- 16 Docker containers orchestrated with appropriate separation of concerns
- Backend as single source of truth — all client requests flow through backend, never directly to AI providers or DB
- Monitoring stack: Prometheus, Grafana, Sentry for error tracking, Uptime Kuma for service health
- Encryption: All sensitive data (phone numbers, messages, passwords) encrypted with proper salt+pepper and GPU-resistant hashing. Decryption keys held off-server.
- Security: 2FA, JWT rotation, session fingerprinting, domain proxying
Outcome
3 months
Solo development to launch
16
Docker containers in production
Week 1
First paying users acquired
25/day
Automated posts per channel on paid tier
Launched to production within the 3-month development window. Multi-account CRM allows
agencies to manage operators without exposing channel owner credentials. Active early-stage
traction with growing user base.
Tech stack
| Backend | FastAPI · Python · Celery |
| Frontend | React · TypeScript · Next.js |
| Database | PostgreSQL · Redis |
| Infrastructure | Docker · Nginx · Multiple Ubuntu servers |
| Monitoring | Prometheus · Grafana · Sentry · Uptime Kuma |
| AI / LLM | Multi-provider stack (proprietary + open source) |
What this demonstrates
- Ability to architect and build production SaaS solo
- Deep understanding of LLM limitations and how to architect around them
- Production-grade security and infrastructure design
- Pragmatic cost optimization at architecture level
- End-to-end product thinking: business problem → technical solution → deployment → operations