Tarjam Docs
Tarjam is a real-time multilingual translation platform for Islamic talks, khutbas, and lectures. The Imam speaks; every listener reads a live translation in their chosen language — at any scale, near-zero cost.
What's in these docs¶
| Section | Description |
|---|---|
| V2 Architecture | Decoupled Edge-Polling Architecture — FastAPI, TanStack, Cloudflare R2/KV, 100k+ listeners |
| JSON Payload Schema | Worker output format — segment types, CDN URL pattern, polling contract |
| HLS Architecture Decision | Why HLS over WebRTC at broadcast scale — four-layer overview and cost breakdown |
| HLS Egress Implementation | LiveKit Egress → Cloudflare R2 → CDN — step-by-step implementation |
System at a Glance (V2)¶
Imam mic
│ WebRTC (LiveKit)
▼
Python AI Worker (one per language track per masjid)
├─ STT → Deepgram (Language ID, Arabic/English code-switching)
├─ LLM → Gemini / GPT-4o-mini + Theological Context RAG
└─ Output → Structured JSON (text, term, quran, hadith, dua segments)
writes to: Cloudflare R2 / KV → masjid123_en.json (every 1 second)
Cloudflare CDN (1s edge cache TTL)
└─ serves 100,000 listeners at zero marginal egress cost
Listener UI (TanStack Start PWA)
└─ polls CDN URL every 1.5 seconds (TanStack Query)
└─ renders segments: tappable term chips, Quran cards, dua indicators
Key Invariants¶
- Listeners never touch LiveKit or WebRTC. They consume plain JSON over HTTP — works on any browser, any connection.
- The Worker is the single source of truth. It overwrites one file per language per masjid, once per second. No pub/sub, no sockets, no fan-out servers.
- Scale is Cloudflare's problem. 100,000 listeners hitting the same URL = one R2 read per second at the edge node. Compute stays constant.
- The Control Plane (FastAPI) never sees listener traffic. Database load is insulated from audience size.
- Structured JSON output. The LLM produces typed segments (
text,term,quran,hadith,dua) in one call — the frontend renders rich UI without additional logic.