
EmailDirector
Transactional Email Routing for Zeqah
Built for Zeqah's internal communication workflows. Implementation details shared at a high level.
The Challenge
Zeqah needed a reliable email infrastructure for onboarding, notifications, and study workflow alerts without depending on one provider. Delivery failures directly affected student experience and admin operations.

The Approach
I built a Next.js application with a robust queue-based architecture. Campaigns are broken into individual send jobs processed by BullMQ workers, with automatic retry logic and dead letter queues for permanent failures. The system integrates with multiple email providers (Resend, ZeptoMail) for failover redundancy.
System Architecture
The system follows a pipeline architecture: the Campaign Scheduler splits a campaign into individual BullMQ send tasks. Workers process each task using the primary email provider (Resend) and automatically failover to the secondary (ZeptoMail) if the primary fails. Permanent failures (invalid emails, hard bounces) land in a Dead Letter Queue with a dedicated UI for diagnosis and retry. Delivery webhooks from providers update campaign analytics in real-time. Redis handles both the job queue state and delivery rate limiting to avoid provider throttling.
System architecture overview
Built For
Zeqah operations and product teams that send onboarding emails, study reminders, and account notifications. Designed for non-technical operators while still exposing deep delivery diagnostics for technical admin workflows.
Design Decisions
Why build in-house instead of using Mailchimp/SendGrid?
Cost at scale. At tens of thousands of emails per campaign, SaaS platforms become prohibitively expensive. Building in-house with direct SMTP/API integrations cut costs by ~80% while giving full control over delivery timing and retry logic.
Why BullMQ over a simple loop?
A naive loop sending 50k emails would block the event loop, have no retry logic, and crash on the first network hiccup. BullMQ gives us rate limiting, automatic retries with exponential backoff, dead letter queues for forensic analysis, and concurrent workers for throughput.
Why dual email providers?
Email deliverability is fragile. If one provider gets rate-limited or goes down, the system automatically fails over to the backup provider. This redundancy ensures campaigns complete even during provider outages.
Code Preview
Curated excerpts from the actual production codebase — demonstrating architecture, patterns, and engineering quality.
💡 The core email processing worker. Each campaign is broken into individual send jobs, processed sequentially with rate limiting. Failed jobs are automatically retried via BullMQ, and real-time progress is broadcast to the UI via Ably.
1async function processEmailJob(job: Job<EmailJobData>) {2 const { jobId, to, subject, html, campaignId } = job.data;34 try {5 // 1. Mark as processing6 await prisma.emailJob.update({7 where: { id: jobId },8 data: { status: 'PROCESSING' },9 });1011 // 2. Send the email via primary provider12 const result = await sendEmail({ to, subject, html });1314 // 3. Update DB to SENT15 await prisma.emailJob.update({The Team
CTO (me)
Architecture, backend implementation, deployment, and reliability engineering
Product/Content Team
Messaging requirements, campaign logic, and operational validation
Tech Stack
Outcomes & Impact
- Processes 50k+ emails per campaign with 99.2% delivery rate
- Dead letter queue system catches and enables manual review of all failed sends
- Timezone-aware scheduling for reminder and notification workflows
- Comprehensive diagnostics: opens, bounces, complaints, and provider-level failures
💬 Behind the Scenes
“The DLQ (Dead Letter Queue) UI was one of my favorite features to build - it turns mysterious email failures into a clear, actionable interface. Nothing like turning "why didn't my email arrive?" into a click-to-diagnose experience.”