Web2024–PresentProprietary

Hydra (AIDirector)

Intelligent AI API Gateway with Automatic Failover

Founder & Architect·Self-initiated platform (used by Zeqah + ISEvolutions)·2024–Present (ongoing product)

Self-initiated core platform used across products. Technical architecture shared; source code is proprietary.

The Challenge

I needed one reliable AI gateway that could serve multiple products (Zeqah and ISEvolutions workloads) without each app managing provider instability independently.

The Approach

I designed a 3-step architecture: Client SDK → Vercel (token validation) → Cloudflare Worker (AI generation on the free tier where possible) → Vercel (cache result). Fallback chains automatically switch between Gemini and OpenRouter when one fails.

System Architecture

A 3-step request flow designed to minimize costs and maximize reliability: (1) Client SDK sends request to Vercel Edge Function for token validation. (2) Validated requests are forwarded to a Cloudflare Worker (free tier) for actual AI generation via Gemini, with automatic failover to OpenRouter if Gemini fails. The gateway explicitly routes heavy Document Intelligence, PDF, and image extraction (OCR) tasks to optimal vision models. (3) Results are cached in Redis via a Vercel function before being returned to the client. The SDK is published on npm for easy integration. HMAC request signing prevents API key theft between the Vercel and Cloudflare layers. PPP (Purchasing Power Parity) pricing is calculated server-side based on the user's country, making AI accessible in 50+ markets.

System architecture overview

Built For

Developers and startups building AI-powered products who need a reliable, affordable AI API without managing provider relationships directly. Particularly developers in emerging markets where AI API costs can be prohibitive - PPP pricing makes the platform accessible to builders in 50+ countries.

Design Decisions

Why the 3-step architecture instead of a simple proxy?

Cloudflare Workers have a generous free tier for compute. By doing the heavy AI API calls there, we eliminated server costs for generation entirely. Vercel handles auth and caching (which it's better at), while CF handles the actual AI calls (which are CPU-bound).

Alternatives considered:Single Vercel endpointAWS LambdaDirect client-to-AI

Why HMAC authentication?

API keys can be stolen from client-side code. HMAC request signing ensures that even if someone inspects network requests, they can't forge new ones - every request is cryptographically signed with a timestamp to prevent replay attacks.

Alternatives considered:API keysOAuth 2.0JWT

Why PPP (Purchasing Power Parity) pricing?

A developer in Nigeria shouldn't pay the same as one in San Francisco. PPP pricing adjusts costs based on the user's country, making the platform accessible globally while maintaining revenue targets.

Alternatives considered:Flat global pricingRegional tiers

Code Preview

Curated excerpts from the actual production codebase — demonstrating architecture, patterns, and engineering quality.

lib/hmac.tstypescript

💡 Cryptographic request signing system that prevents API key theft and replay attacks. The server never sees the full secret key — it verifies signatures using the stored hash. Timing-safe comparison prevents side-channel attacks.

1export function verifyHmacSignature(
2    secretKeyHash: string,
3    signature: string,
4    config: HmacSignatureConfig,
5    toleranceMs: number = 5 * 60 * 1000 // 5 minutes default
6): { valid: boolean; error?: string } {
7    // Check timestamp freshness → prevents replay attacks
8    const now = Date.now();
9    const timeDiff = Math.abs(now - config.timestamp);
10
11    if (timeDiff > toleranceMs) {
12        return {
13            valid: false,
14            error: `Request timestamp expired. Diff: ${timeDiff}ms`,
15        };

The Team

Founder & Architect (me)

System design, SDK, failover architecture, caching, and operations

Product Integrators

Integrated Hydra into Zeqah and ISEvolutions product workflows

Tech Stack

Next.jsTypeScriptCloudflare WorkersRedisPrismaGemini APIOpenRouterFlutterwaveVitestHMAC AuthVision/OCR API

Outcomes & Impact

~50% cache hit rate reduces AI API calls (and costs) by half
Automatic failover keeps the service available during single-provider outages
Client SDK published on npm for easy integration
PPP pricing enables developers in 50+ countries to use the platform affordably
Robust multi-modal pipeline supporting high-throughput document intelligence and OCR parsing tasks

💬 Behind the Scenes

“The name "Hydra" isn't just cool - it's the architecture. Cut off one head (provider goes down), and another takes its place. We tested this by intentionally breaking providers during load tests. The failover was seamless.”

← Back to Projects