Sandhya
Honnappa

<architecture>

A front-end portfolio shows pixels. An architect's shows systems, decisions, and tradeoffs. Each study below is problem → approach → decisions → outcome.

JLL GPT

Key Architect & Lead Engineer

Enterprise multi-provider GenAI platform serving 45,000+ employees globally.

  • Azure OpenAI
  • AWS Bedrock
  • Google Gemini
  • Baidu Ernie
  • Azure Cosmos DB
  • AKS
  • LiteLLM
  • Qdrant
  • SignalR
  • APIM
  • Front Door

<problem />

JLL needed one centralised, governed AI surface for internal employees — without locking the company into a single LLM vendor, and resilient to provider outages, cost spikes, and regional data constraints.

<approach />

  • Designed a multi-provider routing strategy (Strategy Pattern + ChatBackendServiceLocator) enabling seamless LLM hot-swapping across Azure OpenAI, AWS Bedrock, Gemini, and Baidu Ernie.
  • Architected the backend with Clean/Onion Architecture, CQRS, and JWT/Okta auth on Azure Cosmos DB; built real-time persistent chat via SignalR with conversation management, file uploads, and RAG integration.
  • Designed the Azure networking topology: Front Door → APIM → Container Apps → Private Endpoints, with rate-limiting and idempotency protections.

<decisions />

  • Strategy Pattern over per-provider branching — so adding/swapping a provider is configuration, not a code change.
  • Private Endpoints + APIM policies to keep enterprise data inside the network boundary.
  • A dedicated AI gateway (LiteLLM) in front of every provider for caching, cost governance, and a uniform interface.

<outcome />

  • Single governed AI platform adopted by 45,000+ employees worldwide.
  • Provider failover and vendor flexibility at enterprise scale.
  • Real-time, persistent, RAG-enabled chat as the standard JLL AI experience.

The Cosmos DB Incident

Incident Lead

Cosmos DB grew 30 GB → 320 GB in 48 hours. Diagnosed, contained, and hardened.

  • Azure Cosmos DB
  • APIM
  • Azure Blob Storage
  • Idempotency

<problem />

In production, Cosmos DB storage exploded from 30 GB to 320 GB within 48 hours — a cascading duplicate-upload bug threatening sustained data growth and cost runaway.

<approach />

  • Traced the cascade to duplicate uploads re-entering the write path and amplifying.
  • Contained it with APIM policies and idempotency guards to stop duplicate writes at the edge.
  • Migrated large payloads out of Cosmos into Blob Storage, sizing each store to its job.

<decisions />

  • Stop the bleeding at the gateway (APIM) first, then fix the data model — containment before cleanup.
  • Idempotency as a first-class protection, not an afterthought.
  • Right storage for the right data: documents in Cosmos, blobs in Blob Storage.

<outcome />

  • Prevented sustained data loss and cost runaway.
  • Idempotency and rate-limiting became standing protections across the platform.

LiteLLM AI Gateway on AKS

Architect & Owner

Production Python AI gateway with semantic caching and cost governance across two Azure regions.

  • LiteLLM
  • Python
  • AKS
  • Qdrant
  • Azure (2 regions)

<problem />

Multiple LLM providers and rising token spend needed a single control point for routing, caching, and cost governance.

<approach />

  • Deployed LiteLLM on AKS across two Azure regions for resilience.
  • Added exact-match and semantic caching via Qdrant to cut redundant LLM calls.
  • Centralised cost governance and a uniform provider interface behind the gateway.

<decisions />

  • Semantic caching (not just exact-match) to catch near-duplicate prompts.
  • Gateway-owned cost controls so governance lives in infrastructure, not scattered in app code.

<outcome />

  • Significantly reduced LLM costs and latency.
  • One operational surface for every provider integration.

MCP Gateway & Agentic AI Marketplace

Architect

Node/Express MCP Gateway mapping tool/resource/prompt primitives onto JLL GPT's strategy pattern.

  • MCP
  • Node.js
  • Express
  • JSON-RPC

<problem />

Agentic tool-use needed a standard, extensible way to expose JLL GPT capabilities to AI agents.

<approach />

  • Built an MCP Gateway on Node/Express mapping MCP tool/resource/prompt primitives to JLL GPT's multi-provider strategy pattern.
  • Authored ADRs, Claude Code setup guides, and CLAUDE.md slash-command patterns for the team.

<decisions />

  • Adopt MCP as the integration contract so new agentic tools plug in without bespoke wiring.
  • Document architecture decisions (ADRs) so the platform stays legible as it grows.

<outcome />

  • Extensible agentic integrations on a standard protocol.
  • Reusable patterns and docs that let other engineers extend the platform.

Nudge — AI Habit Tracker

Solo Builder (Personal Project)

Full-stack Python AI app demonstrating FastAPI + pgvector + LiteLLM + MCP patterns.

  • FastAPI
  • PostgreSQL
  • pgvector
  • LiteLLM
  • MCP
  • Railway

<problem />

A real, working personal tool that also demonstrates production Python AI backend patterns end-to-end.

<approach />

  • Multi-repo architecture: nudge-mcp-server, nudge-agent, nudge-pwa.
  • JWT auth, RAG-ready vector storage in pgvector, and an MCP server for agentic habit logging.
  • Hosted on Railway with managed Postgres.

<decisions />

  • Use the exact stack I'd recommend at work (FastAPI, pgvector, LiteLLM, MCP) so the project is also a reference implementation.
  • Multi-repo split to keep the agent, server, and PWA independently deployable.

<outcome />

  • A functioning personal habit tracker.
  • A portfolio demonstration of Python AI backend patterns.

</architecture>

<Contact/>