Ummah Speaks
AI Islamic companion — emotional intent, sub-2s latency
Next.js 15 · TypeScript · Groq API · Llama 3 · Tailwind CSS 4
the problem
Most AI chatbots respond to the literal text you send. If you type 'I'm struggling', they respond to the word 'struggling'. Emotional intent is different — the same words mean different things depending on context, exhaustion level, and what you're not saying. For an Islamic companion specifically, this distinction matters: a person venting needs a different response than a person seeking guidance.
the approach
A serverless Next.js app that classifies emotional intent before generating a response. The Groq API runs Llama 3 inference. In-memory rate limiting prevents weekend traffic from becoming a billing surprise. All conversation history is stored in localStorage — deliberately. No database, no user data, no GDPR problem.
key decisions
↳ Emotional intent classification first
Before generating any response, the app classifies the user's message into one of several intent categories (venting, seeking guidance, expressing gratitude, etc.). The response prompt changes based on the classification. This is the thing that makes responses feel like they understood you.
↳ In-memory rate limiting, no Redis
A proper rate limiter would use Redis. This app uses an in-memory Map keyed by IP. Resets on server restart. For a free-tier serverless app without a persistent database, this is the right tradeoff — it handles real traffic without adding infrastructure costs.
↳ localStorage for conversation history
Every message stays on the user's device. No backend storage. This is a privacy decision first — some conversations shouldn't be in a database — and a simplicity decision second. No auth, no user management, no compliance headache.
by the numbers
Response latency
< 2 seconds
Server-side storage
zero
External database
none
what i actually learned
Privacy-first isn't a compromise — it's a feature. Telling users their conversations never leave their device is a selling point, not a limitation. And in-memory rate limiting is underrated: it handles real production traffic on a free tier without adding any infrastructure.
sub-2s latency · emotional intent · zero server-side storage