RH
← projects

Ummah Speaks

AI Islamic companion — emotional intent, sub-2s latency

Next.js 15 · TypeScript · Groq API · Llama 3 · Tailwind CSS 4

the problem

Most AI chatbots respond to the literal text you send. If you type 'I'm struggling', they respond to the word 'struggling'. Emotional intent is different — the same words mean different things depending on context, exhaustion level, and what you're not saying. For an Islamic companion specifically, this distinction matters: a person venting needs a different response than a person seeking guidance.

the approach

A serverless Next.js app that classifies emotional intent before generating a response. The Groq API runs Llama 3 inference. In-memory rate limiting prevents weekend traffic from becoming a billing surprise. All conversation history is stored in localStorage — deliberately. No database, no user data, no GDPR problem.

key decisions

Emotional intent classification first

Before generating any response, the app classifies the user's message into one of several intent categories (venting, seeking guidance, expressing gratitude, etc.). The response prompt changes based on the classification. This is the thing that makes responses feel like they understood you.

In-memory rate limiting, no Redis

A proper rate limiter would use Redis. This app uses an in-memory Map keyed by IP. Resets on server restart. For a free-tier serverless app without a persistent database, this is the right tradeoff — it handles real traffic without adding infrastructure costs.

localStorage for conversation history

Every message stays on the user's device. No backend storage. This is a privacy decision first — some conversations shouldn't be in a database — and a simplicity decision second. No auth, no user management, no compliance headache.

by the numbers

Response latency

< 2 seconds

Server-side storage

zero

External database

none

what i actually learned

Privacy-first isn't a compromise — it's a feature. Telling users their conversations never leave their device is a selling point, not a limitation. And in-memory rate limiting is underrated: it handles real production traffic on a free tier without adding any infrastructure.

sub-2s latency · emotional intent · zero server-side storage