This page is the public companion to the system: curated knowledge, hybrid retrieval on Supabase, structured answers with citations, and guardrails you would expect in a serious AI product — not a demo prompt taped to a chat widget.
The Q&A Assistant is a conversational layer on top of an explicitly governed knowledge base. Sources are curated in the repo (markdown and structured data), modeled as records and chunks, synced to Supabase, and retrieved with a hybrid pipeline (vectors + full-text + aliases + intent-aware scoring). Responses are generated with Anthropic Claude, with structured output, citations, and deterministic fallbacks when parsing or the API fails.
Each user message travels through a single server pipeline (runGroundedChat) with timing breakdowns for profiling (session, intent, history, retrieval, LLM, normalize, logging).
Session & validation
Ensures a stable session id (client or generated), loads recent history for follow-up questions, validates input size (Zod).
Intent, language, entity hint
Rule-based intent classification, language detection, and optional entity resolution (canonical keys from titles, slugs, aliases).
Safety gates
Requests pointing at excluded internal files get a refusal and a blocked-event log; handoff rules decide when to suggest contacting you.
Hybrid retrieval
Expanded query → embedding → Supabase RPC qa_assistant_hybrid_candidates → composite scoring → dedupe by canonical key → chunk budget per response mode.
LLM + structure
Claude receives grounding prompts and evidence; output is normalized to a structured payload (lead, bullets, sections, citations).
Persistence
Messages and query logs stored; optional feedback and lead handoff endpoints rate-limited separately.
Response to client
JSON includes rendered content, citations, structured UI hints, and timing metadata for debugging.
The corpus is built in curated-records.ts: approved paths (e.g. about-me/*.md, project slices) and an explicit exclusion list for private or internal interview analyses. Each fact becomes a record with entity_type, canonical_key, visibility, assistant_summary, evidence text, tags, and query_aliases for retrieval.
Chunks split long material for embedding and FTS; each chunk carries enriched embed_text (not raw-only dumps), metadata for narrative ordering, recruiter evidence, and answer_roles so retrieval can prefer chunks suited to the current intent.
Query and chunk vectors use the OpenAI Embeddings API (configurable model, default text-embedding-3-small, 1536 dimensions). Embeddings are version-tagged in storage; an in-memory cache reduces duplicate calls. If the API key is missing, vector similarity degrades gracefully while lexical and alias signals remain.
Tables include qa_assistant_records, qa_assistant_chunks, qa_assistant_sessions, qa_assistant_messages, qa_assistant_query_logs, qa_assistant_feedback, qa_assistant_handoffs, qa_assistant_blocked_events, qa_assistant_eval_runs, and admin override state. The RPC qa_assistant_hybrid_candidates returns vector, FTS, alias, and title-match scores for candidate chunks filtered by visibility and entity type.
Retrieval combines: (1) Supabase hybrid candidates when available, (2) local cosine similarity against stored chunk embeddings, (3) composite scoreChunk weighting titles, summaries, content, tags, intent alignment, leadership / AI-tool boosts, and penalties for common confusions (e.g. spoken languages vs programming languages). Results are deduplicated by canonical entity, optionally forced to include an entity anchor, then trimmed via applyChunkBudgetAndOrdering for modes like narrative, recruiter fit, or project focus.
Anthropic models are selected via environment variables (defaults: Sonnet-class main model, Haiku for fast paths where used). System prompts enforce grounding to provided evidence. The client receives structured payloads for rich rendering (bullets, sections, project/fit cards). If the provider fails or JSON parsing fails, a deterministic fallback builds an answer from chunk summaries so the UI never silently hallucinates a full narrative.
POST /api/qa-assistant/chat is rate limited (per IP + session). Handoff submissions use stricter limits, honeypot fields, and Cloudflare Turnstile when configured. Admin routes are protected via Supabase auth and an allowlist of admin emails. Service-role keys never reach the browser.
Full-page chat at /qa-assistant and a floating widget share the same API, differentiated by an entrypoint flag for analytics. Citations, follow-ups, structured layouts, feedback thumbs, and a lead handoff flow are integrated in the UI layer.
Seed scenarios live in docs; scripts under scripts/ run factual and product audits, with optional persistence to qa_assistant_eval_runs. This supports regression checks when the corpus or prompts change.
Typical production variables (names only — set values in your host):
| ANTHROPIC_API_KEY | Claude API access |
|---|---|
| OPENAI_API_KEY | Embeddings API |
| NEXT_PUBLIC_SUPABASE_URL / ANON / SERVICE ROLE | Database + server sync |
| QA_ASSISTANT_ADMIN_EMAILS | Admin UI allowlist |
| QA_ASSISTANT_TURNSTILE_SECRET_KEY | Handoff captcha verification |
| QA_ASSISTANT_EMBEDDING_MODEL / ANTHROPIC_MODEL | Optional model overrides |
| NEXT_PUBLIC_APP_URL | Absolute URLs for emails and callbacks |
The best proof is interactive: ask about background, projects, stack, or availability — and inspect how citations line up with the architecture above.
Go to Q&A Assistant