Skip to content
Back to projects
Featured project

AndresAI — Portfolio Chatbot

A retrieval-augmented, model-agnostic agent that answers questions about my career, stack, and projects

Screenshots

3 views · click to enlarge
AndresAI chat interface with the assistant side drawerThe cyberpunk-styled chat UI with suggested prompt chips for Stack, Career, Open Source, and Learning, and the 'About this assistant' side drawer explaining the RAG pipeline and the live tech stack.
AndresAI admin Ops Console dashboardThe /admin Ops Console: live KPI cards for users, messages, KB entries and average latency, a 24h live counter strip, P50/P95/P99 response-time bars, and an all-time messages-per-day throughput chart.
AndresAI admin conversations view with a user detail modalThe Conversations view with a user-detail modal open over it — message counts, average latency, IP and platform, plus the full conversation history streamed from the database.

Technologies used

Frontend

  • Next.js 16
  • React 19
  • TypeScript
  • Tailwind CSS v4
  • shadcn/ui
  • Vercel

Backend & Infrastructure

  • Model-agnostic LLMs
  • Pydantic AI
  • FastAPI
  • Python
  • PostgreSQL
  • pgvector
  • SQLModel
  • OpenAI Embeddings
  • RAG
  • Redis
  • Caddy
  • Docker
  • Logfire
  • Sentry

Project overview

AndresAI is a portfolio chatbot that answers questions about my career, stack, hobbies, and projects using a retrieval-augmented, model-agnostic agent. It's split into two decoupled services: a Next.js 16 frontend on Vercel that serves both the public chat and a private admin dashboard, and a FastAPI backend that runs the AI agent, owns the database, and exposes the streaming chat endpoint.

The agent is built on Pydantic AI and is model-agnostic — it can run on Claude, OpenAI, or any other provider the framework supports — and exposes a single retrieval tool over a pgvector knowledge base embedded with OpenAI's text-embedding-3-small. Both the system prompt and the retrievable facts live in PostgreSQL and can be edited from the admin without a redeploy — adding a new project or hobby instantly makes it answerable in chat.

The admin layers a WebSocket pub/sub channel for live counters that update the moment a new message lands, with realtime KPIs, latency percentiles, and full CRUD over every entity the agent reads from.

Key features

  • Token-streaming chat: The server echoes the user's prompt first so the UI updates within tens of milliseconds, then streams the assistant's response token-by-token. A Stop button aborts the in-flight stream mid-token via AbortController.

  • Model-agnostic Pydantic AI agent: A single, well-scoped tool — search_knowledge_base(category, query) — lets the agent decide when to retrieve. The system prompt is fetched from the database on each conversation, so persona changes ship without a deploy, and the underlying LLM provider (Claude, OpenAI, …) can be swapped without touching the agent's code.

  • RAG with pgvector: OpenAI text-embedding-3-small (1536-dim) indexed in the same PostgreSQL instance via pgvector. Cosine-distance similarity search filtered by category, with embeddings auto-regenerated whenever an admin edits a knowledge-base entry.

  • Realtime admin dashboard: Next.js admin at /admin with live KPI cards, throughput and latency charts, and full CRUD over users, conversations, messages, knowledge base, and agent contexts. Authenticated via an httpOnly JWT cookie and backed by a WebSocket pub/sub for live counters.

  • Persistent per-browser history: Each browser gets a UUID stored in localStorage; the conversation reloads on revisit. The server owns all message IDs and timestamps, so there's no optimistic UI or client-side dedupe.

  • Containerized, self-hosted backend: FastAPI, PostgreSQL + pgvector, Redis (for fastapi-limiter rate limits), and Caddy with automatic HTTPS, all orchestrated by Docker Compose. Logfire and Sentry trace every request, model call, and SQL query.

Try it now

Experience AndresAI live and explore everything it can do.