// “Fixing the Cracks in the Foundation”
// Solidifying Core Engineering Foundations
- Setup: Python, Jupiter, UV, virtual envs, Git/GitHub workflow
- Local LLM setup
- FastAPI AI endpoint skeleton
- Docker intro and running services locally
- UX: Clear inputs/outputs for an AI endpoint
- Tracing a first call (LangSmith or equivalent)
Mini-project: Build a simple Chat app + automation flow (n8n)
// “From Order to Chaos”
// Prompting, Structured Outputs, and Calling AI Services
- Calling hosted LLMs (OpenAI-compatible), structured outputs
- Frontier & multi-provider overview (compare GPT / Claude / Gemini / Grok / etc.)
- Structured JSON & Chaining
- Prompt patterns (system/user, few-shot, tool-calling)
- UX: Display raw model response + validation errors
- Tracing and comparing two prompt variants
- Error handling, retries, timeouts
Project 1: Task Assistant (JSON + showing steps)
Student Evaluation Committee - EVALUATION & NOMINATION
// Retrieval-Augmented Generation with Transparent UX
// “Patterns that Make the Code Dance”
- RAG pipeline: chunking, embeddings, vector DB
- Build a vector store with Chroma or FAISS
- Visualize embeddings with t-SNE
- Retrieval => Generation wiring
- UX: Show retrieved sources, confidence / “no result” state
- Logging RAG vs non-RAG runs
- Handling RAG failure modes
- Intro to Gradio
Project 2: Knowledge assistant with Source panel
// Tool-Chaining, LangGraph/LangChain, and Action Traces
// “The School”
- How tool calling really works (Tracing and Inspecting)
- Agents with LangChain/LangGraph (multi-tool, planning/reactive)
- Tool whitelisting and safe inputs
- UX: Agent timeline (step-by-step), error surface (Gradio)
- Tracing multi-step runs and tool calls
- Simple routing and Multi-model conversations
- Deep Research / Claude Code / Agent Mode
Project 3: Agent that calls at least two tools + trigger an external automation via webhook
// APIs, Containers, CI/CD, and Operable AI Endpoints
Student Evaluation Committee - EVALUATION & NOMINATION
// “Developing for the Pocket-Sized World”
- Turning LLM/RAG/agent into microservices (FastAPI)
- Docker packaging and reproducible environments
- CI/CD for AI services: Smoke prompts / RAG regression suite
- UX: Service status + last successful run
- Observability: Logging Prod vs Local runs (OTel, LangSmith, etc.)
Project 4: Containerized RAG service
// Quality Gates, Cost/Latency Tracking, and Run Comparisons
// “The Pulse of Real-Time Web”
- The Chinchilla Scaling Law
- Cost and latency tracking (basic model “tiering”)
- LLM evaluation: golden datasets, regression on prompts
- Benchmarks & Leaderboards
- UX: Show latency and cost per request
- Running eval suites and comparing runs over time
- Using LLM-as-judge and retrieval metrics (MRR/nDCG) for RAG evaluation
Project 5: Eval harness for an earlier agent
Student Evaluation Committee - EVALUATION & NOMINATION
// Baselines, Model Choice, and When Not to Use an LLM
// “The Final Showdown”
- Supervised ML (Metrics, Baselines, Overfitting): End-to-end baseline project
- When to pick classic ML vs LLM: Use baseline to compare
- UX: Explain “why this path” (LLM vs ML) to the user
- Hugging Face Datasets & Classic Pipelines
- Logging model choice and final outcome
Project 6: Train a small ML model
// Hybrid Search, Tenants, and Business Document Integrations
// “Voices from The Experts”
- Advanced RAG: hybrid/metadata/multi-tenant retrieval
- GraphRAG-style thinking and re-ranking
- UX: Show user/org context used for retrieval
- With vs without frameworks
- Enterprise Connectors & Ingestion (GDrive, Notion, S3)
- Tagging runs by user and analyzing failures
Project 7: Build an “Org knowledge agent” + event-driven ingestion with n8n
// Prompt-Injection Defenses, Policies, and UX for Denied Actions
// “Voices from The Experts”
- Red Teaming labs: Prompt injection and jailbreak defenses
- Secure tool calling, role-based/action-based policies
- AI “Firewall” pattern
- UX: Explicit rejection/blocked-action states
- Safety Dashboards (number of blocked actions, jailbreak attempts, etc.)
- Logging blocked/denied runs with reasons
Project 8: Create a safe tool-calling agent
// ASR/TTS, Image/Text Fusion, and Explainable Interactions
// “Voices from The Experts”
- Automatic Speech Recognition/Text-to-Speech (Whisper or OpenAI Audio)
- Multimodal RAG patterns (RAG over transcripts + documents)
- Logging raw audio/text and comparing outputs
- DALL-E 3 vs Stable Diffusion / FLUX
- UX: Show transcription + final answer + steps
Project 9: Build a voice/helpdesk agent + scheduled automations via n8n
// Cloud Deploy, Monitoring, and Operating Agentic Apps
// “Voices from The Experts”
- Dataset curation sprint (data cleaning + JSONL) ⇒ Upload to Hub in HF
- Fine-tuning pipeline (Validate, Launch, Track Loss)
- Deploying to cloud targets (Render/ECS/GCP)
- Queues/background jobs for long tasks
- Monitoring and alerting for AI services
- Fine-tuning vs just better RAG or prompts
- UX: Surface service health/degradation to the user
Final project Packaging
Final Student Evaluation Committee - EVALUATION & NOMINATION