Skip to main content

AI Agent Engineering Handbook

"The best AI engineers understand both the models and the engineering."

This knowledge base builds a complete technical loop from LLM fundamentals to production AI agent systems.

The Core Formula

Agent = Model (Brain) + Prompt (Instruction) + Memory (RAG/Context) + Tools (MCP) + Planning (Architecture)

1. System Architecture Overview

This diagram shows how the 7 modules logically depend on each other:

How the modules connect:

  • LLM Foundational provides computing and reasoning foundations
  • Prompt & Context are the media for interacting with models
  • RAG provides static knowledge support for models
  • MCP provides dynamic tool support for models
  • Agents orchestrate and coordinate all above components
  • Ops & Security runs through the entire lifecycle

2. Module Synopsis

IDModuleOne-Liner DefinitionKey Technologies & Keywords
01LLM FoundationalUnderstanding the "brain" mechanism, training pipeline, and physical limitationsTransformer, Attention, Pre-training, RLHF, Tokenization, Inference Params (Temp/Top-P)
02Prompt EngineeringWriting "instruction code" to elicit reasoning and standardize output formatChain-of-Thought (CoT), Few-shot, ReAct, XML/JSON Output, Persona
03RAGAugmenting models with external "library" to solve hallucinations and inject private dataVector DB, Embeddings, Chunking, Hybrid Search, Grounding, Self-Querying
04AgentsEvolving from "chat" to "action" with planning, reflection, and tool useOrchestration, Loop Control, Reflection, Router, Multi-Agent (Supervisor/Hierarchical)
05MCPModel Context Protocol - standardized AI connection (USB-C) decoupling models from toolsHost/Client/Server, Resources, Tools, Prompts, JSON-RPC, Stdio/SSE
06Context EngineeringManaging model "attention" window and long/short-term memory to prevent overloadKV Cache, Context Window, Short/Long-term Memory, Information Compression
07AgentOps & SecurityConverting demos to production applications with safety, observability, and evaluationEval (LLM-as-a-Judge), Prompt Injection, Docker Deployment, Tracing

3. Learning Paths

Choose your path based on your development goals.

The Builder Path (Practical Developer)

Goal: Quickly build a Java AI Agent that can access the web and query databases.

Recommended Order:

  1. 05 MCP: First understand how to write a tool (Server)
  2. 04 Agents: Learn how to make the model call this tool
  3. 02 Prompt: Optimize instructions for more accurate calls
  4. 07 Ops: Deploy to Docker (refer to Brave Search case)

Focus: Rapid iteration, working code, production deployment

The Architect Path (Architect/Researcher)

Goal: Design complex enterprise multi-agent systems.

Recommended Order:

  1. 01 Foundational: Understand model capability boundaries
  2. 04 Agents: Design multi-agent collaboration patterns
  3. 06 Context: Design memory systems to support long workflows
  4. 03 RAG: Plan enterprise knowledge base integration

Focus: System design, scalability patterns, architectural trade-offs


4. Quick References

Essential resources for common tasks - avoid deep-diving into documentation.

Standard Agent System Prompt Template

See Template Guide

MCP Server Standard Code Structure (Java/Spring)

See Java Implementation Guide

RAG Chunking Strategy Cheat Sheet

See RAG Optimization Guide

ParameterConservativeCreativeCoding
Temperature0.0 - 0.30.7 - 1.00.1 - 0.2
Top-P0.90.950.9
Max Tokens102420484096
Frequency Penalty0.00.30.0

5. Navigation Guide

Core Modules

  • LLM Foundational - Transformer architecture, training, inference, limitations
  • Prompt Engineering - CoT, few-shot, ReAct patterns, output formatting
  • RAG - Vector databases, embeddings, retrieval strategies, grounding
  • Agents - Orchestration, multi-agent systems, planning, reflection
  • MCP - Protocol specification, server implementation, tools, resources
  • Context Engineering - Context windows, memory systems, optimization
  • AgentOps & Security - Deployment, monitoring, safety, incident response

Additional Resources


6. Key Concepts at a Glance

Token Economics

  • 1 token ~= 0.75 words (English) ~= 4 characters
  • Context window = maximum tokens per request (varies by model)
  • KV cache = cached previous tokens for faster generation

The RAG Pipeline

Query -> Embedding -> Vector Search -> Context Assembly -> LLM -> Response

Agent Decision Loop

Observe -> Reason -> Act -> Observe -> Reason -> Act ...

MCP Connection Model

Host (App) <-> Client (Protocol) <-> Server (Tool/Data)

7. Common Patterns

Pattern 1: ReAct Agent

Thought: [Analyze the situation]
Action: [Call a tool]
Observation: [Review result]
Thought: [Plan next step]
Action: [Continue or finish]

Pattern 2: Router Agent

Classify Query -> Route to Specialist Agent -> Aggregate Results

Pattern 3: Hierarchical Agents

Supervisor Agent -> Worker Agents -> Report Back -> Synthesize

8. Production Checklist

Before deploying to production:

  • All tools have proper error handling
  • Sensitive operations require human approval
  • Comprehensive audit logging enabled
  • Kill switches implemented and tested
  • Rate limiting configured
  • Cost controls in place
  • Monitoring dashboards active
  • Incident response procedures documented
  • Security review completed
  • Load testing performed

Get Started

New to AI engineering? Start with LLM Foundational to understand how models work, then move to Prompt Engineering to learn effective prompting patterns.

For Java Developers

If you're building AI applications with Spring Boot, check out MCP for standardized tool integration and AgentOps for production deployment patterns.