Skip to main content

AI Agent Engineering Handbook

"The best AI engineers understand both the models and the engineering."

This knowledge base builds a complete technical loop from LLM fundamentals to production AI agent systems.

The Core Formula​

Agent = Model (Brain) + Prompt (Instruction) + Memory (RAG/Context) + Tools (MCP) + Planning (Architecture)

1. System Architecture Overview​

This diagram shows how the 7 modules logically depend on each other:

How the modules connect:

  • LLM Foundational provides computing and reasoning foundations
  • Prompt & Context are the media for interacting with models
  • RAG provides static knowledge support for models
  • MCP provides dynamic tool support for models
  • Agents orchestrate and coordinate all above components
  • Ops & Security runs through the entire lifecycle

2. Module Synopsis​

IDModuleOne-Liner DefinitionKey Technologies & Keywords
01LLM FoundationalUnderstanding the "brain" mechanism, training pipeline, and physical limitationsTransformer, Attention, Pre-training, RLHF, Tokenization, Inference Params (Temp/Top-P)
02Prompt EngineeringWriting "instruction code" to elicit reasoning and standardize output formatChain-of-Thought (CoT), Few-shot, ReAct, XML/JSON Output, Persona
03RAGAugmenting models with external "library" to solve hallucinations and inject private dataVector DB, Embeddings, Chunking, Hybrid Search, Grounding, Self-Querying
04AgentsEvolving from "chat" to "action" with planning, reflection, and tool useOrchestration, Loop Control, Reflection, Router, Multi-Agent (Supervisor/Hierarchical)
05MCPModel Context Protocol - standardized AI connection (USB-C) decoupling models from toolsHost/Client/Server, Resources, Tools, Prompts, JSON-RPC, Stdio/SSE
06Context EngineeringManaging model "attention" window and long/short-term memory to prevent overloadKV Cache, Context Window, Short/Long-term Memory, Information Compression
07AgentOps & SecurityConverting demos to production applications with safety, observability, and evaluationEval (LLM-as-a-Judge), Prompt Injection, Docker Deployment, Tracing

3. Learning Paths​

Choose your path based on your development goals.

The Builder Path (Practical Developer)​

Goal: Quickly build a Java AI Agent that can access the web and query databases.

Recommended Order:

  1. 05 MCP: First understand how to write a tool (Server)
  2. 04 Agents: Learn how to make the model call this tool
  3. 02 Prompt: Optimize instructions for more accurate calls
  4. 07 Ops: Deploy to Docker (refer to Brave Search case)

Focus: Rapid iteration, working code, production deployment

The Architect Path (Architect/Researcher)​

Goal: Design complex enterprise multi-agent systems.

Recommended Order:

  1. 01 Foundational: Understand model capability boundaries
  2. 04 Agents: Design multi-agent collaboration patterns
  3. 06 Context: Design memory systems to support long workflows
  4. 03 RAG: Plan enterprise knowledge base integration

Focus: System design, scalability patterns, architectural trade-offs


4. Quick References​

Essential resources for common tasks - avoid deep-diving into documentation.

Standard Agent System Prompt Template​

See Template Guide

MCP Server Standard Code Structure (Java/Spring)​

See Java Implementation Guide

RAG Chunking Strategy Cheat Sheet​

See RAG Optimization Guide

ParameterConservativeCreativeCoding
Temperature0.0 - 0.30.7 - 1.00.1 - 0.2
Top-P0.90.950.9
Max Tokens102420484096
Frequency Penalty0.00.30.0

5. Navigation Guide​

Core Modules​

  • LLM Foundational - Transformer architecture, training, inference, limitations
  • Prompt Engineering - CoT, few-shot, ReAct patterns, output formatting
  • RAG - Vector databases, embeddings, retrieval strategies, grounding
  • Agents - Orchestration, multi-agent systems, planning, reflection
  • MCP - Protocol specification, server implementation, tools, resources
  • Context Engineering - Context windows, memory systems, optimization
  • AgentOps & Security - Deployment, monitoring, safety, incident response

Additional Resources​


6. Key Concepts at a Glance​

Token Economics​

  • 1 token ~= 0.75 words (English) ~= 4 characters
  • Context window = maximum tokens per request (varies by model)
  • KV cache = cached previous tokens for faster generation

The RAG Pipeline​

Query -> Embedding -> Vector Search -> Context Assembly -> LLM -> Response

Agent Decision Loop​

Observe -> Reason -> Act -> Observe -> Reason -> Act ...

MCP Connection Model​

Host (App) <-> Client (Protocol) <-> Server (Tool/Data)

7. Common Patterns​

Pattern 1: ReAct Agent​

Thought: [Analyze the situation]
Action: [Call a tool]
Observation: [Review result]
Thought: [Plan next step]
Action: [Continue or finish]

Pattern 2: Router Agent​

Classify Query -> Route to Specialist Agent -> Aggregate Results

Pattern 3: Hierarchical Agents​

Supervisor Agent -> Worker Agents -> Report Back -> Synthesize

8. Production Checklist​

Before deploying to production:

  • All tools have proper error handling
  • Sensitive operations require human approval
  • Comprehensive audit logging enabled
  • Kill switches implemented and tested
  • Rate limiting configured
  • Cost controls in place
  • Monitoring dashboards active
  • Incident response procedures documented
  • Security review completed
  • Load testing performed

Get Started

New to AI engineering? Start with LLM Foundational to understand how models work, then move to Prompt Engineering to learn effective prompting patterns.

For Java Developers

If you're building AI applications with Spring Boot, check out MCP for standardized tool integration and AgentOps for production deployment patterns.