Architecting a Modern Data Stack for AI Agents
PUT GUARDRAILS IN PLACE AT EVERY LEVEL
LLMs can go badly wrong in a lot of ways. Guardrails are essential. A guardrail is something in an agentic AI system that prevents errors and misleading hallucinations, protects users and organizations, complies with regulations, and prevents gross ethical violations. That’s a lot of important work for something that has such a nebulous definition.
Essentially, you need to think of as many possible ways as you can that the AI agent could go horribly wrong. Then build in logic or checks in the workflow to prevent that from happening. Without guardrails, AIs have done everything, from sending private data out in a mass email to spouting racial epithets and ignoring all job applicants who are female.
AI has no ethics, conscience, or understanding of right and wrong. It can’t distinguish between real and made-up. Guardrails at every level–from data preprocessing to output checking–are not optional. A lot of testing before pushing an agent out to production is also needed, because Murphy’s Law still applies, along with Finagle’s Law and the Douglas Adams corollary. You might have thought of everything that could possibly go wrong, but many things that could not possibly go wrong probably will—and at the worst possible time. Security is also a big part of AI guardrails.
People would think anyone who tested an important application once in production was an idiot. Don’t make the mistake of testing an AI agent that way either. Just because the agent is checking itself doesn’t mean a human shouldn’t double-check it. Every agent needs testing, even reflective agents that check other agents’ output. NVIDIA offers some help with NeMo Guardrails, and there’s a lot of information on GitHub under guardrails-ai.
NOW ARCHITECT THE ACTUAL AGENTIC WORKFLOW
Once the foundation is solid, you can build your house on it. A lot of your choices as to the workflow of the agent will be based on use case. What does it need to do?
People think first of simple GenAI usage, such as having a large language model (LLM) transcribe or summarize a meeting. Agentic AI is AI with more complex tasks that require multiple steps. Often, the LLM breaks down the task, and that becomes the workflow.
Equally often, there’s a router with logic coded in that, upon receiving the initial request or the returned output of an action, directs action to the appropriate next step in the flow.
If you can provide simple code that works well as a router, then maybe do it with code. Simplicity is often a winner for long-term stability and predictability. Just because this is an AI workflow doesn’t mean every single step has to be done by AI.
This works best when an AI agent is built to do a specific type of task so you know in advance what the workflow should be. When building that router, always consider if actions can be done in parallel. If you need three pieces of information and a map to answer a prompt, there’s no reason why the router can’t send out three data searches and a map application tool trigger at the same time. It should then fire a final LLM step that combines those into a coherent reply.
MAKE THE LLM MODULE INTERCHANGEABLE
One of the most enduring principles in data architecture of any kind is DSOFU: Don’t screw over future you. One important way to be kind to your future self is to make everything in your architecture as modular and interchangeable as you can. That way, future you can pull out one outdated component and replace it with another without rebuilding everything connected to it.
This is especially true when it comes to LLM models. They need changing more often than your car’s oil. GenAI models are constantly being retrained and updated. In the development process, one model might perform better at the task you’re building your agent to do than another. You can only find that out if you swap out the models and test them. This is a lot easier if the models are interchangeable from the beginning.