Building AI Agents: 9 Lessons We Learned Since 2023
Discover nine lessons we have learned from building AI agents, from fine-tuning models to seamless integration, scalability, and human-AI collaboration.
The AI landscape is evolving faster than ever, and building AI Agents for business has emerged as one of the most transformative areas in artificial intelligence.
At Multimodal, we’ve worked on numerous client projects, helping enterprises deploy AI Agents designed to handle more complex tasks, automate workflows, and deliver tangible impact. Along the way, we’ve learned some key lessons that shape how we handle agent development and ensure our products deliver value.
Hands-on experience and real-world challenges drive our approach, sometimes diverging from popular industry opinions.
Here are nine lessons we’ve learned about building AI Agents grounded in real-world scenarios, customer feedback, and practical experience.
Lesson 1: AI Must Must Be Governed by Proper Guardrails
In high-stakes environments like banking and insurance, proper guardrails for AI systems are non-negotiable. These industries have sensitive workflows where the risk and impact of failure are significant.
To address these challenges, we’ve especially started prioritizing three core AI Agent features: explainability, confidence scores, and self-learning abilities.
Why Explainability, Confidence Scores, and Self-Learning Matter
For sensitive workflows, features like explainability and confidence scores are vital. It's important to understand several things:
Where did the answer come from?
How did the AI Agent arrive at a particular decision?
How confident is the AI Agent about the decision?
While explainability shows you why the AI made a decision, confidence scores tell you how sure it is about it.
Self-learning capabilities, on the other hand, enable the AI to improve over time by learning from new data, feedback, and interactions.
All three features work together to ensure the AI Agent operates reliably, adapts to new challenges, and builds trust in high-stakes environments.
As Ankur Patel, Founder & CEO @ Multimodal, explains:
“Explainability, confidence scores, and self-learning capabilities ensure that human intervention can improve AI Agents when needed.”
These are non-negotiable key metrics in regulated industries like banking or insurance, where failure can have significant consequences. This complexity is what we excel at.
Balance Reasoning with Context-Specific Knowledge
The key is to balance reasoning with grounding in the right data.
This ensures that AI Agents align with the specific needs of a company’s departments, creating a balance between logic and domain expertise. By fine-tuning AI Agents on customer-specific workflows and internal raw data, we ensure their reasoning is both accurate and actionable. And that brings us to our second lesson.
Lesson 2: Context is Everything
The right decision requires the right context at the right time. AI Agents need to reason and problem-solve effectively, but their capabilities can fall short without a foundation of context-specific knowledge.
The quality and relevance of the data an AI Agent uses can make the difference between success and failure. Training data must reflect the business's specific operational context.
However, more data doesn’t equal good data, so it is essential that you prepare your data properly. Without the right data, even the most advanced AI tools can fail.
AI Agents must be trained on internal data to ensure they make decisions that align with company policies, industry regulations, and customer expectations. Fine-tuning ensures decisions are accurate and grounded in the business context.
Key Qualities of “Right Context”
AI Agents must operate on relevant, complete, and up-to-date data to make accurate decisions:
Relevance: Data must directly support the task. Irrelevant or unrelated data can lead to inefficiencies or errors.
Completeness: Missing data creates gaps that undermine decision-making. AI Agents must have access to all required pieces of information to function optimally.
Timeliness: Outdated data reduces effectiveness and reliability, especially in fast-changing industries. Agents need access to the latest data to ensure decisions align with current realities.
By balancing reasoning capabilities with high-quality, context-specific, and AI-ready data, we ensure our AI Agents can deliver actionable and trustworthy outputs.
Lesson 3: Fine-Tuning Improves, Not Worsens, How an AI Agent Performs
There’s a notion that fine-tuning AI models for specific tasks can limit their ability to reason independently. This is true to an extent.
But you need a combination of really narrowly sculpted, fine-tuned LLMs that are very good at specific tasks. The ones orchestrating between specialized LLMs can be a bit more broadly trained.
Essentially, agent workflows consist of a mixture of broad and specific LLMs. We will touch upon this more in the lesson below.
Our experience has shown that building an AI Agent is not about picking an off-the-shelf solution. AI Agents are built, not bought. You want to focus on crafting a tailored system that meets your business's unique needs. Fine-tuning has consistently elevated the relevance and accuracy of our agents.
And so our cornerstone approach is to fine-tune models with internal data. Unlike the one-size-fits-all mentality, we focus on creating agents fine-tuned to client-specific workflows.
Fine-tuning is a foundation for building AI Agents that are both intelligent and relevant.
By adapting models to align with client workflows and unique training data, fine-tuning allows agents to perform more effectively in real-world scenarios. Whether automating insurance underwriting or optimizing supply chains, customization has proven to be a competitive advantage. This has been particularly critical in high-stakes environments where agents must rely on specific operational contexts.
Lesson 4: Balance Both Broad and Specialized LLMs
We have found that combining broad and specialized LLMs is key to building robust AI Agents. Specialized LLMs handle niche tasks, like fraud detection or customer inquiries, while broader models support overarching orchestration and reasoning.
Ankur often emphasizes the need for a mix of broad and specific LLMs to manage agentic workflows. This hybrid approach allows us to build custom agents designed for flexibility and depth, particularly in complex, multi-step specific processes.
Models Are Constantly Improving—And So Must Our Approach to Fine-Tuning
AI models are continuously advancing, and staying ahead requires careful consideration of fine-tuning strategies.
While it’s true that foundational models like GPT-3.5 and GPT-4 have shown remarkable leaps in capabilities, the challenge lies in leveraging these improvements effectively for specific tasks.
As Ankur explained:
“The agent does get better over time as you're doing fine-tuning. So that is really critical.”
As models grow more powerful, they require nuanced approaches to align with specific tasks. They also need to minimize hallucinations and biases, and adapt to increasingly complex workflows and data environments.
Lesson 5: Seamless Integration Is Essential for Performance
One of the most critical lessons we’ve learned is the importance of integrating AI Agents seamlessly into enterprise systems. From CRMs and internal databases to document management tools, the systems where data resides often dictate the agent’s utility.
Seamless integration with tools and systems ensures AI Agents can execute tasks efficiently.
Effective integration ensures that AI Agents can gather data and act across diverse platforms. AI Agents can pull data from sources like emails, CRMs, and SharePoint and then analyze data and synthesize insights for decision-making.
For example, in banking, agents might need to retrieve context from email threads or internal knowledge bases and execute actions in specialized workbenches like FIS.
This ability to 'latch onto' various workbenches ensures that agents access diverse sources of context and take the right actions in the appropriate platforms. Without this capability, workflows would fragment, leading to inefficiencies or missed opportunities for accurate decision-making.
We design interfaces that integrate directly into CRMs, internal databases, and other enterprise tools. This interaction between the AI Agent and computer systems is critical—not just for accessing data but also for executing actions and delivering contextually relevant responses across different user interactions and diverse platforms.
Lesson 6: The Competitive Advantage Lies in Both Agents and Infrastructure
There’s a growing belief that infrastructure—security, data connectors, and user interface—matters more than the agent. While infrastructure is undeniably critical, the AIAgent’s capabilities remain central to its effectiveness.
For us, the agent is where the magic happens, particularly when it’s fine-tuned for a client’s specific needs. Paired with a robust AI infrastructure, our agents deliver seamless performance and continuously improve through self-learning and feedback loops.
Lesson 7: Open Source Frameworks Are a Starting Point, Not the Finish Line
Open-source frameworks like LangChain or LlamaIndex provide valuable starting points. They offer pre-built tools and libraries that speed up initial development. Developers frequently use them for experimentation or prototyping.
However, they often fall short in production environments due to limitations in scalability, flexibility, and the ability to meet particular business needs. In our experience, custom solutions are often necessary to meet these complexities of real-world tasks.
For example, open-source frameworks may lack advanced integration capabilities with proprietary systems or struggle to handle enterprise environments' unique workflows and security requirements. They can also present debugging challenges, as the abstraction layers they introduce can make it harder to trace errors or customize behaviors for edge cases.
“We use open-source frameworks as a starting point,” Ankur said, but to meet the complexity of real-world needs, you often need to design your own libraries and layers.
While we frequently build on top of open-source tools, adding layers of customization ensures our agents are robust, scalable, and tailored to client needs. This approach combines the agility of open-source development with the reliability of proprietary enhancements.
Lesson 8: Adaptability and Human-In-The-Loop Are Non-Negotiable
AI Agents don’t operate in isolation—they work alongside human operators. Effective integration includes creating seamless interfaces between AI and humans, allowing operators to step in when agents encounter challenges. User feedback enhances learning and creates a cycle of continuous improvement.
Another important and overlooked issue is change management. It ensures that both human operators and AI systems work together smoothly when workflows evolve or new challenges emerge. This includes:
establishing clear protocols for introducing updates to AI agents,
integrating new security measures,
managing the handoff between human and machine decision-making.
For example, when introducing new AI capabilities, organizations should prepare employees through training and provide guidelines for how and when human intervention should occur. By planning for these transitions, businesses can minimize disruptions and maintain trust in their systems.
Human interaction is as critical as computer interfaces, especially in workflows requiring adaptability. Seamless integration between AI systems and human operators allows businesses to effectively manage challenges and maintain workflow continuity.
That is why we’ve developed scalable AI Agents that seamlessly integrate human intervention into workflows and adapt over time.
As we’ve mentioned, to ensure scalability from day one, our platform layers into the foundation:
confidence scores,
explainability,
and self-learning capabilities.
The lessons we’ve learned as we build AI Agents highlight the complexity and opportunity of this field. Whether fine-tuning models, integrating multimodal capabilities, or balancing human-AI collaboration, the key is creating flexible, reliable systems tailored to specific business needs.
The best advice we can give to enterprises considering implementing agentic AI is to start early and experiment with a build-for-scale mindset.
The Real Lessons To Build AI Agents
AI Agents have immense potential to transform industries by automating routine tasks, improving decision-making, and delivering personalized interactions. But the key to leveraging their full capabilities lies in thoughtful design, seamless integration, and adaptability to real-world challenges.
The field of AI Agents is complex, but the lessons we’ve learned so far highlight the importance of balance between:
reasoning and domain expertise,
generalization and specialization,
and automation and human collaboration.
As Ankur summed up, “We are seeing our customers get massive ROI. This is the killer application for generative AI in business,“ of course, when agentic AI is designed and deployed with scalability, security, and adaptability in mind.
Unlock AI Agents’ Full Potential To Transform Your Business Operations
If you’re looking to build an AI Agent tailored to your business, our experience can help you confidently navigate this journey. Ready to see our AI Agents live? Schedule a free 30-minute call with our experts to get started.