What are LLM Agents? Why are they considered the future of AI?

Reading Time: 5 minutes

The world of AI and LLM is very dynamic. It has never been in the history of technology that so many large technology companies and startups are investing so much time and resources in AI research.

It seems every day we are getting a new LLM or a new update on LLMs.

The latest is LLM Agents.

So, what are LLM Agents and why should we care about them?

Let us take this scenario, you want to reply to an email in a polite and professional tone.

We all know ChatGPT can generate that reply for you. However, there is a problem it cannot read the email from our inbox.

We have to paste the actual email for ChatGPT’s reference, right?

Now, consider a scenario, where you just ask an LLM that you have received an email and can you create a polite and professional email reply.

After this instruction, an LLM will access your inbox, read that email, create a response and store it in a draft for your review.

Cool, isn’t it? Well, that’s what LLM Agents can do.

The above example is also a basic functionality of an LLM agent. Let’s take another case, where in the email, the person has asked – what are large language models.

In order to answer this question, the agent then decides that it has to access Google. So, it will access Google with a search team, get the answer and based on this answer, it will draft a reply.

If we define LLM agents – they can process complex tasks, interact with software applications in an external environment and provide assistance to their users.

Check out this LLM Agent that can answer current affairs questions by accessing Google – https://www.aimletc.com/gk-tool-connect-chatgpt-with-google-search/

Image Source – https://gathnex.medium.com/how-to-create-your-own-llm-agent-from-scratch-a-step-by-step-guide-14b763e5b3b8

What changes from early Transformers to the Latest LLM Agents?

If we want to understand in simple terms, with Transformers, AI models started understanding natural languages.

Transformers then were trained on huge data, they acquired world knowledge and became Large Language Models that could generate responses.

LLMs then were trained on chatting data and RLHF were performed and they acquired the capabilities of generating like humans.

Take this free course, if you are not familiar with LLMs – https://www.aimletc.com/free-course-introduction-to-large-language-models/

And therefore we got ChatGPT.

However, LLMs still did not have reasoning capabilities and they work in isolation.

What if they start interacting with other software/ applications, and can reason? They can do complex tasks and become more helpful.

Image Source – https://developer.nvidia.com/blog/building-your-first-llm-agent-application/

What can LLM agents do?

Let us take another example. You are planning to visit Paris and want to build a 7-day itinerary.

You can obviously go to ChatGPT or any other LLMs and they can help you with that.

However, is a 7-day itinerary enough to plan a trip? What if you also want some more information like

What is the weather like when you want to visit?
Hotels that are within a certain budget?
Food options within a certain distance from the hotel?

To answer these questions, LLMs need to access multiple other tools like a weather app, hotel booking site, Google, etc.

LLMs need to have reasoning capabilities to break down this complex problem into multiple steps and then perform all the steps one by one taking output from previous steps.

Now that’s what LLM Agents can do.

They first reason through the problem and decide what to do, then they decide which tool ( weather app, hotel booking sites, Wikipedia, Maths, Google, etc) to use and then access them to help the user.

In our case, the LLM agent will first access the weather app and based on the weather decide if the trip should be planned or not.

If yes, it will access the hotel booking site at its disposal and shortlist a few hotels within the specified budget.

Now, the Agent will store this information in its memory and then access Google to figure out eating options near each hotel.

Based on its research agent can generate a complete itinerary with hotel & food options.

How do LLM Agents do what they do?

Image Source: https://developer.nvidia.com/blog/introduction-to-llm-agents/

LLM Agents have 3 elements

Planning Module

To solve a complex problem, it needs to be divided into smaller manageable tasks. It also needs to design a workflow. LLM agents do the same with the help of their planning module

Tools

Let us take an example of a Plumber. He/ she has multiple tools to perform his/ her tasks. Similarly, LLM agents also have many tools at their disposal like Wikipedia, Google, Math etc which they can access to get their job done

Agent Core

It is like a central functioning unit that does all the decision-making. It oversees the logic, manages the end objective/ goal and provides the instruction.

Pretty cool stuff, right?

So, do Agents interact with other software applications only? Or Agents can interact with other agents as well?

Multi Agent System

The answer to the previous question is Yes!

Agents can interact with other agents and perform other functions and this is known as Multi Agent System.

There are 2 ways in which Agents can interact with each other:

Co-operative
Adversarial

As the name suggests, in a Cooperative environment, agents seek collaboration while Adversarial work on the game theory concepts and the interaction includes debate and argumentation.

In the next article, I will cover Multi Agent System in detail, how to evaluate LLM Agents and other exciting stuff about LLM Agents.

Check out the cool list of notable LLM agents – https://www.promptingguide.ai/research/llm-agents#notable-llm-based-agents

Conclusion

LLM Agents are considered to be the step towards achieving AGI. Well, whether we will achieve AGI is still a debatable topic but LLM Agents are indeed useful because of their reasoning ability, memory and interaction with other applications.

In case you are looking to learn AI + LLM in a very simple language in a live online class from an instructor, check out the details here

Featured image source

Post Views: 398