What Can AI Agents Actually Do Today?
Concrete examples of what AI agents can do right now — from booking travel to writing reports to managing code. No hype, just real capabilities and honest limitations.
The short version
AI agents are not just a concept — they are working in real businesses today. This article walks through concrete examples of what agents can actually do right now: booking travel, managing email, writing reports, helping with code, and conducting research. For each example, we cover what works, what does not, and where human oversight is still essential.
A quick reminder: what makes it an agent?
As we covered in What Is an AI Agent and How Is It Different from a Chatbot?, an agent is software that can take real-world actions — not just generate text. It connects to external services, breaks down goals into steps, and executes those steps with varying degrees of independence.
The examples below are all things agents can do today, in early 2026. Not things that might be possible one day — things that are working now.
Example 1: Managing email
What the agent does: Reads your inbox, identifies emails that need a response, drafts replies based on your past writing style and the context of the conversation, and flags anything urgent.
What works well: Agents are genuinely good at triaging email. They can sort messages by priority, summarise long threads into a few sentences, and draft responses to routine inquiries (scheduling requests, status updates, acknowledgements). For someone who receives dozens of emails a day, this can save an hour or more.
Where it falls short: Agents can misjudge tone. A message that sounds routine might actually be a sensitive client complaint. Agents also struggle with emails that require real judgement — negotiations, bad news, or anything politically nuanced.
Human oversight needed: Review drafted responses before sending. Never let an agent send emails on your behalf without a review step, especially for external communications.
Example 2: Booking travel
What the agent does: Searches for flights and hotels based on your preferences (budget, timing, airline loyalty programs), compares options, and books the best match. It can also monitor prices and rebook if a better deal appears.
What works well: Agents excel at the tedious comparison work that makes travel planning time-consuming. They can check multiple airlines, apply your preferences consistently, and handle the booking process across different platforms.
Where it falls short: Agents can miss nuances — a layover in a city where you need a transit visa, a hotel that is technically close to the conference but across a river with no bridge. They also do not understand personal preferences that you have not explicitly stated.
Human oversight needed: Review and approve bookings before payment is processed. Set spending limits so the agent cannot book a first-class flight when you meant economy. Tools like Multicorn Shield let you set per-action and daily spending caps for exactly this situation.
Example 3: Writing and sending reports
What the agent does: Pulls data from connected sources (spreadsheets, databases, project management tools), analyses trends, writes a formatted report, and distributes it to a specified list of recipients on a schedule.
What works well: Agents handle the mechanical parts of reporting reliably — gathering data, formatting tables, generating summaries, and emailing the result on time. Weekly status reports, monthly sales summaries, and daily metrics digests are all well-suited to agent automation.
Where it falls short: Agents summarise data but do not understand context the way a human does. A 15% drop in sales might be alarming, or it might be expected because of a seasonal pattern the agent does not know about. The written analysis can be superficially correct but miss what actually matters.
Human oversight needed: Review the analysis and conclusions before distribution. Use the agent for the first draft and the data gathering, but add your own interpretation before sharing with stakeholders.
Example 4: Helping with code
What the agent does: Writes code based on specifications, reviews existing code for bugs and improvements, generates tests, updates documentation, and handles repetitive refactoring tasks.
What works well: Coding agents are among the most mature agent categories. They are genuinely useful for generating boilerplate code, writing unit tests, explaining unfamiliar codebases, and handling mechanical tasks like renaming a variable across dozens of files.
Where it falls short: Agents can introduce subtle bugs that pass initial review, especially in complex logic or edge cases. They sometimes generate code that works but does not follow the team's conventions. They can also hallucinate API methods that do not exist (see AI hallucinations).
Human oversight needed: Always review agent-generated code before merging it. Run tests. Check that the solution actually handles edge cases. Treat agent-generated code the same way you would treat code from a new team member — helpful, but requiring review.
Example 5: Research and analysis
What the agent does: Searches the web, reads documents, compiles findings, compares sources, and produces a structured summary. Some agents can monitor specific topics over time and alert you when something relevant is published.
What works well: Agents are excellent at the initial phase of research — casting a wide net, reading quickly, and organising what they find into a structured format. Tasks like "Find the five most recent studies on remote work productivity and summarise each in three sentences" are well within their capabilities.
Where it falls short: Agents can struggle to evaluate source quality. They may treat a blog post with the same weight as a peer-reviewed study. They can also miss context that a human researcher would catch — like knowing that a particular source has a known bias or that a study was later retracted.
Human oversight needed: Use the agent for the initial gathering and organisation. Evaluate the sources yourself before drawing conclusions or making decisions based on the research.
The pattern across all examples
You may have noticed a consistent pattern. In every example:
- The agent handles the mechanical, time-consuming parts well — searching, gathering, formatting, drafting, scheduling.
- The agent struggles with judgement, nuance, and context — tone, quality assessment, edge cases, political sensitivity.
- Human oversight is needed before any action that is hard to undo — sending an email, completing a purchase, publishing a report, merging code.
This pattern will not change overnight. Agents are getting more capable, but the need for human-in-the-loop oversight is not going away. The best approach is to use agents for what they do well and build in checkpoints where humans review before irreversible actions happen.
Why permissions matter more than ever
Every example above involves an agent that has access to real services — email, calendars, payment systems, code repositories. The more services an agent connects to, the more damage it can do if something goes wrong.
This is why Multicorn Shield exists. It provides the control layer between your agents and the services they connect to: consent before access, granular permissions, spending limits, activity logs, and instant revocation. Whether you are using agents for email, travel, reports, coding, or research, Shield gives you the tools to stay in control.
Key takeaways
- AI agents are working in real businesses today for email management, travel booking, report writing, coding assistance, and research.
- Agents excel at mechanical, time-consuming tasks and struggle with nuance and judgement.
- Human oversight is essential before any agent action that is difficult to undo.
- The more capable agents become, the more important permission controls and activity logging become.
- Build agents into your workflow for what they do well, and keep humans in the loop for what they do not.
Stay up to date with Multicorn
Get the latest articles and product updates delivered to your inbox.