Copilot Studio turns to AI-powered workflows

By Simon Bisson

Last time I looked at Copilot Studio, it was a way of extending the original Power Virtual Agents tools to incorporate generative AI and support more general conversational interactions. By using Azure OpenAI tools to work with additional data sources, Copilot Studio became a much more flexible tool with improved language understanding capabilities.

At Build 2024 Microsoft charted a new path for Power Platform’s AI capabilities, aligning them with the platform’s low code and no code development tools and adding support for Power Automate flows and connectors. It’s a big shift for the Power Platform, but one that takes advantage of Microsoft’s adoption of generative AI as a tool for building and running autonomous agents.

The new preview of Copilot Studio thus amounts to a complete refactoring of Power Platform’s AI strategy, evolving beyond chatbots to AI-orchestrated workflows. While chatbots are still supported, there is now much more to build in Copilot Studio’s web-based, no-code development canvas.

Putting the robot into RPA

At the heart of the new Copilot Studio is an improved understanding of how models like GPT 4.0 can work with structured interface descriptions, like those used by OpenAPI to dynamically generate requests and to parse and format responses using natural language. Here, instead of using OpenAPI, generative AI is being used to orchestrate existing and new Power Platform connectors, allowing you to converse with your agent and see its responses in any supported chat client.

There’s a lot to like in this approach. Working with long transactions has always been a problem, and the semantic memory tools at the heart of AI-driven workflows are a promising solution, especially when they’re used to keep the human in the loop.

The most important aspect of this redesign is the planned ability to use a trigger to run a flow that encompasses a series of different AI-driven tasks. Instead of being one-shot chat tools, they’re now a way to manage long-running transactions, modifying steps based on the last set of results.

How this works is straightforward. Let’s say you’re triggering a set of actions based on an incoming email. You can use Copilot Studio to map out a workflow, launched by an event in the Microsoft Graph. This might involve pulling in the details of the sender from Dynamics 365, automatically generating a response based on the incoming mail and the sender’s CRM interaction history, and sending a message to Teams, detailing the actions taken and listing possible follow-ups that require human intervention.

That’s a very different set of actions from those managed by the first generation of the Power Platform’s AI assistant. It’s now a way of working with those long-running, almost ad hoc workflows that require managing information between operations as well as a mix of automated and manual actions. By using an agent to manage this, not only can we send natural language responses based on application data, we can also route notifications and interactions to the right person.

Building agents in Copilot Studio

This requires integration with Microsoft’s various clouds, especially the Microsoft Graph and the Power Platform’s Dataverse. As a result, the latest generation of Copilot Studio leans into the flow design metaphor of Power Automate. Instead of building chat apps (or rather, as well as building chat apps), we’re now using AI to manage and control workflows. In fact, what we’re doing is using Copilot Studio with existing Power Automate flows, so you can build AI into existing business processes.

Flows are treated as one of a set of available actions: conversational, connectors, flow, and prompts. Conversational actions are one of the more interesting new features, working like ChatGPT plug-ins or skills in Semantic Kernel. They behave like a Copilot Studio topic, but instead of connecting to content, they allow your Copilot Studio agent to access APIs and external data. You can even hook them up to custom code and business logic, mixing traditional enterprise software development techniques with no-code AI.

Grounding with connectors and real data

One of the more important new features is the Copilot connector. Much like the connectors used in Power Apps and Power Automate, these link your application to external data and APIs. Tools like this are more important in an AI-based application, especially one using generative AI, as they provide the necessary grounding to reduce the risk of out-of-control outputs.

Usefully Copilot Studio can use existing Power Platform connectors, extending what Microsoft describes as its “knowledge.” This is a set of information sources that include the existing chatbot tools and Microsoft information sources like Dataverse and Fabric, as well as using the Dataverse as a way of preparing data from other enterprise sources for use in RAG (retrieval augmented generation)-driven outputs. There are limits on what you can use, with only two Dataverse sources per application (and only fifteen tables in each source). Custom data from line-of-business applications is imported as JSON, ready for use.

That may seem to be not very much data, but you’re not using Copilot Studio to build and run full-scale autonomous applications; those really require working with frameworks like Azure AI Studio’s Prompt Flow.

Adding a connector to an agent in Copilot Studio is simple enough. Start by adding knowledge to your application, adding an enterprise connection. These connections inherit the permissions of the user, ensuring that users get results without breaking security boundaries. This approach is essential if you’re building AI applications for regulated industries.

AI-powered workflow with conversational actions

Things get more interesting when you start to use conversational actions in your applications. This is where the underlying agent starts exhibiting autonomous behaviors, by parsing a user’s request and using it to construct an orchestration across a known set of actions, connections, and components, before using generative AI to assemble a natural language response.

Here the user’s request is an orchestration prompt that is used to start the interaction. In a future release the underlying system will use its knowledge of the APIs it is using to request additional information, as necessary. For now, however, you’re limited to a useful, if basic, way of adding a natural language extension to an existing AI application that you’ve already built and tested in Copilot Studio.

All you need to do is edit your application, adding an extension or an action, choosing a conversational action. You will then need to set some basic configurations before editing the action. A trigger is a prompt that defines the action, describing what it’s used for. This is used to determine when and how that action is invoked.

Once you have the trigger in place, you can then build the action. This is a process flow that has no UI. Microsoft’s editing tool won’t show any user interaction components, ensuring that the process runs inside your copilot and doesn’t interrupt its flow. Once published, you can add the action to the Microsoft 365 Copilot catalog, where it’s treated as a plugin and activated as part of a user conversation with the copilot.

The cost of the upgraded Copilot Studio is surprisingly low. As it’s a background service, it’s not licensed per-user, but uses a per-message pricing, with 25,000 messages for $200 per month. A message is a request that triggers a response, with a message that requires generative AI operations counting as two standard messages. It’s not clear how you can purchase additional capacity at this point. There is an alternative $30-per-user option for use with Microsoft 365 only.

The initial release of Copilot Studio gave us an uncomplicated way to build chatbots, infusing existing technologies with generative AI. This new update, now in preview, goes much further, linking modern AI tools to process automation, offering the promise of no-code development of autonomous agents. Mixing familiar techniques with AI-powered orchestration allows the current generation of AI tools to do what they do best: working with well-defined, semantically rich APIs, and delivering results in a human-friendly format.