Control Tool Choice

Applies to:

Calling Tools With LLMs

When you use a language model with Arcade, you can control how it calls and returns the output of the available tools with the tool_choice parameter:

response = client.chat.completions.create(
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Star the ArcadeAI/arcade-ai repo on GitHub",
        },
    ],
    model="gpt-4o",
    user=user_id,
    tools=[
        "GitHub.SetStarred",
        "GitHub.CountStargazers",
    ],
    tool_choice="generate",
)

Arcade extends the OpenAI tool_choice parameter to accept these options:

execute: The model executes the tool and returns the raw output directly (usually a JSON object), without generating any natural language response.
generate: The model runs the tool and, instead of returning the tool’s raw output, it interprets the output and generates a natural language response, which may contain some information extracted from the tool’s output, depending on what the user asked for.

Tool calling patterns with Arcade

Whether to use execute or generate depends on how you want to use the tool’s output.

We will refer to Engine and Worker below. The Engine is responsible for authorizing and coordinating requests between users, LLMs, and tools. The Worker is the service that hosts tools and executes them, when requested by the Engine.

`tool_choice: execute`

The execute option empowers the model to run tools as if it were executing them directly. Arcade handles the tool execution behind the scenes and returns the results to the client.

Flow Overview:

Client Request: The client calls the AI model via the Arcade Engine.
Tool Definition: The Engine adds tool definitions to the request.
Model Prediction: The model predicts which tool to use and its arguments.
Tool Execution: The Engine sends the arguments to the appropriate Worker.
Result Return: The Worker executes the tool and returns results to the Engine.
Client Response: The Engine sends the results back to the client.

Example: Sending a Slack Message

Imagine a user wants to send a Slack message:

User Input: “Send a Slack message to John saying ‘Meeting at 3 PM’”
Model Prediction: Use the Slack.SendDmToUser tool with arguments:
- user_name: "john"
- message: "Meeting at 3 PM"
Tool Execution: The Engine forwards these arguments to the Worker that hosts the Slack toolkit. The worker then sends the message.
Response: The client receives the return value from the tool

This process happens seamlessly, with the client only seeing the initial request and final response.

`tool_choice: generate`

The generate option works like execute but adds a step where the Engine asks the model to create a response based on the tool’s results. This provides more refined output that incorporates the tool’s data.

Flow Overview:

Client Request: The client calls the AI model via the Arcade Engine.
Tool Definition: The Engine adds tool definitions to the request.
Model Prediction: The model predicts which tool to use and its arguments.
Tool Execution: The Engine sends the arguments to the appropriate Worker.
Intermediate Results: The Worker executes the tool and returns results to the Engine.
Response Generation: The Engine sends a second request to the model with the tool’s results.
Final Response: The model generates a response incorporating the tool’s output, and the Engine returns it to the client.

Example: Checking Calendar Availability

Suppose a user wants to know their availability for the next day:

User Input: “What’s my availability for tomorrow?”
Model Prediction: Use the Google.ListEvents tool for the specified date.
Tool Execution: The Engine requests the Worker hosting the Calendar toolkit to retrieve events for tomorrow.
Results: The Worker returns calendar data (e.g., three meetings scheduled).
LLM Response Generation: The Engine provides the calendar data to the LLM.
Response: The model generates: “You have 3 meetings tomorrow. You’re free from 9-10 AM, 12-2 PM, and after 4 PM.”
Client Receives: The summarized availability information.

By leveraging generate, you receive responses that are both informative and contextually rich.

Alternative values for `tool_choice`

Additionally, these options from OpenAI’s tool_choice parameter are supported, but are not commonly used:

none: Prevent the model from calling any tools and instead generates a message
auto: Let the model pick between generating a message or calling one or more tools, but does not execute the tool
required: Ensure the model selects at least one tool, but does not execute the tool

For backwards compatibility, auto and required only predict the tool choice, but do not run the tool. These options behave the same with or without Arcade.

Call Tools Directly Tool formats

Control Tool Choice

Tool calling patterns with Arcade

tool_choice: execute

tool_choice: generate

Alternative values for tool_choice

`tool_choice: execute`

`tool_choice: generate`

Alternative values for `tool_choice`