Agent

Overview

The Agent provides autonomous, multi-step task execution using AI. It can navigate websites, interact with elements, extract data, and complete complex workflows.

Creating an Agent

Create an agent using stagehand.agent():

const agent = stagehand.agent(config?);

config

AgentConfig

Show properties

model

string | AgentModelConfig

Model to use for agent reasoningFormat: "provider/model" (e.g., "openai/gpt-4o")

executionModel

string | AgentModelConfig

Model for tool execution (observe/act calls)Inherits from model if not specified

systemPrompt

string

Custom system prompt for the agent

mode

'dom' | 'hybrid' | 'cua'

Tool mode:

dom: Semantic tools (act, fillForm) - default
hybrid: Coordinate-based tools (click, type)
cua: Computer Use Agent (screenshot-based)

Default: "dom"

stream

boolean

Enable streaming modeDefault: false

integrations

Client[] | string[]

MCP (Model Context Protocol) integrations

tools

ToolSet

Custom tools for the agent

returns

NonStreamingAgentInstance | StreamingAgentInstance

Agent instance with execute() method

execute()

Execute a task with the agent.

Non-streaming Mode

const result = await agent.execute(instructionOrOptions);

instructionOrOptions

string | AgentExecuteOptions

required

Task instruction or options object

Show AgentExecuteOptions properties

instruction

string

required

Task description in natural language

maxSteps

number

Maximum number of stepsDefault: 100

page

Page | PlaywrightPage | PuppeteerPage

Page to operate on (defaults to active page)

highlightCursor

boolean

Show cursor overlay during execution

messages

ModelMessage[]

Previous conversation messages to continue from

signal

AbortSignal

Signal to cancel execution

excludeTools

string[]

Tool names to exclude (not supported in CUA mode)Available tools:

DOM mode: act, fillForm, ariaTree, extract, goto, scroll, keys, navback, screenshot, think, wait, done, search
Hybrid mode: click, type, dragAndDrop, clickAndHold, fillFormVision, act, ariaTree, extract, goto, scroll, keys, navback, screenshot, think, wait, done, search

output

StagehandZodObject

Zod schema for structured output

variables

Variables

Variables for form filling and typingRequires experimental: trueFormat: { name: value } or { name: { value, description } }

callbacks

AgentExecuteCallbacks

Show properties

onStepFinish

GenerateTextOnStepFinishCallback

Called after each LLM step completes

prepareStep

PrepareStepFunction

Called before each step to modify settings

onSafetyConfirmation

SafetyConfirmationHandler

Handle safety checks (CUA mode only)

returns

Promise<AgentResult>

Show AgentResult properties

success

boolean

required

Whether the task completed successfully

message

string

required

Result message

actions

AgentAction[]

required

Array of actions taken

completed

boolean

required

Whether the agent finished naturally

metadata

Record<string, unknown>

Additional metadata

usage

object

Token usage statistics

Show properties

input_tokens

number

output_tokens

number

reasoning_tokens

number

cached_input_tokens

number

inference_time_ms

number

messages

ModelMessage[]

Conversation messages (for continuation)

output

Record<string, unknown>

Structured output data (if output schema provided)

Streaming Mode

When stream: true is set in AgentConfig:

const streamResult = await agent.execute(options);

// Access the text stream
for await (const chunk of streamResult.textStream) {
  console.log(chunk);
}

// Get final result
const result = await streamResult.result;

options.callbacks

AgentStreamCallbacks

Show properties

onStepFinish

StreamTextOnStepFinishCallback

prepareStep

PrepareStepFunction

onChunk

StreamTextOnChunkCallback

Called for each stream chunk

onFinish

StreamTextOnFinishCallback

Called when stream finishes

onError

StreamTextOnErrorCallback

Called on error

onAbort

Function

Called when aborted

returns

Promise<AgentStreamResult>

Stream result with textStream and result properties

Examples

Basic Agent

import { Stagehand } from "@browserbasehq/stagehand";

const stagehand = new Stagehand({ env: "LOCAL" });
await stagehand.init();

const page = await stagehand.context.newPage();
await page.goto("https://github.com");

// Create and execute agent
const agent = stagehand.agent();
const result = await agent.execute(
  "Find the most starred TypeScript repository"
);

console.log(result.message);
console.log(`Steps taken: ${result.actions.length}`);

await stagehand.close();

Agent with Structured Output

import { z } from "zod";

const agent = stagehand.agent();

const result = await agent.execute({
  instruction: "Find flight information from NYC to LA",
  output: z.object({
    price: z.string().describe("Flight price"),
    airline: z.string().describe("Airline name"),
    departureTime: z.string(),
  }),
});

console.log(result.output);
// { price: "$299", airline: "Delta", departureTime: "8:00 AM" }

Agent with Variables

const agent = stagehand.agent();

const result = await agent.execute({
  instruction: "Fill out the login form and submit",
  variables: {
    username: {
      value: "user@example.com",
      description: "Login email",
    },
    password: {
      value: "secret123",
      description: "Account password",
    },
  },
});

Streaming Agent

const agent = stagehand.agent({ stream: true });

const streamResult = await agent.execute({
  instruction: "Search for restaurants near me",
  callbacks: {
    onChunk: async (chunk) => {
      process.stdout.write(chunk.text);
    },
    onStepFinish: async (step) => {
      console.log("\nStep completed:", step.toolCalls);
    },
  },
});

const result = await streamResult.result;
console.log("\nFinal result:", result.message);

Agent with Abort Signal

const controller = new AbortController();

// Abort after 30 seconds
setTimeout(() => controller.abort(), 30000);

const agent = stagehand.agent();

const result = await agent.execute({
  instruction: "Complete this long task",
  signal: controller.signal,
});

if (!result.completed) {
  console.log("Task was aborted");
}

CUA Mode Agent

// Computer Use Agent mode with screenshot-based interaction
const agent = stagehand.agent({
  mode: "cua",
  model: "anthropic/claude-opus-4-5",
});

const result = await agent.execute({
  instruction: "Navigate to settings and enable dark mode",
  callbacks: {
    onSafetyConfirmation: async (checks) => {
      console.log("Safety checks:", checks);
      // Return { acknowledged: true } to proceed
      return { acknowledged: true };
    },
  },
});

Core Classes

Methods

Types & Schemas

Utilities

Overview

Creating an Agent

execute()

Non-streaming Mode

Streaming Mode

Examples

Basic Agent

Agent with Structured Output

Agent with Variables

Streaming Agent

Agent with Abort Signal

CUA Mode Agent

Core Classes

Methods

Types & Schemas

Utilities

Documentation Index

​Overview

​Creating an Agent

​execute()

​Non-streaming Mode

​Streaming Mode

​Examples

​Basic Agent

​Agent with Structured Output

​Agent with Variables

​Streaming Agent

​Agent with Abort Signal

​CUA Mode Agent

Overview

Creating an Agent

execute()

Non-streaming Mode

Streaming Mode

Examples

Basic Agent

Agent with Structured Output

Agent with Variables

Streaming Agent

Agent with Abort Signal

CUA Mode Agent