Stagehand comes with 4 primary APIs that can be directly used as tools:

  • goto(): navigate to a specific URL
  • observe(): observe the current page
  • extract(): extract data from the current page
  • act(): take action on the current page

These methods can be used as tools in AgentKit, enabling agents to browse the web autonomously.

Below is an example of a simple search agent that uses Stagehand to search the web:

import { createAgent, createTool } from "@inngest/agent-kit";
import { z } from "zod";
import { getStagehand, stringToZodSchema } from "./utils.js";

const webSearchAgent = createAgent({
  name: "web_search_agent",
  description: "I am a web search agent.",
  system: `You are a web search agent.
  `,
  tools: [
    createTool({
      name: "navigate",
      description: "Navigate to a given URL",
      parameters: z.object({
        url: z.string().describe("the URL to navigate to"),
      }),
      handler: async ({ url }, { step, network }) => {
        return await step?.run("navigate", async () => {
          const stagehand = await getStagehand(
            network?.state.kv.get("browserbaseSessionID")!
          );
          await stagehand.page.goto(url);
          return `Navigated to ${url}.`;
        });
      },
    }),
    createTool({
      name: "extract",
      description: "Extract data from the page",
      parameters: z.object({
        instruction: z
          .string()
          .describe("Instructions for what data to extract from the page"),
        schema: z
          .string()
          .describe(
            "A string representing the properties and types of data to extract, for example: '{ name: string, age: number }'"
          ),
      }),
      handler: async ({ instruction, schema }, { step, network }) => {
        return await step?.run("extract", async () => {
          const stagehand = await getStagehand(
            network?.state.kv.get("browserbaseSessionID")!
          );
          const zodSchema = stringToZodSchema(schema);
          return await stagehand.page.extract({
            instruction,
            schema: zodSchema,
          });
        });
      },
    }),
    createTool({
      name: "act",
      description: "Perform an action on the page",
      parameters: z.object({
        action: z
          .string()
          .describe("The action to perform (e.g. 'click the login button')"),
      }),
      handler: async ({ action }, { step, network }) => {
        return await step?.run("act", async () => {
          const stagehand = await getStagehand(
            network?.state.kv.get("browserbaseSessionID")!
          );
          return await stagehand.page.act({ action });
        });
      },
    }),
    createTool({
      name: "observe",
      description: "Observe the page",
      parameters: z.object({
        instruction: z
          .string()
          .describe("Specific instruction for what to observe on the page"),
      }),
      handler: async ({ instruction }, { step, network }) => {
        return await step?.run("observe", async () => {
          const stagehand = await getStagehand(
            network?.state.kv.get("browserbaseSessionID")!
          );
          return await stagehand.page.observe({ instruction });
        });
      },
    }),
  ],
});

These 4 AgentKit tools using Stagehand enable the Web Search Agent to browse the web autonomously.

The getStagehand() helper function is used to retrieve the persisted instance created for the network execution.

For a complete example of a simple search agent using Stagehand, check out the Simple Search Agent using Stagehand example.

Was this page helpful?