Skip to main content

Overview

The @onkernel/ai-sdk package provides Vercel AI SDK-compatible tools for browser automation powered by Kernel. This package exposes a Playwright execution tool that allows LLMs to browse the web, interact with websites, and perform automation tasks through natural language instructions. With this tool, AI agents can execute Playwright code on Kernel’s remote browsers, enabling powerful browser automation capabilities in your AI-powered applications.

Installation

Install the package along with its peer dependencies:
npm install @onkernel/ai-sdk zod
npm install ai @onkernel/sdk

Prerequisites

Before using the AI SDK tool, you’ll need:
  1. Kernel API Key - Obtain from the Kernel Dashboard or the Vercel Marketplace integration
  2. AI Model Provider - An API key for your chosen LLM provider (OpenAI, Anthropic, etc.)
  3. Kernel Browser Session - A running browser session created via the Kernel SDK

How It Works

The playwrightExecuteTool creates a Vercel AI SDK tool that:
  1. Accepts natural language instructions from an LLM
  2. Converts those instructions into Playwright code
  3. Executes the code on a Kernel remote browser
  4. Returns the results back to the LLM
This enables AI agents to autonomously browse websites, extract data, and perform complex automation tasks.

Usage with generateText()

The simplest way to use the AI SDK tool is with Vercel’s generateText() function:
import 'dotenv/config';
import { openai } from '@ai-sdk/openai';
import { playwrightExecuteTool } from '@onkernel/ai-sdk';
import { Kernel } from '@onkernel/sdk';
import { generateText } from 'ai';

// 1) Create Kernel client and start a browser session
const kernel = new Kernel({
  apiKey: process.env.KERNEL_API_KEY,
});

const browser = await kernel.browsers.create({});
const sessionId = browser.session_id;

console.log('Browser session started:', sessionId);
console.log('Browser session URL:', browser.browser_live_view_url);

// 2) Create the Playwright execution tool
const playwrightTool = playwrightExecuteTool({
  client: kernel,
  sessionId
});

// 3) Use with Vercel AI SDK
const result = await generateText({
  model: openai('gpt-5.1'),
  prompt: 'Open news.ycombinator.com and get the title of the top news story.',
  tools: {
    playwright_execute: playwrightTool,
  },
});

console.log('Result:', result.toolResults[0].output);

// 4) Clean up
await kernel.browsers.deleteByID(sessionId);

Usage with Agent()

For more complex, multi-step automation tasks, use the Vercel AI SDK’s Agent() class. Agents can autonomously plan and execute a series of actions to accomplish a goal:
import 'dotenv/config';
import { openai } from '@ai-sdk/openai';
import { playwrightExecuteTool } from '@onkernel/ai-sdk';
import { Kernel } from '@onkernel/sdk';
import { Experimental_Agent as Agent, stepCountIs } from 'ai';

const kernel = new Kernel({
  apiKey: process.env.KERNEL_API_KEY,
});

const browser = await kernel.browsers.create({});
const sessionId = browser.session_id;

console.log('Browser session started:', sessionId);
console.log('Browser session URL:', browser.browser_live_view_url);

// Initialize the AI agent with GPT-5.1
const agent = new Agent({
  model: openai('gpt-5.1'),
  tools: {
    playwright_execute: playwrightExecuteTool({
      client: kernel,
      sessionId
    }),
  },
  stopWhen: stepCountIs(20), // Maximum 20 steps
  system: `You are a browser automation expert. You help users execute tasks in their browser using Playwright.`,
});

// Execute the agent with the user's task
const result = await agent.generate({
  prompt: 'Go to news.ycombinator.com and get the titles of the top 3 posts.',
});

console.log('Agent response:', result.text);

await kernel.browsers.deleteByID(browser.session_id);

Tool Parameters

The playwrightExecuteTool function accepts the following parameters:
function playwrightExecuteTool(options: {
  client: kernel;     // Kernel SDK client instance
  sessionId: string;  // Existing browser session ID
}): Tool;

Tool Input Schema

The generated tool accepts the following input from the LLM:
{
  code: string;          // Required: JavaScript/TypeScript code to execute
  timeout_sec?: number;  // Optional: Execution timeout in seconds (default: 60)
}
Under the hood, the tool calls:
client.browsers.playwright.execute(sessionId, { code, timeout_sec })
So, any code you can run through the SDK can be run via the tool.

Additional Resources