Skip to main content
OpenSteer makes it easy to extract structured data from web pages using natural language descriptions and typed schemas.

Complete Example

import { Opensteer } from "opensteer";

async function run() {
  const opensteer = new Opensteer({
    name: "product-extraction",
    model: "gpt-5.1",
  });

  await opensteer.launch({ headless: false });

  try {
    await opensteer.goto(
      "https://kbdfans.com/search?type=product%2Cquery&options%5Bprefix%5D=last&q=tactile+switches",
    );

    console.log("Starting extraction...");
    const data = await opensteer.extract({
      description:
        "Extract the main product cards with title, price, image url, and url",
      schema: {
        products: [
          {
            title: "",
            price: "",
            imageUrl: "",
            url: "",
          },
        ],
      },
    });

    console.log(data);
  } finally {
    await opensteer.close();
  }
}

run().catch((err) => {
  console.error(err);
  process.exit(1);
});

Extraction Workflow

1. Configure the Model

const opensteer = new Opensteer({
  name: "product-extraction",
  model: "gpt-5.1",
});
Specify the LLM model to use for extraction. OpenSteer defaults to gpt-5.1, but you can use:
  • gpt-5.1 (default)
  • gpt-5-mini
  • Any model supported by your provider
You can also set the model via environment variable:
OPENSTEER_MODEL=gpt-5-mini

2. Navigate to the Target Page

await opensteer.goto(
  "https://kbdfans.com/search?type=product%2Cquery&options%5Bprefix%5D=last&q=tactile+switches",
);
Navigate to the page containing the data you want to extract.

3. Define Your Schema

schema: {
  products: [
    {
      title: "",
      price: "",
      imageUrl: "",
      url: "",
    },
  ],
}
Define the structure of the data you want to extract. The schema:
  • Uses empty strings as type placeholders for string fields
  • Supports arrays with [{ ... }] notation
  • Can include nested objects
  • Guides the LLM to extract data in the exact format you need

4. Extract with Description

const data = await opensteer.extract({
  description:
    "Extract the main product cards with title, price, image url, and url",
  schema: {
    products: [
      {
        title: "",
        price: "",
        imageUrl: "",
        url: "",
      },
    ],
  },
});
The description parameter tells the LLM:
  • What to look for on the page
  • Which elements to focus on
  • Any specific instructions about the extraction
The LLM returns data matching your schema structure:
{
  "products": [
    {
      "title": "Gateron Yellow Switches",
      "price": "$3.50",
      "imageUrl": "https://...",
      "url": "https://..."
    },
    {
      "title": "Durock T1 Tactile Switches",
      "price": "$6.00",
      "imageUrl": "https://...",
      "url": "https://..."
    }
  ]
}

Advanced Schema Patterns

Single Object

const data = await opensteer.extract({
  description: "Extract the hero section information",
  schema: {
    title: "",
    subtitle: "",
    ctaText: "",
    ctaHref: "",
  },
});

Nested Objects

const data = await opensteer.extract({
  description: "Extract article with author details",
  schema: {
    title: "",
    content: "",
    author: {
      name: "",
      bio: "",
      avatar: "",
    },
  },
});

Arrays of Primitives

const data = await opensteer.extract({
  description: "Extract all category names",
  schema: {
    categories: [""],
  },
});

Best Practices

For AI agent workflows, always take an extraction snapshot first:
await opensteer.snapshot({ mode: "extraction" });
const data = await opensteer.extract({ ... });
This provides the LLM with optimized HTML for better extraction results.
Clear descriptions lead to better extraction:
// Good
description: "Extract the main product cards with title, price, and image"

// Less specific
description: "Extract products"
Your schema should reflect the actual structure on the page. If there are multiple items, use arrays. If there’s a single element, use an object.
Always wrap extraction in try/catch and close resources:
try {
  const data = await opensteer.extract({ ... });
  console.log(data);
} catch (error) {
  console.error("Extraction failed:", error);
} finally {
  await opensteer.close();
}

Running the Example

Make sure you have an API key configured for your model provider:
# For OpenAI
export OPENAI_API_KEY=your_key_here

# For Anthropic
export ANTHROPIC_API_KEY=your_key_here
Run the example:
node data-extraction.js

Next Steps

Form Filling

Learn how to fill out and submit forms

AI Integration

Build AI agents with OpenSteer