How It Works

Search first. Every product comes from a real web page.

Authentic Search Experience

This demo shows how Exa's search API can power a full product discovery flow. Every result comes from a real web search — no pre-built catalogue, no cached database. You type a query, Exa searches the web in real time, and the products you see are sourced directly from live product pages across the internet.

1

Searching the web

Your query is sent to Exa's product search pipeline. It uses the product category with a neural filter to find real product pages — filtering out marketplaces like Amazon and eBay to surface direct brand and retailer sites.

2

Extracting product details

For each page found, Exa extracts structured data — product images, prices, and currencies — using AI-generated summaries with a JSON schema. This turns unstructured web pages into clean product cards.

3

Ranking by quality

Products are scored based on data completeness: image availability, price extraction, and source quality. The top results are surfaced — ensuring you see the best matches, not just the first ones.

Exa Search Call

The core API call uses category: "product" with richImageLinks for intelligent image selection, plus a summary schema for structured price extraction.

const results = await exa.search(query, {
  type: "auto",
  numResults: 20,
  category: "product",
  flags: ["use_super_product_neural_filter"],
  contents: {
    extras: { richImageLinks: 5, imageLinks: 1 },
    summary: {
      query: "Product details for: " + query,
      schema: PRODUCT_SUMMARY_SCHEMA,
    },
  },
  excludeDomains: [
    "amazon.com", "ebay.com", "walmart.com",
    "alibaba.com", "aliexpress.com", // ...
  ],
});

Structured Summary Schema

The summary schema enforces a JSON response with typed fields. The AI extracts price as a number and currency as an ISO 4217 code, making it easy to format consistently.

{
  "type": "object",
  "properties": {
    "description": {
      "type": "string",
      "description": "A concise one-sentence description"
    },
    "price": {
      "type": "number",
      "description": "Numeric price without currency symbol"
    },
    "currency": {
      "type": "string",
      "description": "ISO 4217 currency code (USD, EUR, GBP)"
    }
  },
  "required": ["description"],
  "additionalProperties": false
}

Image Filter

A product page's og:image is often a brand logo, banner, or stale asset. To avoid that, every result is asked for multiple image candidates via richImageLinks (URLs + alt text), each one is probed in parallel so anything that 404s or trips hotlink protection is dropped, and an LLM picks the surviving candidate that most clearly depicts the actual product. Cards with no usable image are removed.

// 1. Ask Exa for richer image candidates per result
contents: { extras: { richImageLinks: 5, imageLinks: 1 } }

// 2. Collect { url, alt } candidates, including og:image
const candidates = collectImageCandidates(result);

// 3. Probe each for reachability (matches browser referer + UA,
//    validates image bytes, catches soft-404s)
const reachable = await probeAll(candidates, { referer: origin });

// 4. One batched LLM call picks the best product photo per card,
//    biased against logos / banners / placeholders
const picked = await pickProductImagesWithLLM(openai, [
  { title: "Nintendo Switch OLED", candidates: reachable },
  // ...
]);

Real-Time Streaming

Results stream back via Server-Sent Events so you see progress in real time — sources appearing as they're found, products rendered as they're extracted.

// Server streams events as search progresses
data: {"type":"plan","steps":[...]}
data: {"type":"step-start","stepIndex":0}
data: {"type":"source","url":"...","title":"...","domain":"..."}
data: {"type":"step-complete","stepIndex":0,"sourcesFound":18}
data: {"type":"product","product":{...}}
data: {"type":"done","totalProducts":12}