Exa

Company Enrichment Tutorial

Build a complete enrichment pipeline for your CRM or database using the Exa Answer API.

Architecture Overview

Watch the complete data flow: your middleware calls Exa Answer to get structured company data, then syncs to your CRM or database.

Initial Backfill
1. Initial Backfill2. Weekly Refresh3. New Records

Starting enrichment pipeline...

Your DatabaseCRM / DBYour MiddlewareOrchestration LayerExa Answer APISearchExtractStructured JSON OutputSource of TruthYour ServerSearch + Extract in One Call

Pipeline Summary

1. Initial Backfill

Query records → Call Exa Answer for each → Get structured data → Update database

2. Weekly Refresh

Cron triggers → Re-query all companies → Fresh data from Exa → Push updates to database

3. New Accounts

Daily cron queries new accounts → Exa Answer enriches → Auto-sync on ingest


Exa Answer API Format

Exa Answer combines web search and LLM extraction in a single API call. Ask a question about any company and get back structured data in your exact schema—no separate LLM step required.

Category Search Types

Use category: "company" for company searches and category: "research paper" for research content. Exa's neural search is the best in the world for these use cases.

Schema Tips

Use required for must-have fields and enum for fields with fixed options. Add description to guide extraction.

Phase 1: Initial Backfill

One-time enrichment of your existing company records

1

Query your company records

Query your database to get the company names and domains you want to enrich, along with the record IDs for syncing data back. This example uses a generic CRM, but works with any database.

// Query all accounts with websites const result = await crm.query(` SELECT Id, Name, Website FROM Account WHERE Website != null `); const accounts = result.records.map(acc => ({ id: acc.Id, name: acc.Name, domain: new URL(acc.Website).hostname, }));
2

Enrich with Exa Answer

For each company, call Exa Answer with your structured schema. This single API call searches the web and extracts exactly the fields you need—no separate LLM step required.

import Exa from 'exa-js'; const exa = new Exa(process.env.EXA_API_KEY); async function enrichCompany(companyName, domain) { const response = await exa.answer( `What is the company information for ${companyName} (${domain})?`, { schema: companySchema } ); return { data: response.answer, sources: response.citations }; }
3

Update your database with enriched data

Push the structured data back to your database, mapping fields to your schema.

async function updateCRMAccount(accountId, enrichedData) { await crm.sobjects.Account.update({ Id: accountId, Industry: enrichedData.industry, NumberOfEmployees: enrichedData.employeeCount, Description: enrichedData.description, Last_Enriched__c: new Date().toISOString(), }); }
4

Run the complete backfill pipeline

Orchestrate all the steps with concurrency control and error handling.

// Process accounts with concurrency limit for (const account of accounts) { const { data } = await enrichCompany(account.name, account.domain); await updateCRMAccount(account.id, data); }

Initial Backfill Complete

Your existing accounts are now enriched with fresh data from the web. Next: set up scheduled refresh to keep data current.

Phase 2: Weekly Refresh

Scheduled re-enrichment to keep data fresh

1

Set up a cron job

Schedule a weekly job to trigger the refresh process. Monday morning is a common choice.

import cron from 'node-cron'; // Run every Monday at 6 AM cron.schedule('0 6 * * 1', async () => { // Refresh logic here });
2

Query accounts prioritized by staleness

Fetch accounts from your CRM, ordered by when they were last enriched so the stalest data gets refreshed first.

const accounts = await crm.query( 'SELECT Id, Name, Website FROM Account ORDER BY Last_Enriched__c ASC' );
3

Re-enrich with Exa Answer

Call Exa Answer for each account to get fresh data from the web.

for (const account of accounts.records) { const { data } = await enrichCompany(account.Name, account.Website); // data contains fresh structured company info }
4

Push updates to your database

Update each record with the fresh enrichment data.

await updateCRMAccount(account.Id, data);

Weekly Refresh Complete

Your accounts stay current with fresh data from the web every week.

Phase 3: New Record Ingestion

Automatically enrich new records as they're added to your database

1

Set up a daily cron job

Schedule a daily job to detect and enrich new records added in the last 24 hours.

import cron from 'node-cron'; // Run daily at 7 AM cron.schedule('0 7 * * *', async () => { // New account enrichment logic here });
2

Query new records

Fetch accounts created today that haven't been enriched yet.

const newAccounts = await crm.query( 'SELECT Id, Name, Website FROM Account WHERE CreatedDate = TODAY' );
3

Enrich and sync each new record

Call Exa Answer for each new account and push the enriched data back.

for (const account of newAccounts.records) { const { data } = await enrichCompany(account.Name, account.Website); await updateCRMAccount(account.Id, data); }
4

Alternative: Real-time webhook

For instant enrichment, set up a webhook that triggers when a new account is created.

// Webhook endpoint for real-time enrichment app.post('/webhook/new-account', async (req, res) => { res.status(200).send('OK'); const { data } = await enrichCompany(req.body.name, req.body.website); await updateCRMAccount(req.body.id, data); });

New Record Ingestion Complete

New accounts are automatically enriched—no manual intervention needed.

That's it!

You now have a complete enrichment pipeline: initial backfill for existing records, weekly refresh to keep data current, and automatic enrichment for new accounts. The Exa Answer API handles all the web search and data extraction in a single call.