Architecture Overview
Watch the complete data flow: your middleware calls Exa Answer to get structured company data, then syncs to your CRM or database.
Starting enrichment pipeline...
Pipeline Summary
Query records → Call Exa Answer for each → Get structured data → Update database
Cron triggers → Re-query all companies → Fresh data from Exa → Push updates to database
Daily cron queries new accounts → Exa Answer enriches → Auto-sync on ingest
Exa Answer API Format
Exa Answer combines web search and LLM extraction in a single API call. Ask a question about any company and get back structured data in your exact schema—no separate LLM step required.
Category Search Types
Usecategory: "company" for company searches and category: "research paper" for research content. Exa's neural search is the best in the world for these use cases.Schema Tips
Userequired for must-have fields and enum for fields with fixed options. Add description to guide extraction.Phase 1: Initial Backfill
One-time enrichment of your existing company records
Query your company records
Query your database to get the company names and domains you want to enrich, along with the record IDs for syncing data back. This example uses a generic CRM, but works with any database.
// Query all accounts with websites
const result = await crm.query(`
SELECT Id, Name, Website FROM Account WHERE Website != null
`);
const accounts = result.records.map(acc => ({
id: acc.Id,
name: acc.Name,
domain: new URL(acc.Website).hostname,
}));Enrich with Exa Answer
For each company, call Exa Answer with your structured schema. This single API call searches the web and extracts exactly the fields you need—no separate LLM step required.
import Exa from 'exa-js';
const exa = new Exa(process.env.EXA_API_KEY);
async function enrichCompany(companyName, domain) {
const response = await exa.answer(
`What is the company information for ${companyName} (${domain})?`,
{ schema: companySchema }
);
return { data: response.answer, sources: response.citations };
}Update your database with enriched data
Push the structured data back to your database, mapping fields to your schema.
async function updateCRMAccount(accountId, enrichedData) {
await crm.sobjects.Account.update({
Id: accountId,
Industry: enrichedData.industry,
NumberOfEmployees: enrichedData.employeeCount,
Description: enrichedData.description,
Last_Enriched__c: new Date().toISOString(),
});
}Run the complete backfill pipeline
Orchestrate all the steps with concurrency control and error handling.
// Process accounts with concurrency limit
for (const account of accounts) {
const { data } = await enrichCompany(account.name, account.domain);
await updateCRMAccount(account.id, data);
}Initial Backfill Complete
Your existing accounts are now enriched with fresh data from the web. Next: set up scheduled refresh to keep data current.Phase 2: Weekly Refresh
Scheduled re-enrichment to keep data fresh
Set up a cron job
Schedule a weekly job to trigger the refresh process. Monday morning is a common choice.
import cron from 'node-cron';
// Run every Monday at 6 AM
cron.schedule('0 6 * * 1', async () => {
// Refresh logic here
});Query accounts prioritized by staleness
Fetch accounts from your CRM, ordered by when they were last enriched so the stalest data gets refreshed first.
const accounts = await crm.query(
'SELECT Id, Name, Website FROM Account ORDER BY Last_Enriched__c ASC'
);Re-enrich with Exa Answer
Call Exa Answer for each account to get fresh data from the web.
for (const account of accounts.records) {
const { data } = await enrichCompany(account.Name, account.Website);
// data contains fresh structured company info
}Push updates to your database
Update each record with the fresh enrichment data.
await updateCRMAccount(account.Id, data);Weekly Refresh Complete
Your accounts stay current with fresh data from the web every week.Phase 3: New Record Ingestion
Automatically enrich new records as they're added to your database
Set up a daily cron job
Schedule a daily job to detect and enrich new records added in the last 24 hours.
import cron from 'node-cron';
// Run daily at 7 AM
cron.schedule('0 7 * * *', async () => {
// New account enrichment logic here
});Query new records
Fetch accounts created today that haven't been enriched yet.
const newAccounts = await crm.query(
'SELECT Id, Name, Website FROM Account WHERE CreatedDate = TODAY'
);Enrich and sync each new record
Call Exa Answer for each new account and push the enriched data back.
for (const account of newAccounts.records) {
const { data } = await enrichCompany(account.Name, account.Website);
await updateCRMAccount(account.Id, data);
}Alternative: Real-time webhook
For instant enrichment, set up a webhook that triggers when a new account is created.
// Webhook endpoint for real-time enrichment
app.post('/webhook/new-account', async (req, res) => {
res.status(200).send('OK');
const { data } = await enrichCompany(req.body.name, req.body.website);
await updateCRMAccount(req.body.id, data);
});New Record Ingestion Complete
New accounts are automatically enriched—no manual intervention needed.That's it!
You now have a complete enrichment pipeline: initial backfill for existing records, weekly refresh to keep data current, and automatic enrichment for new accounts. The Exa Answer API handles all the web search and data extraction in a single call.