Data reconciliation. It's a task that sounds simple on the surface—comparing lists to find what's changed—but can quickly spiral into a complex, resource-intensive nightmare. Whether you're synchronizing user databases, de-duplicating marketing leads, or verifying a data migration, you've likely spent hours writing boilerplate code to loop, compare, and filter arrays.
This is tedious, error-prone, and it doesn't scale. A script that works for 1,000 records will bring your server to its knees with 1,000,000.
Enter lists.do: a powerful, developer-first API for list management and operations. It's "Lists as a Service"—you send us your arrays, and we perform complex sorting, filtering, and set theory operations on our optimized cloud infrastructure. You get a clean, accurate result without bogging down your own resources.
Let's explore five common data reconciliation workflows you can automate today, saving you hours of development time.
The Scenario: You just hosted a successful webinar and have a CSV of 5,000 new leads. You need to add them to your master customer relationship management (CRM) list, but you want to avoid creating duplicate entries for people who are already customers.
The Old Way: You'd likely write a script to fetch all existing customer emails, load the 5,000 new emails into memory, and then loop through the new list, checking if each email already exists in your master list. This is inefficient and memory-intensive.
The lists.do Way: Use the diff operation to find the unique items. The source is your new lead list, and the exclude list is your master CRM list. The API returns a clean list of only the truly new leads.
import { createClient } from '@do/sdk';
const lists = createClient('lists.do');
// Find new webinar leads not already in our CRM
async function getNewSignups() {
const newWebinarLeads = ['user1@example.com', 'user50@example.com', /* ...5000 more */];
const existingCrmContacts = ['user1@example.com', 'user2@example.com', /* ...50000 more */];
const { data: newLeads } = await lists.diff({
source: newWebinarLeads,
exclude: existingCrmContacts
});
console.log(newLeads);
// Output: ['user50@example.com'] and any other truly new leads
// Now you can safely import `newLeads` into your CRM
}
getNewSignups();
The Scenario: You maintain a "golden source" list of users who should have admin privileges in your application (e.g., in a separate IAM system or a simple config file). You need to regularly audit your application's database to ensure its list of admins matches the golden source perfectly.
The Old Way: Fetch two lists and write two separate loops: one to find users who have access but shouldn't, and another to find users who need access but don't have it. This is a classic set theory problem that's cumbersome to implement manually.
The lists.do Way: Perform two diff operations to find both discrepancies. This gives you two actionable lists: one for revocation and one for granting access.
// Conceptual code for permission synchronization
const goldenSourceAdmins = ['admin1@corp.com', 'admin2@corp.com'];
const currentDbAdmins = ['admin1@corp.com', 'rogue_admin@corp.com'];
// Find admins who need their access revoked
const { data: usersToRevoke } = await lists.diff({
source: currentDbAdmins,
exclude: goldenSourceAdmins
});
// Result: ['rogue_admin@corp.com']
// Find users who need admin access granted
const { data: usersToGrant } = await lists.diff({
source: goldenSourceAdmins,
exclude: currentDbAdmins
});
// Result: ['admin2@corp.com']
The Scenario: Your subscription service needs to identify which customers canceled their service this month. You have a list of active subscriber IDs from last month and a list from this month.
The Old Way: Fetch last month's list. For each user, check if they exist in this month's list. If not, add them to a "churned" list. This becomes very slow as your user base grows.
The lists.do Way: A simple diff operation gives you an immediate answer. By subtracting this month's active users from last month's, you are left with precisely the users who have churned.
// Conceptual code for finding churned users
const subscribersLastMonth = ['sub_1', 'sub_2', 'sub_3', 'sub_4'];
const subscribersThisMonth = ['sub_1', 'sub_3', 'sub_5']; // sub_2 & 4 churned, sub_5 is new
const { data: churnedUsers } = await lists.diff({
source: subscribersLastMonth,
exclude: subscribersThisMonth
});
console.log(churnedUsers);
// Output: ['sub_2', 'sub_4']
The Scenario: Your company runs two separate products, "PhotoTool" and "VideoTool." You want to run a marketing campaign targeting customers who use both products, offering them a discounted "Creative Suite" bundle.
The Old Way: Pull the user list for PhotoTool. Pull the user list for VideoTool. Write a nested loop or use hash maps to find the users present in both lists. This is a classic array intersection problem.
The lists.do Way: Use the intersection operation. It’s designed specifically for this set theory task, providing a highly optimized way to find the common elements between two or more lists.
// Conceptual code for finding users of multiple products
const photoToolUsers = ['userA@example.com', 'userB@example.com'];
const videoToolUsers = ['userB@example.com', 'userC@example.com'];
const { data: powerUsers } = await lists.intersection({
lists: [photoToolUsers, videoToolUsers]
});
console.log(powerUsers);
// Output: ['userB@example.com']
// Now you have your target list for the "Creative Suite" campaign!
The Scenario: You're migrating a user database with 10 million records from a legacy MySQL server to a new PostgreSQL instance. You need to be certain that every single user record made it across safely.
The Old Way: Trying to load 10 million IDs from both databases into your application's memory is a recipe for disaster. You'd have to write a complex, paginated streaming script and be very careful about memory management.
The lists.do Way: This is where our service truly shines. You can stream the user IDs from both the source and destination databases directly to our API. We handle the massive scale on our infrastructure. Perform a diff in both directions. If both diffs return an empty list, your data migration is a success. This offloads the entire computational burden from your servers and application code.
For each of these workflows, you could write the data manipulation logic yourself. But why would you?
Ready to automate your data reconciliation tasks? Check out lists.do and turn hours of frustrating coding into a single, clean API call.