Clay Icon

Improving Lead Deduplication Workflow with Clay and Salesforce Integration

·
·

Hi Clay, I’m reaching out regarding a critical issue in my current workflow involving Clay’s integration with Salesforce. The Issue Whenever new leads are created in Salesforce through Clay, I need to ensure that they are deduplicated against existing Salesforce Contacts and Leads. Currently, I am facing challenges with managing this process effectively:

  1. 1.

    Current Workflow:

  • I rely on exporting all Salesforce Contacts and Leads as a CSV.

  • I then deduplicate the data manually or with external scripts, matching new leads across fields like:

  • Full Name (fuzzy matching).

  • Email (including domain-level matching).

  • Website (domain-level matching).

  • LinkedIn URL (unique ID matching).

  • Company Name (fuzzy matching).

  • The results are then returned to Salesforce with additional columns for:

  • Match confidence (e.g., "Sure" or "Unsure").

  • Matched record ID.

  • Matched field(s) and matched value(s).

  1. 4.

    The Problem:

  • This process is slow, unreliable, and heavily dependent on external tools or developers.

  • With growing data volumes, it has become unsustainable, especially as I need a scalable, automated solution.

What I Need Help With I’d like support in identifying how Clay can address this issue within the current integration. Specifically:

  1. 1.

    Deduplication Workflow:

  • Is there a way for Clay to automatically deduplicate new leads by comparing them against Salesforce Contacts and Leads during the creation process?

  1. 2.

    Matching Logic:

  • Full Name: Fuzzy or phonetic matching (e.g., Jewish Soundex).

  • Email: Fuzzy matching for full email and domain-level matching (e.g., valenciarc.com vs. valenciarealty.capital).

  • Website: Domain-level comparison.

  • LinkedIn URL: Match based on unique IDs.

  • Company Name: Fuzzy matching for name variations.

  1. 3.

    Output to Salesforce:

  • I need Clay to populate fields in the new lead or a related record with:

  • Matched record ID (Lead or Contact ID).

  • Confidence score (High, Medium, Low).

  • Matched field(s) and matched value(s).

  1. 5.

    Automated Workflow:

  • If Clay can handle this process directly, it would eliminate the need for manual intervention or external tools, which would significantly improve accuracy and save time.

Additional Context Previously, I relied on a Python script set up by a developer to handle this deduplication. However, the setup was unreliable, and I’ve spent weeks troubleshooting without success. I’m looking for a solution that leverages Clay’s capabilities and doesn’t require me to rely on external scripts or manual processes. Can you assist me in configuring this deduplication workflow or suggest how I can achieve this through Clay’s existing features? Happy to speak with any developer referrals you may have as well. Thank you for your support—I look forward to hearing your recommendations.

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    You can implement a deduplication workflow in Clay using a combination of Lookup Record and Create Record actions. Here's how:1. First, use the Lookup Record action to check for existing records: • Select your Salesforce account and specify the Objects (Leads and Contacts) • Define search parameters using your matching fields (Email, Name, Company) • You can toggle between exact match or contains search depending on your needs 2. After the lookup: • Create columns in your Clay table to store the lookup results • Use these results to determine if a record is a duplicate • Configure conditional runs based on the lookup results 3. For creating new records: • By default, Clay won't create duplicate records in Salesforce • If needed, you can enable "Duplicate Rule Override" in the Create Record action to bypass Salesforce's duplicate rulesNote: For advanced fuzzy matching or complex deduplication logic beyond exact/contains matches, you may need to implement additional tools or custom solutions, as these specific matching capabilities aren't native to Clay's Salesforce integration. Sources: Salesforce integration overview | Documentation | Clay University Salesforce Integration Guide FAQ: Will Clay create duplicate records in Salesforce? Lookup Record with Salesforce

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Looks like you're still looking for help. We're escalating this over to our support team and someone should be in touch soon. In the meantime, feel free to include any other information that may be helpful in solving your issue!

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    You can create this without any devs help as well. its just a logic that you need to implement with in your clay table that will actively lookup records in sales force and ensures the record being created is new and not already available.

  • Avatar of Stephen S.
    Stephen S.
    ·
    ·

    Muhammad S. thanks for your reply. I did try to do this and unchecked the "exact match" checkbox. Unfortunately it was very inaccurate and the exact match functionality doesn't work for my dataset.

  • Avatar of Bo (.
    Bo (.
    ·
    ·

    Hey Stephen - I would use a dedupe feature within Salesforce instead that would yield better results since a lot are made for this. Once the database is clean you can then correctly use a lookup and or any of our enrichments for your purposes. If you only want to use Clay, you could upload all your data and then proceed to matching it and cleaning it, but that sounds like it might be too complicated. Some options would be: • Salesforce Duplicate Management: Use built-in matching and duplicate rules to identify and manage duplicates within Salesforce. • Cloudingo: A third-party tool designed to find and merge duplicate records in Salesforce. • DemandTools: Offers deduplication features to identify and merge duplicates across your Salesforce data. • Duplicate Check by Plauti: A Salesforce-native application that helps find, merge, and prevent duplicate records. • DupeCatcher: A Salesforce app that flags duplicates as they’re entered or imported, allowing real-time interception.

  • Avatar of Stephen S.
    Stephen S.
    ·
    ·

    Thanks, Bo (.. We are having challenges with the native salesforce dedupe capability (i.e., can't indicate the duplicate record on the prospective import record). We are having to build a custom solution for this that has the fuzzy matching capability we need. Surprisingly, most of the options you mentioned don't have those capabilities (such as soundex fuzzy matching). Cloudingo may, but it is cheaper to build our own.

  • Avatar of Bo (.
    Bo (.
    ·
    ·

    I see - There are also a lot of other tools like Duplicate Check by Plauti, Apsona Dedupe and Match, Datagroomr, but there are also a lot of Python libraries too. These can be found on GitHub or other libraries as well. Might be worth checking for this in GitHub or other libraries too.

  • Avatar of Maximilian J.
    Maximilian J.
    ·
    ·

    Stephen S. We currently encounter a similar issue in our Hubspot Data. There are a lot of duplicates that are difficult to identify due to differences in subsidiary/holding names and domains and more. What I did was creating a table that pulls all CRM companies/contacts into a table. I used extensive (javascript) formulas in two columns to completely normalize the name and domain (removes all pre- and suffixes and entitiy names such as the German GmbH, AG etc. and also terms such as "Group" "Holding" etc. and makes it all small caps). Subsequently, I run one "equals" lookup for each, the name and the domain within the same table and if either one is true (sum of results >2), it is very likely a duplicate. To make it more sensitive you can add another lookup that uses the contains operator, however, this is often too sensitive, especially for firms with very short names or domains such as, e.g., "EFF Group".

  • Avatar of Maximilian J.
    Maximilian J.
    ·
    ·

    This is my current name normalization formula for reference 🙂 it is adjusted for (mostly) German b2b operations Just use claude.ai or any other genai to create/modify the formula, works pretty well! {{Company Name}}?.toLowerCase() .replace( /\s+(gmbh(?: \+? &? co\.? ?kg?| \+? co\.? ?kg?)?|ag(?!\s+[a-zäöüß])|se(?: \+? &? co\.? ?kg?)?|\bkg|(?<!^)europa|(?<!^)europe|(?<!^)germany|(?<!^)deutschland|(?<!^)holland|(?<!^)netherlands|(?<!^)group|(?<!^)gruppe|holding|ug|ohg|\+? &? co\.?)\b.*/gi, "" ) .trim()

  • Avatar of Stephen S.
    Stephen S.
    ·
    ·

    Maximilian J. This is great. Thank you for sharing! I can see how this could work. Do any duplicates end up slipping by?

  • Avatar of Stephen S.
    Stephen S.
    ·
    ·

    I could see scenarios where the structured normalization might not pick up on variances in domain name vs website, for example.

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Hey there - just wanted to check in here to see if you needed anything else! Feel free to reply back here if you do.

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Hi Stephen S.! This thread was recently closed by our Support team. If you have a moment, please share your feedback: