I’d like to have consistent UUIDs for people and companies. When I use different enrichment sources (e.g., through a waterfall), each source seems to return a different UUID format. What’s the best way to standardize UUIDs so I can have both coverage (maximizing the number of enriched rows) and consistency (using a cohesive UUID format across all records)? Is there a method to do this?
Our normalization tools can help standardize data formats across different sources. You can access these functions in the enrichment panel under the "Tools" section, where you'll find pre-built normalization functions. These tools are credit-free since they parse existing data and execute code rather than reaching out to external providers. The normalization functions can standardize text whitespace, phone numbers, and locations to create uniformity across your data. However, the available normalization tools focus on company names, text formatting, phone numbers, and locations rather than UUID standardization specifically. Are you looking to create a consistent identifier format for matching records across different enrichment sources, or do you need to standardize existing UUID formats that different providers are returning?
Yes I am looking to create a consistent identifier format for matching records across different enrichment sources
Use Clay's "corner pieces" as consistent identifiers across enrichment sources. For people records: Use Personal LinkedIn Profile URL and Full Name as your primary identifiers. For company records: Use Company Domain and Company LinkedIn Profile URL as your primary identifiers. These corner pieces remain consistent regardless of which enrichment provider you use in your waterfall, giving you both coverage and consistency for record matching.
LinkedIn profiles can change so I would like to create a unique UUID for each person or company if possible
Clay doesn't have a built-in UUID generation feature for creating unique identifiers across enrichment sources. Our system relies on corner pieces like LinkedIn URLs and company domains as consistent identifiers, but these can change as you mentioned. You could create your own UUID system using a formula column that combines multiple stable data points (like email domain + first name + last name for people, or domain + company name for companies) and then applies a hash function to generate a consistent identifier. What specific data points do you have available in your table that tend to remain stable across your enrichment sources?
I am currently using PeopleDataLabs to enrich organization and using the PDL IDs as a UUID but when PDL can’t find an organization and I have to fall back to other enrichment sources I get a different ID format. I am trying to standardize UUID formats if possible. For people I plan on using PDL, Clay enrichment source, and Apollo
Our normalization tools can standardize text formats, phone numbers, and locations, but don't include UUID standardization specifically. These tools are credit-free since they parse existing data rather than reaching out to external providers. For your UUID standardization challenge across PDL, Clay, and Apollo enrichment sources, you'll need to create a custom solution using a formula column. You can combine stable data points (like company domain + company name) and apply a hash function to generate consistent identifiers when different sources return varying ID formats. Since each provider (PDL, Apollo) returns their own unique ID format, there's no built-in way to standardize these across sources - you'll need to implement your own UUID generation logic using our formula columns.
Can you show me how to build that formula?
Here's how to build a UUID formula in Clay: 1. Add a new "formula" column to your table 2. Use our AI formula generator to create custom code that combines stable data points 3. Prompt the AI with something like: "Create a UUID by combining company domain + company name and applying a hash function" 4. The AI will generate the appropriate code to give you a consistent identifier For example, you could combine: ** For companies: Domain + Company Name ** For people: Email domain + First Name + Last Name The formula will hash these stable data points into a consistent UUID format, regardless of which enrichment source (PDL, Apollo, Clay) you're using.
This formula is not working: require(“crypto”).createHash(“sha256").update(({{website}}||“”).toString().toLowerCase()).digest(“hex”)
The require("crypto") syntax doesn't work in Clay's formula environment. Try using our AI formula generator instead:
1. Add a new formula column
2. Use the AI formula generator with a prompt like: "Create a hash of the website field using SHA256"
3. Let the AI generate the proper code for Clay's environment
The AI will create the appropriate syntax that works within Clay's formula system. Our AI formulas are credit-free and designed to handle custom code generation for data formatting and manipulation.
the ai is also returning require(“crypto”)
Clay's AI formula generator may not support Node.js crypto functions like require("crypto"). Try a simpler approach:
1. Add a formula column
2. Prompt: "Create a simple hash of the website field using built-in functions"
3. Or try: "Combine website + company name into a unique identifier"
The AI should generate code that works within Clay's environment without requiring external libraries.
Hi Nathen, thank you for reaching out! I don't believe creating a unique identifier for each data point is possible in Clay. You can potentially use formulas to create ID based on hash functions but that requires using a unique value being passed in the hash functions for each row. Were you able to figure out a simpler formula that could do the task?
No I wasn’t. I tried using both suggestions and none of the formulas were generating results. I guess the simplest way would just to do a concatenation of multiple fields but I would ideally like to find a more elegant solution
