Hi everyone! How can I make a backup of the Clay workbook? I ask this, because I would like to secure my data in case of accidental deletion of a column within the workbook. Also, I want to use geographic data to find large buildings within Germany to then auto-search them in Google Maps to scrape their details. What integrations are there for this? I have some non-Clay sources from the government, but I want to know what I can do in Clay.
I don't see specific backup features for Clay workbooks in our documentation. For data protection, you may need to export your data regularly. For finding buildings in Germany, we have Google Maps integration that can search local businesses by location. You can: β’ Set Germany as your search location β’ Define search radius and specific areas β’ Pull business information including addresses and details We also integrate with 75+ data enrichment tools and offer AI web scraping to extract information from websites. You can use our HTTP API feature to connect external tools you have API access to. For your government data sources, you could potentially integrate them using our HTTP API feature if they provide API access.
Cool. Can I search or filter those buildings by square meter? How does that work?
Sure, no problem!
Hey, Thereβs no native backup feature in Clay, but exporting your table as a CSV is the best way to safeguard your data in case of accidental changes or deletions. To prevent unnecessary enrichments while editing, itβs also a good idea to turn off auto-update before making big changesβthis helps you avoid retriggering credits by mistake. For your project on large buildings in Germany, you can use the Google Maps integration in Clay to search and pull details. You can also use Claygent to search online for specific building data or details. If youβre trying to get information like square meters, Claygent can usually surface that. Another option is to take a screenshot of a buildingβs perimeter on Google Maps and use AI Analyze Image to extract surface-related info. Let me know if you have more questions.
Thanks for your response! I have a question concerning the waterfall logic in Clay. I have tried to get it to work in this workbook, but I am not sure why it did not work or what I should do differently. Second, I am trying to create one workbook as the one we actively use and another as a backup. I already did a CSV export, but when I imported it, it only had 5000 rows and not the original 16827. Why not? https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0szaay9t77oqp6dPbvM/views/gv_0sz6ugahV3D6yAkmBS3 https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/all-tables
Hey, happy to walk you through this. For the work email waterfall, hereβs how it works: when triggered, it checks the first provider. If it returns an email, itβs sent to a verification provider. If that email is marked invalid, the next provider is triggered. Each provider works independently, so some might return the same email since they donβt share results with each other. For the CSV import, Clay has a limit of 5,000 rows per import. Thatβs likely why youβre only seeing a portion of your 16,827 rows. To bring everything in, youβll need to split your CSV into smaller files before uploading. Could you also share the original workbook and upload your CSV to Google Sheets? Thatβll help compare and double-check the import. Let me know if you have more questions.
Sure! Here is the original, with 16827 rows: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0sz6ugaAdPnxkdmdM5g/views/gv_0sz6ugahV3D6yAkmBS3 Here is the Google Sheet: https://docs.google.com/spreadsheets/d/1k_ZBKrBIVkAokWCMMIPcwlrlKnJps7jb0T0C6lB46D8/edit?usp=sharing As for the waterfall logic: What I am trying to create is waterfall logic in Clay that first sees if www.domain/impressum.de exists and scrapes it if it does. If that site does not exist, Clay then checks if www.domain/kontakt.de exists and scrapes it if it does. Then www.domain/contact.de Then www.domain/ueberuns.de And so on. I tried to first create a waterfall with Clay's HTTP API to check which sites of the above listed ones actually exist and then scrape the ones that exist, but that did not work
Hey, thanks for sharing both links. I tested it on my end and everything seems to be working correctly when added. See here For your goal, rather than manually checking and scraping multiple versions of the same page, Iβd recommend using an enrichment like Find Emails Associated with Domain (from providers like Prosp.io, Snov.io, or Hunter). These automatically return emails from across key pages without needing custom logic, and itβs much more efficient on credits. If you still want to stick with your original approach, you can do it by: ** Creating separate columns for each URL variant (/impressum, /kontakt, etc.) ** Using a formula column to check if each URL exists ** Then using Scrape Website, set to βonly run ifβ that specific page is valid Alternatively, you can also try using *Claygent to find the most relevant page like the Impressum directlyβit might give you better flexibility. Let me know if you want help setting that up.
Ok, thanks for telling me how to make it more efficient! The end result is that I would like 1. the telephone number, 2. the first and last name of the person in charge 3. the email, preferably the private email. The reason I started by scraping the impressum page is because I can hypothetically get all of that in one scrape, but is there a way to do that more efficiently? Is there an enrichment for first and last name too? Or some workaround?
Hi, jumping in for Bo here!
While scraping the Impressum page can work in some cases, thereβs a more reliable and efficient workflow in Clay that can help you get validated contact information, including the full name, email, and phone number of decision-makers.
Hereβs how Iβd recommend approaching it: 1. Start by using a Claygent column to find the best domain for each company (as Bo mentioned). This gives you a clean, usable domain that unlocks many enrichments. 2. Run a βFind Contacts at Companyβ search using that domain. In the job title field, you can specify the seniority or department you're targeting (e.g., "Founder", "Head of Marketing", etc.). 3. Once the contacts are found, you'll see initial information like name and title. You can then run an "Enrich Person" or "Enrich Profile" action on those contacts to get more detailed data, including their first and last name, email, and phone number. 4. These enrichments typically return validated contact data, making them a strong alternative to scraping. Here's a quick demo of what that might look like:
This workflow might give you better results, however totally understand if you'd prefer to use the scraping technique instead and would recommend following Bo's approach for what that would look like^. Let me know how you'd like to progress and happy to help you create a solution that best suits you!
Hello and thanks for your reply. I am not sure I understand your method. I already have a column of companies. Here is the workbook: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0szfyi9esn5Ucne2Hm3/views/gv_0szfyi9PsonfgqfE2u6 I'm currently trying to set up an enrichment workflow using my own list of companies and domains, which Iβve uploaded into Clay. However, Iβm running into a few issues:
Wrong preview data: When I try to configure enrichment actions (like EnrichPerson or EnrichProfile), Clay shows me preview data for random companies β not the ones from my actual worksheet. I want the enrichments to strictly use the domains and company names from my uploaded list, but that doesn't seem to be happening right now.
EnrichPerson / EnrichProfile understanding: Iβve also tried using the EnrichPerson and EnrichProfile actions, but Iβm not entirely sure how they work or whether Iβve set them up correctly. Could you help me understand:
What inputs these actions require to function properly
The difference between them
How to best use them to get contact-level data for people at a given company
General UI clarity: I'm still getting used to the Clay interface and would appreciate a short explanation of how:
Workbooks, views, and columns relate to each other
Data is stored or segmented across different parts of Clay
I've attached a screenshot to show my current setup β I hope that helps clarify what Iβm trying to achieve.
Hi - sorry for the confusion there and hope to help you out with your workflows. 1) I am looking into what's happening here with the team and will get back to you when we've got a solution for this. For your other two questions, happy to help clarify things here for you.
Understanding EnrichPerson vs. EnrichProfile
These actions are actually the same: "EnrichProfile" refers to Clayβs "Enrich Person from Profile" action. It pulls detailed data from LinkedIn profiles, like job title, company, and more.
Inputs These Actions Work Best With: ** LinkedIn URL: most reliable for enrichment ** Email: work or personal, depending on the provider ** Full name + company domain : great fallback when LinkedIn isnβt available ** Phone number: sometimes usable, but provider-dependent To get contact-level data for people at a company, you can: 1. Use βFind Contacts at Companyβ to discover leads 2. Then enrich them using βEnrich Person from Profileβ 3. Optionally, set up a waterfall (multiple providers in sequence) to maximize coverage Clay UI β How Workbooks, Views, and Columns Fit Together ** Workbooks = Project folders that contain your related tables (like Excel files with tabs) ** Tables = Where your actual data lives (companies, contacts, etc.) ** Views = Ways to filter and organize how you see that table (e.g., only fully enriched rows) ** Columns = Individual data fields (inputs, enrichment results, formulas) Everything is connected, so you can link company and contact data and build smooth workflows. This video is a helpful visual explanation of how the logic of workbooks look like: https://www.clay.com/changelog/clay-workbooks
Let me know if that makes sense!
Clay S., can you help me fix the workflow that you recommended that I set up? If there is a way to do what I want more efficiently, I would like to know abouit it. Can you help me fix the problem with the preview data? We could also do a Slack Huddle or some other arrangement, if that would make our communication more efficient.
Bo (., hi, I am trying to set up the workflow that Clay-Support recommended to me: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0szfyi9esn5Ucne2Hm3/views/gv_0szfyi9PsonfgqfE2u6
I looked for other guides, like this, but that relies on LinkedIn and many people in the cleaning industry do not have LinkedIn. I made a video to explain where I am at right now: https://www.loom.com/share/9464fac8d23141ecac03fa40cb5022cf Bo (. Clay S.
Bo (., I found this: https://community.clay.com/x/support/mmdb58wfmocw/data-sources-for-clays-find-companies-and-find-peo?utm_source=chatgpt.com Here is the data: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0szfyi9esn5Ucne2Hm3/views/gv_0szfyi9PsonfgqfE2u6 I made a video to explain where I am at right now: https://www.loom.com/share/9464fac8d23141ecac03fa40cb5022cf Clay S. From what I understand, Clayβs βFind Contacts at Companyβ enrichment relies on external data providers like Clearbit, Apollo, HitHorizons, etc. Weβre working almost exclusively with German SMEs, and Iβve noticed that domains like zuzusauber.de often return βCompany Not Found.β Can you confirm whether this is because those domains simply arenβt present in the provider databases β and if so, that our current workflow (scraping Impressum/contact pages + using GPT to extract names, emails, and phone numbers) is indeed the most effective and cost-efficient approach for this type of data? This is a key decision point for us, since it will impact how we scale the process and optimize credit usage. Appreciate your guidance!
Hi - thanks for your patience here and for sending across the video to help us understand the point you are at. Apologies for the confusion in setting up this workflow. You are right in stating that the domains you are attempting to find are not in our provider databases, which is why you are unable to get significant results. Apologies that this did not work as expected. I also tried to run a workflow to find emails by domain, however this was also not outputting the best results. Your current workflow "(scraping Impressum/contact pages + using GPT to extract names, emails, and phone numbers)" seems to be the most cost efficient approach here for the data that you have. Let us know if you have further questions
Hi, Clay S.! Here is where I am at in the other workbook: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0sz6ugaAdPnxkdmdM5g/views/gv_0sz6ugahV3D6yAkmBS3 https://www.loom.com/share/118808f64a81441487220e3da8085de1
Thanks for the context here. I'd like to ask a follow up - you mentioned you used the "Research Bot" to try and find the Impressum page before, but it didn't work as expected? Can I ask if you used Claygent to do the search or OpenAI?
I used Claygent. This was the "Impressum" page it could not find: https://www.zulj.de/35.html
Understood - thanks for sharing that. I can't see the Claygent column you were using in your current table, but perhaps there would be a way to improve the prompt to try and help you find better results
Hi! This is my situation so far with the agent: https://www.loom.com/share/b2135c97510542c79fbf340ce087c439 This is the sheet: https://app.clay.com/workspaces/670253/workbooks/wb_0sz6ugaxRpWDQC9QBfm/tables/t_0sz6ugaAdPnxkdmdM5g/views/gv_0sz6ugahV3D6yAkmBS3
Thanks for sharing the video for context and going through your promot. I noticed you're currently using ChatGPT for the search of the information page, not Claygent. In this case, Claygent might actually be a better fit since it's more effective at web scraping. It does use 3 credits per row (compared to 1 credit for ChatGPT), but if you're open to testing it on one of your rows, it could be a great option for finding harder-to-locate pages. Just a heads-up: even with Claygent, AI might not always find the exact result you're looking for, so trying it out on a single row first is a good way to gauge if itβs worth applying more broadly. I think it could be a good way to check and see if potentially Claygent would give you a better output for that domain you pointed out that was incorrectly generated in Row 9
Hope that helps clarify things, and let me know if you'd need help setting this up. If this is not the direction you'd like to go in, then I believe your original solutions make the most sense for the logic here and are optimized to use credits effectively
This thread was picked up by our in-app web widget and will no longer sync to Slack. If you are the original poster, you can continue this conversation by logging into https://app.clay.com and clicking "Support" in the sidebar. If you're not the original poster and require help from support, please post in 02 Support.