A client wants to feed the scoring system I have built in Clay with data that lives in Postgres. We only want to get leads into the Clay table that have taken some trigger action (i.e. clicked on a link in an email). The volume is upwards of 10k / month,. But their postgres instance is behind a VPN, and there is no easy way to bypass this without creating a security vulnerability. Keeping this brief – can someone guide me how to best approach this? What do you need to know?
I think the ChatGPT answer is not correct, but posting here: 1. Set Up Regular Database Exports to S3 or GCS • Automated Database Exports: Use a script, database tool, or ETL solution to regularly export your data to files (like CSV, JSON, or Parquet) and save them in an S3 or GCS bucket. For example, you could run a cron job to export the data every night. • Naming Conventions: Set up a clear naming convention for these files (like data-export-YYYY-MM-DD.csv) so that you can easily identify and retrieve the latest file. 2. Connect Clay to Your S3 or GCS Account • Configure the Integration: • In Clay, navigate to Integrations and select Amazon S3 or Google Cloud Storage. • Input your credentials (AWS Access Key and Secret for S3 or a GCS service account JSON key). • Specify the correct permissions to allow Clay to read files from the bucket. Clay will need permissions like s3:GetObject for S3 or similar access permissions in GCS. 3. Set Up an Automated Pull Schedule in Clay • Define Frequency: Set a schedule for Clay to check the S3 or GCS bucket. This can be every hour, daily, or weekly depending on how frequently your data changes. • File Selection: If you’re storing multiple versions of the data, use a pattern or specific file name to tell Clay which file to pull. For example, you could specify a pattern like data-export-*.csv to always pull the latest version. 4. Parse and Process the Data in Clay • File Parsing: • Once the file is pulled, specify how Clay should parse it. Clay can read common formats like CSV, JSON, and Parquet. • For CSV files, ensure that headers are correctly set, so Clay knows how to map each column. • Data Transformation: • Set up data processing steps in Clay to filter, transform, or map data as needed for your workflows. • For example, if the data has timestamps or IDs you want to filter by, you can configure these in Clay’s processing settings. 5. Trigger Workflows Based on the New Data • Automate Actions: After data import, Clay can trigger specific workflows, such as sending notifications, updating dashboards, or moving data to another system. • Error Handling: Add checks to handle any errors in data parsing, such as malformed files or connectivity issues. Clay can be configured to alert you if a data pull fails.
Hey there Andreas thanks for reaching out, if you are worried about security vulnerability by connecting Clay to their database, then a safe bet here would be to see if they can give you the dataset in a CSV and then uploading the CSV(s) to your table and allowing the scoring system to run that way. This may be time consuming as if the CSV files are too large then Clay won't be able to bring them in and so multiple CSV may be needed. However, it would ensure you would not have to worry about dealing with their VPN and connecting the Postgres database
This thread was picked up by our in-app web widget and will no longer sync to Slack. If you are the original poster, you can continue this conversation by logging into https://app.clay.com and clicking "Support" in the sidebar. If you're not the original poster and require help from support, please post in 02 Support.