Automating Job Board Scraping into Clay: A How-To Guide

Sunny D.

is there a way to Automate scraping company job boards (multiple sources) scraping like an RSS feed into clay?

13 comments

Channeled
APP
·
·
Yes, we have a few ways to automate scraping job board sources into Clay: 1. RSS Feed Integration: We offer a native RSS Feed source that allows you to pull feed items from URLs into a Clay table. 2. Webhook Integration: You can use third-party scraping tools like Browse AI to scrape job boards and send data to Clay via webhook. 3. Claygent AI Web Scraper: Our AI-powered web scraper can retrieve unstructured data from various websites, including job boards. To set up an RSS feed, use our "write to table" feature to import feed items as rows. For more complex scraping needs, Claygent or a webhook integration with a dedicated scraping tool might be more suitable.
Sources: RSS feed integration | Documentation | Clay University Scrape Indeed and send to Clay for enrichment | Clay Claygent: AI Web Scraper [Automated Outbound] - Clay University Fetch RSS Feed Items with RSS Feeds | Clay.com
Aamir N.
·
·
You can try using Apify actors for LI and Indeed and Google jobs
Channeled
APP
·
·
We can use Apify actors for LinkedIn, Indeed, and Google Jobs to automate scraping into Clay through our Apify integration. Here's how: 1. Connect your Apify account to Clay in Settings > Connections. 2. In your Clay workbook, click "+ Add" and search for "Apify" or use "Add enrichment" for an existing table. 3. Select "Run Apify Actor" and choose the specific actor for LinkedIn, Indeed, or Google Jobs. 4. Configure the input data for the actor in JSON format, specifying your search criteria. 5. Use our "write to table" feature to import the scraped data as rows in your Clay table. This process allows you to automate job board scraping and integrate the data directly into your Clay workflows.
Sources: Apify integration | Documentation | Clay University Apify x Clay integration | Clay.com Run Apify Actor with Apify
Sunny D.
·
·
yeah, but do you know any to scrap company job boards? Company pages. Not Li or indeed
Channeled
APP
·
·
To scrape company job boards or company pages and integrate the data into Clay, you can use our Clay Chrome Extension. Here's how: 1. Install the Clay Chrome Extension from the Chrome Web Store. 2. Navigate to the company's job board or page you want to scrape. 3. Click on the Clay Chrome Extension icon and select "Select Data to Add from Page". 4. Choose "Select a List" for job boards with multiple listings, or "Select a Single Attribute" for individual job pages. 5. Map the attributes you want to extract (e.g., job title, description, location). 6. Name and save your recipe. 7. Use our "write to table" feature to import the scraped data into your Clay table. For pages with similar structures, you can create a URL pattern to apply the same recipe across multiple pages. This method allows you to efficiently gather job data from various company websites and integrate it directly into your Clay workflows.
Sources: Clay Chrome extension | Documentation | Clay University Get transcript of any YouTube video in seconds | Clay
Sunny D.
·
·
how do I make that into an RSS feed? also the structure is only title on the first page. Each click opens up the JD. How do I capture that too for all links
Channeled
APP
·
·
To create an RSS feed from scraped company job boards and capture both titles and job descriptions, you can follow these steps: 1. Use our Clay Chrome Extension to scrape the job board: • Create a recipe for the list page to capture job titles. • Create another recipe for individual job pages to capture descriptions. 2. Use the "Scrape Website with Clay" feature to extract data from both the list and individual job pages. 3. Use our "write to table" feature to import the scraped data into a Clay table. 4. Set up an RSS feed using a third-party tool that can convert your Clay table data into an RSS format. 5. Use our "Pull RSS Feed" feature to integrate this newly created RSS feed back into Clay for further processing or automation. This approach allows you to capture both the titles from the first page and the detailed job descriptions from individual links.
Sources: Scrape Website with Clay Clay Chrome extension | Documentation | Clay University Auto Prospect B2B Companies Listed on Product Hunt via RSS Feed | Clay
Muhammad S.
·
·
Use claygent to find career/job pages of companies, use a scrapper like firecrawl to scrap the data from those pages, then use AI to clean that jobs data
Channeled
APP
·
·
Here's how you can use Claygent and other tools to find and process job data: 1. Use Claygent to find career/job pages: • In your Clay workspace, go to the enrichment panel and select Claygent under AI tools. • Create a prompt asking Claygent to find career/job pages for your list of companies. 2. Scrape data with Firecrawl: • Use Firecrawl or a similar tool to scrape the job data from the pages Claygent found. 3. Import scraped data to Clay: • Use our "write to table" feature to import the scraped data into a Clay table. 4. Clean job data with AI: • Use Clay's AI capabilities to clean and format the data. • You can use our standardization tools to ensure consistency in fields like job titles or company names. • Use the AI Formula Generator to create custom cleaning rules based on your specific needs. This process combines Claygent's AI web scraping, external tools, and Clay's data cleaning features for efficient job data processing.
Sources: Claygent: AI Web Scraper [CRM Enrichment] - Clay University Clay AI Capabilities Data Cleaning + Normalization [CRM Enrichment] - Clay University
Channeled
APP
·
·
Bo (.
·
·
Hi Sunny, There are multiple tools that can create RSS feeds or notify you of website updates, like Feedly. However, I'd recommend creating a Google search operator based on career pages and setting up Google alerts for those searches. When companies post new job listings, they typically become available on search engines, allowing you to track them through Google search alerts without needing to manually scrape multiple sources. 1. Create a Google Search Operator 2. Create a Google Alert (RSS) 3. Add in Clay - Monitor Let me know if you need more specific guidance on setting up those search operators!
Sunny D.
·
·
Thanks Bo. I tried using Apify for now making changes into Page Function for each directory and will hopefully schedule the runs and dedupe it. I will also try your method out to extract more data. Will update the thread if I face issues. Thanks for helping.
Bo (.
·
·
Perfect - Keep us posted! This method is actually free so might be worth trying too!!! :)