Clay Icon

Request for Help Scraping Data from a Website Using Clay

ยท
ยท

How can I give a website url to Clay and ask it to scrape the data of the companies and people. Can you please please.

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    You can scrape website data using our Chrome Extension. Here's how to get started:1. Install the Clay Chrome Extension: โ€ข Go to the Chrome Web Store โ€ข Search for "Clay" โ€ข Click "Add to Chrome" to install 2. Once installed: โ€ข Click the Clay Extension icon in your toolbar โ€ข Select "Select Data" option โ€ข Navigate to the website you want to scrapeThe scraper can extract various types of data including:** Body text ** Links ** Emails ** Phone numbers ** Keywords ** Social links * ImagesOur native scraper is specifically designed to handle company websites and can easily extract common business information. It's particularly effective for static pages like company homepages and contact pages. Sources: Scrape Website with Clay Get transcript of any YouTube video in seconds | Clay Deep Dive: Clay Scrapers [Limitless Research] - Clay University

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    Hi Noor S.! This thread was recently closed by our Support team. If you have a moment, please share your feedback:

  • Avatar of Noor S.
    Noor S.
    ยท
    ยท

    I've added the extension but I still can't get what I want

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    Let me help you understand how to effectively use the Chrome Extension to scrape data. There are two main ways to extract data:1. Using Auto-detect: โ€ข The extension automatically tries to identify lists on your page โ€ข You'll see detected items ready for scraping 2. Creating Custom Recipes (if auto-detect doesn't work): โ€ข Click "Select Data to Add from Page" โ€ข Choose either: โ€ข "Select a List" for pages with multiple items โ€ข "Select a Single Attribute" for individual pages โ€ข Click on items you want to scrape โ€ข Select specific attributes (like company name, location, description) โ€ข Save your recipeYou'll see a preview of your scraped data in a table format before finalizing. Once you're satisfied with the selected data, you can export it as a CSV, copy to clipboard, or add directly to your Clay workspace. Sources: Deep Dive: Clay Chrome Extension [Limitless Research] - Clay University Clay Chrome extension | Documentation | Clay University

  • Avatar of Noor S.
    Noor S.
    ยท
    ยท

    Here is a link: https://resources.hse.gov.uk/convictions/breach/breach_list.asp?ST=B&EO=%3D&SN=P&SF=HRE&SV=565 In simple words, I need to get the list of companies who were fined after an HSE breach.

  • Avatar of Noor S.
    Noor S.
    ยท
    ยท

    even on the auto version, I can't do it.

  • Avatar of Noor S.
    Noor S.
    ยท
    ยท

    Can you help me prepare a list.

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    Here's how you can create a list of companies with HSE breaches using the Clay Chrome Extension:1. Navigate to your HSE breaches webpage 2. Click the Clay Extension icon and select "Select Data to Add from Page" 3. Choose "Select a List" option 4. Click on the first few company entries to help the extension detect the pattern 5. Add the specific attributes you want to capture by selecting: โ€ข Company names โ€ข Fine amounts โ€ข Breach descriptions โ€ข Dates โ€ข Any other relevant informationThe extension will create a preview of your scraped data in a table format. Once you're satisfied with the selected data, you can save it to your Clay workspace.If the auto-detect feature doesn't work effectively, you can create a custom recipe:1. Click "Select Data" 2. Create a new recipe with a specific name (e.g., "HSE Breaches") 3. Set the URL pattern 4. Manually select the data points you want to capture Sources: Clay Chrome extension | Documentation | Clay University Deep Dive: Clay Chrome Extension [Limitless Research] - Clay University

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    Looks like you're still looking for help. We're escalating this over to our support team and someone should be in touch soon. In the meantime, feel free to include any other information that may be helpful in solving your issue!

  • Avatar of Tanvi R.
    Tanvi R.
    ยท
    ยท

    Hey there, thanks for reaching out! Our old Chrome extension to scrape Clay websites is currently deprecated, so unless you already have this downloaded you won't be able to use this extension. Instead, you can actually achieve this with Clay's Scrape Website enrichment and a formula column. I'll create a quick tutorial showing you what steps to take and send this to you!

  • Avatar of Nicole N.
    Nicole N.
    ยท
    ยท

    Can I get that too? I've been trying to use the chrome extension and can't get it to work

  • Avatar of Noor S.
    Noor S.
    ยท
    ยท

    Tanvi R. Thanks, can you share the video link please

  • Avatar of Tanvi R.
    Tanvi R.
    ยท
    ยท

    Sure thing!

  • Avatar of Tanvi R.
    Tanvi R.
    ยท
    ยท

    Here's a tutorial for you on how I was able to scrape these pages: https://www.loom.com/share/2f3f6f2abe2b4b8082559217c0da3356?sid=f0e5b4d2-f641-4e07-867b-d3930b694e1c

  • Avatar of Tanvi R.
    Tanvi R.
    ยท
    ยท

    This is the AI prompt I used: There are 10 rows of data I scraped from a website. I need your help to clean up this data and return it as a JSON array. Here is the website data: bodyText Go through the full bodyText and pull out the following fields: Case/Breach Defendant's Name Hearing Date Result Fine Act or Regulation Here is an example: Case/Breach: 47803630/01 Defendant's Name: All Clear Services Limited Hearing Date: 13/11/2024 Result: Guilty-Fine Fine: 18,000.00 Act or Regulation: Work at Height Regulations 2005 / 4 / 1 Return 10 arrays with the above fields

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    Hey there - just wanted to check in here to see if you needed anything else! Feel free to reply back here if you do.

  • Avatar of Channeled
    Channeled
    APP
    ยท
    ยท

    We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!