Clay Icon

Extracting Data from Websites with ClayAgent: Seeking Prompt Help

·
·

Hey everyone, im trying to extract data from a website using clayagent. However i am struggling to find the right prompt. the websites are always built the same way, theoretically all information can be pulled from the same div class from the html. the target website looks like this. the expected output would be the website, e-mail and phone number from the right hand side (see screenshot). can somebody help me with the write prompt? (will be using 4o or 4o mini) 🙏

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    what's your current prompt?

  • Avatar of Julian
    Julian
    ·
    ·

    Extract the following four pieces of information from the given webpage: 1. Website URL 2. Email Address 3. Telephone Number Rules: • Focus only on information near or within sections labeled with “Kontakt”, “Contact”, or near the company name/description. • Ignore any values found in the footer, header, or social media links (e.g., Twitter, LinkedIn). • Do not use any contact information from sections that include “Follow Us”, “Newsletter”, or generic terms like “Support”. • Prefer values located within the main body of the page rather than sidebars or peripheral elements. Return the output in this format: • Website: [URL] • Email: [Email Address] • Telefon: [Text]

  • Avatar of Julian
    Julian
    ·
    ·

    _______ Actually, there might be a different issue, as the content on the target url has an age check, clayagent might not be able to access the data?

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    What's the current output with this?

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Yes, if it has an age check it will not be able to proceed. The same happens with websites requiring logins.

  • Avatar of Julian
    Julian
    ·
    ·

    it does work for some rows tho - super weird

  • Avatar of Julian
    Julian
    ·
    ·

    is there a way to have clay click the ‘over 18’ cta?

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Maybe the website doesn't age check prompt for the rows that have filled

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    So it could access the info

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Try the same prompt in other websites without an age check

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    If it works, you know it's that. If it doesn't, you gotta refine your prompt

  • Avatar of Julian
    Julian
    ·
    ·

    its alwayts the same website, its the exhibitor detail page from a conference page. so will always be the same url except for exhibitor id

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Got you. Try on a different website without an age check prompt at all.

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    If Claygent doesn't give you a specific error, it might be because the prompt actioned that time

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    But it usually either returns something, or gives out an error

  • Avatar of Julian
    Julian
    ·
    ·

    yeah my prompts usualyl work on different websites

  • Avatar of Julian
    Julian
    ·
    ·

    is there anyway to bypass it for this table?

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Does the website URL change after the age prompt? If so, you can do a quick formula to give you the post age prompts URL and start scraping from there

  • Avatar of Julian
    Julian
    ·
    ·

    unfortunately not, they use cookies for that

  • Avatar of Julian
    Julian
    ·
    ·

    can i train clayagent to accept the popup and get the cookie?

  • Avatar of Oriol S.
    Oriol S.
    ·
    ·

    Not that I'm aware. Claygent simply scrapes what it can see on the website with the added benefit of using AI so it can run logic

  • Avatar of Tanvi R.
    Tanvi R.
    ·
    ·

    Hi Julian, thanks for reaching out! Appreciate you Oriol for jumping in here and sharing this great Claygent prompt! How is that working for you now? Another solution you can try here is also using the "Run Zenrows Scrape" enrichment with the auto-parse setting enabled. This will automatically parse the website scrape for website urls, emails, and telephone numbers. Feel free to give this a try for the rows that don't work with Claygent and let me know if this resolves your issues!

  • Avatar of Alberto
    Alberto
    ·
    ·

    zenrows is a good options, but is there a way to scrape only certian type of Links on pages (that reflects only certain URL pattern?)

  • Avatar of Owen C.
    Owen C.
    ·
    ·

    Hi Alberto! Yes, Zenrows does provide a way to scrape specific types of links on a page, especially when using tools like BeautifulSoup. You can find more detailed information on how to do this in the Zenrows documentation here: Zenrows Documentation Let me know if you need further clarification or help! 😊

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Hi Julian! This thread was recently closed by our Support team. If you have a moment, please share your feedback: