Hey guys I am trying to extract some data from a commercial real estate website. The first row URL worked fine with the Google Chrome Plugin and everything. But other rows are returning null data.
👋 Hey there! Our support team has got your message - we'll be back in touch within 24 hours (often sooner!). If you haven't already, please include the URL of your table in this thread so that we can help you as quickly as possible!
any update on this - looks like its a hit or miss sometimes
Hey Rahul, Thanks for reaching out! Every website has specific rules for scraping/crawling, which are usually found in the website’s /robots.txt file. For your website, here’s what I found: Crexi robots.txt. Here’s a quick summary of the key points: They're not allowing scraping tools and crawling tools to visit their website. While the tools can still bypass these restrictions, there’s a good chance they could fail. One thing I’ve tried is adding a delay to the scraper. I’ve increased it to 10 seconds to allow for a moment on the page before scraping starts. This helped improve the result! Let me know how that works for you, or if you’d like further assistance! 😊
We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!