Hi I have an apify task which has run 10 times last week. I want to get data from all the pulls into the table. I have tried 3 methods:
Native Apify integration. It does not show me all the runs to choose from. And it only loads the last run on default.
API call, when I call the dataset/item api, the data exceeds cell limitation.
How can I do this? I am fairly new to this..
Unfortunately, this can't be done via API call. Youll have to manually download the data sets from the Storage in your Apify account and upload to Clay.
Thanks for jumping in Adnan. Hey Utsav, happy to provide some context! To import data from all the runs of your Apify task into a Clay table, you should be aware that the native Apify integration in Clay is designed to pull data from the latest run by default. If you want to import data from specific runs, you would typically need to enable a feature that allows pulling data from those specific runs. For your second method using an API call, you've encountered an issue with the data exceeding cell limitations. This is a common problem when dealing with large datasets. One way to handle this is by paginating the GET request to retrieve the dataset in smaller chunks. You can add query parameters like offset and limit to your GET request to manage the size of the data you are importing. Or you can use the Field paths to Return to pull in selected values (see image). That should also solve your cell limitation error. https://downloads.intercomcdn.com/i/o/w28k1kwz/1234200555/bedbc6d1567c00d200a601fc4eda/Berrycast_HZko0Vb1tm.png?expires=1730306700&signature=6497f7a0170a29605f6dc78d6266d40812253e0200c0421080516436602143be&req=dSIkEst%2BnYRaXPMW1HO4zZ80KbMZzvJKmVYBzlKMc4fmw0Ack3O9ekkwAJ9N%0AiH2U%0A Let me know if you need additional help on this!
Got it. I will explore pagination. For the one with integration, how do I enable pull from all runs?
Hi Utsav, while you are not able to pull all runs from an Apify actor at once, what you can do is add each run as a source in your Apify table. Create a new table with Apify and select the appropriate actor and the latest task. You can continue adding sources within this table and select all of the relevant tasks you'd like to pull data from. This way, you can populate all of your Apify runs into your table. https://downloads.intercomcdn.com/i/o/w28k1kwz/1234523962/3c1f05a55b0bc99fabdde5eec567/CleanShot+2024-10-30+at+16_29_14%402x.png?expires=1730322000&signature=22fe711a2e4aec7c539952319be9539169bfbfb532c852a2b77b93064cffd5d8&req=dSIkEsx8nohZW%2FMW1HO4zcgVUFTvcR9DncjVCTt%2BIpxnEZV7wkUrcJVVpVji%0AgJSc%0A
Let me know if this makes sense and if you need further assistance with your set up!
We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!