Clay Icon

Apify Actor Domain Input Troubleshooting Guide

·
·

Hey, Can you please have a lot at this table: https://app.clay.com/workspaces/349620/workbooks/wb_2UMetbkw9uuk/tables/t_YsGt2h2AbpgC/views/gv_Z4aJukiDFRio I try to input in apify actor a domain This is what i tested and it worked:

{
    "aggressivePrune": false,
    "clickElementsCssSelector": "[aria-expanded=\"false\"]",
    "clientSideMinChangePercentage": 15,
    "crawlerType": "cheerio",
    "debugLog": false,
    "debugMode": false,
    "expandIframes": true,
    "ignoreCanonicalUrl": false,
    "includeUrlGlobs": [
        {
            "glob": ""
        }
    ],
    "keepUrlFragments": true,
    "maxCrawlPages": 500,
    "proxyConfiguration": {
        "useApifyProxy": true
    },
    "readableTextCharThreshold": 100,
    "removeCookieWarnings": true,
    "removeElementsCssSelector": "nav, footer, script, style, noscript, svg,\n[role=\"alert\"],\n[role=\"banner\"],\n[role=\"dialog\"],\n[role=\"alertdialog\"],\n[role=\"region\"][aria-label*=\"skip\" i],\n[aria-modal=\"true\"]",
    "renderingTypeDetectionPercentage": 10,
    "saveFiles": false,
    "saveHtml": false,
    "saveHtmlAsFile": false,
    "saveMarkdown": true,
    "saveScreenshots": false,
    "startUrls": [
        {
            "url": "https://onepilot.co/", // NOT using dynamic, Tested and send it from clay like this and it worked.
            "method": "GET"
        }
    ],
    "useSitemaps": true
}

When I add it with dynamic domain I keep getting this error:

{
"error": {
"type": "invalid-input",
"message": "Cannot parse input JSON body: Bad control character in string literal in JSON at position 615"
}
}

I tried :

{
    "startUrls": [
        {
            "url": "{domainInput}",
            "method": "GET"
        }
    ]
}

and

{
    "startUrls": [
        {
            "url": {domainInput},
            "method": "GET"
        }
    ]
}

and

{
    "startUrls": [
        {
            "url": "${domainInput}",
            "method": "GET"
        }
    ]
}

Can you please help me with this ? Thanks

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    It accepts urls like: https://onepilot.co/ If you try to input domain: onepilot.co, it wont work. So just ensure you are inputting a whole URL instead of just domain, thats what I think could be a problem.

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    hmm. let me see

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Same issue

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    can you share the screenshot of the section where you inputted the domain?

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Tried both of these:

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·
    {"aggressivePrune":false,"clickElementsCssSelector":"[aria-expanded="false"]","clientSideMinChangePercentage":15,"crawlerType":"cheerio","debugLog":false,"debugMode":false,"expandIframes":true,"ignoreCanonicalUrl":false,"includeUrlGlobs":[{"glob":""}],"keepUrlFragments":true,"maxCrawlPages":500,"proxyConfiguration":{"useApifyProxy":true},"readableTextCharThreshold":100,"removeCookieWarnings":true,"removeElementsCssSelector":"nav, footer, script, style, noscript, svg,\n[role="alert"],\n[role="banner"],\n[role="dialog"],\n[role="alertdialog"],\n[role="region"][aria-label*="skip" i],\n[aria-modal="true"]","renderingTypeDetectionPercentage":10,"saveFiles":false,"saveHtml":false,"saveHtmlAsFile":false,"saveMarkdown":true,"saveScreenshots":false,"startUrls":[{"url":"https://onepilot.co/","method":"GET"}],"useSitemaps":true}
  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    Try copy pasting it as is and see if this works - nvm it still has issues it wont work this way, you just have to remove all the extra spacing from within your body and it should fix

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Thanks, but it doesn’t work:

    {
    "error": {
    "type": "invalid-input",
    "message": "Cannot parse input JSON body: Expected ',' or '}' after property value in JSON at position 69"
    }
    }

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·
    {"aggressivePrune":false,"clickElementsCssSelector":"[aria-expanded=\"false\"]","clientSideMinChangePercentage":15,"crawlerType":"cheerio","debugLog":false,"debugMode":false,"expandIframes":true,"ignoreCanonicalUrl":false,"includeUrlGlobs":[{"glob":""}],"keepUrlFragments":true,"maxCrawlPages":500,"proxyConfiguration":{"useApifyProxy":true},"readableTextCharThreshold":100,"removeCookieWarnings":true,"removeElementsCssSelector":"nav, footer, script, style, noscript, svg,\n[role=\"alert\"],\n[role=\"banner\"],\n[role=\"dialog\"],\n[role=\"alertdialog\"],\n[role=\"region\"][aria-label*=\"skip\" i],\n[aria-modal=\"true\"]","renderingTypeDetectionPercentage":10,"saveFiles":false,"saveHtml":false,"saveHtmlAsFile":false,"saveMarkdown":true,"saveScreenshots":false,"startUrls":[{"url":"https://onepilot.co/","method":"GET"}],"useSitemaps":true}

    Sorry, the previous one was incorrect, try this one and if this still does not work then clay team would be able to have a look and trouble shoot.

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    this was sent. and i sent it before, but the challenge is to send the dynamic domains.

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    you can add dynamic domain in there, just copy the body as is and replace it with dynamic domain try with "" around dynamic domain and without "" as well

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    ok, will do thanks btw. it also shows timeout for cell so i can’t get the data back 😞

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    Thats a typical apify thing, how long does it take for the actor to finsih its run on average?

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    What exactly are you trying to achieve there might be other alternative/easy solutions to that

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    here it is:

    {
    "error": {
    "type": "invalid-input",
    "message": "Cannot parse input JSON body: Bad control character in string literal in JSON at position 473"
    }
    }

    i tried with “” and without. Bummer.

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Clay S. any ideas?

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Muhammad S. I don’t really need it back as I am scraping website and sending to Open AI Vector database for now.

  • Avatar of Muhammad S.
    Muhammad S.
    ·
    ·

    The same input works just find when I add "" around company domain, maybe you need to delete your apify column and create a new one?

  • Avatar of Bruno R.
    Bruno R.
    ·
    ·

    Hi Romeo, thanks for reaching out. Taking a look at this now - can you share the documentation of the Apify actor you're trying to set up in this table? I'll be happy to take a look at what's causing the error here. Are you currently working with the first or the second Apify column?

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    I still need to try Muhammad S. option in the meantime

  • Avatar of Bruno R.
    Bruno R.
    ·
    ·

    Based on the context you provided, I’ve set up a new Apify actor run column called "clay support apify setup." The scraper has been successfully configured, and it's now up and running. Below is the JSON body structure that works for this setup: https://downloads.intercomcdn.com/i/o/w28k1kwz/1234379674/7c03870d1d5ea42827351db8d3ca/CleanShot+2024-10-30+at+14_21_27%402x.png?expires=1730314800&signature=be77e944c46e770a3e18179a11375145b8d198b5f3f536fe0b2ab5c65937a4cd&req=dSIkEsp5lIdYXfMW1HO4zeFve%2BRRFaVkCcCXs5Yelvz42JP1zyIp6w326kFO%0An%2Bxw%0A I hope this was helpful. If there's anything else I can assist you with, please let me know!

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Bruno R. isee you added it but we can’t run it like this becasue:

    {
    "error": {
    "type": "actor-memory-limit-exceeded",
    "message": "By launching this job you will exceed the memory limit of 32768MB for all your Actor runs and builds (currently used: 32768MB, requested: 8192MB). Please consider upgrading or purchasing extra memory as an add-on at https://console.apify.com/billing/subscription to increase your Actor memory limit."
    }
    }

  • Avatar of Romeo M.
    Romeo M.
    ·
    ·

    Bruno R. There are some important functions here:

    {
        "aggressivePrune": false,
        "clickElementsCssSelector": "[aria-expanded=\"false\"]",
        "clientSideMinChangePercentage": 15,
        "crawlerType": "cheerio",
        "debugLog": false,
        "debugMode": false,
        "expandIframes": true,
        "ignoreCanonicalUrl": false,
        "includeUrlGlobs": [
            {
                "glob": ""
            }
        ],
        "keepUrlFragments": true,
        "maxCrawlPages": 500,
        "proxyConfiguration": {
            "useApifyProxy": true
        },
        "readableTextCharThreshold": 100,
        "removeCookieWarnings": true,
        "removeElementsCssSelector": "nav, footer, script, style, noscript, svg,\n[role=\"alert\"],\n[role=\"banner\"],\n[role=\"dialog\"],\n[role=\"alertdialog\"],\n[role=\"region\"][aria-label*=\"skip\" i],\n[aria-modal=\"true\"]",
        "renderingTypeDetectionPercentage": 10,
        "saveFiles": false,
        "saveHtml": false,
        "saveHtmlAsFile": false,
        "saveMarkdown": true,
        "saveScreenshots": false,
        "startUrls": [
            {
                "url": "https://onepilot.co/",
                "method": "GET"
            }
        ],
        "useSitemaps": true
    }

  • Avatar of Bruno R.
    Bruno R.
    ·
    ·

    Hi Romeo, thanks for following up! I’ve optimized the parameters by reducing crawl depth, page limits, concurrency, and session rotations to stay within memory limits. This adjustment should resolve the issue. Feel free to tweak the current setup further if needed, and it should now run smoothly.

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    We haven't heard back from you in a bit, so we're going to go ahead and close things out here - feel free to let us know if you still need something!

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Hi Romeo M.! This thread was recently closed by our Support team. If you have a moment, please share your feedback:

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Thank you so much for sharing your feedback Romeo M.!

  • Avatar of Channeled
    Channeled
    APP
    ·
    ·

    Thank you so much for sharing your feedback Romeo M.!