Issues with GPT 4.1 Integration: Bugs and Unreliable Responses | Clay

Issues with GPT 4.1 Integration: Bugs and Unreliable Responses

Really loving the integration of GPT 4.1! However, it seems to be super buggy as sometimes it only returns answers in 5/10 cases. I had to re-run 4 records in the first column too. What is wrong here??

17 comments

Channeled
APP
·
·
We're sorry to hear you're experiencing issues with the GPT 4.1 integration. The intermittent responses and need to re-run records could be due to a few factors: 1. API usage limits: Ensure your OpenAI API plan has sufficient tokens per minute (TPM) for reliable use in Clay. For ChatGPT columns, we recommend 30,000 TPM. 2. Network connectivity: Unstable internet connections can cause intermittent failures. 3. Browser cache: Persistent error messages can sometimes be resolved by clearing your browser cache and refreshing the view. To troubleshoot, try these steps: 1. Check your OpenAI API usage and limits at https://platform.openai.com/account/rate-limits 2. Clear your browser cache and refresh Clay 3. Ensure a stable internet connection If the issue persists, could you provide more details about the specific GPT 4.1 integration you're using in Clay?
Source: OpenAI/GPT x Clay integration | Clay.com
Maximilian J.
·
·
I am using the GPT 4.1 Integration with Clay Credits so rate limits can't be an issue on my side 😕
Maximilian J.
·
·
https://app.clay.com/workspaces/424789/workbooks/wb_w4M6GerBWdrT/tables/t_UVvWaE4gVWEE/views/gv_FHCEgUAbTYP9
Maximilian J.
·
·
Would really appreciate the help, as the results I actually get are super promising!!!
LuisArturo
·
·
Hey there Maximilian thanks for reaching out, sorry for the delay here, was taking a look at the setup here and nothing is out of order. This may come down to some connection's issues with the model and Clay since it is a recent launch. Will check in with the team about this.
Channeled
APP
·
·
Maximilian J.
·
·
Thank you LuisArturo, will monitor the next runs! Do you think it could be related to total tokens exceeding some limits?
Smit V.
·
·
Hi Maximilian, that seems unlikely because it errors out in the cases of token limit exceeding. Like Arturo mentioned, seems to be some sort of connection issue because the model is brand new. Let me know if I can help you with anything else :)
Maximilian J.
·
·
Alright Smit V. 😁 I just saw that whenever it returned something like "sorry but I have no answer", it took about 20+ steps compared to 10-14 for the successful runs
Bo (.
·
·
That makes sense — when the model can’t find a clear answer, it likely takes more steps attempting to reach one before eventually giving up. This would explain the 20+ steps compared to the usual 10–14 in successful runs.
Maximilian J.
·
·
I will try reducing my prompt in terms of more specific instructions where to look for and hopefully this avoids the 20+ steps! In the meantime: Is there any way to re-run these cells as they contain an "answer" and not an error?
Bo (.
·
·
Absolutely! You can re-run those cells a couple of ways: 1. To re-run specific rows: Click the first row you want, hold Shift, click the last row, then right-click and select Run Columns.
2. To re-run the entire column: Click the column header, then choose Run Column > Force Run All Rows. Let me know if you’d like help fine-tuning your prompt too.
Maximilian J.
·
·
Hey Bo (.! I have now split up the prompts into smaller focus areas and it slightly improved. However, there are still about 10% of runs with the response "I could not find an answer sorry" and 20 steps executed. Can someone please help me on how to fix this issue as I need to implement this into our automation workflow very soon. That would be amazing 🙂 !!!!
Maximilian J.
·
·
fyi: I have manually re-run the cells because i need to continue with building and testing the pipeline
Bo (.
·
·
Hey Maximilian — totally hear you, and honestly, what you’re seeing is pretty normal. Even with a well-structured prompt, it’s expected that some rows won’t return an answer — especially if the input data is sparse or unclear. The AI will still go through its process, often using 20+ steps, trying to find something before it gives up. That fallback behavior is built in. Your move to split the prompt into more focused parts was smart — that usually helps reduce failure rates. If you’re seeing about 10% return with “I could not find an answer,” that’s within a typical range, but we can still look at ways to tighten the prompt or handle those edge cases more gracefully in your workflow. Feel free to show us where and where did you find an answer instead that way we can help troubleshoot Let me know if you have more questions
Maximilian J.
·
·
Hello Bo (. 🙂 okay this is somewhat new to me as using claude 3.5 before usually always returned an answer, the quality was not on the same level however as it is with 4.1 now... What typically helped is simply re-running the cell once or a few times until there is a real response. I can do this for manual qualifications on scale but this wont work for automations 😕
Bo (.
·
·
Right — if re-running the cell gives you a proper response, it usually means the prompt might need some tweaking. Try identifying what’s missing or not working in the initial run, and then adjust the prompt to guide the model more clearly. And if Claude was working well for your use case, it’s totally fine to stick with it — no need to switch if it’s more reliable for what you need.