Improving Employee Count Extraction with Argon Claygemt Model: Prompt Tips and Consistency Issues

Improving Employee Count Extraction with Argon Claygemt Model: Prompt Tips and Consistency Issues | Clay

Im using Argon as a Claygemt model and this prompt: # DOMAIN RESTRICTION - CRITICAL NEVER visit external domains. You can ONLY access pages ending with {Company Domain}. NEVER visit linkedin.com, crunchbase.com, wikipedia.org, or any other external site. # ROLE Extract employee counts from company websites using ONLY the official domain provided. # VARIABLES Company Name: Company Domain: # DOMAIN RULES - READ TWICE ONLY access URLs ending exactly with {Company Domain} Subdomains allowed ONLY if they end with {Company Domain} (careers.example.com ✓, example.de ✗) NEVER follow redirects to external domains NEVER use LinkedIn, Crunchbase, Wikipedia, news sites, or third-party sources # SEARCH PRIORITY High Priority Pages: /about, /company, /careers, /investors, /press, /team, /operations Check All PDFs: Annual reports, fact sheets, investor docs hosted on domain ALL other Pages: All navigation links, footer links, sitemap pages (you must check ALL pages until you either find a number or there are no more pages left) # EXTRACTION RULES Must refer to the company itself (not subsidiaries/groups) Prefer exact numbers over ranges For ranges (200-500), use upper bound (500) For "1,200+" use base number (1200) Most recent data wins (ignore 36+ month old data) Copy quotes exactly, max 30 words # VALIDATION STEP Before outputting ANY number, confirm the source URL ends with {Company Domain}. If not, discard it. # OUTPUT FORMAT Employee Count: [number or 0] Evidence 1 URL: [exact URL] Evidence 1 Quote: "[exact text from page]" # FAILURE CONDITION Output "Employee Count: 0" ONLY after exhaustively checking ALL pages and PDFs on the domain. [1] #EXAMPLE INPUTS, DATA FOUND & OUTPUTS# Example 1: Multiple employee numbers on the same website Input: Company Name: Eiffage Infra-Bau SE Domain: bau.eiffage-infra.de Data Found: Quote 1: “At Eiffage Infra-Bau, more than 4,000 employees work in a wide range of professions.” Quote 2: “With 7,200 employees at 95 locations, we generate revenue of EUR 2 billion.” Quote 3: “With over 72,500 employees worldwide, we are one of Europe’s leading construction groups.” Output: 7,200 Reasoning: Quote 3 refers to the global parent group and must be excluded. Quotes 1 and 2 refer to the same local company. When multiple valid local numbers exist, select the highest stated value. Example 2: Employees described by job roles instead of “employees” Input: Company Name: Xervon GmbH Domain: xervon.de Data Found: Quote 1: “With more than 1,000 skilled scaffolders and scaffolding specialists, we are well positioned in terms of personnel.” Output: 1,000 Reasoning: The roles describe the primary workforce of the company. No indication that the number refers to a subset. Treat role-based counts as total employees when no broader figure exists. Example 3: Company uses “Group” but operates as a single entity Company Name: Local Manufacturing Company Domain: manufacturingco.com Data Found: Quote 1: “560 employees work for the Local Manufacturing Group across Germany.” Output: 560 Reasoning: No evidence of multiple subsidiaries or separate operating companies. “Group” is used as a naming convention only. Attribute the number directly to the company.

1.
Is this the right model?
2.
Can the prompt be written better? How?
3.
Why does the agent sometimes return different numbers on the same domain if I let him run twice?

9 comments

· Sorted by Oldest

Clay Support
APP
·
·
On “is this the right model?”: I can’t confirm whether Argon is the best choice from what’s here, but Clay’s AI results can vary based on the column’s selected AI options/functionality (e.g., web research vs content-style options). So first, double-check the column configuration is exactly the same each run/column if you want consistent behavior.
 
On “why different numbers on the same domain across runs?”: Some variability between AI outputs is expected. If you need stability, set up a comparison workflow that stores run history, then uses Lookup Multiple Rows in Other Table + Use AI (or a formula for numeric outputs) to detect differences between the two most recent runs. [1]
 
If you share 1-2 example domains and two differing outputs, I can help tighten the prompt for consistency.
Clay Support
APP
·
·
If you still need help refining your prompt or understanding the AI behavior, I’m here to assist. Would you like to provide more details about the domains or outputs you’re working with?
Tom O.
·
·
Its definitely the same column configuration. On this doman (xervon.de) it first identified 0 employees/did not find anything (wrong) and then 1000 employees (correct).
Clay Support
APP
·
·
Getting 0 on the first run and 1000 on the second (with the same column settings) can still happen because Clay AI prompts can be variable run-to-run, even when you don’t change anything. [1] Different confidence outcomes are also expected; check the cell’s icon (green/orange/red) to see whether the run was high vs low confidence.
A few practical ways to reduce this:
Use the column’s preview/iterate cycle to tighten wording until it reliably finds the same snippet.
If you have multiple AI columns, re-verify they’re truly using the same AI options/functionality (some options behave differently).
Clay Support
APP
·
·
Is that what you were looking for?
Tom O.
·
·
Is the prompt already optimized or can you write it better?
Clay Support
APP
·
·
Your prompt can be improved, but the biggest “optimization” is often using our built-in prompt helpers and preview loop. You can click Help me / Generate with prompt, describe the outcome (“Find employee count from the company’s own website and return a number + URL + quote”), and it will generate a tighter prompt and recommend an appropriate model. [1] [2] [3]
To reduce inconsistency, iterate with preview before scaling: click an enriched cell to see the generated details, then adjust wording until it reliably finds the right snippet. [4]
Also confirm your prompt variables (domain, company name) are correctly linked to the right columns.
Clay Support
APP
·
·
Did that answer your question?
Clay Support
APP
·
·
If you still need help optimizing your prompt or understanding the AI behavior, I’m here to assist. Would you like to share more details about the domains or outputs you’re working with?

Clay Support
APP
·
·
On “is this the right model?”: I can’t confirm whether Argon is the best choice from what’s here, but Clay’s AI results can vary based on the column’s selected AI options/functionality (e.g., web research vs content-style options). So first, double-check the column configuration is exactly the same each run/column if you want consistent behavior.
 
On “why different numbers on the same domain across runs?”: Some variability between AI outputs is expected. If you need stability, set up a comparison workflow that stores run history, then uses Lookup Multiple Rows in Other Table + Use AI (or a formula for numeric outputs) to detect differences between the two most recent runs. [1]
 
If you share 1-2 example domains and two differing outputs, I can help tighten the prompt for consistency.
Clay Support
APP
·
·
If you still need help refining your prompt or understanding the AI behavior, I’m here to assist. Would you like to provide more details about the domains or outputs you’re working with?
Tom O.
·
·
Its definitely the same column configuration. On this doman (xervon.de) it first identified 0 employees/did not find anything (wrong) and then 1000 employees (correct).
Clay Support
APP
·
·
Getting 0 on the first run and 1000 on the second (with the same column settings) can still happen because Clay AI prompts can be variable run-to-run, even when you don’t change anything. [1] Different confidence outcomes are also expected; check the cell’s icon (green/orange/red) to see whether the run was high vs low confidence.
A few practical ways to reduce this:
Use the column’s preview/iterate cycle to tighten wording until it reliably finds the same snippet.
If you have multiple AI columns, re-verify they’re truly using the same AI options/functionality (some options behave differently).
Clay Support
APP
·
·
Is that what you were looking for?
Tom O.
·
·
Is the prompt already optimized or can you write it better?
Clay Support
APP
·
·
Your prompt can be improved, but the biggest “optimization” is often using our built-in prompt helpers and preview loop. You can click Help me / Generate with prompt, describe the outcome (“Find employee count from the company’s own website and return a number + URL + quote”), and it will generate a tighter prompt and recommend an appropriate model. [1] [2] [3]
To reduce inconsistency, iterate with preview before scaling: click an enriched cell to see the generated details, then adjust wording until it reliably finds the right snippet. [4]
Also confirm your prompt variables (domain, company name) are correctly linked to the right columns.
Clay Support
APP
·
·
Did that answer your question?
Clay Support
APP
·
·
If you still need help optimizing your prompt or understanding the AI behavior, I’m here to assist. Would you like to share more details about the domains or outputs you’re working with?