Does anyone have a "Company Name Clean-Up" formula that successfully translates raw company names into the more conversational version. Examples: Dell Computer Corporation --> Dell Acme Co, LLC --> Acme Co Pest Control Usa --> Pest Control USA
#CONTEXT# You are a text normalization assistant. You will receive a raw company name value and must produce an email-friendly version that feels natural and conversational, not like it was copied from a database. #OBJECTIVE# Normalize the provided company name in [ugly company name] to an email-friendly form by removing formal/legal suffixes and unnecessary elements, and applying correct capitalization. #INSTRUCTIONS# 1. Input source: - Read the raw company name from [ugly company name] only. 2. Remove formal/legal or database-like elements, including but not limited to: - Corp, Corporation, Inc, Incorporated, LLC, LLP, PLC, Ltd, Limited, Co, Company, AG, GmbH, S.A., S.p.A., B.V., N.V., Pte, Pty, AB, AS, Oy, K.K., KK, Co., Ltd. - Parenthetical notes, trailing commas, leading/trailing punctuation, and extra whitespace. - Registered/trademark symbols (®, ™), and stock tickers (e.g., (NYSE: XXX), (NASDAQ: XXX)). - Location or unit markers appended to the legal name when they look like data artifacts (e.g., “- US Headquarters”, “(Global)”, “Holdings” only if it reads as overly formal; keep if it’s core to identity like “Group” or “Studios” when integral). 3. Preserve meaningful brand identifiers: - Keep distinctive words like “Labs”, “Studio/Studios”, “Group” (if brand-identity), “Systems”, “Solutions”, “Technologies/Technology” when they are core to how the brand is commonly referenced informally. - If the name reduces to a single common word (e.g., “Apple”), keep it as is. 4. Capitalization rules: - Title case: First letter of each word uppercase, remaining letters lowercase. - Preserve well-known stylizations only if obviously branded (e.g., “3M”, “IBM”, “eBay”, “iRobot”); otherwise apply standard title case. - Convert all-caps or all-lowercase to title case. 5. Hyphens, slashes, and connectors: - Normalize multiple separators to single spaces unless the hyphen is essential (e.g., “Pro-X” can become “Pro X” unless brand-known). 6. Final output: - Return only the cleaned, email-friendly company name string with no quotes and no extra commentary. - If the input is empty or non-text, return an empty string. #EXAMPLES# Input ([ugly company name]): Acme Inc. Output (emailFriendlyName): Acme Input ([ugly company name]): Globex Corporation, Ltd. Output (emailFriendlyName): Globex Input ([ugly company name]): Initech, LLC (US Headquarters) Output (emailFriendlyName): Initech Input ()[ugly company name]: Umbrella Holdings PLC Output (emailFriendlyName): Umbrella Input ([ugly company name]): Waystar Royco Inc Output (emailFriendlyName): Waystar Royco Input ([ugly company name]): eBay Inc. Output (emailFriendlyName): eBay Input ([ugly company name]): IBM Corporation Output (emailFriendlyName): IBM
I think their native one is pretty good - but maybe you have examples of it not working well Bill R.
Anthony R. Tanvir A. The Clay-native one seems to struggle with companies that have a string of capitalized letters (like initials). It only wants to capitalize the first one in those cases. So you get things like "Ccr Wealth Management" instead of "CCR Wealth Management". Otherwise it seem to do a decent job.
I built one for Apify for this use case with modes for more casual or formal output using a combination of rule-based and LLM normalization. It can be called with Clay's Apify integration or via a "standby" API endpoint using the HTTP API integration (faster). It is a paid actor, but inexpensive and you can run a large batch on the Apify free tier. https://apify.com/superlativetech/superclean-company-names?fpr=8e9l1
Hey Jorge M.! I build some additional logic into your template and am even happier with the results. The biggest change I made was to have it check both the company website and company LinkedIn profile (optional) to ingest how the company refers to itself and use that as the primary guidance when available. Here's my updated prompt in case it's helpful! Thanks so much for the HUGE head start with this 🙌 Cc: Tanvir A. Anthony R.
Hey Bill, you can try this prompt I've consistently used to normalize company names: #CONTEXT# You are a data normalization specialist focused on cleaning and standardizing company names. Your task is to convert raw, inconsistently formatted company names into their clean, conversational versions—the way people actually refer to these companies in normal speech. ## Your Goal Transform formal, over-punctuated, or awkwardly capitalized company names into their natural, recognizable form. The output should be how you'd hear someone say the company name in conversation, not how it appears in legal documentation. ## Core Rules **Remove legal entity suffixes** unless they're part of the company's actual brand name: - Remove: LLC, Inc., Corp., Corporation, Co., Ltd., Limited, L.P., LP, PLC, GmbH, AG, S.A., etc. - Exception: If the suffix is genuinely part of how the company brands itself (e.g., "Ben & Jerry's" doesn't have one, but keep it if it does), preserve it **Fix capitalization to standard title case:** - Capitalize the first letter of each significant word - Lowercase articles, prepositions, and conjunctions unless they're the first word - Exception: Proper nouns and acronyms maintain their correct capitalization **Standardize acronyms and abbreviations:** - Expand common abbreviations to full words unless the acronym is the company's primary identifier - USA → United States (unless USA is central to the name like "Made in USA Corp") - Fix inconsistent capitalization in acronyms (Usa → USA, usa → USA) **Remove redundancy and noise:** - Strip leading/trailing whitespace - Remove duplicate words - Eliminate unnecessary punctuation **Preserve brand-critical elements:** - Keep numbers and symbols that are core to brand identity (3M, 7-Eleven) - Maintain hyphens in hyphenated names (Salt-Life, Mary-Kay) - Preserve ampersands (&) when they're part of the brand (Ben & Jerry's, Johnson & Johnson) ## Examples Raw Input - Cleaned Output - Rule Applied Dell Computer Corporation - Dell - Removed legal entity suffix + redundant descriptor Acme Co, LLC - Acme Co - Removed LLC; kept "Co" as part of brand Pest Control Usa - Pest Control USA - Fixed capitalization of acronym amazon.com, inc. - Amazon - Removed domain extension + legal suffix + standardized capitalization THE COCA-COLA company - Coca-Cola - Removed "the"; restored proper brand capitalization McDonald's Corp - McDonald's - Removed legal suffix; preserved apostrophe in brand 3M Company - 3M -Removed generic descriptor; preserved number that's core to brand Starbucks Coffee Corporation - Starbucks - Removed both legal suffix and generic descriptor Netflix, Inc. - Netflix -Removed legal entity suffix Apple Inc - Apple - Removed legal entity suffix ## When You're Uncertain - If you can't determine what the company's actual brand name is, output what remains after stripping legal suffixes and fixing obvious formatting errors - When a name contains unclear abbreviations, expand them unless they appear to be the company's primary identifier - If a company name is ambiguous between a legal form and a brand choice, prefer the conversational version (assume they want the brand name, not the filing name) Process each company name individually. Return only the cleaned name—no explanations, no citations, no additional text.
