Improving Regex for Extracting Email Subjects and Threads in Clay
Hello, team👋 I’m working with a column (text type) that contains full email messages in the following structure: Subject: [subject line] Thread: [thread title] Body of Email: [email body] My goal is to extract specific parts of this text, particularly the Subject and Thread values into separate fields. In Google Sheets, I would typically use a formula like this to extract the subject: =REGEXEXTRACT(A1, "Subject:\s*(.+?)\s*Thread:") I tried using the “Extract Values from Data” function and applied the following regex pattern in Clay: Subject:\s*(.+?)\s*Thread However, the result I get is still prefixed with “Subject:” and includes a line break before “Thread.” Example output: Subject: Supporting Reliable Digital Experiences at Company\n\nThread My questions are: 1) Is it possible to adjust the regex pattern so that the result only includes the clean subject text (e.g., Supporting Reliable Digital Experiences at Company)? 2) Is there a better method to extract specific sections of a text field? Thanks in advance!