Validity: Configuration Scenarios

What These Scenarios Cover

This page walks through three real-world configurations of DQS validity analysis. Each scenario covers a specific business problem, shows the exact settings to use, and explains how to read the results.

These walkthroughs build on the concepts from the main Validity article. Read that first if you are new to validity metrics, the diagnostic flow, or pattern configuration.

Scenario 1: Secondary Email Validation on a Custom Text Field

The Problem

Your organization stores a secondary email address in a custom Secondary_Email__c text field on the Contact object. Unlike the standard Salesforce Email field, a text field has no built-in format validation. Users paste, type, and import anything into it. Marketing wants to use these secondary addresses for a re-engagement campaign, but nobody knows how many are structurally valid. You need a concrete number so marketing can set realistic campaign projections and your ops team can scope the cleanup.

Why not the standard Email field? Salesforce’s native Email field type validates format on input. Values in a standard Email field already pass basic format checks. DQS email validation is useful on custom Text fields that store email addresses without Salesforce’s built-in enforcement.

Configuration

Use Format Validation mode on the Contact object, targeting the Secondary_Email__c field. You need the headline validity rate and a count of usable records. Placeholder detection and noise analysis are not relevant here because email addresses either match the format or they don’t.

Setting	Value	Why
Analysis Mode	Format Validation	You need the match rate and valid count, not the full invalid breakdown
Pattern Type	Email	Built-in pattern: `^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`
Include Blanks	OFF	Blank emails are a completeness problem, not a validity problem. Keep them out of this analysis.
Case Sensitive	OFF	Email addresses are case-insensitive by definition

The Email pattern is a built-in preset. You do not need to write any regex. Select “Email” from the pattern picker and the regex is applied automatically.

Sample Results

Metric	Value
Validity Rate	71%
Valid Count	35,500

Total Contact records evaluated: 50,000.

Reading the Results

Start with the headline: 71% validity. That means 29% of secondary email addresses fail the format check. Of 50,000 Contacts with a populated Secondary_Email__c, only 35,500 have a structurally valid address.

What 29% invalid looks like in practice: These are values missing the ”@” symbol (john.company.com), missing a domain extension (john@company), containing double dots ([email protected]), or including spaces (john @company.com). Because this is a text field, Salesforce accepted all of them on entry. Every campaign sent to these addresses bounces.

The campaign math changes. Marketing has been projecting re-engagement reach based on 50,000 secondary addresses. The real addressable audience is 35,500. Open rates, click rates, and conversion projections all need to be recalculated against the valid base, not the inflated total.

Why Format Validation is enough here. You don’t need the Advanced mode for this scenario. The question is simple: “How many secondary emails match a valid format?” Validity Rate and Valid Count answer that question. If you later need to scope a cleanup project with exact invalid counts, switch to Advanced Format Validation for the full breakdown.

What to Do Next

Use Valid Count (35,500) as the real addressable audience for campaign planning. Scope a cleanup project for the remaining 14,500 records: export them, identify the most common format errors, and fix them through data enrichment or manual correction. Consider adding a Salesforce validation rule on Secondary_Email__c to enforce email format on future entries, or convert the field to the Email type if your processes allow it.

Scenario 2: Product Code Validation with Fixed Length

The Problem

Your company uses 8-character product codes in a custom Product_Code__c field on the Opportunity Product object. These codes drive inventory lookups, pricing rules, and ERP integration. The ERP sync has been failing on roughly 5% of records each week, and the integration team suspects malformed product codes. You need to confirm how many codes fail the format check and get the exact cleanup scope.

Configuration

Use Advanced Format Validation mode on the Opportunity Product object, targeting the Product_Code__c field. You need the full valid/invalid breakdown so the integration team has exact record counts for their remediation project.

Setting	Value	Why
Analysis Mode	Advanced Format Validation	You need Invalid Count to scope the cleanup, plus Noise Rate to check for junk entries
Pattern Type	Fixed Length	Product codes are always exactly 8 characters
Fixed Length	8	Your standard code length
Include Blanks	ON	A blank product code is invalid for ERP sync. Count it as a failure.
Case Sensitive	OFF	Product codes are not case-dependent in your system

The Fixed Length pattern generates the regex ^.{8}$ automatically. Any value that is not exactly 8 characters fails validation.

Sample Results

Foundation Metrics:

Metric	Value
Validity Rate	94.2%
Valid Count	9,420

Advanced Metrics:

Metric	Value
Invalid Rate	5.8%
Invalid Count	580
Noise Rate	0.4%
Noisy Records Count	40

Total records evaluated: 10,000.

Reading the Results

5.8% invalid confirms the integration team’s estimate. 580 product codes out of 10,000 do not match the 8-character format. These are the records breaking the ERP sync.

Invalid Count (580) is the cleanup scope. Your integration team now has a concrete number. Instead of investigating each sync failure individually, they can pull the 580 records, categorize the format errors, and batch-fix them. Common problems in product code fields include truncated codes (5-7 characters from copy-paste errors), codes with trailing spaces (9 characters because of an invisible space), and codes with dashes or prefixes added by users (“PC-12345678”).

Noise Rate (0.4%) is low but worth noting. 40 records contain noise patterns: repeated characters (“XXXXXXXX”), keyboard entries (“asdfghjk”), or special character strings. These 40 records are not format errors. They are junk entries that happen to be exactly 8 characters long. Validity Rate counted them as valid because they pass the length check, but they are garbage data that will fail the ERP lookup for a different reason. Noise Rate catches what the format check misses.

Include Blanks ON matters here. With Include Blanks enabled, any record where Product_Code__c is empty counts as invalid. If you had left this setting off, those blank records would be excluded from evaluation entirely, and your Invalid Count would be lower than the true number of records failing ERP sync. Since a blank product code breaks the integration the same way a malformed one does, including blanks gives you the accurate failure scope.

What to Do Next

Export the 580 invalid records for the integration team. Categorize errors by type: truncated codes, extra characters, trailing spaces. Fix them in bulk using a data update job. For the 40 noisy records, investigate the source. If they came from a specific import or user, address that root cause. After cleanup, add a Salesforce validation rule enforcing the 8-character length on Product_Code__c to prevent new bad entries. Rescan to verify your new Validity Rate.

Scenario 3: Web-to-Lead Company Name Noise Detection

The Problem

Your web-to-lead form requires the Company field. Lead volume is strong: 20,000 new leads per quarter. But the SDR team reports that many leads have garbage company names, entries like “asdf”, “test”, “xxx”, or “na na na.” These leads waste SDR time and pollute your segmentation. A basic completeness check shows 98% of leads have a Company value. You suspect the 98% is misleading because junk entries are technically “populated.”

Configuration

Use Advanced Format Validation mode on the Lead object, targeting the Company field. You need Noise Rate to quantify the garbage that hides behind a healthy completeness score.

For the format pattern, there is no strict format rule for company names. Company names are free text. Use a minimal text validation to check that the value contains at least one alphanumeric character.

Setting	Value	Why
Analysis Mode	Advanced Format Validation	You need Noise Rate and Noisy Records Count to quantify junk entries
Pattern Type	Custom	No built-in pattern fits free-text company names
Custom Pattern	`^.[a-zA-Z0-9].$`	Matches any value containing at least one letter or digit. Catches values that are purely special characters.
Include Blanks	ON	Blank company names are a problem too. Include them in the failure count.
Case Sensitive	OFF	Not relevant for this pattern, but leave it off as the default

The real value of this scan is in the noise metrics, not the format validation. The custom pattern is intentionally loose because you are not enforcing a specific company name format. You are running the scan in Advanced mode to get access to Noise Rate and Noisy Records Count.

Sample Results

Foundation Metrics:

Metric	Value
Validity Rate	97.5%
Valid Count	19,500

Advanced Metrics:

Metric	Value
Invalid Rate	2.5%
Invalid Count	500
Noise Rate	12%
Noisy Records Count	2,400

Total Lead records evaluated: 20,000.

Reading the Results

97.5% validity is expected and not the point. Almost every value passes the loose format check because the pattern only requires one alphanumeric character. The 500 invalid records are entries with only special characters or whitespace, values like ”---”, ”…”, or ”!!!”. Those are easy to identify and delete.

Noise Rate (12%) is the real finding. 2,400 leads have company names that contain noise patterns. These are entries with repeated characters (“aaaa”, “xxxxx”), consecutive special characters (”!@#$%”), or control characters. They pass the format check because they contain alphanumeric characters, but the values are garbage.

The true data quality picture:

Category	Records	What It Means
Clean and valid	17,100	Real company names ready for SDR outreach
Invalid (pure junk)	500	No alphanumeric content at all. Delete or quarantine.
Noisy (hidden junk)	2,400	Looks populated but contains garbage. Manual review or auto-flag.

Your SDR team is right: the lead quality problem is real. 2,900 out of 20,000 leads (14.5%) have unusable company data. That is 14.5% of SDR time wasted on leads that can never be properly routed, enriched, or segmented.

The completeness vs validity gap. Completeness says 98% of leads have a Company value. Validity says 97.5% pass the format check. Noise Rate says 12% of those passing values are garbage. Each dimension reveals a different layer of the problem. Completeness alone misses the junk that Noise Rate catches.

What to Do Next

Build a cleanup queue for the 2,900 combined invalid and noisy records. For the 500 purely invalid records, auto-delete or quarantine them. For the 2,400 noisy records, decide: auto-delete leads with no other useful data, or flag them for manual review if phone or email data is still usable.

Fix the source. The junk is coming from your web form. Add client-side validation: a minimum character length, block repeated-character patterns, and consider CAPTCHA for bot prevention. After implementing form changes, run the scan again next quarter and compare Noise Rate to this baseline.

Choosing Your Configuration

Use this table to pick the right starting point for your validity analysis.

If You Need To…	Start With	Key Settings
Check email format on custom text fields	Format Validation	Pattern Type: Email, Include Blanks: OFF
Validate fixed-length codes (product codes, SKUs, postal codes)	Advanced Format Validation	Pattern Type: Fixed Length, set your character count, Include Blanks: ON
Validate URL format on website fields	Format Validation	Pattern Type: URL, Include Blanks: OFF
Enforce a custom business format (regex)	Advanced Format Validation	Pattern Type: Custom, enter your regex pattern
Detect junk and noise in free-text fields	Advanced Format Validation	Use a loose format pattern, focus on Noise Rate and Noisy Records Count
Scope a data cleanup project for an integration	Advanced Format Validation	Include Blanks: ON, use Invalid Count and Noisy Records Count for project sizing

For a full reference of all 6 validity metrics, pattern types, and noise detection details, return to the main Validity article.

Ready to measure your own data quality? Take the AI Readiness Assessment to see your validity scores and more.