Defining Data Quality
Data quality measures how well your data serves its intended purpose. It is not about whether data is “correct” in absolute terms. It is about whether your data is fit for use in decision-making, operations, and analytics.
A customer address is high quality if it reaches the customer. A product code is high quality if your systems recognize it. Quality depends on context.
The “Fit for Purpose” Principle
Data quality is contextual. A shipping address needs street-level precision. A marketing region needs only country or state. Both can be “high quality” at different precision levels.
When assessing data quality, ask: What does this data need to do? Then measure whether it can do that.
The Five Dimensions Framework
Data quality is measured across five key dimensions. This framework has been adopted across industries and forms the basis of ISO 8000 and DAMA standards.
| Dimension | What It Measures | Example |
|---|---|---|
| Completeness | Required data is present | All mandatory fields are filled |
| Validity | Data conforms to formats | Email addresses have valid format |
| Uniqueness | No duplicate records | One record per customer |
| Timeliness | Data is current | Contact info updated within 90 days |
| Consistency | Data is uniform | ”USA” used consistently, not “US” or “United States” |
Each dimension answers a specific question about your data. Together they provide a complete picture of data health.
For detailed guidance on each dimension, see:
Industry Standards and Frameworks
ISO 8000
The ISO 8000 standard defines data quality requirements for master data exchange. It establishes principles for data accuracy, completeness, and consistency across organizations.
DAMA-DMBOK
The Data Management Association’s Body of Knowledge (DAMA-DMBOK) defines data quality as one of eleven knowledge areas in data management. It provides guidance on measurement, monitoring, and improvement processes.
The 1-10-100 Rule
This principle illustrates the escalating cost of poor data quality:
| Stage | Cost | Example |
|---|---|---|
| Prevention | $1 | Validation at data entry |
| Correction | $10 | Cleaning data after entry |
| Failure | $100 | Business impact of bad data |
Investing in data quality at the source saves significant costs downstream.
Data Quality vs Related Concepts
Data Quality vs Data Management
Data management is the broader practice of collecting, storing, and maintaining data. Data quality is one component of data management, focused specifically on fitness for use.
| Concept | Scope | Focus |
|---|---|---|
| Data Management | All data practices | Storage, access, security, lifecycle |
| Data Quality | Fitness for purpose | Completeness, validity, uniqueness, timeliness, consistency |
| Data Governance | Policies and ownership | Who owns data, who can change it, what rules apply |
Data Quality vs Data Accuracy
Accuracy asks: Does this value reflect reality? Quality asks: Does this data work for its purpose?
An email address can be valid (correct format) but inaccurate (person no longer uses it). DQS measures quality because format and completeness can be automated. Accuracy typically requires external verification.
How Data Quality is Measured
Quantitative Metrics
Data quality is expressed through measurable indicators:
| Metric Type | Example | Calculation |
|---|---|---|
| Percentage | Fill Rate | (Populated Records / Total Records) x 100 |
| Count | Duplicate Count | Number of records with matching values |
| Score | Validity Score | Weighted average across validation rules |
| Ratio | Conformance Rate | Conforming Values / Total Values |
Thresholds and Targets
Organizations set thresholds based on business requirements:
| Level | Threshold | Use Case |
|---|---|---|
| Critical | 99%+ | Regulatory reporting fields |
| High | 95%+ | Customer-facing data |
| Standard | 85%+ | Operational data |
| Low | 70%+ | Historical or archival data |
Continuous vs Point-in-Time Measurement
Point-in-time measurement provides a snapshot. Continuous measurement tracks trends and catches degradation early.
DQS supports both approaches:
- Run ad-hoc scans for immediate assessment
- Schedule recurring scans for ongoing monitoring
Why Organizations Struggle
1. Data Silos
When data lives in disconnected systems, inconsistencies occur naturally. Sales has one version of a customer record. Support has another. Neither knows which is correct.
2. Manual Entry Errors
Human data entry is prone to typos, inconsistent formatting, and missing information. Without validation rules, these errors compound over time.
3. No Clear Ownership
When no one is responsible for data quality, it becomes everyone’s problem and no one’s priority. Data stewardship requires explicit assignment.
4. Lack of Measurement
You cannot improve what you do not measure. Many organizations assume their data is good enough without establishing baselines or tracking metrics.
5. One-Time Cleanup Projects
Treating data quality as a project rather than a process leads to temporary improvements that degrade over time.
The Business Impact
Poor data quality affects every function:
| Function | Impact |
|---|---|
| Marketing | Campaigns sent to wrong addresses, wasted spend |
| Sales | Time wasted on duplicate leads, lost context |
| Finance | Inaccurate reports, compliance risks |
| Operations | Decisions based on flawed data |
| AI/ML | Models trained on bad data produce bad outputs |
Quantifying the Cost
Research from MIT Sloan and industry studies shows:
- Organizations lose 15-25% of revenue annually due to poor data quality
- Over 25% of organizations lose more than $5 million per year on data issues (IBM 2025)
- Employees spend up to 27% of their time correcting bad data
Connection to AI Readiness
Traditional data quality (the five dimensions) prepares your data for reporting and automation. AI applications like Agentforce depend on the same foundations: complete records, valid formats, consistent values, current data, and no duplicates.
On top of those five dimensions, AI deployment introduces one additional concern: sensitive data exposure. Before connecting AI agents to your Salesforce data, you need to know where PII lives so you can mask or exclude it.
DQS measures both traditional data quality and AI readiness in a single platform:
- Five Data Quality dimensions: Completeness, Validity, Uniqueness, Timeliness, Consistency
- PII Detection: Scans text fields for sensitive data (SSNs, credit cards, personal info) before AI exposure
Building a Data Quality Practice
Effective data quality requires three elements:
1. Measurement
Establish baselines before improvement. Know where you stand across each dimension and field.
2. Process
Define workflows for ongoing data maintenance:
- Data entry validation rules
- Regular cleansing schedules
- Issue escalation procedures
- Change management protocols
3. Culture
Build organization-wide commitment:
- Assign data stewards for each domain
- Include data quality in performance metrics
- Celebrate improvements and share wins
- Make quality visible through dashboards
Getting Started with DQS
DQS provides the measurement foundation for your data quality practice:
- Select capabilities: Choose which dimensions to measure
- Define scope: Pick the objects and fields to analyze
- Configure thresholds: Set your quality standards
- Run scans: Execute analysis across your data
- Review results: Identify issues and prioritize fixes
The first step is understanding your current state. Take the AI Readiness Assessment to benchmark your data quality maturity in 3 minutes.
Next Steps
- Working in Salesforce? Start with Data Quality in Salesforce
- Measure it with a Data Quality Score — also called a data reliability score
- Dive deeper into Completeness, the first dimension
- Read about The Five Dimensions for a complete overview
- Learn about Agentforce Preparation for AI-specific requirements
- Take the AI Readiness Assessment to see your current scores