Data Quality Verification
Dune is committed to delivering high-fidelity blockchain data that teams can trust for critical analytics, applications, and business decisions. Our data quality framework ensures accuracy, completeness, and consistency across 100+ blockchains through continuous automated verification and monitoring. Data quality isn’t a one-off test. It’s a set of automated, continuous checks running at every stage of our pipeline, from raw block ingestion through transformation and delivery.Monitoring: Status Page · Data Freshness
Trusted at Scale
Dune has been indexing and serving blockchain data longer than any other onchain data platform. Our infrastructure processes millions of queries per day across 100+ blockchains, powering analytics for leading exchanges, DeFi protocols, investment firms, and blockchain foundations worldwide. This isn’t just scale for its own sake. It’s operational proof. Every chain-specific edge case, every reorg pattern, every data quality challenge that exists in onchain data, we’ve encountered and built systems to handle. The quality framework described on this page has been refined through years of production operation at a scale no other onchain data provider matches.“Huge credit to the Dune team for their speed, precision, and commitment to making onchain data accessible and actionable.” Luke Judges, Director at Ripple
How Our Pipeline Works
Dune’s data pipeline processes blockchain data through three stages, each with its own quality controls: 1. Ingestion: We ingest raw block, transaction, log, and trace data directly from blockchain nodes. Automated checks at this stage catch missing blocks, duplicates, and malformed data. 2. Automated Quality Processing: Continuous, 24/7 reconciliation and validation runs across all data tiers. This includes gap detection, reorg handling, deduplication passes, ABI validation for decoded data, and cross-reference checks between data layers. 3. Validated Output: The result is production-ready datasets that have been validated through the processes above. Raw data preserves full chain history with integrity guarantees. Decoded data accurately parses contract interactions. Curated datasets likedex.trades and tokens.transfers apply standardized schemas and business logic to make blockchain data easily accessible for all your use cases.
Our Data Quality Framework
Gap Detection
We ensure complete chain histories with no missing blocks or transactions. Our automated systems continuously scan for gaps in data sequences and flag any missing records for immediate remediation.Duplicate Detection
Our data pipeline includes validation checks to prevent duplicate records from appearing in our datasets. This is critical for accurate counting, aggregation, and analysis.Block Integrity
Every block must contain all its expected components in the correct structure and sequence. We verify that blocks maintain continuity across chains and that all transactions, event logs, and trace data within each block are complete and properly ordered.Re-org Handling
Blockchain reorganizations can invalidate recently indexed data. Our raw data infrastructure includes reorg detection and handling mechanisms to maintain data accuracy:- Finality awareness: For chains with higher reorg propensity, we wait for sufficient block confirmations before indexing. This ensures that only finalized, stable blocks are included, preventing the ingestion of potentially reversible transactions.
- Reorg detection: We run checks to identify blocks where chain reorganizations have occurred, flagging affected data for correction.
- Data correction: When reorgs are detected, affected blocks are re-harvested and replaced with the canonical chain state.
Token Metadata Updates
Token metadata (names, symbols, decimals, contract addresses) changes frequently as new tokens are deployed and standards evolve. Dune continuously updates token metadata across all supported chains to ensure you always have access to the latest information.Null Detection & Validation
We identify and address missing values in critical data fields. Our validation rules flag unexpected nulls in essential columns, ensuring data completeness and preventing errors in downstream analysis.Monitoring & Transparency
Real-time visibility into data health:Data Freshness
Check current ingestion latency for all supported chains and curated datasets.
Status Page
Subscribe to alerts for any incidents affecting data availability or quality.
“Dune has by far the largest and most comprehensive dataset on the market.” Dan Smith, Senior Research at Blockworks