Skip navigation to main content
United States Department of Transportation - Federal Highway Administration FHWA HomeFeedback

Executive Summary

Introduction

One of the foremost recommendations from the FHWA sponsored workshops on Traffic Data Quality (TDQ) in 2003 was a call for "guidelines and standards for calculating data quality measures." These guidelines and standards are expected to contain methods to calculate and report the data quality measures for various applications and levels of aggregation.

The objective of this project is to develop methods and tools to enable traffic data collectors and users to determine the quality of traffic data they are providing, sharing, and using. This report presents the framework that provides methodologies for calculating the data quality metrics for different applications and illustrated with case study examples. The report also presents guidelines and standards for calculating data quality measures that are intended to address the following key traffic data quality issues:

Framework for Data Quality Measurement

The framework is developed based on the six recommended fundamental measures of traffic data quality. These are defined below:

The framework takes into account the facts that there are different types of traffic data and different customers and users. The framework also recognizes that traffic data is used for different applications. As such, the needs and quality requirements are different for the different data customers and applications. Table ES-1 shows the range of data consumers, types of data, and possible applications that are considered in developing the framework.

Table ES-1. Types of Data Consumers and Applications
Data ConsumersType of DataApplications or Users
Traffic operators (of all stripes)Original source data
Archived source data
Traffic management
Incident management
Archived data administratorsOriginal source dataDatabase administration
Archived data users
(Planners and others)
Original source data
Archived source data,
Archived processed data
Analysis
Planning
Modeling (development and calibration)
Traffic data collectorsOriginal source data
Archived source data
Traffic monitoring
Equipment calibration
Data collection planning
Information Service ProvidersOriginal source data (real time)Dissemination of traveler information
TravelersTraveler informationPre-trip planning

The framework is structured as a sequence of steps in calculating the data quality measures and assessing the quality as shown in Figure ES-1. The first step in assessing the quality of data is to determine the potential data consumers or users of the data. This is important because the type of data consumer or application determines the type of data and thus the methods of calculating the quality measures and the thresholds for evaluating the quality of data. The other steps include methods for calculating each data quality measure to allow quantitative assessment of quality, establishing acceptable quality targets, and reporting of data quality.

The application of the methods in the framework is illustrated with three case studies. The case studies are intended to only illustrate the application of the methodologies in evaluating traffic data quality. These case studies are not intended to and do not represent a review of the quality of data of the agencies providing the data for this case studies.

Figure ES-1.  Structure of Data Quality Assessment FrameworkD

Figure ES-1. Structure of Data Quality Assessment Framework

Guidelines for Data Quality Measurement

The guidelines address technical issues related to the data quality standards, data sharing, estimates of the level of effort required for measuring and reporting data quality, and specifies procedures for using metadata. The guidelines include the following essential elements

Establishing Acceptable Data Quality Targets

Estimated data quality targets are provided for different applications. These targets are defined for the six data quality measures. These targets reflect the acceptable quality based on the data user's needs and applications. Depending on the user and application, data quality measures falling outside the thresholds could be unacceptable for the intended application or it could be an indication that the data ought to be used with caution. Table ES-2 shows the estimated data quality targets for some representative transportation applications. These estimates were based on experience and validated through beta testing with several FHWA offices. With regards to the accessibility measure, it is estimated that, all applications can be adequately serviced with access times in the 5 to 10 minute range.

Level of Effort Required for Traffic Data Quality Assessment

The extra costs associated with assessing and reporting data quality was considered an important issue at the regional TDQ workshops. Estimates of the level of effort are expressed in hours of labor required to implement a data quality assessment program. The estimated levels of effort do not account for the level of effort required to maintain or improve data quality. These estimates represent the level of effort required to assess the quality of existing data. Table ES-3 shows estimated levels of effort in developing and maintaining a data quality assessment system within an agency.

Specifications and Procedures for Using Metadata for Reporting Data Quality

Metadata is an extremely important consideration for data sharing in general, and especially for communicating data quality. Commonly referred to as "data about data," metadata is typically thought of as dataset descriptions. It is recommended that the ASTM standard, once approved, be used for documenting traffic data quality.

Table ES-2. Data Quality Targets
Applications   Data Data Quality Measure1
Accuracy2
Data Quality Measure1
Completeness
Data Quality Measure1
Validity
Data Quality Measure1
Timeliness
Data Quality Measure1
Typical Coverage
Transportation Planning ApplicationsStandard demand forecasting for Long Range Planning Daily traffic volumesFreeways: 7% Principal Arterials: 15% Minor Arterials: 20% Collectors: 25%At a given location 25% - 12 consecutive hours out of 48-hour countUp to 15% failure rate - 48-hour counts
Up to 10% failure rate - permanent count stations
Within three years of model validation year55-60% of freeway mileage
25% of principal arterials
15% of minor arterials
10-15% of collectors
Transportation Planning ApplicationsHighway Performance Monitoring SystemAADT5-10% Urban Interstate
10% Other urban
8% Rural Interstate 10% Other Rural Mean Absolute Error
80% continuous counts
70-80% for portable machine counts (24-/48-hour counts)
Up to 15% failure rate - 48-hour counts
Up to 10% failure rate - permanent count stations
Data one years old or less55-60% of freeway mileage
25% of principal arterials
15% of minor arterials
10-15% of collectors
Transportation OperationsTraveler InformationTravel times for entire trips or portions of trips over multiple links 10-15% RMSE95-100% valid dataLess than 10% failure rateData required close to real-time100% area coverage
Highway SafetyExposure for safety analysisAADT and VMT by segment5-10% Urban Interstate
10% Other urban
8% Rural Interstate
10% Other Rural
Mean Absolute Error
80% continuous count data 50% for portable machine counts (24-/48-hour counts)Up to 15% failure rate - 48-hour counts
Up to 10% failure rate - permanent count stations
Data one years old or less55-60% of freeway mileage
25% of principal arterials
15% of minor arterials
10-15% of collectors
Pavement ManagementHistorical and forecasted loadingsLink vehicle class20% Combination unit
12% Single unit
80% continuous count data
50% for portable machine counts (24-/48-hour counts)
Up to 15% failure rate - 48-hour counts
Up to 10% failure rate - permanent count stations
Data three years old or less55-60% of freeway mileage
25% of principal arterials
15% of minor arterials
10-15% of collectors

Notes:
1"Accessibility" for all applications is discussed in the text.
2Percentage figures correspond to estimate of Mean Absolute Percent Error (MAPE).

Table ES-3. Level of Effort Estimates for Traffic Data Quality Assessment and Reporting
TaskAction itemAssumed UnitsLevel of effortFrequency
General
Develop mechanism/ system for data quality assessment Develop data reduction software or proceduresPer program40 hoursOne time
Design and implement input data proceduresPer program40 hoursOne time
Test, refine, and update systems and softwarePer program40 hoursPeriodic
Develop data quality reporting system Design/develop reporting procedures and metadata templates Per program40 hoursOne time
Accuracy
Develop reference or ground truth dataDesign and collect sample baseline data Per site or data source8 hoursAs required
Assess accuracy of original source field data using independent equipment; and archived dataDownload/process review data. Implement framework/software to calculate accuracy measuresPer site or data source 1 hour As required
Review results compared to targetsPer site or data source 15 mins As required
Completeness, validity, timeliness
Assess quality of original source and archived dataDownload, process, and review data. Implement framework to calculate quality measures Per site or data source 1 hour As required
Review results compared to targetsPer site or data source 15 mins As required
Coverage, and accessibility
Assess coverage and accessibility qualities of data for the programReview coverage, accessibility requirements for the programPer program 1 hour As required
Download and review data. Implement framework to evaluate dataPer program 1 hour As required
Data Quality Reporting and Improvements
Summarize and report data qualities to potential users (Metadata).Compile and report data quality to users (Metadata)Per program 8 hours Periodic/ as required
Identify improvement and communicate quality problems.Communicate quality problems to field personnel; schedule maintenancePer site or data source 4 hours Periodic/ as required

Note: As required - based on need and time scales e.g., annual, semi-annual, monthly, weekly, daily, or per request.

Guidelines for Data Sharing Agreements

Data sharing agreements codify the roles, expectations and responsibilities among the parties providing and using traffic data. Such agreements can conceivably occur between public entities, entirely between private entities or between private and public entities. Table ES-4 below presents suggested minimum data acceptance standards for the incorporation of ITS-generated traffic data into traffic monitoring programs for planning and engineering purposes.

Table ES-4. Standards for Data Transfer Agreements
Type of LocationProposed Minimum Quantity StandardProposed Quality Standard
Roadway sectionsSingle locationSeven consecutive days per month 
Single corridor100 percent coverage one day per month Daily count within 10 percent of machine or manual count within 15 percent of hourly count as measured once per year. Twenty percent sample of locations.
Areawide75 percent coverage one day per monthDaily count within 10 percent of machine or man­ual count within 15 percent of hourly count as measured once per year. Five percent sample of locations.
IntersectionsSingle locationSeven consecutive days per monthN/A
Single Corridor100 percent coverage one day per month Five and 10 percent standard applied every five miles in corridor once a per year. Five percent sample of intersection locations.
Areawide75 percent coverage one day per monthFive and 10 percent standard applied to one location per corridor per year. One percent sample of locations.

Conclusions and Recommendations

Data quality is directly based on the extent to which a data set satisfies the needs of the person judging it. A better understanding and means to assess the quality of data offers various benefits including confidence and efficacy in decisions based on data. This project developed a framework and guidelines for measuring and assessing the quality of traffic data for different applications. The case studies used to illustrate the application of the framework are selected to represent a diverse range of data sources and applications. The guidelines include guidance on quality targets, levels of effort required to establish a data quality assessment system within an agency, approaches for including metadata with data quality, and standards for data sharing agreements. The examples for metadata and proposed standards for data sharing agreements provide useful guidance in those areas.

The beta testing although limited, has provided the opportunity to validate the concepts and methodologies presented in the framework and also validate some straw man estimates of data quality targets and estimates of the levels of effort. It is recommended that the estimated levels of effort and quality targets need to be tested and validated based on actual experiences in the use of the framework and guidelines.


Acronyms | Table of Contents | 1.0



FHWA Home | Feedback
FHWA