The purpose of the beta testing phase of the project is to test the concepts, framework, including the methodologies, as well as the guidelines for assessing traffic data quality. It was expected that data from actual projects of State DOTs would be used for testing the applicability of the data quality assessment methodologies. In this regard, the framework was sent out to selected individuals in DOTs of the states of Florida, Georgia, Illinois, Maryland, Utah, Virginia, and Washington to review and where possible apply the framework using their individual local data. The original intent of testing the framework with state data was abandoned when it became clear that a fully fledged beta testing is quite an unreasonable demand on the public agencies. It became necessary to request only reviews of the concepts presented in the framework. A few review comments were received and these are summarized below.
In developing the guidelines for implementing the framework, estimates of the level of effort required to establish a data quality assessment system and straw man estimates of data quality acceptability levels were developed. Review comments on the guidelines and reality checks on the straw man estimates were sought from various offices of the U.S. DOT, FHWA and a few state DOTs. The review comments are also presented below. The estimates of the levels of effort and acceptability levels were revised based on recommendations by the reviewers.
Out of the seven state DOTs included in the beta testing, four provided written comments on the framework as whole. None of the states actually applied the framework to their data. The following are detailed review comments on the framework.
Florida DOT (FDOT) agrees that the quality measures are useful but noted that some may be difficult to calculate. Completeness and validity measures can be easily calculated for continuous counters. FDOT believes that it would not be practical to compute the accuracy of counters by routinely performing manual counts and comparing with machine counts. This is because (i) it is very time consuming to count from a video, (ii) manual counts are very error prone, and (iii) it is very difficult to synchronize the times on the video to the times in the permanent counter.
As far as timeliness measure is concerned, as long as the data resides in the database at the time it is extracted for processing, it is considered to be timely.
FDOT estimates 100 percent coverage of the state highway system every year, because those few roads that cannot be counted due to construction are estimated by applying a growth factor to the latest measured year.
Georgia DOT (GDOT) agrees with all of the quality measures and noted that the framework is well-presented and easy to follow. GDOT is currently undertaking a similar exercise in assessing data quality and determining archiving processes and needs. The results to date are generally similar to those presented by the case studies in the framework.
Washington DOT (WsDOT) noted that timeliness definition works for their ATR sites. However, for short counts (AADT - archived data), AADTs are calculated only at the end of year (usually May-June timeframe to coincide with HPMS submittal). This is because factors based on ATR sites are not available until February/March. WsDOT observed that coverage should also mention group factors (sufficient number of stations to accurately develop group factors).
Washington state has started using Data Stewards, data dictionaries, and providing a "data mart" for ISP's to access the data they need. WsDOT is pleased to see this concept in the framework.
Beta testing of the guidelines focused on validating the straw man estimates of the acceptability levels of the quality measures and the estimated level of effort to implement a data quality assessment system within an agency. The following are comments on the draft guidelines.
Minnesota Department of Transportation (MnDOT) noted that data quality and anticipated variance in data quality are related to so many variables that, establishing parameters and targets is extremely difficult. The Office of Transportation Data and Analysis at the MnDOT have traditionally approached data quality from at least three perspectives:
* A recount is requested if adjusted volumes exceed acceptable percent change margins.
The following office of FHWA reviewed the straw man estimates of acceptable levels for the quality measures and levels of effort. The suggested changes to the initial estimates were incorporated.
There was some discussion about the data completeness measure. It was observed that a single statistic completeness measure does not distinguish completeness over time from completeness at the same time. For example, consider two extreme cases in which 95 percent of the possible readings are provided i.e., completeness (or availability is 95 percent). In the first case, data from 95 percent of the sensors is always available but is never available from the other 5 percent of the sensors. In the second case, data from all the sensors is available 95 percent of the time. The two cases differ significantly from each other, but both of them provide an overall availability of 95 percent. By this example, it would appear that a single statistic combines spatial and temporal completeness and can be misleading. It was noted that completeness needs to be addressed in greater detail i.e., a user needs a report of when and where data are available.
It is important to note that a single completeness statistic as illustrated above is not misleading. A single completeness statistic provides an indication of the magnitude of data completeness in the same way a national congestion statistic does not provide any details on congestion at specific locations. It is desirable to have more detailed completeness statistics and this can also be achieved by showing several levels of detail for completeness measure. It is conceivable that operations managers and other mid-level managers would appreciate a single completeness statistic (and not having to run calculations on their own to get a single statistic). Based on the single statistic, one can determine whether further details are required to characterize the quality of the data. Similarly, they could be links to other levels of detail e.g., "completeness by day" or "completeness by road" or something similar that permits detailed analysis by the desired category.
The framework is developed to provide a single statistic of the completeness measure. The single measure is intended to provide an "overall" measure of completeness of data. Detailed information can then obtained about the spatial and temporal variability of completeness if desired.