Skip navigation to main content
United States Department of Transportation - Federal Highway Administration FHWA HomeFeedback

APPENDIX A: AUSTIN, TEXAS CASE STUDY

Introduction

The following sections describe procedures for calculating these data quality measures in a specific setting: traffic data collection, dissemination and archiving in Austin, Texas. Because the exact calculation of the data quality measures may vary depending upon the specific data consumer, it is useful to first identify who will be primary data consumers in the traffic data flows (via the National ITS Architecture at http://itsarch.iteris.com/itsarch/) and thus whose perspective should be represented in calculating data quality. Readers should note that most of the information and details in this case study example are accurate and true; however, some details and results have been embellished or simplified for the purposes of the example.

Traffic Data Flows: Identifying the Data Consumers

Figure A.1 shows a simplified version of the physical entities (as defined in the National ITS Architecture) involved in traffic data collection, dissemination, and archiving in Austin, Texas. Figure A.2 illustrates the data flows from another perspective, with additional detail related to the specific context of Austin traffic data. In this example, there are 5 primary data consumers whose perspectives should be represented in calculating data quality measures:

In this example, the archived data administrator and the ISP use the exact same data stream (i.e., original source data) as the traffic operations personnel. Thus, these three data consumers should share a common definition for data quality measures since their data is identical (they may, however, have different views of what quality level is acceptable for the six measures). From Figure A.2, it can be seen that there are three different types of traffic data for which we should calculate data quality:

Simplified chart represents mapping of traffic data in Austin case study to National ITS Architecture. Using wireless, fixed-point, short-range, and vehicle-to-vehicle communications, traffic data flow among travelers, centers, field, and vehicles.

Figure A.1. Simplified Austin Case Study Mapped to National ITS Architecture

In diagram, Austin traffic data flow via private network to traffic operations personnel and via secure FTP to archived data administrator and ISP. Via public Web site, archived data users access database and travelers (also via phones) access ISP data.

Figure A.2. Data Flows and Data Consumers in Austin Case Study

An inherent principle in this methodology is the need to re-calculate data quality when the data has undergone significant change or transformation. Thus, the data quality results at different points in the data flow may be slightly different because the data itself has been modified as it flows from field devices to various data consumers. The following sections describe specific calculation procedures for the six data quality measures for these three different types of data

Calculation of Data Quality Measures

For the Austin case study, we consider a single day of data (i.e., August 29, 2003) collected by the Texas Department of Transportation (TxDOT) as an example. Readers should note that data quality could also be reported for other time scales, such as every hour, week, month, or year. For this particular example day, there were 654 unique single- and double-loop detectors (in which a "detector" measures traffic data for a lane) configured to report lane-by-lane traffic data (i.e., volume, occupancy, speed) at 1-minute intervals. Each 1-minute reading from each detector is considered to represent one record. For example, Figure A.3 shows a sample of the original source data for Austin.

DET_ID DATR END_TIME VOLUME OCC SPEED
60099218/29/20037:00:2422867
60099218/29/20037:01:23271063
60099218/29/20037:02:2323968
60099218/29/20037:03:23291068
60099218/29/20037:04:23191167
60099218/29/20037:05:23341268
60099218/29/20037:06:23221267
60099218/29/20037:07:23281163
60099218/29/20037:08:23291167
60099218/29/20037:09:2322863
60099218/29/20037:10:2318768
60099218/29/20037:11:23281066
60099218/29/20037:12:23211266
60099218/29/20037:13:23291165
60099218/29/20037:14:23341366
60099218/29/20037:15:2320769

Figure A.3. Sample of Original Source Data for Austin

The Austin data was used in illustrating the calculation of the six data quality measures are described below.

Accuracy

For the purposes of this example, we assume that reference measurements are available for the three different versions of data: original source data, archive database, and traveler information.

Original Source Data

In this example, consider that we would like to know the accuracy of the speed values in the original source data. Ground truth data have been obtained from a calibrated portable reference sensor that was temporarily installed at a representative sample of detector locations. To compute the accuracy of the original source data, we will summarize the reference data to 1-minute time periods to match those of the detector data being tested.

For visual reference, a chart is created that compares the actual speed measurements to the reference measurements (Figure A.4). The mean absolute percent error was calculated as 12.0 percent using Equation 1, and the root mean squared error was calculated as 11 mph using Equation 2.

Chart compares test detector speeds with reference sensor speeds in miles per hour to determine accuracy of speed values in original source data for Austin. Mean average percent error is 12.0%, and root mean squared error is 11 mph.

Figure A.4. Accuracy of Speed Values in Original Source Data

Archive Database

In this example, consider that we wish to compare the accuracy of traffic volume values from an operations-based sensor to a nearby automatic traffic recorder (ATR) that has recently been calibrated. One of the many data products available through the data archive are hourly traffic volumes; therefore, the reference measurements are also summed to match the exact date and time of the hourly traffic volumes in the data archive.

For visual reference, a chart is created that compares the volume counts from the archive database to the reference counts from the ATR (Figure A.5). The mean absolute percent error was calculated as 4 percent using Equation 1, and the root mean squared error was calculated as 131 vehicles using Equation 2.

Chart compares hourly volume from archive database with reference automatic traffic recorder to determine accuracy of hourly traffic volumes in archive database for Austin. Mean average percent error is 4.4%, and root mean squared error is 131 vehicles.

Figure A.5. Accuracy of Hourly Traffic Volumes in Archive Database

In this example, one can see that the accuracy of the hourly traffic volumes in the archive database is fairly good, with the mean absolute percent error being less than 5 percent.

Traveler Information

In this example, the ISP provides route-based speed and travel time reports on its website and through a voice-responsive phone system. The route speeds and travel times are updated every minute in both systems, while the speed and travel time values are based on a rolling 2-minute average. The ISPs also estimates route speeds and travel times if some of the original source data are missing.

As a means to ensure a quality product, the ISP arranges for reference travel time measurements to be obtained along selected Austin routes for various times of the day. The ISP uses the travel time accuracy procedures described in an FHWA report (Travel Time Data Collection for Measurement of Advanced Traveler Information Systems Accuracy, EDL Document No. 13867).

The ISP travel times are visually compared to the reference travel times using similar charts (see Figure A.6). The mean absolute percent error was calculated using Equation 1, and the root mean squared error was calculated using Equation 2.

Chart compares ATIS route travel times with reference travel times in minutes to determine accuracy of route travel time values in traveler information for Austin. Mean average percent error is 8.6%, and root mean squared error is 1.56 minutes.

Figure A 6. Accuracy of Route Travel Time Values in Traveler Information

Completeness

In the Austin example, we calculated data completeness for the three different versions of data: original source data, archive database, and traveler information. In this particular example, the data process includes the flagging and eventual purging of invalid data values. Therefore, the completeness statistics will only include valid data values. The potential contribution of invalid data values to the completeness measure can be determined by combining the completeness and validity statistics.

Original Source Data

In Austin, there are 513 on-line detectors that should report a data record every minute for the entire day. Another 141 detectors are configured but "off-line" for acceptance or evaluation testing—these detectors are not counted in the completeness measure because they are not malfunctioning. Of the 513 detectors, 78 are non-trap or single-loop detectors (mostly on ramps and service roads) that only report volume and occupancy values. The remaining 435 detectors are trap or double-loop detectors that report volume, occupancy and speed values. Thus, we expect to have 738,720 valid volume and occupancy records per day (513 total detectors 1,440 records per day) but only 626,400 valid speed records per day (435 trap detectors 1,440 records per day). The Austin field computers perform basic validity tests on 20-second data and replace "invalid" values with "-1" values. When computing completeness, however, we only consider valid values and the "-1" values are not considered valid. Table 1 contains the completeness statistics and data used in the calculations. The completeness statistics in Table A.1 indicate that the original source data in Austin is almost fully complete, with only 1 to 2 percent of the data being incomplete (i.e., missing or invalid).

Table A.1. Completeness Statistics for Original Source Data
Number of records Volume Occupancy Speed
Number of records with valid values 731,787 731,787 616,991
Number of records that require valid values 738,720
(513 total detectors)
738,720
(513 total detectors)
626,400
(435 trap detectors)
Percent complete 99% 99% 98%

Incomplete data can be caused by 1) large amounts of invalid data; or 2) missing data due to communication, hardware, or software failures. The completeness statistics must be viewed in combination with validity statistics to pinpoint the most likely cause of missing data.

Data quality reports should fully specify or disclose information related to the amount of expected data (the denominator of percent complete), especially for the completeness measure. Note that in this example, we have observed that 141 detectors are off-line for acceptance or evaluation testing. Malfunctioning detectors should not be discounted from expected data counts simply because device owners are aware of their malfunction but have not been able to repair the devices. The practice of listing malfunctioning detectors by considering them "off-line" is not recommended as it obscures the true device failure rate and data quality results.

Archive Database

As shown in Figure A.2, the archived data administrator retrieves the original source data from the traffic management center. The archive administrator performs several data processing steps in preparation for loading into the data archive:

  1. Additional validation checking (beyond what is done by field computers);
  2. Aggregation of 1-minute data to 5-minute intervals; and
  3. Estimation or imputation of data values for 5-minute intervals with incomplete data.

The additional validation rules (step 1 above) used by the archive administrator are described later when discussing data validity. If a data value from the original source data fails these additional validation rules, the original value is flagged as invalid and not included in subsequent processing steps. Data values of "-1" that were marked as invalid by field computers are also flagged and are not included in subsequent processing steps.

The aggregation step (step 2) combines all 1-minute records within a 5-minute period (e.g., 12:00 to 12:05 am, 12:05 to 12:10 am, etc.) and computes total volume, average occupancy and average speed. Additional attributes are appended to the record to indicate how many valid 1-minute records were included in the 5-minute summary statistics.

Once the preliminary 5-minute subtotals are calculated, a factored volume estimate is calculated based on the number of valid 1-minute volume values in the 5-minute subtotal. For example, consider a 5-minute volume subtotal of 125 vehicles that is based on 4 valid 1-minute volume values. Because one of the 1-minute volume values is missing, the archive administrator calculates a volume estimate for the full 5-minute period based on the 4 minutes of data as follows: 125 vehicles 5 values expected 4 valid values = 156 vehicles. This estimated volume count is marked as an estimate and the estimation method is documented in the archive metadata. Five-minute average occupancy and speed values with less than 5 valid values are not factored up since they are averages and not sums (as is the case with volumes). Five-minute time periods with no valid values are left as missing or null and no estimates are provided.

After the archive administrator has performed these processing steps, completeness statistics are computed by counting the valid data values in the data archive. With 5-minute subtotals, the data archive should have 288 records per day for each detector. There should be 147,744 records with valid volume and occupancy values (513 total detectors 288 records per day). Similarly, there should be 125,280 records with valid speed values (435 trap detectors 288 records per day). Note that missing or null data values are not counted as valid data values for the purposes of the following completeness statistics. Table 2 contains the completeness statistics and data used in the calculations. Table A.2 indicates that the completeness of the archive database is still nearly fully complete.

Table A.2. Completeness Statistics for Data Archive
Number of records Volume Occupancy Speed
Number of records with valid values 146,729 146,925 124,420
Number of records that require valid values 147,744
(513 total detectors)
147,744
(513 total detectors)
125,280
(435 trap detectors)
Percent Complete 99% 99% 99%

Readers should note that in processing the speed data, it was determined that about 17 percent of the reported speed values were "missing" because no vehicles were recorded during a 5-minute period (VOLUME=0 and OCCUPANCY=0). In the original source data, the value of SPEED=0 was replaced with a null or missing data value to better represent the traffic being recorded by the detectors. Even though 17 percent of the speed values are missing, in this example we do not count that against the percent complete measure since the missing speed values were presumably not caused by a detector malfunction, and thus should not lower data quality. However, data users should be cautious when datasets contain many time periods where no vehicles were recorded (VOLUME=OCCUPANCY=SPEED=0), as this may indicate detector failures.

Traveler Information

In this example, the ISP provides route-based speed and travel time reports as traveler information on its website and through a voice-responsive phone system. The route speeds and travel times are updated every minute in both systems, while the speed and travel time values are based on a rolling 2-minute average. There are a total of 6 routes being monitored, thus one would expect to have a total of 8,640 reported travel times during the day (6 routes 1,440 updates per day, or one update per minute). The ISP estimates route speeds and travel times if some of the original source data are missing. The ISPs policy is to provide their best estimate of travel time, even if that travel time is based on historical data or speed limits instead of real-time data.

The ISP has automated a quality control process that monitors the availability of its website and voice-responsive phone system at periodic times throughout the day. Because of the ISP's policy of estimating travel times even when the original source data is incomplete, the main factor affecting completeness will be website and phone system availability. For this example, consider that a hardware failure in the phone system caused 60 minutes of downtime during this particular day. Travel time reports via the website were available at all sampled times of the day. Table A.3 contains the completeness statistics for the traveler information.

Table A.3. Completeness (Availability) Statistics for Traveler Information
Number of Records Travel Times on Website Travel Times on Voice-Responsive Phone System
Number of records with valid values 8,640 8,280
Number of records that require valid values 8,640
(6 routes, updated every minute)
8,640
(6 routes, updated every minute)
Percent Complete 100% 100%

Table A.3 indicates that the completeness or availability of the traveler information was relatively high for both traveler information products. Because of the common ISP practice of estimating values when original source data are missing, the availability of traveler information can be more affected by hardware or software failures associated with ISP operations. In cases where the ISP does not estimate missing values, the availability may also reflect missing values in original source data.

Validity

For the Austin example, we calculate data validity for the three different datasets: original source data, archive database, and traveler information.

Original Source Data

In Austin, the field computers perform these basic validity checks on the original source data before it is sent to the traffic management center:

Unfortunately, the Austin field computers use the same error code of "-1" for both invalid data as well as communication failures. Ideally, different error codes would be used so that missing data problems could be diagnosed.

To calculate validity of the original source data, we simply count the number of 1-minute data values that have been marked as valid values (i.e., those without "-1" values), and then divide by the total number of data values. Table A.4 contains the validity statistics and data used in the calculations.

Table A.4. Validity Statistics for Original Source Data
Number of records Volume Occupancy Speed
Number of records meeting validity criteria 731,787 731,787 616,991
Number of records subjected to validity criteria 731,886 731,886 620,616
Percent Complete 99.9% 99.9% 99%

Table A.4 indicates that the validity of the original source data was very high, as less than 1 percent of the data failed the validity checks. This could be due to several reasons: 1) the data could be legitimately valid; 2) the validation checks could be too few or not rigorous enough.

Archive Database

The archive administrator uses the following additional validation rules:

Note that these additional validation rules are applied to the original source data before it is aggregated into 5-minute periods. In some cases, validation rules may be applied at several different points in the data flow between original source data and the archive database. Table A.5 contains the validity statistics and data used in the calculations.

Table A.5. Validity Statistics for Archive Database
Number of records Volume Occupancy Speed
Number of records meeting validity criteria 712,828 713,809 599,518
Number of records subjected to validity criteria 731,886 731,886 620,616
Percent Complete 97% 98% 97%

Table A.5 indicates that the validity of the archive database is still quite high, as less than 3 percent of the data failed the additional validity checks. Because of the number of additional validation checks, we can be reasonably assured that there are no major data validity problems with either the original source data or the archive database.

Traveler Information

In this example, consider that the ISP applies its validity criteria after the original source data has already been processed into route travel times. Thus the ISP uses a different set of validation rules than the archive administrator:

Table A.6 contains the validity statistics for the ISP route travel times and data used in the calculations.

Table A.6. Validity Statistics for Traveler Information
Number of records Route Travel Time
Number of records meeting validity criteria 8,380
Number of records subjected to validity criteria 8,640
Percent Valid 97%

Timeliness

Original Source Data

In measuring the timeliness of the original source data, we examine the data flow between the field computers and the traffic management center. There are four field computers that are expected to supply the traffic management center computer with data messages every minute, where a data message consists of the volume, occupancy and speed values for the previous minute. By examining the timestamps of the data messages, we can calculate the timeliness of this data flow. Note that in this example, the timestamps represent the time the data messages arrived at the traffic management center, not the time the data messages departed the field computers. This data timestamp convention should be confirmed when calculating timeliness, as it could dramatically affect the results.

The traffic operations personnel have decided that data messages received up to 5 seconds later than when they are expected are acceptable. In analyzing the timestamps on the 1-minute data messages, we find that 5,699 of the 5,707 data message were received at the traffic management center within 65 seconds of the previous message. Therefore, timeliness is calculated as:

% timely data equals 5,699 on-time messages over 5,707 total messages received equals 99.8%

By further analyzing the timestamps, we calculate that the average delay for the 8 late messages is 28 seconds. This means that, when a data messages were received late, on average it was received 28 seconds later than expected.

Archive Database

The archive administrator has a scheduled secure FTP download of the previous day's original source data from the traffic management center at 3 a.m. the following morning. The administrator also has a scheduled script that automatically transforms and loads the aggregated data into the archive database at 6 a.m. that same morning. In the Austin example, the original source data were collected August 29th, and the data were downloaded from the center's FTP site at 3 a.m. on August 30th, then loaded into the archive database by 6 a.m. One of the data archive users (e.g., the traffic operations personnel) expects the previous day's archived data to be available by 6 a.m. every morning since they have traffic management applications that rely on the data archive. Thus, if any portion of the previous day's archived data is loaded after 6 a.m. that data are considered to be late.

In this example, assume that a software malfunction prevented 10 percent of the archived data records from being loaded into the archive database. The archive administrator arrived at work, fixed the malfunction, and had the remaining 10 percent of the archived data loaded by 9 a.m. In this example, the timeliness is as follows:

Traveler Information

In this example, consider that the ISP would like to evaluate the timeliness of the updates to the route travel times on its website and voice-responsive phone system. For both systems, the ISP has a goal of providing condition updates every minute, based on the original source data. Now consider the hardware failure in the phone system that was discussed in the completeness example. The phone system was not available for 60 minutes, and the phone system provides travel times for 6 different routes. Thus, the timeliness is as follows:

% timely data equals 8,280 on-time updates over 8,640 route travel time updates equals 96%

In this example, average delay for late data is not calculated because the travel time updates in the phone system were not available at all for the entire 60 minutes.

Coverage

Original Source Data

The traffic operations personnel at TxDOT have focused their real-time data collection and traffic management activities on the freeways in the Austin area. Therefore, their goal is to monitor the entire freeway network in the Austin area with real-time traffic data. They have chosen to focus initial deployments on the most congested parts of the freeway network, with later deployments covering less congested freeway locations. As standard practice, TxDOT has installed detectors on the freeway main lanes between every major entrance or exit ramp, which results in an average detector spacing of 0.4 miles. Therefore, they consider this sample to adequately represent the freeway locations between point detectors. Additionally, they have placed detectors in every freeway lane and on all entrance and exit ramps.

Because of their emphasis, the traffic operations personnel only consider the functional class of freeways. In the Austin metropolitan planning area, there are a total of 174 centerline-miles of freeway. TxDOT has installed traffic detectors along 23 freeway centerline-miles. Therefore, the percent of freeway coverage is 23/174 = 13 percent, with an average detector spacing of 0.4 miles.

Archive Database

The archive administrator has also chosen to focus the coverage statistics on the freeway network only. Therefore, the coverage statistics in the archive database are exactly the same as in the original source data. Therefore, the percent of freeway coverage is 23/174 = 13 percent, with an average detector spacing of 0.4 miles.

Traveler Information

Because the ISP is attempting to provide traveler information for all major roadways, they consider arterial streets in reporting coverage statistics. Because the freeway routes for which travel times are provided correspond with freeway detector locations, the freeway coverage statistics for traveler information are the same as in the original source data and archive database. Therefore, the percent of freeway coverage is 23/174 = 13 percent, with an average detector spacing of 0.4 miles. However, the arterial street coverage is 0 percent since no arterial street data is available from TxDOT or the City of Austin.

Accessibility

Original Source Data

The accessibility of the original source data is first described in qualitative terms:

The original source data is accessible through a private computer network to the traffic operations personnel, who provide the same original source data to the archived data administrator and the ISP through periodic secure file transfer protocols. The archived original source data is also available on CD-ROM upon written request.

The archive administrator and the ISP have different software scripts they use to import, validate, and load the data into their system. The archive administrator is using customized software with advanced features that enables relatively fast data imports. The ISP is using commercial-off-the-shelf software that performs slightly slower than the archive software. The accessibility of the original source data in quantitative terms is as follows:

The archive administrator is able to retrieve and import a full day of original source data in 8 minutes (actual clock time). Over the course of a day, the ISP has tracked and recorded its data retrieval and import time as 10 minutes (actual clock time).

From this example, we can see that the original source data is easily accessible only to a limited number of data consumers (e.g., traffic operations personnel, archive administrator and ISP). We also note that the original source data is more accessible to the archive administrator than the ISP (e.g., 8 minutes vs. 10 minutes) because of customized software.

Archive Database

The accessibility of the archive database will be of interest to archived data users, who wish to retrieve and manipulate data products from the archive. The accessibility is first defined in qualitative terms:

The data archive is available to all data consumers through a public website.

In this example, consider that the archive administrator would like to measure how accessible the planning data products are to data consumers from the Capital Area Metropolitan Planning Organization (CAMPO). The archive administrator devises a simple exercise that asks users to retrieve average annual daily traffic (AADT) volumes for specific locations, then records how long it takes a sample of users to retrieve the desired data.

The accessibility of planning data in the data archive is such that it requires data consumers an average of 12 minutes to retrieve the desired data (e.g., AADT values).

In this example, the accessibility exercise is modeled after website usability tests since the primary access to the data archive is through a website. This example illustrates a test for a single data retrieval function. In most data archives, however, the accessibility exercise will most likely be performed for several of the most common data retrieval functions.

Traveler Information

The accessibility of traveler information will be of interest to travelers, who wish to make more informed travel decisions. The accessibility of the traveler information is as follows:

Route-based traveler information is available through a public website and a voice-responsive phone system.

Consider that the ISP would like to measure how accessible traveler information is on its website and phone system. The ISP recruits a sample of 50 travelers to do usability tests by offering a small incentive (e.g., three free months of personalized travel information). The usability tests measure how long it takes travelers to obtain current travel conditions for a specified route.

The accessibility of traveler conditions on the public website is such that it requires an average user 20 seconds to obtain data for the specified route. The accessibility of the phone system is such that is requires 60 seconds to obtain data for the specified route.

In this example, we can see that the traveler information is relatively accessible to many users. However, there are numerous other information outlets (e.g., changeable message signs, radio reports, etc.) that might improve accessibility for those without Internet access or mobile phones. We also note that the website appears to be much more accessible than the phone system. After viewing these statistics, the ISP might decide to upgrade the voice recognition software because the usability tests revealed that the delay was associated with poor voice recognition. Or the ISP might note through the usability tests that, despite the longer access time, travelers using the phone system had comparable satisfaction ratings as those travelers using the website. In most cases, data accessibility may not be as dynamic as the other data quality measures. The most appropriate time(s) to measure data accessibility is after major system interface or design changes. Measuring accessibility or usability at this time will allow system designers to see whether their interface or design changes have improved accessibility to data consumers.

Interpretation of Data Quality Statistics

The data quality statistics for the Austin case study are summarized in Table A.7.

Table A.7. Traffic Data Quality "Scorecard" for Austin Case Study
Data Quality Measures Original Source Data Archive Database Traveler Information
Accuracy
  • MAPE
  • RMSE
One-minute speeds:
12.0%
11 mph
Hourly volumes:
4.4%
131 vehicles
Travel times:
8.6%
1.56 minutes
Completeness
  • Percent Complete
Volume: 99%
Occupancy: 99%
Speed: 98%
Volume: 99%
Occupancy: 99%
Speed: 99%
Website: 100%
Phone: 96%
Validity
  • Percent Valid
Volume: 99.9%
Occupancy: 99.9%
Speed: 99%
Volume: 97%
Occupancy: 98%
Speed: 99%
Route travel times: 97%
Timeliness
  • Percent Timely Data
    Average Data Delay
99.8%
28 seconds
90%
3 hours
96%
n.a.
Coverage
  • Percent Coverage
Freeways: 13% with 0.4 mile spacing Freeways: 13% with 0.4 mile spacing Freeways: 13% with 0.4 mile spacing;
Arterials: 0%
Accessibility
  • Avg. Access Time
Archive admin.:
8 minutes;
ISP: 10 minutes
Retrieve AADT values: 12 minutes avg. access time Website: 20 second avg. access time
Phone: 60 second avg. access time

The results in Table A.7 indicate that, in general, the quality of traffic detector data is reasonably high for most data consumers. The quality measure with perhaps the lowest score was percent coverage, which can be expected since Austin is in the process of deploying their freeway detector system. The accuracy of speed data as collected from the field could be improved, as the mean absolute percent error was 12 percent. The traffic operations personnel might decide to devote additional resources to calibrate the double-loop detector for speed measurement, or perhaps they might decide that the existing accuracy is adequate to detect incidents. The phone access time for traveler information is 3 times as long as the website, so perhaps the ISP might decide to fine-tune the voice recognition software to decrease phone access times.

Although all six data quality measures are recommended for each data consumer, it should be evident that some data consumers will value certain aspects of data quality more than others. For example, traffic operations personnel and ISPs may consider timeliness a critical measure, whereas archived data users may be less concerned about timeliness. The project team considered the calculation of a composite data quality score but did not further develop the concept for a number of reasons. This does not preclude data consumers from constructing their own composite score based on their priorities. A single data quality score would be difficult to interpret unless some value judgments were used with the measures that are not reported as percentages, such as timeliness or accessibility. Different data consumers may wish to weight each data quality measure differently according to their own priorities.


7.0 | Table of Contents | Appendix B



FHWA Home | Feedback
FHWA