|
|
Advanced Public Transportation Systems: Evaluation Guidelines January 1994
Advanced Public Transportation Systems: Evaluation Guidelines January 1994 Click HERE for graphic. Advanced Public Transportation Systems: Evaluation Guidelines Final Report January 1994 Prepared by Robert F. Casey and John Collura John A. Volpe National Transportation Systems Center Kendall Square Cambridge, MA 02142 Prepared for Office of Technical Assistance Federal Transit Administration 400 Seventh Street SW Washington, DC 20590 Distributed in Cooperation with Technology Sharing Program U.S. Department of Transportation Washington, D.C. 20590 DOT-T-94-1 0 PREFACE This document was prepared by the Office of Research and Analysis, Volpe National Transportation Systems Center, under the sponsorship of the Advanced Public Transportation Systems (APTS) Program, Federal Transit Administration (FTA) and with the guidance of Mr. Ronald Fisher, FTA's Director of Training, Research, and Rural Transportation. The Volpe Center operates under the auspices of DOT's Research and Special Programs Administration (RSPA). The major contributors were Mr. Robert Casey, RSPA/Volpe Center Operations Research Analyst, and Dr. John Collura, RSPA/Volpe Center Faculty Fellow and Professor of Civil Engineering at the University of Massachusetts, Amherst. Technical assistance also was provided by Ms. Judith Schwenk and Mr. Lawrence Labell of the RSPA/Volpe Center and Dr. Thomas Horan of the Institute of Public Policy at George Mason University. The summaries of the breakout sessions at the recent National Workshop on APTS Evaluations also were useful in the completion of the guidelines. The summaries were prepared by Ms. Katherine Turnbull of the Texas Transportation Institute, Mr. John Mason of Science Applications International Corporation, Mr. Joel Markowitz of the Metropolitan Transportation Commission (San Francisco), and Mr. Philip Shucet of Michael Baker Jr., Inc. The preparation of this document was also facilitated by prior evaluation work by Mr. Mark Abkowitz, Ms. Carla Heaton, Mr. Chester McCall, Mr. Howard S. Slavin, and Mr. Robert Waksman as part of the Federal Transit Administration's Service and Methods Demonstration program. The document consists of evaluation guidelines for use by contractors responsible for evaluating APTS operational tests. Although these guidelines are intended for the APTS Program, their potential applicability extends beyond the evaluation of FTA-sponsored operational tests to the evaluation of any innovative use of advanced technology in public transportation. It is anticipated that this document will be modified periodically to reflect additional experience gained in evaluating APTS operational tests. iii Click HERE for graphic. TABLE OF CONTENTS PAGE 1. OVERVIEW OF EVALUATION GUIDELINES 1 2. BACKGROUND 3 2.1 Overview of the Evaluation Process 7 2.1.1 Evaluation Frame of Reference 7 2.1.2 Evaluation Planning 12 2.1.3 Evaluation Implementation 13 2.1.4 Potential Evaluation Spin-offs. 15 2.2 Coordination of APTS Evaluations 16 3. GUIDELINES FOR PLANNING EVALUATION ACTIVITIES 19 3.1 Determination of Site Data Requirements and Sources 21 3.2 Determination of Measures and Collection/ Derivation Techniques Required to Address APTS; Program Objectives and Other Relevant Project Objectives/Issues 24 3.2.1 Basic Set of Measures 25 3.2.1.1 APTS Costs and Functional Characteristics 27 3.2.1.2 User Acceptance 28 3.2.1.3 System Efficiency and Effectiveness 28 3.2.1.4 Impacts 29 3.2.1.5 Relationship Between APTS Program Objectives and the Categories of Measures 29 3.2.1.6 Other Objectives and Measures 34 3.2.2 Data Collection/Derivation Techniques 34 3.3 Planning Considerations Relative to Data Collection and Analysis 38 3.3.1 Basic Data Collection/Analysis Design 39 3.3.2 Measure Stratification 43 3.3.2.1 Categorization of a Measure Into Additive Components 44 3.3.2.2 Categorization of a Measure According to Target Market, Operational, Geographic, or Time Categories 48 3.3.3 Grouping of Raw Data Into Class Intervals 49 3.3.4 Sampling Requirements 51 3.3.5 Timing of Data Collection 52 v 4. GUIDELINES FOR PERFORMING EVALUATION ACTIVITIES 55 4.1 Monitoring/Performance of Data Collection 55 4.2 Data Reduction, Analysis, and Presentation 59 5. CONTENT AND ORGANIZATION OF REPORTS 75 5.1 Evaluation Plan 75 5.2 Monthly Evaluation Progress Reports 77 5.3 Interim Evaluation Reports 78 5.4 Final Summary Evaluation Report 78 5.5 Quarterly Project Progress Reports 79 APPENDIX A - SURVEY EXECUTION AND DESIGN A-1 A-1 Defining the Survey Universe A-2 A-2 Sampling the Survey Universe A-3 A-3 Techniques for Surveying the Samples Selected A-7 A-4 Survey Design Principles A-10 A.4.1 Organization A-11 A.4.2 Length A-11 A.4.3 Question Sequence and Wording A-12 A.4.4 Standardized Questions A-14 A.4.4.1 Behavioral Measures A-14 A.4.4.2 Attitudinal Measures A-15 A.4.4.3 Social and Demographic Measures A-16 A.5 Non-Response Bias A-17 A.6 Interviews With Transportation Agency Personnel A-18 A.7 References A-19 vi APPENDIX B - STATISTICAL METHODOLOGY B-1 B.1 Definitions B-1 B.2 Data Analysis Determination B-2 B.3 Sample Size Determination B-4 B.4 Data Collection B-7 B.5 Analysis Methods B-7 B.6 Methodology Documentation B-10 B.7 References B-10 APPENDIX C-GLOSSARY C-1 APPENDIX D-BIBLIOGRAPHY D-1 APPENDIX GENERAL WORK TASKS OF EVALUATION CONTRACTOR E-1 E.1 Task 1: Evaluation and Task Administration Plans E-1 E.2 Task 2: Implementation and Analysis of Data E-2 E.3 Task 3: Report Preparation E-3 vii LIST OF EXHIBITS PAGE EXHIBIT 1: Selected Examples and Applications of APTS 4 EXHIBIT 2: Evaluation Relationships 6 EXHIBIT 3: Evaluation Process 8 EXHIBIT 4: APTS Operational Test Planning, Implementation, and Evaluation Sequence of and Responsibility for Activities 17 EXHIBIT 5: Basic Site Data Requirements for APTS Operational Tests 20 EXHIBIT 6: Typical Sources for Site Data 22 EXHIBIT 7: APTS Program Objectives and Examples of Corresponding Measures 31 EXHIBIT 8: Examples of Data Collection Techniques for Selected Measures 36 EXHIBIT 9: FTA/Section 15 Worksheet for Functional Distribution of Expense Object Classes/ Level B 47 EXHIBIT 10: Service Area for the Seattle Project 63 EXHIBIT 11: Distribution of Park-and-Ride Users for the Seattle Project 64 EXHIBIT 12: Passenger Volume for the Seattle Project 65 EXHIBIT 13: Bus Schedule Adherence for the Minneapolis Urban Corridor Project 66 EXHIBIT 14: Corridor Demographic Characteristics for the Shirley Highway Express Bus-On-Freeway Project 67 EXHIBIT 15: Charge-A-Ride Usage by Card Type and Time Period for the Merrimack Valley Charge-A-Ride Program 68 EXHIBIT 16: Comparison of Fare Payment Times Using Different Methods 69 EXHIBIT 17: Project Effectiveness Measures for the Seattle Project 70 EXHIBIT 18: Highway Travel Time Distributions for the Minneapolis Urban Corridor Project 71 viii EXHIBIT 19: Results of Before and After Analyses for Portland Self-Service Fare Collection 72 EXHIBIT 20: Benefit-Cost Analysis Results of Salt Lake City Rider Information System 73 EXHIBIT A-1: Summary of Survey Sampling Methods and Applicable Survey Techniques A-20 EXHIBIT A-2: On-Board Bus Survey -- Katy Transitway Transit User Survey A-21 EXHIBIT A-3: Carpool/Vanpool Survey A-22 EXHIBIT A-4: Freeway Motorist Survey A-24 EXHIBIT A-5: Bus Riders Survey A-26 EXHIBIT A-6: Bus Riders Mail-Back Survey A-28 EXHIBIT A-7: Bellevue Smart Traveler Project Surveys A-30 EXHIBIT A-8: Washington, DC, Self-Administered Post Card Bus Survey A-34 EXHIBIT A-9: 1979 Downtown Crossing Bus Passenger Survey A-35 EXHIBIT A-10: Recommendations for Questions on Boarding and Alighting Points (for user surveys only) A-36 EXHIBIT A-11: Recommendations for Questions on Trip Origin A-37 EXHIBIT A-12: Recommendations for Questions on Trip Destination A-38 EXHIBIT A-13: Recommendations for Questions on Trip Start and End Times A-39 EXHIBIT A-14: Recommendations for Questions on Access Mode to Transit Vehicle A-40 EXHIBIT A-15: Recommendations for Questions on When Present Mode was First Used A-41 EXHIBIT A-16: Recommendations for Questions on Former Transportation Mode A-42 EXHIBIT A-17: Set of Attitudinal Questions on Travel by Transit and Auto A-43 EXHIBIT A-18: Set of Questions on General Attitudes of the Population A-44 ix EXHIBIT A-19: Recommendations for Questions on Respondents' Sex A-46 EXHIBIT A-20: Recommendations for Questions on Respondents' Age A-47 EXHIBIT A-21: Recommendations for Questions on Respondents' income A-48 EXHIBIT A-22: Recommendations for Questions on Auto Availability A-49 EXHIBIT A-23: Recommendations for Questions on Auto Ownership A-50 EXHIBIT A-24: Recommendations for Questions on Whether Respondent has a Driver's License A-50 EXHIBIT A-25: Recommendations for Questions on Respondents' Occupation A-51 EXHIBIT A-26: Recommendations for Questions on Respondents' Educational Level A-52 EXHIBIT A-27: Recommendations for Questions on Length of Residence A-53 x 1. OVERVIEW OF EVALUATION GUIDELINES This document presents guidelines for planning, implementing, and reporting the findings of the evaluation of Federal Transit Administration's (FTA) Advanced Public Transportation Systems (APTS) operational tests. These evaluation guidelines are intended for use by organizations engaged by the Research and Special Programs Administration/Volpe National Transportation Systems Center (Volpe Center) to evaluate the APTS operational tests. In addition, the guidelines will be useful to state and local organizations involved in the design and evaluation of Advanced Public Transportation Systems. An objective of these guidelines is to foster consistency of evaluation philosophy and techniques, and comparability and transferability of results to improve the quality and utility of information obtained from the APTS program. The guidelines are designed to emphasize the assessment of the APTS Program's national objectives as well as the objectives of the local implementing agency. The various operational tests implemented under the APTS Program are meant to serve as learning tools and/or as models for other locales throughout the country. In order for these tests to have maximum effectiveness in their respective operational capacities, a consistent, carefully structured approach to project evaluation is desirable. This document has been prepared to provide a common framework and methodology for developing and then executing the evaluation of individual operational tests. These evaluation guidelines are by no means comprehensive--that is, they do not offer a suggested or preferred course of action for every conceivable situation that might arise. Nor are they to be rigidly or blindly followed, since each operational test and each site wilt be unique and wilt require somewhat tailor-made evaluation procedures. It is anticipated that these guidelines wilt be modified during the course of the APTS Program to reflect experience gained in implementing and monitoring the evaluations of individual tests. Although it is not the desire to update these guidelines frequently, modifications resulting from field experience wilt be made where appropriate for enhancement of performance and evaluation of the various projects. 1 In order to put these guidelines into a meaningful context, Chapter 2 provides background information on the FTA/APTS Program and the operational test evaluation process. Chapters 3 and 4 present guidelines relative to planning and executing operational test evaluations. Finally, Chapter 5 presents the recommended content and organization for each type of report to be prepared in conjunction with the evaluation process. 2 2. BACKGROUND The Federal Transit Administration has developed the Advanced Public Transportation Systems (APTS) Program which is an integral part of the overall U.S. DOT Intelligent Vehicle Highway Systems (IVHS) effort. A major aim of the APTS Program is to promote research and development of innovative applications of advanced navigation, information, and communication technologies. These technologies would be designed and tested to achieve APTS Program goals directed toward enhancing the ability of public transportation systems to satisfy customer needs and contributing to the achievement of broader community goals and local objectives. The APTS Program goals and objectives will be discussed further within the context of the evaluation frame of reference. The wide array of new technologies provides a unique opportunity to discover innovative and useful applications in public transportation. These operational tests and evaluations will be the principal activities of the APTS Program. Real world testing will be done in urban and rural areas using those technologies which appear to offer promise and represent useful applications. Major technologies include automated vehicle location systems, smart card systems, dynamic ridesharing systems, passenger information systems, high occupancy vehicle systems, and vehicle component monitoring systems. Exhibit 1 provides selected examples of these technologies and associated applications. Tests will involve joint ventures with state and local governments, and, when appropriate, universities and private vendors. Tests may range from 3 - years: 1-2 years to develop implementation plans, 1 year to implement service, and 1 year to evaluate the APTS application and associated impacts. In order for the APTS Program to encourage significant technological innovations by many urban and rural areas, the technologies tested and the results obtained must be evaluated, well documented, and widely distributed. It is important not only that the operational tests be structured and evaluated to facilitate transferability of results but also that evaluation results be disseminated so that prospective beneficiaries in other urban and rural areas are made aware of the potential of such technologies. Accordingly, a significant element of the APTS Program is the technology sharing function. 3 EXHIBIT 1. SELECTED EXAMPLES AND APPLICATIONS OF APTS APTS Examples Applications . Automated vehicle location (AVL) . controlling and monitoring system using satellite or ground- the use of vehicles based technologies and compute- rized dispatching techniques . estimating vehicle positions to assist dispatchers in improv- ing on-street schedule adherence . obtaining boarding and alighting information in conjunction with automatic passenger counters (APCs) . assisting in the development of more realistic schedules . facilitating the assignment of individuals to shared ride, demand response services . assisting in the preparation of daily driver logs . Smart card systems using a contact . facilitating the or contactless plastic card with a collection of fares, microchip and storage and processing the verification of capabilities travel, and the acquistion of information about passengers and vehicle usage . encouraging the coordination of various modes including bus, rail, auto, and parking services . aiding in the establishment of a postpayment fare system and the application of employer and human service agency-based subsidy programs . assisting in the design of a comprehensive, historical vehicle maintenance and parts inventory database 4 EXHIBIT 1 (continued) APTS Examples . Dynamic ridesharing systems using . providing quick real-time communication methods and easy access to with the aid of touch-tone telephone, up-to-date information television, radio, and videotex to aid an individual systems in arranging a carpool or vanpool the same day or evening before a trip . Passenger information system . supplying passengers using audio, visual, and/or hard with real time infor- copy methods such as digitized mation on routes, voice, interactive television, schedules, cancella- videotex, automated map displays, tions, delays, rerout- computer monitors and printers, and ing, and other aspects other devices located in terminals, of service to make stations, vehicles, places of travel easier and to employment, and at home; also facilitate intermodal could be provided in conjunction transfers with a traffic management center (TMC). . High occupancy vehicle systems . providing traffic (HOVs) including preferential control signal treatment methods and park and ride preemption capabilities facilities . monitoring vehicle occupancy remotely to enforce HOV lane restrictions . Vehicle component monitoring . assisting in the early systems detection of problems with vehicle components (e.g. engine, exhaust system) to avoid component failure while vehicle is in The exact number, general content, and location of the APTS operational tests are yet to be determined. For each fiscal year program, a series of primary objectives will be selected, and a group of proposals corresponding to each objective, and in keeping with total budgetary constraints, will be developed. Then, following an investigation, analysis, and negotiation 5 process involving FTA, the Volpe Center, and candidate sites, a final set of operational tests and respective sites will be agreed upon. Once final negotiation and transfer of funds between FTA and the APTS local sponsor are completed, the operational test can be implemented and evaluated. As part of its responsibility to evaluate the operational tests implemented under the APTS Program, the Volpe Center shall engage contractor support to participate in all phases of the evaluation process. Exhibit 2 shows the interaction among FTA, the Volpe Center, the local sponsor, the evaluation contractor, and the APTS vendors involved in the operational test. FTA/APTS staff is responsible for overseeing and guiding all aspects of the operational test including planning, site selection, negotiations with the site, implementation, and evaluation. The local sponsor is responsible for planning and implementing the actual conduct of the operational test as well as performing most of the data collection. The Volpe Center assists FTA in the activities for which FTA is responsible, and directs and monitors the efforts of the evaluation contractor. The Volpe Center, the evaluation contractor, and the vendors interface with the local sponsor (or the implementing agency, if different from the local sponsor). While being directly responsible to the Volpe Center for its activities, the evaluation contractor will 6 maintain an informal association and relationship with the local sponsor, the APTS vendors, and the cognizant FTA Project Manager. The APTS vendors, as deemed appropriate by FTA and the Volpe Center, may participate in a review of the evaluation plan, data reduction and analysis, and the interim and final reports. The APTS vendors may serve on the local evaluation review team as discussed in Section 2.2. 2.1 OVERVIEW OF THE EVALUATION PROCESS The evaluation process can be thought of conceptua11y as a link between the operational tests and technology transfer portions of the APTS Program. That is, it serves as a bridge between the conduct of an operational test at a particular site and the understanding of its actual performance at that site as well as its potential effectiveness in other locales. The quality of the evaluation process directly influences the accuracy and perceptiveness of the operational test assessment and ultimately affects the applicability and transferability of tat findings. Exhibit 3 is a flow diagram representing the evaluation process for an APTS operational test. The diagram is divided into four major sections: the evaluation frame of reference, evaluation planning, evaluation implementation, and potential evaluation spin-offs. (The specific organizational responsibilities associated untie the various aspects of each APTS test are given later in this chapter.) The first and fourth sections can be thought of, respectively, as input to and output from the active phases of the evaluation process, which are planning and implementation. A discussion of each of the four sections follows. 2.1.1 Evaluation Frame of Reference The evaluation frame of reference consists of four elements: the operational test application(s); APTS Program objectives; external influences; and local issues, objectives and site characteristics. An APTS operational test will consist of one or more technological applications introduced individually or sequentially. For example, a test might include the use of a smart card to facilitate automatic fare collection. Another example could consist of an automated vehicle location (AVL) system to determine vehicle position, followed by the installation of an automated passenger counting (APC) system and a computerized dispatching and scheduling 7 EXHIBIT 3: EVALUATION PROCESS* 8 system which work in conjunction with the AVL system. Each APTS operational test also is intended to meet the goals of the APTS Program which are: 1) to enhance the ability of public transportation to satisfy customer needs; and 2) to contribute to broader community goals by providing information on innovative applications of available IVHS technologies. These goals can be translated into the following set of objectives: Objective #1: Enhance the Quality of On-Street Service to Customers . Improve the quality, timeliness, and availability of customer information, . Increase the convenience of fare payments within and between modes, . Improve safety and security, . Reduce passenger travel times, and . Enhance opportunities for customer feedback. Objective #2: Improve System Productivity and Job Satisfaction . Reduce transit system costs, . Improve schedule adherence and incident response, . Increase the timeliness and accuracy of operating data for service planning and scheduling, . Enhance the response to vehicle and facility failures, . Provide integrated information management systems and better management practices, and . Reduce worker stress and increase job satisfaction. Objective #3: Enhance the Contribution of Public Transportation Systems to Overall Community Goals . Facilitate the ability to provide discounted fares to special user groups (e.g., disabled persons or employees eligible for tax-free employer subsidies), . Improve communication with users having disabilities (e.g., visual or hearing impairments), . Enhance the mobility of users with ambulatory disabilities, . Increase the extent, scope, and effectiveness of Transportation Demand Management programs, 9 . Increase the utilization of high occupancy vehicles, with an emphasis on reducing the use of single occupant vehicle, and . Assist in achieving regional air quality goals and mandates established in the Clean Air Act Amendments of the Intermodal Surface Transportation Efficiency Act (ISTEA). Objective #4: Expand the Knowledge Base of Professionals Concerned with APTS Innovations . Conduct thorough evaluations of operational tests, . Develop an effective information dissemination process, . Showcase successful APTS innovations in model operational tests, and . Assist system design and integration. Objective #1 relates primarily to the riders and their desire for improved transit service. Objective #2, on the other hand, deals in part with management aspects regarding system costs, service planning, scheduling, and operations. Objective #3 concerns broader impacts in terms of the degree to which an APTS application contributes to local community goals and national issue pertaining to, for example, the special needs of disabled persons, congestion management activities, user-side subsidy initiatives, energy, air quality, and accessibility. In section 3.2.1, measures are presented to examine the level to which these first three objectives are attained in each operational test. The fourth objective is directed at expanding the knowledge base of policy-makers, engineers, planners, researchers, and other individuals interested in the application of advanced technology to improve public transit. Because this objective is a broader, overarching aim of the entire evaluation program, its level of achievement will not be assessed using measures such as those discussed in section 3.2.1. Instead, an effort will be made to cull information from interim and final evaluation reports prepared as part of each operational test, and this information will be disseminated in publications such as FTA's APTS Briefs, IVHS America's Newsletter, and technical journals and conference proceedings of other organizations. In addition, selected evaluation results will be summarized on electronic bulletin boards commonly available to transportation professionals, and results will be presented at national and 10 international meetings. Finally, where appropriate, the findings and conclusions of the evaluations will be used as a basis for discussion in focus groups, meetings, and seminars. It should also be emphasized that for any given operational test, there may be objectives, over and above the APTS Program objectives, which are important evaluation considerations. These might be state or local objectives which other participants (e.g. transit operator, state transportation agency, community group, or local government) are striving to attain (e.g., to encourage ridesharing into the downtown area for the purposes of reducing parking requirements or traffic congestion in the central business district, to preserve the stability, cohesion, and authenticity of neighborhoods). The extent to which these state and local objectives relate to the APTS program objectives should be identified by the contractor. The operational test site can consist of anything from a corridor in a city to a group of cities or towns, and can be at any point along the population and density spectrum. An understanding of the unique demographic, economic, geographic, and transportation characteristics of the site, as well as prevailing attitudes toward transportation, is a useful and necessary adjunct to knowledge about the APTS application and associated objectives. To the maximum extent possible, external influences on the project should also be identified and, if necessary, appropriate strategies should be designed to reduce the likelihood that such influences will have adverse effects on the operational test. For example, if the APTS application has radio frequency (RF) spectrum requirements, such requirements should be analyzed, and political negotiation with authorized communication agencies should be initiated as early as possible. Information on the planned APTS innovations, project objectives, other issues and site characteristics, and external influences will generally be available from the application submitted to FTA by the site prior to approval of the project. Depending on the timing of the evaluation contractor's initial involvement in the project, a more detailed description of the project may be available in the form of a Project Implementation Plan. Further background on the operational test (e.g., genesis of the project concept, recent history of transit/para-transit developments at the site) can be obtained through discussions with the PTA Project Manager, the Volpe Center staff, and the local sponsor. 11 2.1.2 Evaluation Planning The evaluation planning phase of the evaluation process is the period during which the contractor interacts with FTA, the Volpe Center, and various agencies at the local level to transform the evaluation frame of reference into a detailed, structured program for conducting the evaluation. This phase sets the stage for the entire evaluation effort and, in addition, provides an opportunity to reassess and, if necessary, restructure the planned operational test. The planning phase begins with the preparation of an Evaluation Strategy for the particular project, which describes: (1) Pertinent information on the APTS application and site (in particular, an indication of what features of the operational test are unique and merit emphasis in the evaluation). (2) APTS Program objectives addressed by the operational test. (3) Relevant local, state and/or national objectives and issues addressed (and the relative emphasis to be placed on these objectives vs. APTS objectives). (4) Key issues to be resolved. (5) External influences to be addressed. (6) Recommended scope and focus of the evaluation including a discussion of the APTS costs and functional characteristics and a review of the potential efficiency, effectiveness and other impacts anticipated. The Evaluation Strategy may be prepared by the Volpe Center or the contractor. The contents of each Evaluation Strategy will vary from test to test depending on the nature and timing of the project. The Evaluation Strategy becomes the basis for the more detailed Evaluation Plan1 which is developed by the contractor. While the Volpe Center will provide a general evaluation strategy including suggestions regarding measures to be used, data to be collected, and analytical techniques to be employed, it is generally the contractor's responsibility to refine and elaborate on the Volpe Center's suggested strategy by developing specific procedures for collecting and analyzing data relative to project objectives, issues, and the site. ____________________ [1] Chapter 3 presents guidelines relative to the evaluation planning phase. The recommended content and organization of the Evaluation Plan presented in Chapter 5. 12 In developing the Evaluation Plan, the contractor is encouraged to propose changes to the approach recommended by the Volpe Center, particularly if the proposed modifications have significant potential to improve the objectivity, accuracy, completeness, and/or efficiency of the project evaluation effort or to enhance the transferability of project findings. In addition, total evaluation costs relative to potential findings must be borne in mind at all times. Throughout the process of developing the Evaluation Plan, the contractor is urged to keep in close contact with the local sponsor or project team responsible for implementing and operating the test and performing data collection. This continuing liaison with the local sponsor will ensure that the proposed methods of data collection are consistent with the resources available at the local level, with the operational implementation plan developed by the site, with important local objectives, and with reasonable costs for the evaluation contractor efforts. As is apparent from the preceding discussion, the evaluation planning phase entails substantial and continued interaction among all parties involved in the operational test. Ideally, planning of the evaluation effort should be coordinated, and take place concurrently with the planning of the project itself. This coordination between the implementation/operation and evaluation planning cycles permits optimum flexibility in the conduct of the overall test. Where possible, operational aspects of the test will be planned to conform to requirements of the evaluation, rather than the evaluation having to be integrated into a pre-existing, rigid operational structure. The concurrence of the two planning cycles ensures that the Evaluation Plan is completed prior to the implementation of the project. Early development of the Plan, in turn, allows the necessary lead time for "before" data collection -- that is, observations of phenomena such as transit system performance prior to the introduction of the APTS application(s) as well as possible information on community awareness and attitudes prior to project implementation. Throughout this phase of the project, it is critical to recognize that the FTA Project Manager is the final authority in negotiating any operational test modifications with the local sponsor. 2.1.3 Evaluation Implementation The evaluation implementation phase is the period during which the approved Evaluation Plan is executed. Activities during this phase include collection/analysis of data relative to 13 project objectives and issues, collection/analysis of data on site characteristics, compilation of a chronology describing the implementation and operation of the test, and recording of external factors which might influence operational test findings and results. Contractor functions during this phase include monitoring and in selected instances, supervising the data collection process (generally to be performed by the local sponsor), any data collection not performed by the local sponsor, data reduction and analysis, subjective analysis of information relative to project issues, and synthesis of project findings into one or more Interim Evaluation Reports and a Final Summary Evaluation Report.2 This phase not only generates information on which the final assessment of the operational test is based but also provides feedback information relative to ongoing transit operations. The ongoing evaluation activities, while adding to the cumulative body of quantitative and qualitative information regarding the project impacts, provide interim indications of costs and functions of APTS applications and the preliminary effects of these applications on transit system efficiency and effectiveness. These interim findings serve as useful input to the local agency responsible for implementing and operating the test by suggesting the need for operational modifications. During this phase, modifications may be made to the evaluation procedures originally specified in the Evaluation Plan. For instance, examination of interim findings may reveal certain gaps or redundancies in the originally planned data collection program. Still other reasons for modifying the evaluation procedure might be changes in the operational test, unanticipated developments or institutional factors at the site, or discovery of an improved evaluation procedure. Procedural steps to accomplish this necessary update for the Evaluation Plan appear in Chapter 5. The culmination of the evaluation implementation phase is the Final Summary Evaluation Report, which presents the following types of findings: _________________ [2] Chapter 4 presents guidelines relative to the evaluation implementation phase. Chapter 5 gives the recommended content and organization of the various contractor reports prepared during this phase, including the Monthly Evaluation Progress Report, the Annual Project Status Summary, the Interim Evaluation Report, and Final Summary Evaluation Report. In addition, Chapter 5 describes the content of local sponsor’s quarterly Project Progress Report to FTA, which can serve as useful input to the contractor’s work. 14 (1) Evaluation of the project in terms of its attainment of relevant APTS Program objectives and other (local and/or national) project objectives. (2) Insight into project issues associated with operational feasibility and characteristics of the applications. (3) Assessment of the influence of site-specific characteristics and external factors on the outcome of the operational test. (4) Lessons learned, based on practical experience, relative to the implementation and operation of the APTS applications (possibly to include recommendations for project modifications in the test site or for future applications in other locales). (5) Appraisal of the evaluation procedures employed in terms of effectiveness, cost, accuracy, etc. In essence, this report presents an assessment of the impact of the APTS applications at the site and provides guidance for the transferability of results to other locales. The body of the Final Summary Evaluation Report includes both narrative and graphic exposition, while detailed quantitative data and documentation of procedures are provided in technical appendices. Since the report is intended for a variety of audiences -- including transportation planners; transit operators; federal, state, and local officials; and private industry - - it contains an Executive Summary which highlights the salient project findings. 2.1.4 Potential Evaluation Spin-Offs It is anticipated that each operational test will give rise to potential implementation and analytical spin-offs. The Final Summary Evaluation Report, while essentially documenting the history and effects of a single project, also serves the broader function of increasing the understanding of and stimulating the application of the demonstrated APTS technologies in other localities. Information presented in the report provides a versatile basis for comparing the effects of a particular APTS application with those of other similar projects, suggesting modifications to the applications for future use, and predicting the effectiveness and utility of the APTS applications in other cities. Moreover, the report's assessment of project evaluation procedures can serve as a stimulus for improving the state-of-art of evaluation techniques. Since these broader functions of the Final Summary Evaluation Report generally materialize after 15 the test period and are not within the purview of the evaluation contractor assigned to a particular project, they are shown in Exhibit 3 as potential evaluation spin-offs. 2.2 COORDINATION OF APTS EVALUATIONS Exhibit 4 summarizes the various activities involved in planning, implementing, and evaluating an APTS operational test and indicates the allocation of responsibility for these activities. The sequence of activities ranges from overall APTS Program definition, to the operation and evaluation of an individual test, to the spin-off uses of the project. It can be seen that the entire stream of activities, especially those comprising the evaluation process, involves extensive interaction among FTA, the local sponsor, the Volpe Center, the evaluation contractor, and the APTS vendors. Moreover, it should be noted that the activities shown do not always occur in a fixed sequence. Time constraints may require that some of the steps be performed in parallel, and there will ideally be considerable interaction and feedback between the project planning and evaluation planning phases. The review functions of the Volpe Center, the local sponsor, and the APTS vendors associated with the data analysis provide a mechanism to identify, on a continuing basis, major problems (if any) so that APTS operational changes can be made (if necessary) during the course of the test. Evaluation spin-offs, while arising out of individual tests, will result in activities which extend beyond the FTA, Volpe Center, local sponsor, and evaluation contractor. The diversity of activities and generally long (three to four years) time frame for an individual test necessitate close and continual coordination among the groups involved. To facilitate communication among local test participants and the contractor concerning the evaluations, FTA will encourage the establishment of a local evaluation review team consisting of representatives of transit providers, metropolitan planning agencies, human service organizations, environmental groups, APTS vendors, and the general public. It may also be appropriate to include faculty from local colleges and universities on the evaluation review team. The contractor will meet with- the local evaluation review team to discuss the project objectives and the emphasis to be placed on each objective in the evaluation; to determine the roles and responsibilities of all parties involved in the anticipated data collection activities; to review problems encountered (if any) during the conduct of major data collection activities and overall 16 EXHIBIT 4: APTS OPERATIONAL TEST PLANNING, IMPLEMENTATION, AND EVALUATION: SEQUENCE OF AND RESPONSIBILITY FOR ACTIVITIES Click HERE for graphic. KEY: P = Primary role M = Monitoring role R = Review function a Includes local evaluation review team. b Local evaluation review team will be established as part of negotiations. c Primary role may also be assigned to the contractor. It may be necessary to have the contractor on-site to monitor the conduct of some data collection efforts such as an on-board survey to ensure that such efforts are carried out properly and that appropriate personnel are available to address unanticipated problems and questions. d FTA will disseminate information from these reports, where appropriate. Such information will appear in FTA’s APTS Briefs, IVHS America’s Newsletter, professional conference papers, and electronic bulletin boards. The final evaluation reports themselves will also be published. 17 operational test implementation; to present preliminary findings and results of the data analyses; and to seek the team's input. However, equally as important as coordination within a particular project is coordination across test sites, so as to maximize the effectiveness of the APTS Program in encouraging the application of new innovations. This coordination across sites is essentially important with respect to the evaluation process. Given the multiplicity of sites, operational tests, and participating organizations within the APTS Program, there is a strong need for coordination of the evaluation process so as to achieve consistency in the planning, implementation, and output of individual project evaluations. With respect to the conduct of the evaluations, such coordination will ensure that: (1) the scope of each evaluation effort is consistent with the importance of that particular APTS test relative to other APTS tests; (2) the technical approaches used to evaluate tests are consistent with the current state-of-the-art of evaluation techniques; (3) common data and definitions are employed; and (4) statistical reliability is maintained. With respect to evaluation output, such coordination will ensure that the Final Summary Evaluation Reports associated with individual projects are consistent in terms of content, format, perspective, and level of detail. This consistency in output will, in addition, enhance the spin-off potential of the evaluations. The achievement of a basic data set of uniform quality across operational tests will make possible inter-project comparisons in terms of rider characteristics, site characteristics, user acceptance, and system efficiency and effectiveness and associated criteria. These types of comparisons will be especially significant in the case of multiple applications of a particular APTS technology in several locations, or in the case of operational tests involving alternative APTS technologies directed towards a particular APTS Program objective. The coordination of the individual evaluation efforts will be achieved through the Volpe Center's active and continual participation in the program, with functions ranging from initial planning of each project evaluation effort, to monitoring of the contractor team, and finally to the synthesis of individual operational tests, evaluation reports and results. This document constitutes the first stage of the Volpe Center's evaluation coordination function, in that it describes general procedures to be followed by each contractor in performing the various evaluation tasks speeded in the contract. 18 3. GUIDELINES FOR PLANNING EVALUATION ACTIVITIES This chapter presents guidelines for planning the evaluation activities associated with a particular APTS operational test. As was mentioned in Chapter 2, the evaluation planning phase of the evaluation process is that period during which the contractor prepares a detailed Evaluation Plan based on the Volpe Center's Evaluation Strategy. The Evaluation Plan contains, among other things, a listing of relevant quantitative and qualitative measures related to various APTS, local, and national objectives and relevant issues, associated data collection and analysis procedures, and site specific data requirements and sources (both one-time and recurring). As such, the Evaluation Plan constitutes a structured, time-phased program for subsequently conducting the evaluation. The chapter is organized into three sections, corresponding to the basic decision-making elements shown in Exhibit 3: . determination of site data requirements and sources, . determination of measures and collection/derivation techniques required to address APTS Program objectives and other relevant objectives/issues, and . planning considerations relative to data collection and analysis. The organization of the chapter is not meant to imply a highly ordered time-sequencing of activities, since the evaluation planning phase is in fact highly iterative and dynamic. Moreover, it is important to realize that the guidelines comprise a basic set of ground rules for planning evaluations. The evaluation contractor will, in all probability, need to depart from these guidelines during the actual planning phase, so as to conform to the unique conditions surrounding a given operational test. The contractor should recognize his responsibility in working with the local sponsor and the Volpe Center to assure that an objective assessment of the project is achieved. One or more site visits during the evaluation planning phase is desirable to establish working relationships and channels of communication among the involved organizations and to uncover any constraints which may have a significant bearing on the development of the Evaluation Plan. During this planning effort, clarification must be made regarding responsibilities for performing and/or 19 EXHIBIT 5: BASIC SITE DATA REQUIREMENTS FOR APTS OPERATIONAL TESTS 1. Population 2. Square miles 3. Population density, persons per square mile 4. Number of persons in the labor force 5. Number of households, by type 6. Age, sex, education, occupation, income distributions 7. Household auto ownership 8. Number of persons with no drivers license 9. Modal split, by trip purpose or time of day if available 10. Existing (Pre-operational test) transit service characteristics . Organizational arrangements . Route miles (fixed route systems) . Tour area (non-faced route systems) . In-service vehicles per square mile of service area (non-faced route system) . In-service vehicles per hour within service area . Time of service operation throughout day . Days of service operation throughout year . Service frequency (fixed route systems) . Fare schedule 11. Description of para-ransit service characteristics . Data on taxi operations . Information on carpool promotion/matching programs 12. Map of the site showing: . The APTS project service area - note that this might be a contiguous area served throughout by the APTS transit system, or it might be two or more non-contiguous areas linked by the APTS service through a travel corridor . The existing transportation network - major highways, transit lines, commuter rail lines . Air quality attainment and non-attainment areas . Major topographical features such as rivers . The central business district . Any other important activity centers 13. Description of relevant site features such as: . Weather conditions . Seasonal population variations . Institutional/political climate . Economic conditions . Cost indices (e.g., cost of living index, prevailing transit wage rates) . Population/employment growth rate, land use development patterns . Residential mobility . Air quality conditions concerning ozone, lead, carbon monoxide, PM10, and other environmental concerns 20 overseeing various activities. The Evaluation Plan should indicate the finally agreed upon allocation of responsibility between the contractor and local evaluation review teams. 3.1 DETERMINATION OF SITE DATA REQUIREMENTS AND SOURCES The purpose of the site data is to provide an in-depth understanding of those characteristics of the site which might in some way influence the outcome of the project or the interpretation of project results. Obviously, the APTS operational test will not be implemented in a static environment, but rather it will affect the surrounding area. Thus, an examination of certain site characteristics is necessary in order to assess fully and accurately the impacts of the APTS application. An additional function of site data is to enhance the comparability and transferability of APTS project findings. Specifically, if conclusions drawn from one project are to be compared with findings of other similar projects or "transferred" to other potential sites, there must exist an objective approach for such a comparison or transfer. This requires the identification of a set of site-specific measures which permit one to classify sites in terms of meaningful similarities or to identify significant areas in which sites differ. Such measures might employ data pertaining to demographic and land use attributes, transportation facilities, and vehicle travel characteristics, both intra and inter-urban. In addition, information on the political/institutional climate of the area and prevailing attitudes toward transportation-related issues might be helpful in anticipating or understanding any problems regarding implementation and evaluation of the project. A review of past transit project evaluations indicates an inconsistency in both the amounts of and details concerning reported site-specific data. To some extent this inconsistency reflects a lack of standardized site data requirements, but more significantly it reflects deficiencies in knowledge regarding the interplay between site characteristics and test results. In an attempt to shed further light on the subject, a basic set of data requirements has been developed for use in APTS operational test projects (see Exhibit 5). Contractors are encouraged to propose additions or deletions to this list, in the context of particular projects, if it is felt that the nature and scope of the project call for a wider or narrower set of site descriptors. Contractors are also encouraged to propose permanent 21 EXHIBIT 6: TYPICAL SOURCES FOR SITE DATA DATA NEEDED TYPICAL SOURCES Demographic U.S. Bureau of the Census City of County Clerk State Department of Labor State Department of Internal Revenue City or County Planning Board Air Quality Environmental Protection Agency Land Use Characteristics City Directories Local, Regional and State Planning Agencies Tax Assessor's Records Planning Studies Motor Vehicle Travel State Highway department (or State DOT) U.S. Census (Journey-to-work) Local Traffic Department Earlier Travel Surveys State Registration Records Gasoline Tax Collection Records Public Transportation Travel Private Transit-Paratransit Companies Transit Authorities State Highway Department (or State DOT) Local Planning Agency U.S. Census (Journey-to-work) Earlier Travel Surveys Travel by Intercity Modes Federal Agencies such as: (air, rail, bus) Federal Aviation Administration Interstate Railroad Administration Federal Railroad Administration Department of Commerce State Regulatory Agencies Earlier Travel Surveys Private Carriers 22 additions, deletions, or changes to this minimum list based on their cumulative experience in conducting APTS evaluations. Aside from the site data requirements in Exhibit 5, it may be desirable in certain instances to collect a standardized set of attitudinal measures to obtain a profile of the community. Examples would be general opinions regarding the role of government, environmental issues, adequacy of transportation facilities, and desirability of travel by alternative modes. Since the value of this type of data for evaluation and transferability purposes has not yet been fully explored, community profile data will be collected only in selected operational tests (to be identified by the Volpe Center). Appendix A contains sample questionnaires which might be used to obtain such data. As experience is gained in this area, a standardized approach to developing an attitudinal profile of the test site may be formally incorporated into these guidelines. It is anticipated that the data set and descriptive information shown in Exhibit 5 will be available from secondary sources or from the local sponsor and will not involve specialized data collection activities (an exception being attitudinal profile data, which will entail surveys). Exhibit 6 indicates typical sources for various categories of site-specific data.3 Once the contractor has determined the type of site data required and the appropriate sources, two decisions remain: (1) the geographic scope of the area, and (2) the time period (s). Regarding the geographic scope, it was indicated above that a basic data set should be assembled for the APTS service area.4 In some cases, data conforming exactly to the service area boundary may be unavailable or may be obtained only by aggregation of fine-grained data (e.g., Census tract). If data is available for an area approximating the service area, the contractor may choose to use this pre-existing data base rather than deriving a special data base, provided that such a substitution will not be misleading and bias the evaluation. O n the other _______________________________ [3] Adapted from Heaton, Carla; McCall, Chester; and Waksman, Robert; "Evaluation Guidelines for Service and Methods Demonstration Projects", USDOT/UMTA-SMD; Washington, DC, 1976. [4] A definition of the APTS service area may not be available at the outset of the project, but rather will need to be developed during the evaluation implementation phase on the basis of user surveys. 23 hand, the use of fine-grained data may be appropriate if the service area is large and heterogeneous and thus should be divided into zones. The time period(s) for which data is to be assembled depends on the time period of the operational test and the rate at which conditions at the site are changing. If the project spans a fairly long period it may be desirable to gather site data for periods before, during, and after the project. In the case of a rapidly changing area or a staged project, data for even more points in time may be necessary. Moreover, if an historical perspective on the site is deemed relevant to the evaluation, it may be desirable to obtain 1980 as well as 1990 Census figures or recent trend data for key variables such as population, employment, and modal split. Since original data collection by the contractor is not anticipated, the number and exact timing of site data periods will be constrained by the collection cycles of existing sources. 3.2 DETERMINATION OF MEASURES AND COLLECTION/DERIVATION TECHNIQUES REQUIRED TO ADDRESS APTS PROGRAM OBJECTIVES AND OTHER RELEVANT PROJECT OBJECTIVES/ISSUES It was pointed out in Chapter 2 that the Evaluation Strategy will set forth a recommended set of APTS Program objectives, relevant project objectives (of local and national significance), and project issues to be examined. The contractor, in developing the Evaluation Plan, is responsible for reviewing this recommended set in the context of the local sponsor's Project implementation Plan and the various national and local perspectives, and then proposing appropriate modifications to the list of objectives and issues. Once the set of project objectives and issues has been finalized (which involves obtaining concurrence from the Volpe Center), the contractor must associate with these items a set of germane measures and identify suitable techniques to derive each measure and to collect necessary data. It is important to note that certain issues may not lend themselves to the use of quantitative measures but may rather involve qualitative analysis of pertinent information. The material presented below is intended to guide the contractor in developing appropriate measures and associated collection/derivation techniques. It is important to recognize that this material will undoubtedly be modified as information is gained through the consistent application and analysis of evaluation techniques on the operational tests. Therefore, because revisions to data program requirements in terms of basic data sets, collection and 24 analysis procedures, and presentation techniques can be expected, the fundamental value of this section of the guidelines lies in the manner in which it structures the approach to the selection of measures and the selection of techniques for collecting/deriving them. In preparing this material, considerable documentation was reviewed (see Bibliography). In addition, direct observance and participation in many previous and ongoing Federally-funded projects has permitted those preparing this document to identify not only a logical structure for project evaluation but also to highlight problem areas of which all potential project evaluators should be aware. The specific projects which contributed the greatest amount of insight were the evaluation plan development for the APTS/AVL operational tests and the Service and Methods demonstration projects. 3.2.1 Basic Set of Measures To assist the evaluation contractor and the local evaluation team in the selection of measures to assess operational test objectives, six categories of measures are suggested: . APTS costs, . APTS functional characteristics, . user acceptance, . transit system efficiency, . transit system effectiveness, and . impacts. The first three categories of measures relate directly to the costs, functional aspects, and utility of the APTS application and associated equipment. The next two categories pertain to transit system performance in terms of actual delivery and usage of the transit services provided. The final category of measures addresses project impacts related to critical transportation issues and societal goals and concerns. While many operational tests will be designed to achieve the same (or similar) objectives, some tests might be particularly unique in their ability to address certain objectives. Consequently, "priority objectives" should be identified in these unique tests, and a corresponding set of measures should be formulated so that these "priority objectives" are given 25 proper attention, emphasis and evaluation resources. Furthermore, the type of measure and the method of measurement should be considered as discussed below. . Type of measure Quantitative -- a measure which is expressed in terms of counts, measurements, dollars, or other physical units Qualitative -- a measure which is expressed in terms of people's attitudes, perceptions, or observations . Method of obtaining measure Collected -- obtained by measurement (vehicle travel time), counting number of passengers), surveying (perceived reliability), or from records (daily revenue) Derived -- calculated from collected measures either by simple arithmetic procedures (passenger miles per seat mile) or through use of analytic models (reduction in air pollution or fuel consumption) In reviewing the basic set of measures, it is important to note that some of these measures would be more meaningful if stratified by time of day (beak versus off-peak), location (corridor versus arterial), person time segments (waiting, access, transfer, in-vehicle), route type (fixed route versus demand responsive), and vehicle tour segments (in-service, non-service). Because such a classification of measures would have needlessly extended the list, the subject of stratification, or categorization, with respect to specific data collection plans is discussed separately in Section 3.3.2 of this chapter. The above categories of measures are not to be construed as a minimum requirement for every APTS project, since an evaluation need only encompass measures corresponding to the APTS Program objectives and other project objectives/issues addressed by the particular 26 operational test. Rather, the categories of measures should be used by the contractor as a checklist from which the most germane measures can be selected and to which other relevant measures can be added as appropriate. It will be noted that for each of the APTS Program objectives, it is possible to measure attainment of some objectives from two vantage points: the actual and the perceived attributes of the transit system (as represented by quantitative and qualitative measures, respectively). In the case of transit travel time, it might be appropriate to measure actual changes in travel time and then to compare with perceived travel time. Similarly, in the case of APTS equipment reliability and user acceptance measures, comparisons with user perceptions and attitudes might also be appropriate. Until more is learned about the interrelationship between actual measurements and attitudinal data, it is not possible to set forth hard and fast rules for when to supplement quantitative measures with qualitative measures. Clearly, it may be prohibitively expensive to employ this two-pronged procedure for each area of interest; on the other hand, mere reliance on quantitative measures may result in overlooking what is in fact the major behavioral determinant -- people's perceptions of the system. For the time being, the contractor must exercise sound judgment in deciding which situations are unique and instructive enough to warrant a two-pronged data collection effort. In no case should an attitudinal measure ever be used in place of a quantitative measure, where both are available. The rationale underlying each category of measures and their association with operational test objectives is discussed in Sections 3.2.1.1 to 3.2.1.6. Further discussion of data collection/derivation techniques appears in Section 3.2.2. 3.2.1.1 APTS Costs and Functional Characteristics Central to an operational test evaluation is the performance of the APTS system and its individual components. Questions surrounding the costs and functional characteristics (including reliability, usefulness, maintainability, adherence to specifications) should be addressed, and the relationship between these APTS characteristics and overall operational test objectives should be examined. Examples of such questions are: 27 . What are the life cycle costs (including fixed and recurring expenses) of the APTS system and its individual components? Which are "start-up" costs associated with the newness of the system and might be avoided in future applications? . Is the automated vehicle location system easy to use and are vehicle positions determined quickly and accurately so that on-time scheduling can be carried out and that passengers are provided with timely information? . Is the smart card system reliable, and does the system meet the required design specifications? To the extent possible, the objective (or objectives) related to a particular APTS component should be clearly articulated and the specific component costs and associated functionality should be determined. This will facilitate the comparison of APTS costs and associated benefits. It is recognized, however, that individual component costs may be difficult to determine if the procurement process allows lump sum bids. 3.2.1.2 User Acceptance The extent to which various APTS applications are actually utilized will be an extremely important dimension of performance in each operational test. The percentages and numbers of riders using a smart card for fare payments are just examples of quantitative measurements in this category. In addition, qualitative measures of user acceptance (or utility) would be employed, examples of which include the attitudes of riders regarding the usefulness of AVL-based pre-trip information and the perceptions of dispatchers concerning the benefits of component monitoring equipment. 3.2.1.3 System Efficiency and Effectiveness Transit system performance is typically viewed in terms of efficiency and effectiveness, both of which may be influenced by the use of the APTS application and other technology. Efficiency is related to the extent to which system inputs such as vehicles, personnel, fuel, and funds are employed to produce outputs; examples of outputs include the actual number of vehicle miles or vehicle hours of service. For example, reductions in unit operating costs would be 28 examined in part with the use of efficiency measures such as the operating cost per vehicle mile or operating cost per vehicle hour. Effectiveness concerns the users and actual demand for service and relates to financial aspects such as revenue and cost effectiveness service utilization, quality, convenience, safety, security, and reliability. In addition, non-financial aspects of effectiveness include service utilization, safety, security, and service reliability. 3.2.1.4 Impacts To examine the extent to which the operational test responds to critical transportation issues and national mandates such as the Americans with Disabilities Act, the Clean Air Act, and other Federal legislative efforts, both quantitative and qualitative impact measures are required. Such impacts may be anticipated or unanticipated and positive or negative. These impacts relate to, for example, the transit agency and its internal activities and administrative procedures; aspects of human factors; privacy; and matters dealing with equity, social, energy, traffic congestion, air quality, special mobility needs, institutional and political concerns. For example, the use of a smart card might facilitate the implementation of a more equitable and efficient fare policy as may have been anticipated, but it unexpectedly required a reorganization of the transit system's finance department and the existing fare collection and accounting activities and procedures. Another example concerns the use of an automated vehicle location (AVL) system which, as intended, may improve on-time scheduling; however, such scheduling improvements will only be realized after the transit dispatching staff has been properly trained and has learned to use the AVL system for the purpose of communicating with the bus operators. 3.2.1.5 Relationship Between APTS Program Objectives and the Categories of Measures While the six categories of measures discussed above are not meant to be exhaustive, they do provide structure and guidance in the selection of measures to evaluate the APTS program objectives #1$ #2$ and #3 to the extent that they are associated with each operational test. 29 The first APTS program objective, as stated in Section 2.1.1, focuses on enhancing the quality of on-street service to riders in terms of safety, security, convenience, ease of travel, and travel time. These concerns fall largely under the categories of APTS functional characteristics, transit service efficiency and effectiveness, user acceptance, and impacts as discussed above. Examples of corresponding measures appear in Exhibit 7. The second APTS program objective is to improve system productivity and job satisfaction. Anticipated system productivity improvements might result from reductions in system costs; better schedule adherence; quick and effective responses to incidents and vehicle and facility failures; and information management systems to provide reliable and accurate operating data in a timely manner. Job satisfaction pertains directly to another group of potential APTS beneficiaries; that is, the employees, such as drivers, dispatchers, and data analysts. An APTS application may lead to a change in the day-to-day activities of such employees and may, in turn, lead to reductions in worker stress and increases in job satisfaction. Examples of measures to evaluate the association of each test with this objective are given in Exhibit 7. The third APTS program objective centers around the contribution of public transportation to larger societal issues and community goals. These issues and goals relate to such elements as special mobility needs, traffic congestion, air quality, energy, privacy, equity, and other concerns. Appropriate measures to assess this APTS objective are mainly included in Exhibit 7 under the categories of user acceptance, effectiveness, and impacts. As discussed in Section 2.1.1., the fourth APTS program objective is a somewhat broader objective than the other three and consequently, the above measures will not be used to measure its level of achievement in each test. However, as mentioned in Section 2.1.1., to expand the knowledge base, results of tests will be disseminated in journals, conference proceedings, electronic bulletin boards, technical meetings, and seminars. Each category of measures includes criteria associated with various aspects of APTS applications ranging from their costs and functional characteristics to their association with overall transit system efficiency and effectiveness and other broader societal issues, such as air quality, energy, and special mobility needs. The results of each evaluation will be widely disseminated as discussed in Chapter 2, so that professionals have access to the knowledge they need regarding the actual performance of APTS technologies and the use of the analytical 30 Click HERE for graphic. 31 Click HERE for graphic. 32 Click HERE for graphic. 33 techniques employed in the analyses. The availability of such knowledge will lead to the design of improved APTS applications, in the conduct of more thorough evaluations, and the utilization of enhanced evaluation analysis tools. 3.2.1.6 Other Objectives and Measures The six categories of measures in Exhibit 7 are also useful in the selection of measures for other operational test objectives. As pointed out in Section 2.1.2, there will likely be state or local objectives in addition to the APTS program objectives. For example, a state objective might be to reduce the amount of financial operating assistance needed. This would imply that either operating costs must decrease or operating revenues (e.g., fares) must increase. Measures associated with this objective relate to system efficiency and effectiveness. Another example might be a desire to revitalize the central business district. Measures for this objective would fall under the area of economic concerns in the impacts category. 3.2.2 Data Collection/Derivation Techniques Once the relevant measures for project evaluation have been determined, it is necessary to identify appropriate collection or derivation techniques. Collected measures can be obtained through the following four basic methods: (1) By measurements, using various instruments, such as stopwatches, odometers, speedometers, and lap-top computers. The accuracy of the recorded data is a function of the accuracy of the measuring instrument itself. Typical measurements include travel times and vehicle velocities. (2) By counts or observations involving tallies either from discrete digitized recording equipment, lap-top computers, or manual counts. Typical counts would be numbers of passengers in vehicles. (3) By surveys or interviews which provide information relative to the individual being questioned, said information to include such items as origin, destination, income level, previous travel modes, observations of how the service is functioning, and attitudes towards transit amenities. (4) By searching records such as those available through the transit system, local sponsor, and other local planning agencies and Census records. 34 Derived measures can be calculated either through the use of simple arithmetic processes or special analytic models. This form of measures builds upon basic data collected through some of the above means. An illustration of a simple derived measure might be dividing passengers per day by vehicle miles per day to obtain passengers per vehicle mile. Examples of the latter type of derived measures resulting from analytic models might be the use of a time-delay curve to estimate vehicle speeds or the calculation of reductions in fuel consumption and air pollution based on a model using changes in traffic volumes as input. In view of the large number and variety of measures in Exhibit 7 and the even larger number which are likely to arise during the course of the APTS Program, it would be very difficult to specify in these guidelines a preferred method of data collection for each measure. Moreover, it would be inappropriate to attempt to choose a set of "best" methods from among the techniques already tried; rather, it is desirable to encourage the continual development and implementation of novel techniques with potential for increasing the efficiency or accuracy of evaluations. Finally, there is really no requirement for uniformity among data collection techniques, but rather there is a need for consistency and comparability of the data obtained by these collection techniques. The techniques can differ from project to project, as long as they are comparable in terms of accuracy and yield data in a form suitable for analysis both within the project and among projects. For the above reasons, it is not the intent here to prescribe a standardized approach to data collection. However, it is appropriate to discuss the potential applicability of some of the specific techniques, drawing where possible from previous experience. Exhibit 8 illustrates the range of techniques employed for selected measures in past transportation projects.5 Specific comments on these techniques and general recommendations applicable to collecting the measures follow: (1) Travel time, speed, and vehicle volume data collection techniques can range from manual to automatic. In general, automatic techniques are effective only where the magnitude of data requirements or some -other special circumstances warrant their use. Some of the more sophisticated automatic procedures are subject to reliability problems. Failure of ____________________ [5] For further details on collecting transit date, see "Review of Data Collection Techniques," prepared by Booz-Allen & Hamilton, Inc. for FTA, March 1985 35 EXHIBIT 8. EXAMPLES OF DATA COLLECTION TECHNIQUES FOR SELECTED MEASURES Travel times for transit vehicles: . On-board checkers or on-street checker with stop watches or lap-top computers . Time referenced equipment connected to bus Speeds for transit vehicles and autos: . On-street checkers with radar units or other equipment . Test vehicle with use of odometer, clock, and other equipment . Real-time surveillance system with image processing capabilities Counting auto occupants: . On-street counts recorded on paper, counters, or lap-top computers Counting transit vehicle passengers: . On-board checkers or on-street counts recorded on paper, counters, or lap-top computers . Bus drivers recording passenger load . Automatic Passenger Counters Travel times for autos: . On-street checkers at selected locations recording license plates and times; calculation of elapsed time by matching plates; possibly in conjunction with video camera and image processing technology . Time lapse aerial photographs or video . Floating car with observers to record travel time and stopped time delay using stop watches or other equipment Counting of transit vehicles and autos: . Permanent or temporary tube counters or loop detector in lanes or zones of interest . Visual counts recorded by persons . Time lapse aerial photographs or video . Real-time surveillance system with image processing capabilities . Electronic detectors Demographic/behavioral/attitudinal data on users/non-users/ operators: . Post cards distributed to auto drivers at exit ramps, to boarding and on-board passengers, and at park-n-ride facilities . Forms, usually no longer than one page, distributed and returned by mail or collected on buses . Sampling of autos by noting license plates and subsequent identification through Department of Motor Vehicles files; possibly with video camera and image processing technology . Interview conducted either at home, work, or within the transit system itself (on board, at stations, etc.) or with transit or local officials 36 these devices can cause loss of vital data, which will in turn delay the evaluation, and considerably increase costs. In addition, the measurement accuracy of automatic or semiautomatic devices may be questionable, particularly if they have not been used extensively before. In cases where definitive information on devise accuracy is not available, it is essential to confirm the accuracy of automatically collected data by periodic use of manual devices. Simple manual devices can be deployed so as to maximize utilization of roadside personnel. For example, in one project, the use of special counters by each observer permitted keeping track of the auto occupancy of each vehicle counted, with the result that two measures were obtained at once. In other projects, special manual devices were used to obtain vehicle counts and occupancy data simultaneously. (2) Past experience has shown that there is a lack of consistency between passenger counts recorded by transit personnel and counts by on-board or roadside observers. For instance, in one project, it was found that bus drivers tend to overestimate the passenger load and that on-board and on-street counters tend, on the average, to be consistent with the other. If transit personnel are to record such data, it is essential that verifications be made during the project to detect any potential bias or unusual variability in this data. (3) In utilizing transit system records and service area records, such as census data, it is critical to ascertain accuracy of these data. Usually, discussions with personnel who initially record these data will provide an assessment of accuracy. Further, where special data are collected for the project by a local organization, monitoring procedures will be established to assure that no modifications in procedures or notations have occurred which might have an impact on the evaluation process. (4) Demographic, behavioral, and attitudinal data on users and non-users of the services provided as part of the operational test, as well as attitudinal information from transit operators, can be collected through a wide variety of survey and interview techniques, with varying degrees of respondent cooperation, accuracy, and cost. In view of the large amount of documented survey experience relating to both transportation and general market research contexts, and in view of the large anticipated role of surveys in APTS evaluations, Appendix A has been devoted to a discussion of survey design and execution. In evaluating the array of existing and potentially innovative collection techniques relative to a particular measure, some of which are included in Exhibit 8 as examples, the contractor should consider factors such as the cost and accuracy of each method, the availability of local resources to implement each method, the ease of implementation, and the ultimate data analysis requirements. 37 With respect to cost, the contractor should apply sound judgment in determining whether the anticipated cost of using a particular technique is justifiable in terms of the contribution to the overall project evaluation of the specific measure being collected. Clearly, the total project expenditure for data collection should be allocated among individual measures, taking into account each measure's contribution to the project evaluation. The contractor should make special note of any data item which is relevant to the evaluation but whose collection cost appears to be disproportionately high in relation to other items. The contractor should determine whether the accuracy of a particular technique is consistent with the accuracy requirement for the measure, which in turn is dependent on the relative importance of the measure. A very accurate technique is probably not warranted for a relatively insignificant measure, especially if that technique would be expensive to implement. In addition, a high degree of accuracy for some measures may be inconsistent with a lesser degree of accuracy for others. The contractor should also evaluate alternative techniques in light of the available local resources-- labor resources as well as equipment. An attempt should be made to utilize existing equipment or rental equipment arrangements wherever feasible, rather than opting for techniques which require the purchase of new equipment (which might not be needed by the locality after the APTS evaluation). The contractor's Evaluation Plan should contain justification for selecting the particular technique applicable to each measure in terms of these considerations. In the case of a novel technique, it is required that the contractor demonstrate acceptable accuracy before it can be used as the sole source for data collection. It is further required that the evaluation contractor document his experience with those data collection methods employed in an evaluation, as explained below in Chapter IV. As this further experience develops, the Volpe Center will make this information available via updates to this Guidelines document. 3.3 PLANNING CONSIDERATIONS RELATIVE TO DATA COLLECTION AND ANALYSIS The preceding section contained-guidelines relative to specifying appropriate measures and collection/derivation techniques for addressing APTS Program objectives and other project objectives and issues. This section completes the discussion of evaluation planning activities with general guidelines for data collection and analysis procedures. The material in this section, 38 while intended to be applied to individual measures selected for inclusion in the evaluation, is presented in a general context. The following topics are included: basic data collection/analysis design, measure stratification, sampling requirements, and the timing of data collection. 3.3.1 Basic Data Collection/Analysis Design A significant aspect of the evaluation process for APTS operational tests is determining the basic data collection and analysis design to be employed relative to specific project objectives. There are a great variety of potential design approaches, ranging from an "after-only" design (a one-shot case study approach involving a single set of measurements taken after the project is operational) to a "before-after with control group" design (involving a comparison of multiple measurements). A General Accounting Office (1991) Report entitled, "Design Evaluations," presents guidelines with the use of a "decision tree" to assist in the selection of an evaluation design including case studies, cross-section or panel surveys, comparative group analyses, or a before and after study. A comprehensive discussion of the specific utility and the relative pros and cons of the various design approaches can be found in Donald T. Campbell and Julian C. Stanley, Experimental and Quasi-Experimental Designs for Research, 1968, and L. Mohr, Impact Analysis for Program Evaluation, 1988. The information which follows is intended to discuss the relative advantages of various approaches in the context of the APTS program and to highlight the major considerations involved in selecting the appropriate design for each APTS evaluation, or for individual measures included in the evaluations. In general, a single set of measurements (for example, taken while the test is in operation) will be insufficient for assessing the impact of the test, since it will not provide any yardstick with which to interpret the measurements. It is recommended, therefore, that every data collection/analysis program be structured around some form of comparison. If such an approach is for some reason infeasible, the contractor must indicate the reason(s) in the Evaluation Plan. Given that the basic data collection/analysis design will generally be in the form of a comparison of multiple measurements, the next question to be considered is what types of comparison are appropriate. The two main forms of comparison are before vs. after and test 39 vs. control. In a before-after comparison, a given measure is collected on a given system element before the experimental or exemplary operational test technique is instituted and then again while the technique is operational.6 In a test-control comparison, a given measure is collected on a system element which has been affected by the introduction of a technique (test unit) and also on an equivalent system element which has not been similarly treated (control unit). Each type of comparison is somewhat limited: the before-after comparison fails to show what portion of the change in the measure is due to external factors; the test-control comparison shows the difference between "after" measures and hence accounts for external factors, but fails to indicate the degree of change from the before state to the after state. Accordingly, it is desirable, where feasible, to conduct a before-after comparison in conjunction with a test-control comparison. In other words, the data design should, if possible, involve collection/analysis observation of both a control and test unit before and after the institution of the APTS application. To make the foregoing discussion more concrete, consider a large area with many bus routes and suppose that a certain fraction of them are treated in some manner (i.e., an APTS application is implemented which can be expected to reduce bus travel time). If pre-application and post-application measures of travel time are made only on the treated routes and a reduction in time is indicated, there is no way of knowing the extent to which the improvement is attributable to external factors (for instance, a decrease in auto traffic on the streets where the buses operate). In order to account for, in a quantitative fashion, these known or unknown factors which have arisen during the interval between the before and after measurements, it is necessary to make before and after measurements of bus travel time on routes which are comparable to the test routes and therefore susceptible to the same set of external factors. The difference between the travel time reduction on the test vs. control routes can then be taken as the true change due to the application. To make these statements, it is necessary to be fairly ___________________ [6] As is discussed below, a before-after comparison does not necessarily imply a single measurement before the operational test is implemented and another measurement while it is operation. Rather, this type of comparison can take the form of a series of measurements prior to, during, and after the operational phase of the operational test. If the project is implemented is stages, there will be a series of measurements corresponding to each stage. 40 confident that conditions affecting both control and experimental units are reasonably similar a requirement which is sometimes difficult, if not impossible, to assure. To reiterate, the proper use of the combined before-after/test-control approach guarantees to the greatest extent that any observed improvement is indeed due to an operational test application. Thus, the contractor should employ both types of comparisons wherever appropriate and feasible. The determination of appropriateness of the combined approach involves a consideration of the time span of the operational test. Regarding the scope of the project, the larger the geographic area encompassed by or affected by the project, the greater the possibility that no control units can be identified (i.e., the entire population is composed of test units). Regarding the time span of the project, no generalizations can be made since tests will vary in length depending on a variety of factors. As a general rule-of-thumb, the desirability of the combined before-after/test-control approach increases with the time span of the project, since this approach reveals internal as well as external changes occurring over the project's duration. The determination of feasibility of the combined approach involves questions of data availability and project timing. If there is a known deficiency in either type of comparison, then only the valid comparison should be employed; it is generally better to do without a before observation or a control observation than to settle for unsuitable before or control data. In the event that only one type of comparison is feasible, there are alternative techniques and precautionary measures available to the contractor to compensate for the absence of the other type of comparison. If no control group exists (e.g., if the operational test affects the entire population of observation units, making each one a test-unit) or if no suitable group can be found (each test unit is unique), then the contractor should be especially observant throughout the evaluation period of possible external factors which might influence the interpretation of project results. Any statistics regarding the before vs. after change due to the applied technique should be examined very carefully in the context of these observed external factors, and any conclusions based on such statistics should be qualified accordingly. If, due to project timing, there is no opportunity to perform before measurements, or if it is known beforehand that the units to be observed will undergo considerable change between 4l the before and after periods, the contractor should attempt to obtain surrogate data for the before period. Possible sources of surrogate data would include: (1) surveys conducted after the test is operational which question people about conditions or their behavior prior to the implementation of the technique; and (2) demographic and travel data collected by the local highway department, planning agency, or transit operator some time prior to the operational test. The surrogate data can be used to provide some indication of the magnitude of the before-after change experienced by the test and control groups. In using the before-after and/or test-control approach, one of the key steps is identifying comparable units. To as great an extent as possible, the units observed for the before case must be equivalent to the units observed for the after units. Returning to the previous example of bus routes, before-after comparability is not a difficult problem, since the same routes can be observed for both time periods. The only note of caution is that the routes should be unchanged (with respect to length, number and location of stops, etc.) from one measurement period to the next. Test-control comparability, on the other hand, raises some interesting problems. Theoretically, the test and control units should be as nearly alike as possible to rule out any chance of the observed change being a result of something other than the operational test application. Test and control units should be chosen which are similar in terms of variables assumed to be related to the particular measure. Again, using the example of bus routes and the measure travel time, matching of test and control routes could be done on the basis of such descriptors as route length, total trips along the route, peak headway, and average speed. The Volpe Center's Evaluation Strategy will generally suggest the basic data collection/analysis design to be employed for each project as a whole or for particular measures (e.g., before-after comparison, test-control comparison, both types of comparison, or a single set of measurements. The contractor should determine the feasibility of such suggestions in terms of the data availability and time frame of the particular project and site. The contractor's Evaluation Plan should then elaborate on the approach finally selected for each measure, indicating information such as the specific units chosen for the control and test groups. 42 3.3.2 Measure Stratification Measure stratification refers to the categorization of individual measures for collection/derivation and/or analysis purposes. Examples of measure stratification are: (1) peak versus off-peak time periods, (2) day of the week, (3) revenue (in-service) versus non-revenue service, (4) waiting, access, transfer and in-vehicle travel times, and (5) fixed route versus demand responsive. Measure stratification improves the quality of the evaluation by allowing an assessment of how changes in measures relate to the stratification categories, hence facilitating the formulation of more specific findings and conclusions. Whereas collection of an unstratified measure provides only a single, average reference point, the use of a stratified measure provides a series of reference points, each of which may be significant to the analysis and interpretation of results. Knowledge of inter-category differences in results enhances transferability; for instance, if a particular operational test proves to be especially beneficial in congested areas but of limited value in sparsely traveled areas, then other sites considering implementation of the service will know to focus their efforts in congested areas. Stratification can take the following forms: (1) categorization of a measure into additive components (e.g., measuring person trip time in terms of trip components such as access time, line-haul time); (2) categorization of a measure, and possibly its components, according to target market, operational, geographic, or time categories (e.g., measuring trip time for peak and off-peak periods); and (3) grouping of raw values of a measure into class intervals, with class intervals determined either before or after data collection (e.g., determining the distribution of early, late, and on-time arrivals). It is not possible apriori to present a standardized approach to be used for each measure. Clearly, the appropriate type and level of stratification depend on the particular measure and on the characteristics of the site and project. However, in order to provide the contractor with 43 some guidance in this area, examples of possible types and levels of stratification are presented below. 3.3.2.1 Categorization of a Measure Into Additive Components This form of stratification involves collecting and reporting data separately for specific components, or sub-breakdowns, of a measure. The purpose of categorizing in this manner is to single out the effect of an APTS application on these specific components. Examples of this form of stratification are available for measures relating to travel time, reliability, and productivity. Person transit trip time for fixed route systems can be broken into segments as depicted in the following diagram: Origin Destination where: Segment A = Access time Segment W = Waiting time for first vehicle or for subsequent transfer Segment T = In-vehicle transit time Segment E = Egress time ti = Time for ith trip segment If further amplification is desired, access time and egress time can be subdivided into walking, riding, and other portions; or in-vehicle transit time can be subdivided into collection, line-haul, and distribution phases. In the case of demand-responsive systems, some of the trip time components might take on a different definition: for example, access time would be zero, and waiting time would refer to the difference between the caller's requested time of pick up and the arrival time of the vehicle at the origin. In cases where the caller is told that pick up can only be made later than 44 the requested time,7 wait time can be further divided into the time between the requested pick-up time and the promised pick-up time, and the time between the promised pick-up time and the arrival time of the vehicle at the origin. This latter travel time component, is, in itself, a basic transit system reliability measure in the category of effectiveness measures summarized in Exhibit 7. In-vehicle transit time, if desired, can be divided into the direct routing travel time (the time between the person's origin and destination if no other pick-ups or drop-offs are made) and the detour travel time (the time spent detouring to make other pick-ups and drop-offs). Transit vehicle time is always to be broken into in-service time and non-service time. However, if desired, these two prime categories can be further divided as indicated below. For fixed route systems: In-service In motion Loading Non-productive -- waiting for lights, metering, or other obstacles to motion Non-service Garage to first service point Last service point to garage Dead turnaround time Deadhead time Other For demand responsive systems: In-service In motion with one or more passengers onboard In motion with no passengers onboard and in the act of picking up one or more passengers Loading Non-service Garage to first pick-up point Last drop-off point to garage Between first pick-up point and last drop-off point with no passengers onboard and not in the act of picking up one or more passengers ________________________ [7] Due to the potential ambiguity associated with requests for immediate service, the contractor should note how the particular transit operator maintains data on requested and promised pick-up times. 45 These time segments are depicted in the following diagram: where: Point A = Garage B = First pick-up point C = Drop off point -- no passengers on vehicle but driver is instructed to proceed immediately to pick up a passenger D = Pick-up point E = Drop-off point -- no passengers on vehicle and there are no requests for immediate pick-up; driver is instructed to proceed to central waiting point F = Point enroute to central waiting point-- driver is instructed to proceed immediately to pick up a passenger G = Pick-up point H = Last drop-off point of day I = Garage Note that in segments BC and GH pick-ups and drop-offs are being made and at least one passenger is always onboard. Also, all pick-up and drop-off points include time spent waiting for riders to board and deboard vehicles. For operating costs of APTS operational tests, it has been decided that the aggregation of cost items should be consistent with FTA Section 15 expense categories. Exhibit 9 is a matrix showing the distribution of expense object classes into functional areas under Section 15. 46 Click HERE for graphic. 47 Because of possible differences in current internal accounting practices, it is essential that any techniques for disaggregation and allocation of costs be described in the Evaluation Plan. In addition, because of different funding mechanisms, it is important to review in depth individual transit authority practices. It is also recognized that the reporting of operating costs should be carried out using a consistent time frame for reporting periods. 3.3.2.2 Categorization of a Measure According to Target Market, Operational, Geographic, or Time Categories The primary purpose of this form of stratification is to evaluate the effect of APTS applications in different contexts. As in the case of categorization into additive components, this form of stratification involves collecting and reporting measures separately for each category. Examples are as follows: Target Market: Trip purpose -- work/non-work User group -- commuters/non-commuters Mode -- auto/transit/other Operational: Type of transit service -- express/local; fixed route/demand responsive Direction of traffic flow - inbound/outbound Type of thoroughfare -- freeway/arterial Geographic: Within/outside central business district Zones with different demographic characteristics Time: peak/off-peak weekday/week-end Finer stratification in the above examples is also possible. For instance, within the target market category, the trip purpose "non-work" can be divided into medical, social, recreational, etc.; non-commuter can be stratified into elderly, disabled (ambulatory and non-ambulatory) unemployed, etc.; and mode can be divided into solo driver auto, carpool auto, chauffeured auto, and specific local transit service options. Types of bus service can be divided into local feeder, local line-haul, and express line-haul, and further divided into individual routes, and 48 beyond that into route segments. Time of day can be refined into the four Section 15 categories (A.M. peak, midday, P.M. peak, night) or even further into hour, half-hour, or 15-minute segments within certain categories. In general, in some instances it will be desirable to partition collected data into various target market categories, since most operational tests will probably consist of specific innovations aimed at particular user groups. The decision as to whether to stratify collected data by operational and geographic categories depends on the nature of the project and thus will have to be made on a case-by-case basis. However, it is recommended that serious consideration be given to using a minimum time of day stratification (peak/off-peak) for every measure, since many transit system operating characteristics as well as general traffic conditions vary widely between peak and off-peak periods. The decision as to stratification of data collection within the peak period (i.e., morning vs. evening peak) and within the off-peak period (i.e., midday vs. nighttime) should be made in accordance with the time of APTS service operation throughout the day and the variability of travel conditions and other relevant factors between the different categories. It is important to note that the peak period may be a changing period depending upon distance from the CBD and type of transit system. Other issues regarding data stratification and analysis are discussed in Section 4.2. 3.3.3 Grouping of Raw Data Into Class Intervals Measure stratification can also refer to the grouping of raw data into intervals, with intervals determined before or after data collection. Whereas the first two forms of stratification involve collecting and reporting a measure separately for each category (e.g., change in travel time during peak periods, off-peak periods), this type of grouping produces a frequency distribution for the particular measure. Survey data on traveler behavior, characteristics and attitudes is a good example of pre-collection determination of intervals. For instance, comparisons of users and non-users of an APTS test can be made using distributions of-such measures as age, income, auto availability, and attitudes toward transit, with the particular response categories of each measure having been determined beforehand. Appendix A contains recommended response categories for selected 49 demographic and travel behavior measures, as well as sample questions and response categories for selected attitudinal data. Reliability measures provide examples of intervals that can be determined after data collection. The difference between scheduled and actual arrival time at an access point would be collected in its raw form (i.e., each vehicle's time difference in minutes), but would be reported as a frequency distribution. A suggested minimum stratification of this measure is: % early % on time (vehicles arriving within +x or -y minutes of scheduled time) % late The contractor should be aware of differences in transit company standards with respect to schedule adherence, and the potential impact on data collection and analysis procedures. Vehicle delays due to breakdowns can be grouped according to the following minimum stratification: % No delay (delay of 2 minutes or less) % Delayed % 96 Total disruption of service If further detail is desired, the late category under schedule adherence and the delayed category under vehicle reliability can be divided into categories such as: 1-5 minutes delay, 6-10 minutes delay, over 10 minutes delay. The basic intent of grouping is to summarize the raw data without masking the real form of the distribution for a given measure. In addition, the extent of grouping may also depend upon the specific analyses which are planned. Interval grouping can be used in conjunction with either of the two forms of stratification previously discussed. For instance, person trip time can be stratified into components (access time, etc.), and time period (peak vs. off-peak) can be grouped into 5 or 10 minute intervals to obtain a frequency distribution. As was stated above, it is not possible in these guidelines to present a standardized approach to stratification for each measure. The contractor will therefore have to rely on judgment and past experience to determine which types of variable stratification are most likely to enhance understanding of specific areas of project effectiveness and potential application. 50 The contractor should plan data collection activities with the finest stratification which can be justified as appropriate for the APTS objectives. Since the ultimate sample size will be directly related to the number of categories employed, the contractor should make sure that the available sample units are sufficient to support the level of stratification deemed desirable. The Evaluation Plan developed by the contractor should contain justification for the type(s) and level of stratification selected, as well as evidence that such stratifications are feasible from the standpoint of data and sample size availability. 3.3.4 Sampling Requirements Once the contractor has determined the basic data collection/analysis design for the project evaluation and the type(s) and level of stratification for each measure, the final question to be addressed is sampling requirements. In general, data required from records maintained by the transit operator or other organizations should be available on a continual basis over the entire lifetime of the experimental test and such data should not require sampling. On the other hand, data obtained from measurements, counts, and surveys will generally not be available on a continual basis but will have to be collected in the form of samples. There may also be situations where measurements or counts yield continual data, but sampling is desired in order to reduce data processing expenses. When collection of a particular measure involves sampling, an estimate of the minimum sample size must be made prior to the initiation of the data collection effort. In estimating sample size requirements, the objective is to have a large enough sample to be able to draw valid inferences about the population from which the sample is drawn. As might be expected, the determination of appropriate sample sizes involves trade-offs between the desired level of precision and the cost of data collection. These trade-off decisions in turn require a determination, during the evaluation planning phase, of the appropriate types of analyses to be performed (e.g.,-estimates of population parameters, comparisons-between two or more groups of sampled data). Appendix B presents specific guidelines relevant to estimating required sample sizes. Included in the discussion are: (1) references to statistics books containing sample size 51 equations, (2) recommendations regarding values for the three input factors in the sample size equation, and (3) suggestions regarding implementation of the field data collection effort based on the calculated sample size values. Appendix B also contains a section on the basic types of possible statistical analyses, appropriate confidence levels, and desirable reporting formats. The contractor should follow the guidelines in Appendix B to develop appropriate sample sizes for each measure. The Evaluation Plan should contain the sample size values, along with an explanation of any assumptions or special procedures underlying these values (e.g., equations, input factor values used). 3.3.5 Timing of Data Collection For measures based on sampling, another issue to be addressed by the contractor is the timing of data collection. The exact periods during which measures are collected have a significant effect on the validity and representativeness of evaluation results, since the operation and effectiveness of a transportation system are sensitive to various factors associated with time. Four basic questions arise concerning the timing of data collection: (1) The appropriate season(s) of the year and day(s) of the week to include in the sample, (2) The appropriate duration of each data collection period, (3) The proper time to initiate data collection, and (4) The appropriateness of "one- shot" vs. periodic monitoring The particular season(s) and day(s) depend largely on the assumed sensitivity of the APTS application to each time unit. If it is deemed appropriate to assess the impact of the APTS application under reasonably normal conditions, data collection should be performed during the fall and spring, when weather conditions are not severe, schools are in session, and few people are on vacation. To the extent that the experimental test evaluation involves measures related to travel patterns and transit usage, the contractor should attempt to schedule data collection activities during those two seasons which are most representative of normal conditions. On the other hand, if severe weather conditions or other atypical conditions are an inherent feature of the site and it is desirable to examine the experimental test under a full range 52 of possible conditions, the contractor should schedule data collection throughout the year so the sample observations include extreme as well as normal conditions. If a particular transit service operates seven days a week, then the sample of days should include both weekdays and week-end days (in fact, the data should be stratified by weekday vs. week-end day to highlight the differences during these two periods). Regarding which day(s) to include in the weekday sample, similar logic applies as in the case of seasons. If the aim is to observe the project under typical weekday conditions, then any day(s) with abnormal traffic patterns should be avoided. In some cities, there is a difference between Monday and/or Friday conditions vs. Tuesday/Wednesday/Thursday conditions; if this is known to be the case for a particular test site, then data collection should be scheduled for the three "typical" days rather than either of the typical days. The contractor should consider the special characteristics of the operational test and the site in deciding which days are appropriate. If a large number of days is going to be involved, and there is no particularly significant distinction among days of the week, then a randomly selected sample of days would be preferable. The duration of each data collection period should be determined based on the degree of day-to-day variability and on the required sample size. If the particular item being measured is suspected to vary in behavior from one day to the next, then the data collection period should include several days; if it has been determined that only Tuesdays, Wednesdays, and Thursdays can be used, then several weeks may be necessary to achieve the required sample of days. Moreover, if the sample size required for a particular variable is large, then several days of data collection may be appropriate to obtain the minimum sample of observations. The choice of initiation time for each data collection period is dependent on a number of considerations, the chief one being that the "after" data collection not begin until the use of the APTS application is fully operational and its performance has stabilized. In general, it will probably take at least a few months for an APTS application to become fully operational, with all the "bugs" worked out, and possible behavioral influences associated with the application are eliminated. The desire is to achieve a "steady state" for the system after the application has been implemented. The time to achieve this "steady state" undoubtedly will vary from project to project. Thus, data collection related to the test should not commence until these adjustments and modifications are completed. Other factors determining the initiation date for data collection 53 are the desire to avoid summer and winter months and the overall schedule of the operational test. In most instances, data collection will be performed for discrete phases of the operational test (i.e., before the project is implemented, while the project is operational, and possibly after the project is terminated). Post-operational test data collection would only be performed if there was a desire to see whether operation of the APTS experiment for a limited period had led to permanent changes in people's travel patterns or attitudes. However, if operational test elements are by nature changing continually or if it is expected that the APTS application will cause gradual but continual changes in transit performance measures, then a periodic process of data collection would be more appropriate than merely "before," "during," and "after" data collection. The multitude of data points obtained from a periodic monitoring process will make possible the examination of functional relationships either among measures of interest or in a time series. Moreover, monitoring of certain measures during the early months following introduction of the application(s) may be useful in determining when the effects have stabilized enough to initiate full-scale data collection. It should be noted that if periodic data collection is appropriate, then a sequential analysis procedure (similar to control charts) may be useful to permit reductions in sampling requirements. The contractor's Evaluation Plan should indicate the exact timing of data collection for each measure involving sampling. This information should be presented in a schedule which also shows the projected implementation dates for the various elements of the project. 54 4. GUIDELINES FOR PERFORMING EVALUATION ACTIVITIES This chapter presents suggestions relative to implementing the evaluation of an APTS operational test. During the evaluation implementation phase of the evaluation process, data collection/analysis relating to site characteristics, quantitative measures, and qualitative measures is undertaken according to the plans and procedures laid out in the Evaluation Plan. In addition, information is gathered relative to the project's operational history and external events which may have some bearing on the project outcome. This information is eventually incorporated into the analysis and interpretation of project results. Contractor functions during the evaluation implementation phase include monitoring and/or performance of data collection activities, data reduction and analysis, subjective analysis of information relative to project issues, and synthesis of project findings into a Final Summary Evaluation Report. In accordance with these contractor functions, this chapter of the guidelines is organized into two sections: (1) monitoring/performance of data collection and (2) data reduction, analysis, and presentation. The recommended content and organization of the various contractor reports prepared during this phase are presented in Chapter 5. During this phase, the contractor must maintain a sensitivity to the relationships among the organizations involved in the project -- in particular the local sponsor or project team, FTA, and the Volpe Center (see Chapter 2). The contractor must work closely with these groups at the appropriate times, while maintaining the role and perspective of an external, objective organization assessing the impact of the operational test. 4.1 MONITORING/PERFORMANCE OF DATA COLLECTION Since much of the data required for evaluations will be unavailable from pre-existing data bases and secondary sources, each operational test will undoubtedly involve significant data collection efforts. Given the considerable amount of time and money which will be spent on data collection; careful management and oversight of the data collection process are essential. Where possible and appropriate, data collection may involve the use of students from local colleges and universities. 55 The contractor is responsible for ensuring that data collection is performed according to the Volpe Center/FTA-approved Evaluation Plan. There are three potential alternatives associated with data collection. One of these occurs when the local sponsor or operator collects all data (under FTA/APTS and/or local funding), and the contractor acts in a monitoring role to assure the quality and timeliness of data collected, as well as adherence to procedures laid out in the Evaluation Plan. A second alternative occurs when the contractor collects the data, and coordinates the timing and performance of these activities through the local sponsor or operator. The third possibility is one in which both collect various elements of the data. In order to monitor and/or perform the data collection activities called for in a given evaluation, the contractor will need to maintain open channels of communication with the site, in the form of visits, telephone and written correspondence with the appropriate local agencies as well as subscriptions to local newspapers. In the rare instance where day-to-day contact with the site is necessary, the contractor should arrange to base a member of the firm at the site. Whether data collection is being performed by the contractor or by the local sponsor, the contractor must stay closely involved in all phases to make sure the procedures specified in the Evaluation Plan are followed. In cases where the local sponsor or other local agency is collecting data, the contractor should meet frequently with the agency to discuss progress and problems, work out solutions to the problems, and observe key phases of field data collection. In addition, the contractor should occasionally perform independent spot checks, especially in the case of measures for which the local agency has limited experience in data collection. The contractor is expected to inform the Volpe Center of the status of data collection in its Monthly Evaluation Progress Reports (see Chapter 5 for the recommended content and organization of this type of report). Should there be an unacceptable degradation of quality or timeliness of data collected by the local sponsor, the contractor should notify the Volpe Center in writing. The Volpe Center will in turn take steps through the FTA Project Manager to rectify the situation. Over and above monitoring data collection activities, the contractor should keep abreast of the status of the operational test. This awareness of project operational status is important so that: (1) data collection activities can be smoothly coordinated with ongoing project activities (causing minimum disruption of day-to-day operations), and (2) evaluation results can be 56 interpreted in the context of project history. The local sponsor's quarterly project progress reports to FTA/Volpe Center (see Chapter 5 for recommended content and organization) will be a useful source of information on the project's operational evaluation. However, the contractor is encouraged to obtain a more detailed account of progress/problems relative to implementing and operating the APTS test by talking with the local sponsor at the site. In addition to keeping abreast of project operations, the contractor should be continually watching at the site for unexpected (external) events which might affect the validity of project results. In any implemented operational test, no matter how well controlled or planned, the possibility remains for unexpected events to occur that may have an impact on measures of the project