Pellkofer et al. 2003, Abstract: For robust and secure behavior in natural environment an autonomous vehicle needs an elaborate vision sensor as main source of information. The vision sensor must he adaptable to the external situation, the mission, the capabilities of the vehicle and the knowledge about the external world accumulated up to the present time. In the EMS-Vision system, this vision sensor consists of four cameras with different focal lengths mounted on a highly dynamic pan-tilt camera head. Image processing, gaze control and behavior decision interact with each other in a closed loop. The image processing experts specify so-called regions of attention (RoAs) for each object in 3D object coordinates. These RoAs should be visible with a resolution as required by the measurement techniques applied. The behavior decision module specifies the relevance of obstacles like road segments, crossings or landmarks in the situation context. The gaze control unit takes all this information in order to plan, optimize and perform a sequence of smooth pursuits, interrupted by saccades. The sequence with the best information gain is performed. The information gain depends on the relevance of objects or object parts, the duration of smooth pursuit maneuvers, the quality of perception and the number of saccades. The functioning of the EMS-Vision system is demonstrated in a complex and scalable autonomous mission with the UBM test vehicle VAMORS.

Technical vision systems for practically useful real-time guidance of ground vehicles are in their second decade of development. Early quasi-static image evaluation systems for this purpose date back some more years. In 1986, the use of LCD cameras with digital microprocessor systems onboard standard sized test vehicles for road driving started both in the USA and in Europe. While driving slowly and cross-country has been the goal in the DARPA-funded US-approach, in Europe driving on well structured roads at high speeds has been favored right from the beginning. The project ‘Prometheus’ in the European ‘EUREKA’ framework has advanced vision for road vehicles considerably in the years 1987 - 94. Most researchers and developers selected a single camera fixed to the car body, for simplicity reasons (see proceedings of the Symposium on ‘Intelligent Vehicles’ since 1992). At UBM, rather early a combination of two cameras with different focal lengths fixedly mounted relative to each other on a pointing platform has been selected, for a good combination of a larger field of view nearby and good resolution in a smaller area further away. Good, but not fully satisfying results for the long run have been achieved. This has led to the development of a third-generation dynamic vision system since 1996, called ‘Expectation-based, Multi-focal, Saccadic’ (EMS-) Vision”. Results with this new vision system are described in this paper.
   The paper is organized as folIows: Section 2 gives a motivation for the interaction mechanism developed between perception and gaze control. Section 3 describes the sensor concept of our vision system. A short introduction of the EMS- Vision system is given in section 4. The components of the gaze control system follow in section 5. Sections 6 and 7 explain how the different perception experts − especially for road recognition − specify their gaze requirements using regions of attention. Section 8 describes the algorithm for the optimization of the viewing direction running in the gaze control unit. In section 9 experimental results of an autonomous turn-off maneuver are presented. Section 10 concludes the contribution.