Dickmanns 1993a : According to the 4D-approach to dynamic machine vision developed at UniBwM over the last decade, active vision is more than just actively controlling the viewing parameters of the camera system. The core of the method is an actively servo-maintained internal representation of the relevant spatio-temporal physical processes including objects and subjects in the visually observed scene; in the future, this will also encompass mental processes assumed to determine control decisions in subjects. Not only spatial relationships are being reconstructed from image sequences and represented but also temporal coherence and causality.
A large fraction of our knowledge about the real world is concerned with the temporal domain; we learn to understand this during early life more or less subconsciously while the capability of crawling, walking and manipulating other objects under Earth gravity is being acquired. The temporal sequence of states of objects and the transition characteristics under certain circumstances constitute very essential knowledge about the real world providing us with the capability of acting adequately even though it does not seem to be represented explicitly. This has long been overlooked in Artificial Intelligence which concentrated its efforts on explicitly represented abstract knowledge about quasi-static relations between objects in the world.
The natural sciences and engineering technology have developed adequate methods for representing these facts about the physical world within the framework of differential equations with time as monotonically increasing independent variable. As I. Kant has elaborated in his ‘Critiques ...‘ more than two centuries ago it has to be kept in mind that space and time are not properties of objects. We cannot help carrying it into the world by our sensing and analysis systems; we ourselves exist in these basic four dimensions. Therefore, it was decided to install these basic four dimensions in our machine intelligence system right from the beginning in order to be able to deal with the real world efficiently. This was the main contribution of our approach to machine vision; the rest follows almost automatically.