Observation and Autonomy

In this series of articles, we will look at the history and challenges of autonomous robotics with the aim of offering insights into the missing elements of large-scale autonomy deployment. 

Part 1 :

The beginning of large-scale autonomy 

From a functional point of view, autonomous robotics operates from two angles, starting with observation to, therefore, define action. Symbolically, every robot has these two capabilities : the first allows it to perceive the environment and the second allows it to act with this environment. The latest advances in industrial cobots or the latest humanoid robots from Boston Dynamics, embody the expertise developed in the field of action. Historically, starting to work from the perspective of ‘action’ is how the first automaton emerged, whose operation was more based on a simple binary signal from a sensor than on a form of intelligence acquired through observation. 

If we consider robot automation as the ability to perform tasks without human supervision, then these automatic robots (automata or automatons) already existed before the first industrial revolution. However, autonomous robots emerged much later.

What is the difference between automatic and autonomous robots? 

An automatic robot performs pre-programmed actions in response to a manual trigger or a basic sensor.

An autonomous robot can be defined as a machine that is set to make an action but is solely programmed to learn and to observe its environment. Then, the robot decides by itself what intermediate actions to take to execute it. 

At the beginning of the 21st century, a new paradigm emerged in the robotics sector to redefine the limits of autonomy. Technological advances in this sector were supported by events that encouraged research, particularly on autonomous cars. After its first edition in 2004, the Defense Advanced Research Projects Agency (DARPA) launched the second DARPA Challenge in 2015, bringing together players from all around the world to explore the field of possibilities regarding autonomous cars. The challenge, which consisted of driving a car autonomously from point A to point B without a driver, was won by Stanley, one of the first autonomous cars. It was equipped with a multiplicity of sensors (Lidars and cameras for perception, GPS for localization) associated with Machine Learning (ML) to understand its environment. After this convincing test, Lidars and ML became the ingredients of the magic recipe to make cars more autonomous. 

Endowing a robot with a high level of autonomy relies on a particular process : perceiving, locating, planning, controlling and acting. The difficulty that limits autonomous systems today lies in perception. The sensors mentioned above are generally used in synergy to provide a global perception of the scene (depth, movement, semantics), and the favorite remains the Lidar. 

The integration of Lidar requires the use of frameworks allowing us to understand the world, to locate obstacles and their nature. This association makes the sensor capable of moving from observation to interpretation.

Levels of autonomy :

Following the DARPA Challenge and the exposure of the “magic recipe”, various level-3 prototypes have been achieved. However, it seems like today’s systems have reached a limit at this level and are struggling to surpass it. In fact, the exacerbated use of sensors, especially lidars, is proving to be a limited recipe to address a major persistent difficulty : the management of rare cases. 

The autonomous car accident that killed Elaine Herzberg was the first in history and had major implications for the autonomous robotics industry. In this case, the vision system developed by Uber at the time was unable to recognize and to identify the approaching obstacle (human) on the side of the road. Although very rare, this type of accident reflects the weakness of the observations on which computer vision algorithms are based. 

To compensate for this difficulty of perception in rare cases, current systems tend to install more sensors to exploit a quantity of data that is near-to-exhaustive on the robot’s environment. This creates heavy, complex, expensive systems. Perhaps this is where the major problem of observation lies : wouldn’t it be better to redefine the vision system and the use of sensors to put intelligence back at the heart of autonomous systems ?

The real industrial revolution will take place when we move from automata to autonomous systems in unstructured and uncontrolled environments.