Algorithms for Walking, Running, Swimming, Flying, and Manipulation

© Russ Tedrake, 2020

How to cite these notes

**Note:** These are working notes used for a course being taught
at MIT. They will be updated throughout the Spring 2020 semester. Lecture videos are available on YouTube.

Table of contents | Next Chapter |

Robots today move far too conservatively, and accomplish only a fraction of the tasks and achieve a fraction of the performance that they are mechanically capable of. In many cases, we are still fundamentally limited by control technology which matured on rigid robotic arms in structured factory environments. The study of underactuated robotics focuses on building control systems which use the natural dynamics of the machines in an attempt to achieve extraordinary performance in terms of speed, efficiency, or robustness.

Let's start with some examples, and some videos.

The world of robotics changed when, in late 1996, Honda Motor Co. announced that they had been working for nearly 15 years (behind closed doors) on walking robot technology. Their designs have continued to evolve, resulting in a humanoid robot they call ASIMO (Advanced Step in Innovative MObility). For nearly 20 years, Honda's robots were widely considered to represent the state of the art in walking robots, although there are now many robots with designs and performance very similar to ASIMO's. We will dedicate effort to understanding a few of the details of ASIMO when we discuss algorithms for walking... for now I just want you to become familiar with the look and feel of ASIMO's movements [watch the asimo video below now].

I hope that your first reaction is to be incredibly impressed with the
quality and versatility of ASIMO's movements. Now take a second look.
Although the motions are very smooth, there is something a little
unnatural about ASIMO's gait. It feels a little like an astronaut
encumbered by a heavy space suit. In fact this is a reasonable analogy...
ASIMO is walking a bit like somebody that is unfamiliar with his/her
dynamics. Its control system is using high-gain feedback, and therefore
considerable joint torque, to cancel out the natural dynamics of the
machine and strictly follow a desired trajectory. This control approach
comes with a stiff penalty. ASIMO uses roughly 20 times the energy
(scaled) that a human uses to walk on the flat (measured by cost of
transport)

For contrast, let's now consider a very different type of walking
robot, called a passive dynamic walker (PDW). This "robot" has no motors,
no controllers, no computer, but is still capable of walking stably down a
small ramp, powered only by gravity [watch videos above now]. Most people
will agree that the passive gait of this machine is more natural than
ASIMO's; it is certainly more efficient. Passive walking machines have a
long history - there are patents for passively walking toys dating back to
the mid 1800's. We will discuss, in detail, what people know about the
dynamics of these machines and what has been accomplished experimentally.
This most impressive passive dynamic walker to date was built by Steve
Collins in Andy Ruina's lab at Cornell

Passive walkers demonstrate that the high-gain, dynamics-cancelling feedback approach taken on ASIMO is not a necessary one. In fact, the dynamics of walking is beautiful, and should be exploited - not cancelled out.

The world is just starting to see what this vision could look like. This video from Boston Dynamics is one of my favorites of all time:

This result is a marvel of engineering (the mechanical design alone is amazing...). In this class, we'll teach you the computational tools to required to make robots perform this way. We'll also try to reason about how robust these types of maneuvers are and can be. Don't worry, if you do not have a super lightweight, super capable, and super durable humanoid, then a simulation will be provided for you.

The story is surprisingly similar in a very different type of machine. Modern airplanes are extremely effective for steady-level flight in still air. Propellers produce thrust very efficiently, and today's cambered airfoils are highly optimized for speed and/or efficiency. It would be easy to convince yourself that we have nothing left to learn from birds. But, like ASIMO, these machines are mostly confined to a very conservative, low angle-of-attack flight regime where the aerodynamics on the wing are well understood. Birds routinely execute maneuvers outside of this flight envelope (for instance, when they are landing on a perch), and are considerably more effective than our best aircraft at exploiting energy (eg, wind) in the air.

As a consequence, birds are extremely efficient flying machines; some
are capable of migrating thousands of kilometers with incredibly small
fuel supplies. The wandering albatross can fly for hours, or even days,
without flapping its wings - these birds exploit the shear layer formed by
the wind over the ocean surface in a technique called dynamic soaring.
Remarkably, the metabolic cost of flying for these birds is
indistinguishable from the baseline metabolic cost

Birds are also incredibly maneuverable. The roll rate of a highly
acrobatic aircraft (e.g, the A-4 Skyhawk) is approximately 720
deg/sec

Although many impressive statistics about avian flight have been
recorded, our understanding is partially limited by experimental
accessibility - it is quite difficult to carefully measure birds (and the
surrounding airflow) during their most impressive maneuvers without
disturbing them. The dynamics of a swimming fish are closely related, and
can be more convenient to study. Dolphins have been known to swim
gracefully through the waves alongside ships moving at 20
knots

Despite a long history of success in industrial applications, and the huge potential for consumer applications, we still don't have robot arms that can perform any meaningful tasks in the home. Admittedly, the perception problem (using sensors to detect/localize objects and understand the scene) for home robotics is incredibly difficult. But even if we were given a perfect perception system, our robots are still a long way from performing basic object manipulation tasks with the dexterity and versatility of a human.

Most robots that perform object manipulation today use a stereotypical pipeline. First, we enumerate a handful of contact locations on the hand (these points, and only these points, are allowed to contact the world). Then, given a localized object in the environment, we plan a collision-free trajectory for the arm that will move the hand into a "pre-grasp" location. At this point the robot closes it's eyes (figuratively) and closes the hand, hoping that the pre-grasp location was good enough that the object will be successfully grasped using e.g. only current feedback in the fingers to know when to stop closing. "Underactuated hands" make this approach more successful, but the entire approach really only works well for enveloping grasps.

The enveloping grasps approach may actually be sufficient for a number
of simple pick-and-place tasks, but it is a very poor representation of
how humans do manipulation. When humans manipulate objects, the contact
interactions with the object and the world are very rich -- we often use
pieces of the environment as fixtures to reduce uncertainty, we commonly
*exploit* slipping behaviors (e.g. for picking things up, or
reorienting it in the hand), and our brains don't throw NaNs if we use the
entire surface of our arms to e.g. manipulate a large object.

By the way, in most cases, if the robots fail to make contact at the anticipated contact times/locations, bad things can happen. The results are hilarious and depressing at the same time. (Let's fix that!)

Classical control techniques for robotics are based on the idea that feedback can be used to override the dynamics of our machines. These examples suggest that to achieve outstanding dynamic performance (efficiency, agility, and robustness) from our robots, we need to understand how to design control systems which take advantage of the dynamics, not cancel them out. That is the topic of this course.

Surprisingly, many formal control ideas do not support the idea of "exploiting" the dynamics. Optimal control formulations (which we will study in depth) allow it in principle, but optimal control of nonlinear systems is still a relatively ad hoc discipline. Sometimes I joke that in order to convince a control theorist to consider the dynamics, you have to do something drastic, like taking away her control authority - remove a motor, or enforce a torque-limit. These issues have created a formal class of systems, the underactuated systems, for which people have begun to more carefully consider the dynamics of their machines in the context of control.

According to Newton, the dynamics of mechanical systems are second order ($F = ma$). Their state is given by a vector of positions, $\bq$ (also known as the configuration vector), and a vector of velocities, $\dot{\bq}$, and (possibly) time. The general form for a second-order control dynamical system is: $$\ddot{\bq} = {\bf f}(\bq,\dot{\bq},\bu,t),$$ where $\bu$ is the control vector.

As we will see, the dynamics for many of the robots that we care about turn out to be affine in commanded torque, so let's consider a slightly constrained form: \begin{equation}\ddot{\bq} = {\bf f}_1(\bq,\dot{\bq},t) + {\bf f}_2(\bq,\dot{\bq},t)\bu \label{eq:f1_plus_f2}.\end{equation} For a control dynamical system described by equation \eqref{eq:f1_plus_f2}, if we have \begin{equation} \textrm{rank}\left[{\bf f}_2 (\bq,\dot{\bq},t)\right] < \dim\left[\bq\right],\label{eq:underactuated_low_rank}\end{equation} then the system is underactuated. Be careful, though -- sometimes we will write equations that look like \eqref{eq:f1_plus_f2} but tack on additional constraints like $|\bu|\le 1$; as we will discuss below, input limits and other constraints can also make a system underactuated.

Notice that whether or not a control system is underactuated may depend
on the state of the system or even on time, although for most systems
(including all of the systems in this book) underactuation is a global
property of the system. We will refer to a system as underactuated if it is
underactuated in *all* states and times. In practice, we often refer
informally to systems as fully actuated as long as they are fully actuated
in *most* states (e.g., a "fully-actuated" system might still have
joint limits or lose rank at a kinematic singularity). Admittedly, this
permits the existence of a gray area, where it might feel awkward to
describe the *system* as either fully- or underactuated (we should
instead only describe its states); even powerful robot arms on factory
floors do have actuator limits, but we can typically design controllers for
them as though they were fully actuated. The primary interest of this text
is in systems for which the underactuation is useful/necessary for
developing a control strategy.

Consider the simple robot manipulator illustrated above. As described in the Appendix, the equations of motion for this system are quite simple to derive, and take the form of the standard "manipulator equations": $${\bf M}(\bq)\ddot\bq + \bC(\bq,\dot\bq)\dot\bq = \btau_g(\bq) + {\bf B}\bu.$$ It is well known that the inertia matrix, ${\bf M}(\bq)$ is (always) uniformly symmetric and positive definite, and is therefore invertible. Putting the system into the form of equation \ref{eq:f1_plus_f2} yields: \begin{align*}\ddot{\bq} =& {\bf M}^{-1}(\bq)\left[ \btau_g(\bq) + \bB\bu - \bC(\bq,\dot\bq)\dot\bq \right].\end{align*} Because ${\bf M}^{-1}(\bq)$ is always full rank, we find that a system described by the manipulator equations is fully-actuated if and only if $\bB$ is full row rank. For this particular example, $\bq = [\theta_1,\theta_2]^T$ and $\bu = [\tau_1,\tau_2]^T$ (motor torques at the joints), and $\bB = {\bf I}_{2 \times 2}$. The system is fully actuated.

I personally learn best when I can experiment and get some physical intuition. The companion software for the course should make it easy for you to see this system in action. I will link to online jupyter notebooks from the main text like this:

Try it out! You'll see how to simulate the double pendulum, and even how to inspect the dynamics symbolically.

Note: You can also run the code on your own machines (see the Appendix for details).

While the basic double pendulum is fully actuated, imagine the somewhat bizarre case that we have a motor to provide torque at the elbow, but no motor at the shoulder. In this case, we have $\bu = \tau_2$, and $\bB(\bq) = [0,1]^T$. This system is clearly underactuated. While it may sound like a contrived example, it turns out that it is almost exactly the dynamics we will use to study as our simplest model of walking later in the class.

The matrix ${\bf f}_2$ in equation \ref{eq:f1_plus_f2} always has dim$[\bq]$ rows, and dim$[\bu]$ columns. Therefore, as in the example, one of the most common cases for underactuation, which trivially implies that ${\bf f}_2$ is not full row rank, is dim$[\bu] < $ dim$[\bq]$. This is the case when a robot has joints with no motors. But this is not the only case. The human body, for instance, has an incredible number of actuators (muscles), and in many cases has multiple muscles per joint; despite having more actuators than position variables, when I jump through the air, there is no combination of muscle inputs that can change the ballistic trajectory of my center of mass (barring aerodynamic effects). My control system is underactuated.

A quick note about notation. When describing the dynamics of rigid-body systems in this class, I will use $\bq$ for configurations (positions), $\dot{\bq}$ for velocities, and use $\bx$ for the full state ($\bx = [\bq^T,\dot{\bq}^T]^T$). There is an important limitation to this convention (3D angular velocity should not be represented as the derivative of 3D pose) described in the Appendix, but it will keep the notes cleaner. Unless otherwise noted, vectors are always treated as column vectors. Vectors and matrices are bold (scalars are not).

Fully-actuated systems are dramatically easier to control than underactuated systems. The key observation is that, for fully-actuated systems with known dynamics (e.g., ${\bf f}_1$ and ${\bf f}_2$ are known for a second-order control-affine system), it is possible to use feedback to effectively change an arbitrary control problem into the problem of controlling a trivial linear system.

When ${\bf f}_2$ is full row rank, it is invertible

Let's say that we would like our simple double pendulum to act like a
simple single pendulum (with damping), whose dynamics are given by:
\begin{align*} \ddot \theta_1 &= -\frac{g}{l}\sin\theta_1 -b\dot\theta_1 \\
\ddot\theta_2 &= 0. \end{align*} This is easily achieved using

Since we are embedding a nonlinear dynamics (not a linear one), we refer to this as "feedback cancellation", or "dynamic inversion". This idea reveals why I say control is easy -- for the special case of a fully-actuated deterministic system with known dynamics. For example, it would have been just as easy for me to invert gravity. Observe that the control derivations here would not have been any more difficult if the robot had 100 joints.

As always, I highly recommend that you take a few minutes to read through the source code.

Fully-actuated systems are feedback equivalent to $\ddot\bq = \bu$,
whereas *underactuated systems are not feedback equivalent to $\ddot\bq =
\bu$*. Therefore, unlike fully-actuated systems, the control designer
has no choice but to reason about the more complex dynamics of the plant in
the control design. When these dynamics are nonlinear, this can dramatically
complicate feedback controller design.

A related concept is feedback linearization. The feedback-cancellation controller in the example above is an example of feedback linearization -- using feedback to convert a nonlinear system into a controllable linear system. Asking whether or not a system is "feedback linearizable" is not the same as asking whether it is underactuated; even a controllable linear system can be underactuated, as we will discuss soon.

Although the dynamic constraints due to missing actuators certainly embody the spirit of this course, many of the systems we care about could be subject to other dynamic constraints as well. For example, the actuators on our machines may only be mechanically capable of producing some limited amount of torque, or there may be a physical obstacle in the free space with which we cannot permit our robot to come into contact with.

In practice it can be useful to separate out constraints which depend only on the input, e.g. $\phi(\bu)\ge0$, such as actuator limits, as they can often be easier to handle than state constraints. An obstacle in the environment might manifest itself as one or more constraints that depend only on position, e.g. $\phi(\bq)\ge0$.

By our generalized definition of underactuation, we can see that input constraints can certainly cause a system to be underactuated. State (only) constraints are more subtle -- in general these actually reduce the dimensionality of the state space, therefore requiring less dimensions of actuation to achieve "full" control, but we only reap the benefits if we are able to perform the control design in the "minimal coordinates" (which is often difficult).

Input and state constraints can complicate control design in similar ways to having an insufficient number of actuators, (i.e., further limiting the set of the feasible trajectories), and often require similar tools to find a control solution.

You might have heard of the term "nonholonomic system" (see e.g.

Contrast the wheeled robot example with a robot on train tracks. The train tracks correspond to a holonomic constraint: the track constraint can be written directly in terms of the configuration $\bq$ of the system, without using the velocity ${\bf \dot{q}}$. Even though the track constraint could also be written as a differential constraint on the velocity, it would be possible to integrate this constraint to obtain a constraint on configuration. The track restrains the possible configurations of the system.

A nonholonomic constraint like the no-side-slip constraint on the wheeled vehicle certainly results in an underactuated system. The converse is not necessarily true—the double pendulum system which is missing an actuator is underactuated but would not typically be called a nonholonomic system. Note that the Lagrangian equations of motion are a constraint of the form \[\bphi(\bq,{\bf \dot{q}},{\bf \ddot{q}},t) = 0,\] so do not qualify as a nonholonomic constraint.

The control of underactuated systems is an open and interesting problem in controls. Although there are a number of special cases where underactuated systems have been controlled, there are relatively few general principles. Now here's the rub... most of the interesting problems in robotics are underactuated:

- Legged robots are underactuated. Consider a legged machine with $N$ internal joints and $N$ actuators. If the robot is not bolted to the ground, then the degrees of freedom of the system include both the internal joints and the six degrees of freedom which define the position and orientation of the robot in space. Since $\bu \in \Re^N$ and $\bq \in \Re^{N+6}$, equation \eqref{eq:underactuated_low_rank} is satisfied.
- (Most) Swimming and flying robots are underactuated. The story is the same here as for legged machines. Each control surface adds one actuator and one DOF. And this is already a simplification, as the true state of the system should really include the (infinite-dimensional) state of the flow.
- Robot manipulation is (often) underactuated. Consider a fully-actuated robotic arm. When this arm is manipulating an object with degrees of freedom (even a brick has six), it can become underactuated. If force closure is achieved, and maintained, then we can think of the system as fully-actuated, because the degrees of freedom of the object are constrained to match the degrees of freedom of the hand. That is, of course, unless the manipulated object has extra DOFs (for example, any object that is deformable).

Even control systems for fully-actuated and otherwise unconstrained systems can be improved using the lessons from underactuated systems, particularly if there is a need to increase the efficiency of their motions or reduce the complexity of their designs.

This course is based on the observation that there are new computational tools from optimization theory, control theory, motion planning, and even machine learning which can be used to design feedback control for underactuated systems. The goal of this class is to develop these tools in order to design robots that are more dynamic and more agile than the current state-of-the-art.

The target audience for the class includes both computer science and mechanical/aero students pursuing research in robotics. Although I assume a comfort with linear algebra, ODEs, and Python, the course notes aim to provide most of the material and references required for the course.

I have a confession: I actually think that the material we'll cover in these notes is valuable far beyond robotics. I think that systems theory provides a powerful language for organizing computation in exceedingly complex systems -- especially when one is trying to program and/or analyze systems with continuous variables in a feedback loop (which happens throughout computer science and engineering, by the way). I hope you find these tools to be broadly useful, even if you don't have a humanoid robot capable of performing a backflip at your immediate disposal.

- The state of the humanoid can be represented by the angles and the angular velocities of all its joints.
- While doing the backflip (state in the left figure), the humanoid is fully actuated.
- While standing (state in the right figure), the humanoid is fully actuated.

- For any twice-differentiable desired trajectory $\bq_{\text{des}}(t)$, is it always possible to find a control signal $\bu(t)$ such that $\bq(t) = \bq_{\text{des}}(t)$ for all $t \geq 0$, provided that $\bq(0) = \bq_{\text{des}}(0)$ and $\dot \bq(0) = \dot \bq_{\text{des}}(0)$?
- Now consider the simplest fully-actuated robot: the double integrator. The dynamics of this system reads $m \ddot q = u$, and you can think of it as a cart of mass $m$ moving on a straight rail, controlled by a force $u$. The figure below depicts its phase portrait when $u=0$. Is it possible to find a control signal $u(t)$ that drives the double integrator from the initial state $\bx(0) = [2, 0.5]^T$ to the origin along a straight line (blue trajectory)? Does the answer change if we set $\bx(0)$ to be $[2, -0.5]^T$ (red trajectory)?
- The dynamics \ref{eq:f1_plus_f2} are $n=\dim[\bq]$
*second-order*differential equations. However, it is always possible (and we'll frequently do it) to describe these equations in terms of $2n$*first-order*differential equations $\dot \bx = f(\bx,t)$. To this end, we simply define $$f(\bx,t) = \begin{bmatrix} \dot\bq \\ {\bf f}_1(\bq,\dot\bq,t) + {\bf f}_2(\bq,\dot\bq,t)\bu \end{bmatrix}.$$ For any twice-differentiable trajectory $\bx_{\text{des}}(t)$, is it always possible to find a control $\bu(t)$ such that $\bx(t) = \bx_{\text{des}}(t)$ for all $t \geq 0$, provided that $\bx(0) = \bx_{\text{des}}(0)$?

- Identify the set of states $\bx = [\bq^T, \dot \bq^T]^T$ in which the system is underactuated.
- For
*all*the states in which the system is underactuated, identify an acceleration $\ddot \bq (\bq, \dot \bq)$ that cannot be instantaneously achieved. Provide a rigorous proof of your claim by using the equations of motion: plug the candidate accelerations in the dynamics, and try to come up with contradictions such as $mg=0$.

Table of contents | Next Chapter |