# Underactuated Robotics

Algorithms for Walking, Running, Swimming, Flying, and Manipulation

Russ Tedrake

How to cite these notes   |   Send me your feedback

Note: These are working notes used for a course being taught at MIT. They will be updated throughout the Spring 2020 semester. Lecture videos are available on YouTube.

# Planning and Control through Contact

So far we have developed a fairly strong toolbox for planning and control with "smooth" systems -- systems where the equations of motion are described by a function $\dot{\bx} = f(\bx,\bu)$ which is smooth everywhere. But our discussion of the simple models of legged robots illustrated that the dynamics of making and breaking contact with the world are more complex -- these are often modeled as hybrid dynamics with impact discontinuities at the collision event and constrained dynamics during contact (with either soft or hard constraints).

My goal for this chapter is to extend our computational tools into this richer class of models. Many of our core tools still work: trajectory optimization, Lyapunov analysis (e.g. with sums-of-squares), and LQR all have natural equivalents.

Let's start with a warm-up exercise: trajectory optimization for the rimless wheel. We already have basically everything that we need for this, and it will form a nice basis for generalizing our approach throughout the chapter.

# Trajectory optmization for the rimless wheel

The rimless wheel was our simplest example of a passive-dynamic walker; it has no control inputs but exhibits a passively stable rolling fixed point. We've also already seen that trajectory optimization can be used as a tool for finding limit cycles of a smooth passive system, e.g. by formulating a direct collocation problem: \begin{align*} \find_{\bx[\cdot],h} \quad \subjto \quad & \text{collocation constraints}(\bx[n], \bx[n+1], h), \quad \forall n \in [0, N-1] \\ & \bx[0] = \bx[N], \\ & h_{min} \le h \le h_{max},\end{align*} where $h$ was the timestep between the trajectory break points.

It turns out that applying this to the rimless wheel is quite straight forward. We still want to find a periodic trajectory, but now have to take into account the collision event. We can do this by modifying the periodicity condition. Let's force the initial state to be just after the collision, and the final state to be just before the collision, and make sure they are related to each other via the collision equation: \begin{align*} \find_{\bx[\cdot],h} \quad \subjto \quad & \text{collocation constraints}(\bx[n], \bx[n+1], h), \quad \forall n \in [0, N-1] \\ & \theta[0] = \gamma - \alpha, \\ & \theta[N] = \gamma + \alpha, \\ & \dot\theta[0] = \dot\theta[N] \cos(2\alpha)\\ & h_{min} \le h \le h_{max}. \end{align*} Although it is likely not needed for this simple example (since the dynamics are sufficiently limited), for completeness one should also add constraints to ensure that none of the intermediate points are in contact, $$\gamma - \alpha \le \theta[n] < \gamma + \alpha, \quad \forall n \in [1,N-1].$$

The result is a simple and clean numerical algorithm for finding the rolling limit cycle solution of the rimless wheel. Please take it for a spin:

examples/contact.ipynb

The specific case of the rimless wheel is quite clean. But before we apply it to the compass gait, the kneed compass gait, the spring-loaded inverted pendulum, etc, then we should stop and figure out a more general form.

# (Autonomous) Hybrid Systems

Recall how we modeled the dynamics of the simple legged robots. First, we derived the equations of motion (independently) for each possible contact configuration -- for example, in the spring-loaded inverted pendulum (SLIP) model we had one set of equations governing the $(x,y)$ position of the mass during the flight phase, and a completely separate set of equations written in polar coordinates, $(r,\theta)$, describing the stance phase. Then we did a little additional work to describe the transitions between these models -- e.g., in SLIP we transitioned from flight to stance when the foot first touches the ground. When simulating this model, it means that we have a discrete "event" which occurs at the moment of foot collision, and an immediate discontinuous change to the state of the robot (in this case we even change out the state variables).

The language of hybrid systems gives us a rich language for describing systems of this form, and a suite of tools for analyzing and controlling them. The term "hybrid systems" is a bit overloaded, here we use "hybrid" to mean both discrete- and continuous-time, and the particular systems we consider here are sometimes called autonomous hybrid systems because the internal dynamics can cause the discrete changes without any exogeneous input††This is in contrast to, for instance, the model of a power-train where a change in gears comes as an external input.. In the hybrid systems formulation, we describe a system by a set of modes each described by (ideally smooth) continuous dynamics, a set of guards which here are continuous functions whos zero-level set describes the conditions which trigger an event, and a set of resets which describe the discrete update to the state that is triggered by the guard. Each guard is associated with a particular mode, and we can have multiple guards per mode. Every guard has at most one reset. You will occasionally here guards referred to as "witness functions", since they play that role in simulation, and resets are sometimes referred to as "transition functions".

The imagery that I like to keep in my head for hybrid systems is illustrated below for a simple example of a robot's heel striking the ground. A solution trajectory of the hybrid system has a continuous trajectory inside each mode, punctuated by discrete updates when the trajectory hits the zero-level set of the guard (here the distance between the heel and the ground becomes zero), with the reset describing the discrete change in the state variables.

For this robot foot, we can decompose the dynamics into distinct modes: (1) foot in the air, (2) only heel on the ground, (3) heel and toe on the ground, (4) only toe on the ground (push-off). More generally, we will write the dynamics of mode $i$ as ${\bf f}_i$, the guard which signals the transition mode $i$ to mode $j$ as ${\bf \phi}_{i,j}$ (where $\phi_{i,j}(\bx_i) > 0$ inside mode $i$), and the reset map from $i$ to $j$ as ${\bf \Delta}_{i,j}$, as illustrated in the following figure:

# Hybrid trajectory optimization

Using the more general language of modes, guards, and resets, we can begin to formulate the "hybrid trajectory optimization" problem. In hybrid trajectory optimization, there is a major distinction between trajectory optimization where the mode sequence is known apriori and the optimization is just attempting to solve for the continuous-time trajectories, vs one in which we must also discover the mode sequence.

For the case when the mode sequence is fixed, then hybrid trajectory optimization is as simple as stitching together multiple individual mathematical programs into a single mathematical program, with the boundary conditions constrained to enforce the guard/reset constraints. For a simple hybrid system with a given a mode sequence and using the shorthand $\bx_k$, etc, for the state in mode in the $k$th segment of the sequence, we can write: \begin{align*} \find_{\bx_k[\cdot],h_k} \quad \subjto \quad & \bx_0[0] = \bx_0, \\ \forall k \quad & \phi_{k,k+1}(\bx_k[N_k]) = 0, \\ & \bx_{k+1}[0] = {\bf \Delta}_{i,j}(\bx_k[N_k]), \\ & \phi_{k,k'}(\bx_k[n_k]) > 0, \quad \forall n_k \in [0, N_k], \forall k' \\ & h_{min} \le h_k \le h_{max}, \\ & \text{collocation constraints}_k(\bx_k[n_k], \bx_k[n_k+1], h_k), \quad \forall n_k \in [0, N_k-1]. \end{align*} It is then natural to add control inputs (as additional decision variables), and to add an objective and any more constraints.

As a simple example of this hybrid trajectory optimization, I thought it would be fun to see if we can formulate the search for initial conditions that optimizes a basketball "trick shot". A quick search turned up this video for inspiration.

Let's start simpler -- with just a "bounce pass". We can capture the dynamics of a bouncing ball (in the plane, ignoring spin) with some very simple dynamics: $$\bq = \begin{bmatrix}x \\ z\end{bmatrix}, \qquad \ddot{\bq} = \begin{bmatrix} 0 \\ -g \end{bmatrix}.$$ During any time interval without contact of duration $h$, we can actually integrate these dynamics perfectly: $$\bx(t+h) = \begin{bmatrix} x(t) + h\dot{x}(t) \\ z(t) + h \dot{z}(t) - \frac{1}{2}gh^2 \\ \dot{x}(t) \\ \dot{z}(t) - hg \end{bmatrix}.$$ With the bounce pass, we just consider collisions with the ground, so we have a guard, ${\bf \phi}(\bx) = z,$ which triggers when $z=0$, and a reset map which assumes an elastic collision with coefficient of restitution $e$: $$\bx^+ = {\bf \Delta(\bx^-)} = \begin{bmatrix} x^- & z^- & \dot{x}^- & - e \dot{z}^- \end{bmatrix}^T.$$

We'll formulate the problem as this: given an initial ball position $(x = 0, z = 1)$, a final ball position 4m away $(x=4, z=1)$, find the initial velocity to achieve that goal in 5 seconds. Clearly, this implies that $\dot{x}(0) = 4/5.$ The interesting question is -- what should we do with $\dot{z}(0)$? There are multiple solutions -- which involve bouncing a different number of times. We can find them all with a simple hybrid trajectory optimization, using the observation that there are two possible solutions for each number of bounces -- one that starts with a positive $\dot{z}(0)$ and one with a negative $\dot{z}(0)$.

examples/contact.ipynb

Now let's try our trick shot. I'll move our goal to $x_f = -1m, z_f = 3m,$ and introduce a vertical wall at $x=0$, and move our initial conditions back to $x_0=-.25m.$ The collision dynamics, which now must take into account the spin of the ball, are in the appendix. The first bounce is against the wall, the second is against the floor. I'll also constrain the final velocity to be down (have to approach the hoop from above). Try it out.

examples/contact.ipynb

In this example, we could integrate the dynamics in each segment analytically. That is the exception, not the rule. But you can take the same steps with a little more code to use, e.g. direct transcription or collocation with multiple break points in each segment.

# Deriving hybrid models: minimal vs floating-base coordinates

There is some work to do in order to derive the equations of motion in this form. Do you remember how we did it for the rimless wheel and compass gait examples? In both cases we assumed that exactly one foot was attached to the ground and that it would not slip, this allowed us to write the Lagrangian as if there was a pin joint attaching the foot to the ground to obtain the equations of motion. For the SLIP model, we derived the flight phase and stance phase using separate Lagrangian equations each with different state representations. I would describe this as the minimal coordinates modeling approach -- it is elegant and has some important computational advantages that we will come to appreciate in the algorithms below. But it's a lot of work! For instance, if we also wanted to consider friction in the foot contact of the rimless wheel, we would have to derive yet another set of equations to describe the sliding mode (adding, for instance, a prismatic joint that moved the foot along the ramp), plus the guards which compute the contact force for a given state and the distance from the boundary of the friction cone, and on and on.

Fortunately, there is an alternative modeling approach for deriving the modes, guards, and resets for contact that is more general (though admittedly also more complex). We can instead model the robot in the floating-base coordinates -- we add a fictitious six degree-of-freedom "floating-base" joint connecting some part of the robot to the world (in planar models, we use just three degrees-of-freedom, e.g. $(x,z,\theta)$). We can derive the equations of motion for the floating-base robot once, without considering contact, then add the additional constraints that come from being in contact as contact forces which get applied to the bodies. The resulting manipulator equations take the form $$\bM({\bq})\ddot{\bq} + \bC(\bq,\dot{\bq})\dot\bq = \btau_g(\bq) + \bB\bu + \sum_i \bJ_i^T(\bq) \blambda_i,$$ where $\blambda_i$ are the constraint forces and $\bJ_i$ are the constraint Jacobians. Conveniently, if the guard function in our contact equations is the signed distance from contact, $\phi_i(\bq)$, then this Jacobian is simply $\bJ_i(\bq) = \pd{\phi_i}{\bq}$. I've written the basic derivations for the common cases (position constraints, velocity constraints, impact equations, etc) in the appendix. What is important to understand here is that this is an alternative formulation for the equations governing the modes, guards, and resets, but that is it no longer a minimal coordinate system -- the equations of motion are written in $2N$ state variables but the system might actually be constrained to evolve only along a lower dimensional manifold (if we write the rimless wheel equations with three configuration variables for the floating base, it still only rotates around the toe when it is in stance and is inside the friction cone). This will have implications for our algorithms.

Grandia 2019 - Feedback MPC for Torque-controlled Legged Robots looks at a relaxation of the friction cone via a barrier certificate for iLQR. "In particular, the friction cone is implemented through a perturbed second-order cone constraint. This for- mulation adds a convex penalty to the cost function and avoids numerical ill-conditioning at the origin of the cone." Could go in the playbook, if not here.