Wednesday, March 20, 2013

Fitting a rotation matrix to corresponding points

When tracking a rigid body by following a bunch of points on the body, we need to be able to find the rotation matrix and translation vector that tell us the new orientation and position of the body relative to some canonical orientation and position.

Consider a collection of  $n$ points $A = [a_1 \cdots a_n]$ in $\mathbb{R}^m$ (where, it's likely that $m=3$).  If we assume that $A$ has been rotated by $R$, translated by $t$, and corrupted by Gaussian noise to produce a corresponding collection of points $B = [b_1 \cdots b_n]$ s.t. $b_i=Ra_i+t+w$ where $w\sim \mathcal{N}_m(0,cI)$.  If we want to recover $R$ and $t$, we can use Umeyama's paper to approximately do so.  For those not interested in the proofs, the key equations are numbers 34-42.

It boils down to this:
  1. Find the mean and variance around the mean of both point sets (eqs. 34-37).
  2. Find the covariance of the mean-centered point sets (eq. 38).
  3. Take the singular value decomposition of the the covariance.
  4. Replace the diagonal matrix of singular values with an identity matrix, negating the last value if the determinant of the covariance matrix is less-than zero (eq. 39).
  5. Multiply the matrix back out and you have the rotation matrix, $R$ (eq. 40).
  6. Rotate the mean of the first set and scale by the trace of the singular value matrix (with the same negation as in step 4).  Subtract this from the mean of the second set and you have the translation $t$ (eqs. 41-42). 
Step 4 will probably not be necessary in most cases for the purpose of tracking  a rigid body with motion capture markers. It is intended to prevent the algorithm from finding a mirrored copy of the rotation matrix in cases where the noise is particularly severe.  The scaling in step 6 should also be small if we're dealing with an actual rigid body; so we might optionally skip that part as well.

In pretty equations:

$\bar{a} =  \frac{1}{n}\sum_{i=1}^n a_i $
$\sigma_a^2 = \frac{1}{n}\sum_{i=1}^n \left\| a_i - \bar{a}\right\|^2$  note that this is not the covariance of the points around the mean, but rather the variance of the scalar distance from the mean.
$\Sigma_{ab} = \frac{1}{n}\sum_{i=1}^n (b_i - \bar{b})(a_i - \bar{a})^\text{T}$

$UDV^\text{T} = \text{SVD}(\Sigma_{ab})$
$S = I$ --- If $\text{det}(\Sigma_{ab})<0$ then $S_{mm}=-1$.
$c = \frac{1}{\sigma_a^2}\text{tr}(DS)$

We then find:
$R = USV^\text{T}$
$t = \bar{b} - cR\bar{a}$

In matlab, we have:
m = size(a,1);
n = size(a,2); 
abar = mean(a,2);
bbar = mean(b,2);
aa = a - repmat(abar, 1,n);
bb = b - repmat(bbar,1,n);
va = mean(sum(aa.*aa),2);
sig = bb*aa'*(1.0/n);
[U D V] = svd(sig);
S = eye(m);
if prod(diag(D))<0)
  S(m,m)=-1;
end
R = U*S*V';
c = trace(D*S)/va;
t = bbar - c*R*abar;

This seems to do the trick, and with a little understanding of the singular value decomposition, it's not too unintuitive.  Many thanks to Shinji Umeyama.