mascado.distortions.polyfit module

Functions for least-squares fits of polynomial vector fields.

Interface follows functional paradigm.

Some functions take upper bounds, for example for outlier counts and condition numbers. The defaults for these bounds are chosen rather restrictively so you are aware of potential problems, when you have to turn them up.

Examples

>>> from mascado.distortions.polynomials import (
...     PolyVectorField, Legendre)
>>> vf = PolyVectorField(Legendre(3))
>>> grid = np.array([[0, 0], [1, 1], [0, 1], [1, 0], [0.5, 0.5]])
>>> vectors = np.array([[1, 1], [0, 1], [0.5, 3], [-1, -1], [100, 3]])
>>> params, inliers, resvar = polyfit_svd_iterative(
...     vf, grid, vectors, sigmas=0.5, maxoutliers=1, iterinfo='  ')
  5 2D data points, 6 parameters
  0 outliers before first iteration, 1 max
    1. iteration: resvar 7.98e+03, outliers: 1 new, 0 old
    2. iteration: resvar 1.12, outliers: 0 new, 1 old
  Iterations: 2, outliers: 1
  Residual variance: 1.12
>>> vf.set_params(params)
mascado.distortions.polyfit.polyfit_svd(vf, positions, vectors, sigmas=None, maxcondition=100.0, info=' ')[source]

Fit vector field to data using SVD.

Does not use or alter internal parameters of vector field.

Parameters:
  • vf (mascado.distortions.polynomials.PolyVectorField) – Model of vector field providing the Vandermonde matrix.
  • positions ((N,2)-shaped array) – x,y positions of data points.
  • vectors ((N,2)-shaped array) – x,y values at data points.
  • sigmas (float or (N, 1) array or (N, 2) array or None) – Uncertainties for vectors. If one-dimensional, the same weight is used for x and y component.
  • maxcondition (float) – Function raises a RuntimeError if the condition value of the weighted Vandermonde matrix is larger than this argument.
  • info (whitespace or False) – Indent for info printed to console. If False, nothing is printed.
Returns:

  • params (array) – Parameter list for vector field.
  • residuals ((N,2)-shaped array) – Residual vectors.
  • resvar (float) – Residual variance, aka. reduced chi squared. NaN if the problem is not over-determined.

Raises:

RuntimeError – If the weighted Vandermonde matrix is conditioned badly.

Notes

Formulate least-squares problem as matrix equation using Vandermonde matrices \(V_x,V_y\) of polynomials in x and y direction at given data points. (Polynomials are linear in their coefficients.) \(q\) are the vector components and \(a\) the polynomial coefficients:

\[q \approx X \cdot a\]
\[\begin{split}q =& [q_{1x}, ..., q_{Nx}, q_{1y}, ..., q_{Ny}]^T \\ a =& [a_{0x}, ..., a_{mx}, a_{0y}, ..., a_{my}]^T \\ X =& \begin{bmatrix} V_x & \mathbf{0} \\ \mathbf{0} & V_y \end{bmatrix} \,.\end{split}\]

Minimize the sum of squared residuals

\[\lVert r \rVert_2^2 = \lVert q - X \cdot a \rVert_2^2\]

using the pseudo inverse of \(W \cdot X\), obtained by SVD:

\[a = X^+ \cdot q \,.\]

Check this fine StackExchange answer for details.

In the weighted case, this becomes

\[W \cdot q \approx W \cdot X \cdot a\]
\[\lVert r \rVert_2^2 = \lVert W \cdot q - W \cdot X \cdot a \rVert_2^2\]
\[a = (W \cdot X)^+ \cdot W \cdot q \,.\]

\(W\) is the weighting matrix

\[W = \operatorname{diag}\left( \frac{1}{\sigma_1}, ..., \frac{1}{\sigma_N}\right)\,.\]

The values are not squared, because the matrix is applied inside the L2-norm.

mascado.distortions.polyfit.polyfit_svd_iterative(vf, positions, vectors, sigmas, maxoutliers, sigmacut=5, initialinliers=None, maxcondition=100.0, info=' ', iterinfo=False)[source]

Fit vector field with iterative outlier rejection.

Outlier rejection is realized through sigma clipping. Because the outliers are introducing systematic errors (“bumps”) in the preliminary solutions, good data points might be rejected, if we would just cut all residuals larger than sigmacut times sigmas. Therefore the sigmacut is scaled with the square root of the residual variance, so, initially, the applied sigmacut is larger than given. As the residual variance converges towards 1, if all outliers are found, the threshold converges to the given sigmacut.

If no sigmas are passed, the root mean square over all residuals is used instead (not scaled by the residual variance, because that’s already in the RMS of residuals).

The set of outliers increases with every iteration, until no more outliers are found and the algorithm halts. Outliers from previous iterations are not reconsidered in subsequent iterations, because that might cause an infinite loop.

Parameters:
  • vf (mascado.distortions.polynomials.PolyVectorField) – 2D vector field. Internal parameters are neither used nor changed.
  • positions ((N, 2)-shaped array)
  • vectors ((N, 2)-shaped array)
  • sigmas (float or { (N,), (N, 2) }-shaped array or None)
  • maxoutliers (int) – Function raises RuntimeError if the number of outliers increases above this number.
  • sigmacut (float) – Multiplier for standard deviation specifying the upper bound for inliers.
  • initialinliers ((N,)-shaped bool array) – Data points, where this array is False, are not used in any iteration.
  • maxcondition (float) – Passed on to polyfit_svd().
  • info (whitespace or False) – Indent for info printed to console. If False, nothing is printed.
  • iterinfo (whitespace or False) – Additional indent for info printed in every iteration. Nothing is printed by default.
Returns:

  • params (float array) – Parameter list for vector field.
  • inliers ((N,)-shaped bool array) – Mask for inliers.
  • residuals ((inlier count, 2)-shaped array) – Residual vectors of inliers.
  • resvar (float) – Residual variance of inliers.

Raises:

RuntimeError – If the maximum outlier count is exceeded.