Numpy/Scipy : solving several least squares with the same design matrix

Related searches

I face a least square problem that i solve via scipy.linalg.lstsq(M,b), where :

  • M has shape (n,n)
  • b has shape (n,)

The issue is that i have to solve it a bunch of time for different b's. How can i do something more efficient ? I guess that lstsq does a lot of things independently of the value of b.

Ideas ?

In the case your linear system is well-determined, I'll store M LU decomposition and use it for all the b's individually or simply do one solve call for 2d-array B representing the horizontally stacked b's, it really depends on your problem here but this is globally the same idea. Let's suppose you've got each b one at a time, then:

import numpy as np
from scipy.linalg import lstsq, lu_factor, lu_solve, svd, pinv

# as you didn't specified any practical dimensions
n = 100
# number of b's
nb_b = 10

# generate random n-square matrix M
M = np.random.rand(n**2).reshape(n,n)

# Set of nb_b of right hand side vector b as columns
B = np.random.rand(n*nb_b).reshape(n,nb_b)

# compute pivoted LU decomposition of M
M_LU = lu_factor(M)
# then solve for each b
X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)])

but if it is under or over-determined, you need to use lstsq as you did:

X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)])

or simply store the pseudo-inverse M_pinv with pinv (built on lstsq) or pinv2 (built on SVD):

# compute the pseudo-inverse of M
M_pinv = pinv(M)
X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)])

or you can also do the work by yourself, as in pinv2 for instance, just store the SVD of M, and solve this manually:

# compute svd of M
U,s,Vh = svd(M)

def solve_svd(U,s,Vh,b):
    # U diag(s) Vh x = b <=> diag(s) Vh x = U.T b = c
    c = np.dot(U.T,b)
    # diag(s) Vh x = c <=> Vh x = diag(1/s) c = w (trivial inversion of a diagonal matrix)
    w = np.dot(np.diag(1/s),c)
    # Vh x = w <=> x = Vh.H w (where .H stands for hermitian = conjugate transpose)
    x = np.dot(Vh.conj().T,w)
    return x

X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)])

which all give the same result if checked with np.allclose (unless the system is not well-determined resulting in the LU direct approach failure). Finally in terms of performances:

%timeit M_LU = lu_factor(M); X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)])
1000 loops, best of 3: 1.01 ms per loop

%timeit X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)])
10 loops, best of 3: 47.8 ms per loop

%timeit M_pinv = pinv(M); X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)])
100 loops, best of 3: 8.64 ms per loop

%timeit U,s,Vh = svd(M); X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)])
100 loops, best of 3: 5.68 ms per loop

Nevertheless, it's up to you to check these with appropriate dimensions.

Hope this helps.

scipy.optimize.lsq_linear, Solve a linear least-squares problem with bounds on the variables. Given a m-by -n design matrix A and a target vector b with m elements, shape (n,) or be a scalar, in the latter case a bound will be the same for all variables. numpy Find the least squares solution to a linear system with np.linalg.lstsq Example Least squares is a standard approach to problems with more equations than unknowns, also known as overdetermined systems.

Your question is unclear, but I am guessing you mean to compute the equation Mx=b through scipy.linalg.lstsq(M,b) for different arrays (b0, b1, b2..). If that is the case you could just parallelize the process with concurrent.futures.ProcessPoolExecutor. The documentation for this is fairly simple and can help python run multiple scipy solvers at once.

Hope this helps.

scipy.linalg.lstsq — SciPy v1.5.2 Reference Guide, Compute least-squares solution to equation Ax = b. We first form the “design matrix” M, with a constant column of 1s and a column containing x**2 : >>> It uses the iterative procedure scipy.sparse.linalg.lsmr for finding a solution of a linear least-squares problem and only requires matrix-vector product evaluations. If None (default), the solver is chosen based on the type of Jacobian returned on the first iteration.

You can factorize M into either QR or SVD products and find the lsq solution manually.

Least squares fitting with Numpy and Scipy, Both Numpy and Scipy provide black box methods to fit one-dimensional data using linear least squares, in the first case, and non-linear least squa best estimated coefficients we will need to solve the minimization problem In the case of polynomial functions the fitting can be done in the same way as� Least squares fitting with Numpy and Scipy nov 11, 2015 numerical-analysis optimization python numpy scipy. Both Numpy and Scipy provide black box methods to fit one-dimensional data using linear least squares, in the first case, and non-linear least squares, in the latter.

Least-squares fitting in Python — 0.1.0 documentation, Many fitting problems (by far not all) can be expressed as least-squares problems . curve_fit is part of scipy.optimize and a wrapper for scipy.optimize.leastsq that overcomes its poor usability. sigma = numpy.array([1.0,1.0,1.0,1.0,1.0,1.0]) Provide data as design matrix: straight line with a=0 and b=1 plus some noise. For example, scipy.linalg.eig can take a second matrix argument for solving generalized eigenvalue problems. Some functions in NumPy, however, have more flexible broadcasting options. For example, numpy.linalg.solve can handle “stacked” arrays, while scipy.linalg.solve accepts only a single square array as its first argument.

Data science with Python: 8 ways to do linear regression and , For many data scientists, linear regression is the starting point of many Method: Scipy.polyfit( ) or numpy.polyfit( ). This is a pretty general least squares polynomial fit function which This is the fundamental method of calculating least-square solution to a linear system of equation by matrix factorization. The algorithm first computes the unconstrained least-squares solution by numpy.linalg.lstsq or scipy.sparse.linalg.lsmr depending on lsq_solver. This solution is returned as optimal if it lies within the bounds. Method ‘trf’ runs the adaptation of the algorithm described in for a linear least-squares problem. The iterations are essentially

Therefore, unless you don’t want to add scipy as a dependency to your numpy program, use scipy.linalg instead of numpy.linalg. numpy.matrix vs 2-D numpy.ndarray ¶ The classes that represent matrices, and basic operations, such as matrix multiplications and transpose are a part of numpy .

Comments
  • Yes that's exactly what i am doing, and i was wandering if there is not a better solution, since the design of the least square problem (M) does not change, maybe a part of the computation could be done once and for all b's.
  • I am unsure if I understand still, but I don't think the solver is computing anything non-trivial independently of b.
  • I think you understood. This is sad :(