## Numpy/Scipy : solving several least squares with the same design matrix

I face a least square problem that i solve via `scipy.linalg.lstsq(M,b)`

, where :

`M`

has shape`(n,n)`

`b`

has shape`(n,)`

The issue is that i have to solve it a bunch of time for different `b`

's. How can i do something more efficient ? I guess that `lstsq`

does a lot of things independently of the value of `b`

.

Ideas ?

In the case your linear system is well-determined, I'll store `M`

LU decomposition and use it for all the `b`

's individually or simply do one solve call for 2d-array `B`

representing the horizontally stacked `b`

's, it really depends on your problem here but this is globally the same idea. Let's suppose you've got each `b`

one at a time, then:

import numpy as np from scipy.linalg import lstsq, lu_factor, lu_solve, svd, pinv # as you didn't specified any practical dimensions n = 100 # number of b's nb_b = 10 # generate random n-square matrix M M = np.random.rand(n**2).reshape(n,n) # Set of nb_b of right hand side vector b as columns B = np.random.rand(n*nb_b).reshape(n,nb_b) # compute pivoted LU decomposition of M M_LU = lu_factor(M) # then solve for each b X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)])

but if it is under or over-determined, you need to use lstsq as you did:

X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)])

or simply store the pseudo-inverse `M_pinv`

with pinv (built on lstsq) or pinv2 (built on SVD):

# compute the pseudo-inverse of M M_pinv = pinv(M) X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)])

or you can also do the work by yourself, as in `pinv2`

for instance, just store the SVD of `M`

, and solve this manually:

# compute svd of M U,s,Vh = svd(M) def solve_svd(U,s,Vh,b): # U diag(s) Vh x = b <=> diag(s) Vh x = U.T b = c c = np.dot(U.T,b) # diag(s) Vh x = c <=> Vh x = diag(1/s) c = w (trivial inversion of a diagonal matrix) w = np.dot(np.diag(1/s),c) # Vh x = w <=> x = Vh.H w (where .H stands for hermitian = conjugate transpose) x = np.dot(Vh.conj().T,w) return x X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)])

which all give the same result if checked with `np.allclose`

(unless the system is not well-determined resulting in the LU direct approach failure). Finally in terms of performances:

%timeit M_LU = lu_factor(M); X_LU = np.asarray([lu_solve(M_LU,B[:,i]) for i in range(nb_b)]) 1000 loops, best of 3: 1.01 ms per loop %timeit X_lstsq = np.asarray([lstsq(M,B[:,i])[0] for i in range(nb_b)]) 10 loops, best of 3: 47.8 ms per loop %timeit M_pinv = pinv(M); X_pinv = np.asarray([np.dot(M_pinv,B[:,i]) for i in range(nb_b)]) 100 loops, best of 3: 8.64 ms per loop %timeit U,s,Vh = svd(M); X_svd = np.asarray([solve_svd(U,s,Vh,B[:,i]) for i in range(nb_b)]) 100 loops, best of 3: 5.68 ms per loop

Nevertheless, it's up to you to check these with appropriate dimensions.

Hope this helps.

**scipy.optimize.lsq_linear,** Solve a linear least-squares problem with bounds on the variables. Given a m-by -n design matrix A and a target vector b with m elements, shape (n,) or be a scalar, in the latter case a bound will be the same for all variables. numpy Find the least squares solution to a linear system with np.linalg.lstsq Example Least squares is a standard approach to problems with more equations than unknowns, also known as overdetermined systems.

Your question is unclear, but I am guessing you mean to compute the equation `Mx=b`

through `scipy.linalg.lstsq(M,b)`

for different arrays (`b0, b1, b2..`

). If that is the case you could just parallelize the process with `concurrent.futures.ProcessPoolExecutor`

. The documentation for this is fairly simple and can help python run multiple scipy solvers at once.

Hope this helps.

**scipy.linalg.lstsq — SciPy v1.5.2 Reference Guide,** Compute least-squares solution to equation Ax = b. We first form the “design matrix” M, with a constant column of 1s and a column containing x**2 : >>> It uses the iterative procedure scipy.sparse.linalg.lsmr for finding a solution of a linear least-squares problem and only requires matrix-vector product evaluations. If None (default), the solver is chosen based on the type of Jacobian returned on the first iteration.

You can factorize `M`

into either QR or SVD products and find the lsq solution manually.

**Least squares fitting with Numpy and Scipy,** Both Numpy and Scipy provide black box methods to fit one-dimensional data using linear least squares, in the first case, and non-linear least squa best estimated coefficients we will need to solve the minimization problem In the case of polynomial functions the fitting can be done in the same way as� Least squares fitting with Numpy and Scipy nov 11, 2015 numerical-analysis optimization python numpy scipy. Both Numpy and Scipy provide black box methods to fit one-dimensional data using linear least squares, in the first case, and non-linear least squares, in the latter.

**Least-squares fitting in Python — 0.1.0 documentation,** Many fitting problems (by far not all) can be expressed as least-squares problems . curve_fit is part of scipy.optimize and a wrapper for scipy.optimize.leastsq that overcomes its poor usability. sigma = numpy.array([1.0,1.0,1.0,1.0,1.0,1.0]) Provide data as design matrix: straight line with a=0 and b=1 plus some noise. For example, scipy.linalg.eig can take a second matrix argument for solving generalized eigenvalue problems. Some functions in NumPy, however, have more flexible broadcasting options. For example, numpy.linalg.solve can handle “stacked” arrays, while scipy.linalg.solve accepts only a single square array as its first argument.

**Data science with Python: 8 ways to do linear regression and ,** For many data scientists, linear regression is the starting point of many Method: Scipy.polyfit( ) or numpy.polyfit( ). This is a pretty general least squares polynomial fit function which This is the fundamental method of calculating least-square solution to a linear system of equation by matrix factorization. The algorithm first computes the unconstrained least-squares solution by numpy.linalg.lstsq or scipy.sparse.linalg.lsmr depending on lsq_solver. This solution is returned as optimal if it lies within the bounds. Method ‘trf’ runs the adaptation of the algorithm described in for a linear least-squares problem. The iterations are essentially

Therefore, unless you don’t want to add scipy as a dependency to your numpy program, use scipy.linalg instead of numpy.linalg. numpy.matrix vs 2-D numpy.ndarray ¶ The classes that represent matrices, and basic operations, such as matrix multiplications and transpose are a part of numpy .

##### Comments

- Yes that's exactly what i am doing, and i was wandering if there is not a better solution, since the design of the least square problem (
`M`

) does not change, maybe a part of the computation could be done once and for all`b`

's. - I am unsure if I understand still, but I don't think the solver is computing anything non-trivial independently of
`b`

. - I think you understood. This is sad :(