## sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()

Just trying to do a simple linear regression but I'm baffled by this error for:

regr = LinearRegression() regr.fit(df2.iloc[1:1000, 5].values, df2.iloc[1:1000, 2].values)

which produces:

ValueError: Found arrays with inconsistent numbers of samples: [ 1 999]

These selections must have the same dimensions, and they should be numpy arrays, so what am I missing?

It looks like sklearn requires the data shape of (row number, column number).
If your data shape is (row number, ) like `(999, )`

, it does not work.
By using `numpy.reshape()`

, you should change the shape of the array to `(999, 1)`

, e.g. using

data=data.reshape((999,1))

In my case, it worked with that.

**scikit learn sklearn: Found arrays with inconsistent numbers of ,** sklearn: Found arrays with inconsistent numbers of samples when calling error for: regr = LinearRegression() regr.fit(df2.iloc[1:1000, 5].values, df2.iloc[1:1000,� sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() scikit-learn. It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, ), it does not work. By using numpy.reshape, you should change to (999, 1).

Looks like you are using pandas dataframe (from the name df2).

You could also do the following:

regr = LinearRegression() regr.fit(df2.iloc[1:1000, 5].to_frame(), df2.iloc[1:1000, 2].to_frame())

NOTE: I have removed "values" as that converts the pandas Series to numpy.ndarray and numpy.ndarray does not have attribute to_frame().

**Help with Scikit-learn : learnpython,** scikit learn sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()?. regr = LinearRegression() regr.fit(df2.iloc[1:1000, � sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() 0 How to predict Label of an email using a trained NB Classifier in sklearn?

Seen on the Udacity deep learning foundation course:

df = pd.read_csv('my.csv') ... regr = LinearRegression() regr.fit(df[['column x']], df[['column y']])

**train_test_split() error: Found input variables with inconsistent ,** ValueError: Found arrays with inconsistent numbers of samples: [ 1 62] before the sklearn calls, and it should work. np.array(list(c)) X = data[:, 0:1] y = data[:, 1 ] model = LinearRegression() model.fit(X, y) print(model.coef_, model.intercept_). ValueError: Found arrays with inconsistent numbers of samples: [ 1 999] These selections must have the same dimensions, and they should be numpy arrays, so what am I missing? Answer: It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, ), it does not work.

I think the "X" argument of regr.fit needs to be a matrix, so the following should work.

regr = LinearRegression() regr.fit(df2.iloc[1:1000, [5]].values, df2.iloc[1:1000, 2].values)

**ValueError: Found input variables with inconsistent numbers of ,** train_test_split() error: Found input variables with inconsistent numbers of samples � python scikit-learn sampling. Fairly new to Python but� sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() (6) As it was mentioned above X argument must be a matrix or a numpy array with known dimensions. So you could probably use this:

I encountered this error because I converted my data to an `np.array`

. I fixed the problem by converting my data to an `np.matrix`

instead and taking the transpose.

ValueError:
`regr.fit(np.array(x_list), np.array(y_list))`

Correct:
`regr.fit(np.transpose(np.matrix(x_list)), np.transpose(np.matrix(y_list)))`

LinearRegression model, but I am getting with inconsistent numbers of samples: [1, 1000] data['TV'] y = data['Sales'] from sklearn.linear_model import LinearRegression model = LinearRegression() model.fit(x,y) Traceback (most recent call last): File "<stdin>", Your array shapes seem inconsistent. sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() 9 sklearn issue: Found arrays with inconsistent numbers of samples when doing regression

Choosing random_state for sklearn algorithms Found arrays with inconsistent numbers of samples when calling LinearRegression.fit()

sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() scikit-learn. It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, ), it does not work. By using numpy.reshape, you should change to (999, 1).

sklearn: Found arrays with inconsistent numbers of samples when calling LinearRegression.fit() scikit-learn. It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, ), it does not work. By using numpy.reshape, you should change to (999, 1).

##### Comments

- my data shape is (10L,), how do i convert it to (10L,1). When i use data=data.reshape(len(data),1), the resulting shape is (10L,1L) not (10L,1)
- @user3841581 please refer to this post.
- @Boern Thanks for the comment. I also discovered that X_train should be of size (N,1) but y_train should be of size (N,) not (N,1), otherwise it does not work, at least not for me.
- data.reshape(...) may show deprication warning if data is Series object. Use data.values.reshape(...)
- data = data.reshape(-1,1)
- Thanks! This is really the simplest and easiest to understand!
- Actually, the Y parameter is expected as a (length, ) shape. Thanks!