Parameter values for parameter (n_estimators) need to be a sequence

gridsearchcv
gridsearchcv grid_scores_
gridsearchcv logistic regression
randomizedsearchcv
name 'gridsearchcv' is not defined
grid search cv stratified
gridsearchcv random forest
gridsearchcv predict_proba

I am getting the error, not sure how to fix it. Can you please help. The entire code can be found at https://github.com/kthouz/NYC_Green_Taxi/blob/master/NYC%20Green%20Taxi.ipynb

optimize n_estimator through grid search
def optimize_num_trees(alg,param_test,scoring_method,train,predictors,target):
    """
    This functions is used to tune paremeters of a predictive algorithm
    alg: sklearn model,
    param_test: dict, parameters to be tuned
    scoring_method: str, method to be used by the cross-validation to valuate the model
    train: pandas.DataFrame, training data
    predictors: list, labels to be used in the model training process. They should be in the column names of dtrain
    target: str, target variable
    """
    gsearch = GridSearchCV(estimator=alg, param_grid = param_test, scoring=scoring_method,n_jobs=2,iid=False,cv=5)
    gsearch.fit(train[predictors],train[target])
    return gsearch

# get results of the search grid
gs_cls = optimize_num_trees(model_cls,param_test,'roc_auc',train,predictors,target)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-c7419a90cdb1> in <module>()
      1 
      2 # get results of the search grid
----> 3 gs_cls = optimize_num_trees(model_cls,param_test,'roc_auc',train,predictors,target)
      4 

<ipython-input-40-2b76f2ffb87f> in optimize_num_trees(alg, param_test, scoring_method, train, predictors, target)
     57     target: str, target variable
     58     """
---> 59     gsearch = GridSearchCV(estimator=alg, param_grid = param_test, scoring=scoring_method,n_jobs=2,iid=False,cv=5)
     60     gsearch.fit(train[predictors],train[target])
     61     return gsearch

/Users/anaconda/lib/python3.5/site-packages/sklearn/grid_search.py in __init__(self, estimator, param_grid, scoring, fit_params, n_jobs, iid, refit, cv, verbose, pre_dispatch, error_score)
    810             refit, cv, verbose, pre_dispatch, error_score)
    811         self.param_grid = param_grid
--> 812         _check_param_grid(param_grid)
    813 
    814     def fit(self, X, y=None):

/Users/anaconda/lib/python3.5/site-packages/sklearn/grid_search.py in _check_param_grid(param_grid)
    346             if True not in check:
    347                 raise ValueError("Parameter values for parameter ({0}) need "
--> 348                                  "to be a sequence.".format(name))
    349 
    350             if len(v) == 0:

ValueError: Parameter values for parameter (n_estimators) need to be a sequence.

I met a similar error like you, in the following code:

# optimize n_estimator through grid search
# define range over which number of trees is to be optimized
param_test = {'n_estimators':range(30,151,20)} 

you can change

range(30,151,20) to np.arange(30,151,20)

Parameters in GridSearchCV in scikit-learn, However when you have only one value with this parameter, it makes more sense put it directly into the classifier as you did with n_estimators . If you want to use fixed random_state=7 you should write it when you instantiate the estimator  However when you have only one value with this parameter, it makes more sense put it directly into the classifier as you did with n_estimators. share | improve this answer | follow | | | | answered Aug 16 '16 at 8:27

I got a similar error as you:

ValueError: Parameter values for parameter (warm_start) need to be a sequence(but not a string) or np.ndarray. site:stackoverflow.com

The Value for each Key apparently needs to be in array brackets [ ]

My erroneous code:

params = {
    'max_depth': [11],
    'warm_start': True
}

My correct code:

params = {
    'max_depth': [11],
    'warm_start': [True]
}

Better error message needed when accidentally supplying scalar , Steps/Code to Reproduce Here's what did it for me # Parameters Parameter values for parameter (booster) need to be a sequence(but not a  Parameters: parlist (sequence of tuple or Parameter) – A sequence of tuples, or a sequence of Parameter instances. If it is a sequence of tuples, then each tuple must contain at least the name. The order in each tuple must be (name, value, vary, min, max, expr, brute_step).

GridSearchCV expects the parameter values in a sequence format, so always you should give the parameter values in the form of a list or numpy array even if the parameter value is a single value.

For example: if you give the below dictionary for GridSearchCV it raises an error, since the value of n_jobs -1 is a single integer and not sequence(list or array).

parameters={'alpha':[0.01, 0.1, 1, 10], 'n_jobs':-1}

but if you wrap the -1 in a list and give to GridSearchCV it wont raise any errors. parameters={'alpha':[0.01,0.1,1,10], 'n_jobs': [-1]}

Hope it helps.

3.2. Tuning the hyper-parameters of an estimator, In scikit-learn they are passed as arguments to the constructor of the estimator classes. from a grid of parameter values specified with the param_grid parameter. base_estimator=RandomForestClassifier(n_estimators=10)) >>> param_grid = { . This estimate comes “for free” as no additional data is needed and can be  Parameters are expanded just before the pipeline runs so that values surrounded by ${{ }} are replaced with parameter values. Use variables if you need your values to be more widely available during your pipeline run. Parameters must contain a name and data type. Parameters cannot be optional. A default value needs to be assigned in your YAML file or when you run your pipeline.

sklearn.model_selection.ParameterGrid, Can be used to iterate over parameter value combinations with the Python as a dictionary mapping estimator parameters to sequences of allowed values. Grid of parameters with a discrete number of values for each. Can be used to iterate over parameter value combinations with the Python built-in function iter. Read more in the User Guide. The parameter grid to explore, as a dictionary mapping estimator parameters to sequences of allowed values. An empty dict signifies default parameters.

sklearn.grid_search.GridSearchCV Python Example, SVR() #set up parameters for the classifier if(passed_parameters == None): RandomForestClassifier(n_estimators=100, n_jobs=FLAGS.n_jobs, the data matrix X = vstack(data.values()) # estimate fold size (if not a divisor of total samples)  IIRC a fairly substantial amount of Python code will accept a scalar just as well as a list or similar so the way things are now may defy user expectations. And in fact if you start typing "parameter values for parameter" into Google there is at least one autocomplete result which suggests that a lot of people have had this issue.

How to Tune the Number and Size of Decision Trees with XGBoost , This dataset is available for free from Kaggle (you will need to sign-up to Kaggle to be able to download this dataset). and correct for the errors made by the sequence of previous trees. XGBoost on Otto dataset, Tune n_estimators This parameter takes an integer value and defaults to a value of 3. If Access displays the Enter Parameter Value dialog box every time that you open a table, the incorrect expression is most likely in the Row Source property of a Lookup field in that table. Right-click the table in the Navigation Pane, and then select Design View .

Comments
  • Check the param_test variable. See if you have initialized it correctly. It should be dictionary or a list of dictionaries. The link you added declares param_test as a dictionary. param_test = {'n_estimators':range(50,200,25)}. You can also try using param_test = {'n_estimators':list(range(50,200,25))} if it needs a list and not a generator.
  • Can you read that? I can't...
  • Edit your question and use the {} button above the edit box to format your code as code so it doesn't reflow.
  • {'n_estimators':list(range(50,200,25))} Adding this fixed it. Thank you!!!
  • @ClockSlave found same issue and ` params = {'max_depth': list(range(1,11))}` helped me out too, can anyone please help me why was this issue coming? what I was missing in writing like params = {'max_depth': range(1,11)}?