Hot questions for Using Neural networks in grid search

Question:

I'm trying to use Keras Scikit Learn Wrapper in order to make random search for parameters easier. I wrote an example code here where :

  1. I generate an artificial dataset:

I am using moons from scikit learn

from sklearn.datasets import make_moons
dataset = make_moons(1000)
  1. Model builder definition:

I define build_fn function needed:

def build_fn(nr_of_layers = 2,
             first_layer_size = 10,
             layers_slope_coeff = 0.8,
             dropout = 0.5,
             activation = "relu",
             weight_l2 = 0.01,
             act_l2 = 0.01,
             input_dim = 2):

    result_model = Sequential()
    result_model.add(Dense(first_layer_size,
                           input_dim = input_dim,
                           activation=activation,
                           W_regularizer= l2(weight_l2),
                           activity_regularizer=activity_l2(act_l2)
                           ))

    current_layer_size = int(first_layer_size * layers_slope_coeff) + 1

    for index_of_layer in range(nr_of_layers - 1):

        result_model.add(BatchNormalization())
        result_model.add(Dropout(dropout))
        result_model.add(Dense(current_layer_size,
                               W_regularizer= l2(weight_l2),
                               activation=activation,
                               activity_regularizer=activity_l2(act_l2)
                               ))

        current_layer_size = int(current_layer_size * layers_slope_coeff) + 1

    result_model.add(Dense(1,
                           activation = "sigmoid",
                           W_regularizer = l2(weight_l2)))

    result_model.compile(optimizer="rmsprop", metrics = ["accuracy"], loss = "binary_crossentropy")

    return result_model

NeuralNet = KerasClassifier(build_fn)
  1. Parameter grid definition :

Then I defined a parameter grid :

param_grid = {
    "nr_of_layers" : [2, 3, 4, 5],
    "first_layer_size" : [5, 10, 15],
    "layers_slope_coeff" : [0.4, 0.6, 0.8],
    "dropout" : [0.3, 0.5, 0.8],
    "weight_l2" : [0.01, 0.001, 0.0001],
    "verbose" : [0],
    "batch_size" : [1],
    "nb_epoch" : [30]
}
  1. RandomizedSearchCV phase:

I defined RandomizedSearchCV object and fitted with values from artificial dataset:

random_search = RandomizedSearchCV(NeuralNet, 
    param_distributions=param_grid, verbose=2, n_iter=1, scoring="roc_auc")
random_search.fit(dataset[0], dataset[1])

What I got (after running this code in console) is :

Traceback (most recent call last):
  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-c5bdbc2770b7>", line 2, in <module>
    random_search.fit(dataset[0], dataset[1])
  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 996, in fit
    return self._fit(X, y, sampled_params)
  File "C:\Anaconda2\lib\site-packages\sklearn\grid_search.py", line 553, in _fit
    for parameters in parameter_iterable
  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 800, in __call__
    while self.dispatch_one_batch(iterator):
  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 658, in dispatch_one_batch
    self._dispatch(tasks)
  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 566, in _dispatch
    job = ImmediateComputeBatch(batch)
  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 180, in __init__
    self.results = batch()
  File "C:\Anaconda2\lib\site-packages\sklearn\externals\joblib\parallel.py", line 72, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1550, in _fit_and_score
    test_score = _score(estimator, X_test, y_test, scorer)
  File "C:\Anaconda2\lib\site-packages\sklearn\cross_validation.py", line 1606, in _score
    score = scorer(estimator, X_test, y_test)
  File "C:\Anaconda2\lib\site-packages\sklearn\metrics\scorer.py", line 175, in __call__
    y_pred = y_pred[:, 1]
IndexError: index 1 is out of bounds for axis 1 with size 1

This code work fine when instead of using scoring = "roc_auc" I used accuracy metric. Can anyone explain me what's wrong? Have anyone had similiar problem?


Answer:

There is a bug in the KerasClassifier that is causing this issue. I have opened an issue for it on the repo. https://github.com/fchollet/keras/issues/2864

The fix is also in there. You can define your own KerasClassifier in the mean time as a temporary workaround.

class FixedKerasClassifier(KerasClassifier):
    def predict_proba(self, X, **kwargs):
        kwargs = self.filter_sk_params(Sequential.predict_proba, kwargs)
        probs = self.model.predict_proba(X, **kwargs)
        if(probs.shape[1] == 1):
            probs = np.hstack([1-probs,probs]) 
        return probs

Question:

My sequential dense DNN seems to run through each parameter in my parameter grid three times while doing Grid Search. I expect it to run once per specified epcohs in the grid: 10, 50 and 100. Why does this happen?

model architecture:

def build_model():
    print('building DNN architecture')
    model = Sequential()
    model.add(Dropout(0.02, input_shape = (150,)))
    model.add(Dense(8, init = 'normal', activation = 'relu'))
    model.add(Dropout(0.02))
    model.add(Dense(16, init = 'normal', activation = 'relu'))
    model.add(Dense(1, init = 'normal'))
    model.compile(loss = 'mean_squared_error', optimizer = 'adam')
    print('model succesfully compiled')
    return model

Grid search on epochs:

from sklearn.model_selection import GridSearchCV
epochs = [10,50,100]
param_grid = dict(epochs = epochs)
grid = GridSearchCV(estimator = KerasRegressor(build_fn = build_model), param_grid = param_grid)
grid_result = grid.fit(x_train, y_train)
grid_result.best_params_

Answer:

Because GridSearchCV does both grid search and cross-validation. For each parameter combination, three (by default) splits are used for cross-validation, and this is why you see the model being trained three times for each parameter set.

You can change the number of folds (splits) with the "cv" parameter. Check it out in the documentation.

Question:

I am trying to run a grid search for a neural network but I keep getting some weird errors. my algorithm looks like:

parameters={'learning_rate':["constant", "invscaling", "adaptive"], 
                 'hidden_layer_sizes': (156,), 'alpha': [10.0 ** -np.arange(1, 7)], 
                 'activation': ["logistic", "relu", "Tanh"]}
grid= GridSearchCV(MLPClassifier(),parameters, n_jobs=-1, cv=10)
grid.fit(train_x, train_y)

The error message I get is:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I have tried to also use just 1 value wit regards to the activation and learning_rate but the problem seem to persist. Is there anything I am not doing well, please?


Answer:

I spotted 2 mistakes in your code.


First: The alpha parameters should be contained in a pure list. Using List Comprehension, the answer is as follows.

Second: In the 'activation': ["logistic", "relu", "Tanh"]} the Tanh should be replaced with tanh.


The following code should work fine:

Replace:

'alpha': [10.0 ** -np.arange(1, 7)]
'activation': ["logistic", "relu", "Tanh"]

With:

'alpha': [10.0 ** -i for i in range(1,7)]
'activation': ["logistic", "relu", "tanh"]

Putting everything together:

parameters={'learning_rate':["constant", "invscaling", "adaptive"], 
             'hidden_layer_sizes': (156,), 'alpha': [10.0 ** -i for i in range(1,7)], 
             'activation': ["logistic", "relu", "tanh"]}

grid= GridSearchCV(MLPClassifier(), parameters, n_jobs=-1, cv=10)

grid.fit(train_x, train_y)

Question:

I am trying to learn by myself how to grid-search number of neurons in a basic multi-layered neural networks. I am using GridSearchCV and KerasClasifier of Python as well as Keras. The code below works for other data sets very well but I could not make it work for Iris dataset for some reasons and I cannot find it why, I am missing out something here. The result I get is:

Best: 0.000000 using {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 5}

from pandas import read_csv

import numpy
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler

from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from sklearn.model_selection import GridSearchCV

dataframe=read_csv("iris.csv", header=None)
dataset=dataframe.values
X=dataset[:,0:4].astype(float)
Y=dataset[:,4]

seed=7
numpy.random.seed(seed)

#encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

#one-hot encoding
dummy_y = np_utils.to_categorical(encoded_Y)

#scale the data
scaler = StandardScaler()
X = scaler.fit_transform(X)

def create_model(n_neurons=1):
    #create model
    model = Sequential()
    model.add(Dense(n_neurons, input_dim=X.shape[1], activation='relu')) # hidden layer
    model.add(Dense(3, activation='softmax')) # output layer
    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=10, initial_epoch=0, verbose=0)
# define the grid search parameters
neurons=[3, 5]

#this does 3-fold classification. One can change k. 
param_grid = dict(n_neurons=neurons)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X, dummy_y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

For the purpose of illustration and computational efficiency I search only for two values. I sincerely apologize for asking such a simple question. I am new to Python, switched from R, by the way because I realized that Deep Learning community is using python.


Answer:

Haha, this is probably the funniest thing I ever experienced on Stack Overflow :) Check:

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)

and you should see different behavior. The reason why your model get a perfect score (in terms of cross_entropy having 0 is equivalent to best model possible) is that you haven't shuffled your data and because Iris consist of three balanced classes each of your feed had a single class like a target:

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 (first fold ends here) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (second fold ends here)2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]

Such problems are really easy to be solved by every model - so that's why you've got a perfect match.

Try to shuffle your data before - this should result in an expected behavior.

Question:

I am trying to use GridSearchCV along with MLPClassifier in order to fit some training data using best parameters:

parameters={
    'learning_rate': ["constant", "invscaling", "adaptive"],
    'hidden_layer_sizes': [x for x in itertools.product((10,20,30,40,50,100),repeat=3)],
    'alpha': [10.0 **-np.arange(1, 7)],
    'activation': ["logistic", "relu", "Tanh"]
    }
ord_pred = MLPClassifier(hidden_layer_sizes = (100,1))
clf = GridSearchCV(estimator=ord_pred,param_grid=parameters,n_jobs=-1,verbose = 10)
    orders_prior1 = orders_prior.groupby('product_id').filter(lambda x: len(x) >= 3).fillna(0)
clf.fit(orders_prior1[['user_id','order_number','order_dow','order_hour_of_day','days_since_prior_order']]\
                      ,orders_prior1['product_id'], orders_prior1['user_order'])

However I got the following errors/exceptions:

   if self.alpha < 0.0:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The above exception was the direct cause of the following exception:

TransportableException                    Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in retrieve(self)
    681                 if 'timeout' in getfullargspec(job.get).args:
--> 682                     self._output.extend(job.get(timeout=self.timeout))
    683                 else:

C:\Anaconda3\lib\multiprocessing\pool.py in get(self, timeout)
    643         else:
--> 644             raise self._value
    645 

TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Wed Aug 16 19:23:55 2017
PID: 18804                            Python 3.6.2: C:\Anaconda3\python.exe
...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _fit_and_score>, (MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False),           user_id  order_number  order_dow  orde...               7.0  

[32433710 rows x 5 columns], 0             196
1           14084
2           ...
Name: product_id, Length: 32433710, dtype: int64, <function _passthrough_scorer>, memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), 10, {'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True})]
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _fit_and_score>
        args = (MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False),           user_id  order_number  order_dow  orde...               7.0  

[32433710 rows x 5 columns], 0             196
1           14084
2           ...
Name: product_id, Length: 32433710, dtype: int64, <function _passthrough_scorer>, memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), 10, {'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'})
        kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True}
    132 
    133     def __len__(self):
    134         return self._size
    135 

...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator=MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

[32433710 rows x 5 columns], y=0             196
1           14084
2           ...
Name: product_id, Length: 32433710, dtype: int64, scorer=<function _passthrough_scorer>, train=memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), test=memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), verbose=10, parameters={'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'}, fit_params={}, return_train_score=True, return_parameters=True, return_n_test_samples=True, return_times=True, error_score='raise')
    233 
    234     try:
    235         if y_train is None:
    236             estimator.fit(X_train, **fit_params)
    237         else:
--> 238             estimator.fit(X_train, y_train, **fit_params)
        estimator.fit = <bound method BaseMultilayerPerceptron.fit of ML...n_fraction=0.1, verbose=False, warm_start=False)>
        X_train =           user_id  order_number  order_dow  orde...               7.0  

[21606079 rows x 5 columns]
        y_train = 1606        17762
1610        17762
1618        ...
Name: product_id, Length: 21606079, dtype: int64
        fit_params = {}
    239 
    240     except Exception as e:
    241         # Note fit time as time until error
    242         fit_time = time.time() - start_time

...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in fit(self=MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

[21606079 rows x 5 columns], y=1606        17762
1610        17762
1618        ...
Name: product_id, Length: 21606079, dtype: int64)
    613 
    614         Returns
    615         -------
    616         self : returns a trained MLP model.
    617         """
--> 618         return self._fit(X, y, incremental=False)
        self._fit = <bound method BaseMultilayerPerceptron._fit of M...n_fraction=0.1, verbose=False, warm_start=False)>
        X =           user_id  order_number  order_dow  orde...               7.0  

[21606079 rows x 5 columns]
        y = 1606        17762
1610        17762
1618        ...
Name: product_id, Length: 21606079, dtype: int64
    619 
    620     @property
    621     def partial_fit(self):
    622         """Fit the model to data matrix X and target y.

...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _fit(self=MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

[21606079 rows x 5 columns], y=1606        17762
1610        17762
1618        ...
Name: product_id, Length: 21606079, dtype: int64, incremental=False)
    320         if not hasattr(hidden_layer_sizes, "__iter__"):
    321             hidden_layer_sizes = [hidden_layer_sizes]
    322         hidden_layer_sizes = list(hidden_layer_sizes)
    323 
    324         # Validate input parameters.
--> 325         self._validate_hyperparameters()
        self._validate_hyperparameters = <bound method BaseMultilayerPerceptron._validate...n_fraction=0.1, verbose=False, warm_start=False)>
    326         if np.any(np.array(hidden_layer_sizes) <= 0):
    327             raise ValueError("hidden_layer_sizes must be > 0, got %s." %
    328                              hidden_layer_sizes)
    329 

...........................................................................
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _validate_hyperparameters(self=MLPClassifier(activation='logistic',
       alph...on_fraction=0.1, verbose=False, warm_start=False))
    386         if not isinstance(self.shuffle, bool):
    387             raise ValueError("shuffle must be either True or False, got %s." %
    388                              self.shuffle)
    389         if self.max_iter <= 0:
    390             raise ValueError("max_iter must be > 0, got %s." % self.max_iter)
--> 391         if self.alpha < 0.0:
        self.alpha = array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06])
    392             raise ValueError("alpha must be >= 0, got %s." % self.alpha)
    393         if (self.learning_rate in ["constant", "invscaling", "adaptive"] and
    394                 self.learning_rate_init <= 0.0):
    395             raise ValueError("learning_rate_init must be > 0, got %s." %

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
___________________________________________________________________________

During handling of the above exception, another exception occurred:

JoblibValueError                          Traceback (most recent call last)
<ipython-input-20-7c1268d1d451> in <module>()
      9 orders_prior1 = orders_prior.groupby('product_id').filter(lambda x: len(x) >= 3).fillna(0)
     10 # up = orders_prior['product_id'].unique()
---> 11 clf.fit(orders_prior1                      [['user_id','order_number','order_dow','order_hour_of_day','days_since_prior_order']]                      ,orders_prior1['product_id'], orders_prior1['user_order'])
     12 
     13 # ord_pred.partial_fit(orders_prior.fillna(0).iloc[0:894]\

C:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups)
    943             train/test set.
    944         """
--> 945         return self._fit(X, y, groups, ParameterGrid(self.param_grid))
    946 
    947 

C:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in _fit(self, X, y, groups, parameter_iterable)
    562                                   return_times=True, return_parameters=True,
    563                                   error_score=self.error_score)
--> 564           for parameters in parameter_iterable
    565           for train, test in cv_iter)
    566 

C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
    766                 # consumption.
    767                 self._iterating = False
--> 768             self.retrieve()
    769             # Make sure that we get a last message telling us we are done
    770             elapsed_time = time.time() - self._start_time

C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in retrieve(self)
    717                     ensure_ready = self._managed_backend
    718                     backend.abort_everything(ensure_ready=ensure_ready)
--> 719                 raise exception
    720 
    721     def __call__(self, iterable):

JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
C:\Anaconda3\lib\runpy.py in _run_module_as_main(mod_name='ipykernel_launcher', alter_argv=1)
    188         sys.exit(msg)
    189     main_globals = sys.modules["__main__"].__dict__
    190     if alter_argv:
    191         sys.argv[0] = mod_spec.origin
    192     return _run_code(code, main_globals, None,
--> 193                      "__main__", mod_spec)
        mod_spec = ModuleSpec(name='ipykernel_launcher', loader=<_f...nda3\\lib\\site-packages\\ipykernel_launcher.py')
    194 
    195 def run_module(mod_name, init_globals=None,
    196                run_name=None, alter_sys=False):
    197     """Execute a module's code without importing it

    F:\thecads_vm-master\eds\Final Project\Instacart\<ipython-input-20-7c1268d1d451> in <module>()
          6 }
          7 ord_pred = MLPClassifier(hidden_layer_sizes = (100,1))
          8 clf = GridSearchCV(estimator=ord_pred,param_grid=parameters,n_jobs=-1,verbose = 10)
          9 orders_prior1 = orders_prior.groupby('product_id').filter(lambda x: len(x) >= 3).fillna(0)
         10 # up = orders_prior['product_id'].unique()
    ---> 11 clf.fit(orders_prior1                      [['user_id','order_number','order_dow','order_hour_of_day','days_since_prior_order']]                      ,orders_prior1['product_id'], orders_prior1['user_order'])
         12 
         13 # ord_pred.partial_fit(orders_prior.fillna(0).iloc[0:894]\
         14 #                      [['user_id','order_number','order_dow','order_hour_of_day','days_since_prior_order']]\
         15 #                      ,orders_prior.iloc[0:894]['product_id'], up)

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self=GridSearchCV(cv=None, error_score='raise',
         ...rain_score=True,
           scoring=None, verbose=10), X=          user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns], y=0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64, groups=0                 11
    1                 11
    2     ...Name: user_order, Length: 32433710, dtype: object)
        940 
        941         groups : array-like, with shape (n_samples,), optional
        942             Group labels for the samples used while splitting the dataset into
        943             train/test set.
        944         """
    --> 945         return self._fit(X, y, groups, ParameterGrid(self.param_grid))
            self._fit = <bound method BaseSearchCV._fit of GridSearchCV(...ain_score=True,
           scoring=None, verbose=10)>
            X =           user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns]
            y = 0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64
            groups = 0                 11
    1                 11
    2     ...Name: user_order, Length: 32433710, dtype: object
            self.param_grid = {'activation': ['logistic', 'relu', 'Tanh'], 'alpha': [array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06])], 'hidden_layer_sizes': [(10, 10, 10), (10, 10, 20), (10, 10, 30), (10, 10, 40), (10, 10, 50), (10, 10, 100), (10, 20, 10), (10, 20, 20), (10, 20, 30), (10, 20, 40), (10, 20, 50), (10, 20, 100), (10, 30, 10), (10, 30, 20), (10, 30, 30), (10, 30, 40), (10, 30, 50), (10, 30, 100), (10, 40, 10), (10, 40, 20), ...], 'learning_rate': ['constant', 'invscaling', 'adaptive']}
        946 
        947 
        948 class RandomizedSearchCV(BaseSearchCV):
        949     """Randomized search on hyper parameters.

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in _fit(self=GridSearchCV(cv=None, error_score='raise',
         ...rain_score=True,
           scoring=None, verbose=10), X=          user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns], y=0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64, groups=0                 11
    1                 11
    2     ...Name: user_order, Length: 32433710, dtype: object, parameter_iterable=<sklearn.model_selection._search.ParameterGrid object>)
        559                                   fit_params=self.fit_params,
        560                                   return_train_score=self.return_train_score,
        561                                   return_n_test_samples=True,
        562                                   return_times=True, return_parameters=True,
        563                                   error_score=self.error_score)
    --> 564           for parameters in parameter_iterable
            parameters = undefined
            parameter_iterable = <sklearn.model_selection._search.ParameterGrid object>
        565           for train, test in cv_iter)
        566 
        567         # if one choose to see train score, "out" will contain train score info
        568         if self.return_train_score:

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self=Parallel(n_jobs=-1), iterable=<generator object BaseSearchCV._fit.<locals>.<genexpr>>)
        763             if pre_dispatch == "all" or n_jobs == 1:
        764                 # The iterable was consumed all at once by the above for loop.
        765                 # No need to wait for async callbacks to trigger to
        766                 # consumption.
        767                 self._iterating = False
    --> 768             self.retrieve()
            self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=-1)>
        769             # Make sure that we get a last message telling us we are done
        770             elapsed_time = time.time() - self._start_time
        771             self._print('Done %3i out of %3i | elapsed: %s finished',
        772                         (len(self._output), len(self._output),

    ---------------------------------------------------------------------------
    Sub-process traceback:
    ---------------------------------------------------------------------------
    ValueError                                         Wed Aug 16 19:23:55 2017
    PID: 18804                            Python 3.6.2: C:\Anaconda3\python.exe
    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
        126     def __init__(self, iterator_slice):
        127         self.items = list(iterator_slice)
        128         self._size = len(self.items)
        129 
        130     def __call__(self):
    --> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
            self.items = [(<function _fit_and_score>, (MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False),           user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns], 0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64, <function _passthrough_scorer>, memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), 10, {'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True})]
        132 
        133     def __len__(self):
        134         return self._size
        135 

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
        126     def __init__(self, iterator_slice):
        127         self.items = list(iterator_slice)
        128         self._size = len(self.items)
        129 
        130     def __call__(self):
    --> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
            func = <function _fit_and_score>
            args = (MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False),           user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns], 0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64, <function _passthrough_scorer>, memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), 10, {'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'})
            kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True}
        132 
        133     def __len__(self):
        134         return self._size
        135 

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator=MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

    [32433710 rows x 5 columns], y=0             196
    1           14084
    2           ...
    Name: product_id, Length: 32433710, dtype: int64, scorer=<function _passthrough_scorer>, train=memmap([    1606,     1610,     1618, ..., 32433707, 32433708, 32433709]), test=memmap([       0,        1,        2, ..., 32190332, 32190334, 32190356]), verbose=10, parameters={'activation': 'logistic', 'alpha': array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06]), 'hidden_layer_sizes': (10, 10, 10), 'learning_rate': 'constant'}, fit_params={}, return_train_score=True, return_parameters=True, return_n_test_samples=True, return_times=True, error_score='raise')
        233 
        234     try:
        235         if y_train is None:
        236             estimator.fit(X_train, **fit_params)
        237         else:
    --> 238             estimator.fit(X_train, y_train, **fit_params)
            estimator.fit = <bound method BaseMultilayerPerceptron.fit of ML...n_fraction=0.1, verbose=False, warm_start=False)>
            X_train =           user_id  order_number  order_dow  orde...               7.0  

    [21606079 rows x 5 columns]
            y_train = 1606        17762
    1610        17762
    1618        ...
    Name: product_id, Length: 21606079, dtype: int64
            fit_params = {}
        239 
        240     except Exception as e:
        241         # Note fit time as time until error
        242         fit_time = time.time() - start_time

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in fit(self=MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

    [21606079 rows x 5 columns], y=1606        17762
    1610        17762
    1618        ...
    Name: product_id, Length: 21606079, dtype: int64)
        613 
        614         Returns
        615         -------
        616         self : returns a trained MLP model.
        617         """
    --> 618         return self._fit(X, y, incremental=False)
            self._fit = <bound method BaseMultilayerPerceptron._fit of M...n_fraction=0.1, verbose=False, warm_start=False)>
            X =           user_id  order_number  order_dow  orde...               7.0  

    [21606079 rows x 5 columns]
            y = 1606        17762
    1610        17762
    1618        ...
    Name: product_id, Length: 21606079, dtype: int64
        619 
        620     @property
        621     def partial_fit(self):
        622         """Fit the model to data matrix X and target y.

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _fit(self=MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False), X=          user_id  order_number  order_dow  orde...               7.0  

    [21606079 rows x 5 columns], y=1606        17762
    1610        17762
    1618        ...
    Name: product_id, Length: 21606079, dtype: int64, incremental=False)
        320         if not hasattr(hidden_layer_sizes, "__iter__"):
        321             hidden_layer_sizes = [hidden_layer_sizes]
        322         hidden_layer_sizes = list(hidden_layer_sizes)
        323 
        324         # Validate input parameters.
    --> 325         self._validate_hyperparameters()
            self._validate_hyperparameters = <bound method BaseMultilayerPerceptron._validate...n_fraction=0.1, verbose=False, warm_start=False)>
        326         if np.any(np.array(hidden_layer_sizes) <= 0):
        327             raise ValueError("hidden_layer_sizes must be > 0, got %s." %
        328                              hidden_layer_sizes)
        329 

    ...........................................................................
    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py in _validate_hyperparameters(self=MLPClassifier(activation='logistic',
           alph...on_fraction=0.1, verbose=False, warm_start=False))
        386         if not isinstance(self.shuffle, bool):
        387             raise ValueError("shuffle must be either True or False, got %s." %
        388                              self.shuffle)
        389         if self.max_iter <= 0:
        390             raise ValueError("max_iter must be > 0, got %s." % self.max_iter)
    --> 391         if self.alpha < 0.0:
            self.alpha = array([  1.00000000e-01,   1.00000000e-02,   1.0...0000000e-04,   1.00000000e-05,   1.00000000e-06])
        392             raise ValueError("alpha must be >= 0, got %s." % self.alpha)
        393         if (self.learning_rate in ["constant", "invscaling", "adaptive"] and
        394                 self.learning_rate_init <= 0.0):
        395             raise ValueError("learning_rate_init must be > 0, got %s." %

    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

It is hard to detect the real cause of this and this runs for a very long time. maybe it is something to do with the range of the alpha parameter search or something else. How to fix? Thanks.


Answer:

Change 'alpha': [10.0 **-np.arange(1, 7)] to 'alpha': 10.0 **-np.arange(1, 7) As you can see in the documentation of np.arrange() the function itself already returns an array. Closing it in brackets (making it an array inside an array) causes GridSearch() to throw an exception.

Question:

I am trying to train a simple NN with 1 hidden layer for binary classification. Tried to use GridSeachCV to get best parameters but training won't go beyond first Epoch.

Not getting any values for best_parameters = gridSearchCV.best_params_ and best_accurcy = gridSearchCV.best_score_ after it stops.

Code
def build_classifier_grid(optimizer):
    classifier_grid = Sequential()
    classifier_grid.add(Dense(output_dim = 6, init = 'uniform',activation = 'relu', input_dim = 11))
    classifier_grid.add(Dense(output_dim = 6, init = 'uniform',activation = 'relu'))
    classifier_grid.add(Dense(output_dim = 1, init = 'uniform',activation = 'sigmoid'))
    classifier_grid.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier_grid

classifier_grid = KerasClassifier(build_fn = build_classifier_grid)

parameters = {'batch_size': [25,32], 
              'nb_epoch' : [100, 500],
              'optimizer': ['adam', 'rmsprop']}

gridSearchCV = GridSearchCV(estimator = classifier_grid, 
                            param_grid = parameters, 
                            scoring = 'accuracy', 
                            cv = 10)

gridSearchCV = gridSearchCV.fit(X_train, y_train)

Getting like:

Epoch 1/1
7200/7200 [==============================] - 5s 676us/step - loss: 0.5647 - acc: 0.7961
Epoch 1/1
7200/7200 [==============================] - 5s 681us/step - loss: 0.5626 - acc: 0.7950
Epoch 1/1
7200/7200 [==============================] - 5s 684us/step - loss: 0.5523 - acc: 0.7956
"
"
Epoch 1/1
7200/7200 [==============================] - 10s 1ms/step - loss: 0.6167 - acc: 0.7929
Epoch 1/1
8000/8000 [==============================] - 11s 1ms/step - loss: 0.5504 - acc: 0.7959

Answer:

It is not stuck at all, it is just training each model for only one epoch, which is the default value. The problem is that you use the parameter nb_epoch, and the correct name in Keras 2.x is epochs.

Question:

I'm trying to do grid search for a multi class problem in neural networks. I am not able to get the optimum parameters, the kernel keeps on compiling. Is there any problem with my code? Please do help

import keras

from keras.models import Sequential
from keras.layers import Dense

# defining the baseline model:

def neural(output_dim=10,init_mode='glorot_uniform'):
    model = Sequential()
    model.add(Dense(output_dim=output_dim,
                    input_dim=2,
                    activation='relu',
                    kernel_initializer= init_mode))
    model.add(Dense(output_dim=output_dim,
                    activation='relu',
                    kernel_initializer= init_mode))
    model.add(Dense(output_dim=3,activation='softmax'))

    # Compile model
    model.compile(loss='sparse_categorical_crossentropy', 
                  optimizer='adam', 
                  metrics=['accuracy'])
    return model

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.model_selection import GridSearchCV
estimator = KerasClassifier(build_fn=neural, 
                            epochs=5, 
                            batch_size=5, 
                            verbose=0)

# define the grid search parameters
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10, 50, 100]
init_mode = ['uniform', 'lecun_uniform', 'normal', 'zero', 
             'glorot_normal', 'glorot_uniform', 'he_normal', 'he_uniform']
output_dim = [10, 15, 20, 25, 30,40]

param_grid = dict(batch_size=batch_size, 
                  epochs=epochs,
                  output_dim=output_dim,
                  init_mode=init_mode)

grid = GridSearchCV(estimator=estimator, 
                    scoring= 'accuracy',
                    param_grid=param_grid, 
                    n_jobs=-1,cv=5)

grid_result = grid.fit(X_train, Y_train)

# summarize results

print("Best: %f using %s" % (grid_result.best_score_, 
                             grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Answer:

There's no error in your code.

Your current param grid has 864 different combinations of parameters possible.

(6 values in 'batch_size' × 3 values in 'epochs' × 8 in 'init_mode' × 6 in 'output_dim') = 864

GridSearchCV will iterate over all those possibilities and your estimator will be cloned that many times. And that is again repeated 5 times because you have set cv=5.

So your model will be cloned (compiled and params set according to the possibilities) a total of 864 x 5 = 4320 times.

So you keep seeing in output that the model is being compiled those many times.

To check if GridSearchCV is working or not, use its verbose param.

grid = GridSearchCV(estimator=estimator, 
                    scoring= 'accuracy',
                    param_grid=param_grid, 
                    n_jobs=1,cv=5, verbose=3)

This will print the current possible params being tried, the cv iteration, time taken to fit on it, current accuracy etc.