Hot questions for Using Neural networks in adaboost

Question:

I implemented Adaboost for a project, but I'm not sure if I've understood adaboost correctly. Here's what I implemented, please let me know if it is a correct interpretation.

1. My weak classifiers are 8 different neural networks. Each of these predict with around 70% accuracy after full training.
2. I train all these networks fully, and collect their predictions on the training set ; so I have 8 vectors of predictions on the training set.

Now I use adaboost. My interpretation of adaboost is that it will find a final classifier as a weighted average of the classifiers I have trained above, and its role is to find these weights. So, for every training example I have 8 predictions, and I'm combining them using adaboost weights. Note that with this interpretation, the weak classifiers are not retrained during the adaboost iterations, only the weights are updated. But the updated weights in effect create new classifiers in each iteration.

Here's the pseudo code:

```all_alphas = []
all_classifier_indices = []
initialize all training example weights to 1/(num of examples)
compute error for all 8 networks on the training set
for i in 1 to T:
find the classifier with lowest weighted error.
compute the weights (alpha) according to the Adaboost confidence formula
Update the weight distribution, according to the weight update formula in Adaboost.
all_alphas.append(alpha)
all_classifier_indices.append(selected_classifier)
```

After `T` iterations, there are `T` alphas and `T` classifier indices ; these `T` classifier indices will point to one of the 8 neural net prediction vectors.

Then on the test set, for every example, I predict by summing over `alpha*classifier` .

I want to use adaboost with neural networks, but I think I've misinterpreted the adaboost algorithm wrong..

Boosting summary:

1- Train your first weak classifier by using the training data

2- The 1st trained classifier makes mistake on some samples and correctly classifies others. Increase the weight of the wrongly classified samples and decrease the weight of correct ones. Retrain your classifier with these weights to get your 2nd classifier.

In your case, you first have to resample with replacement from your data with these updated weights, create a new training data and then train your classifier over these new data.

3- Repeat the 2nd step T times and at the end of each round, calculate the alpha weight for the classifier according to the formula. 4- The final classifier is the weighted sum of the decisions of the T classifiers.

It is hopefully clear from this explanation that you have done it abit wrongly. Instead of retrain your network with the new data set, you trained them all over the original dataset. In fact you are kind of using random forest type classifier (except that you are using NN instead of decision trees) ensemble.

PS: There is no guarantee that boosting increases the accuracy. In fact, so far all the boosting methods that I'm aware of were unsuccessful to improve the accuracy with NN as weak learners (The reason is because of the way that boosting works and needs a lengthier discussion).

Question:

For a binary classification problem I want to use the `MLPClassifier` as the base estimator in the `AdaBoostClassifier`. However, this does not work because `MLPClassifier` does not implement `sample_weight`, which is required for AdaBoostClassifier (see here). Before that, I tried using a Keras model and the `KerasClassifier` within `AdaBoostClassifier` but that did also not work as mentioned here .

A way, which is proposed by User V1nc3nt is to build an own `MLPclassifier` in TensorFlow and take into account the sample_weight.

User V1nc3nt shared large parts of his code but since I have only limited experience with Tensorflow, I am not able to fill in the missing parts. Hence, I was wondering if anyone has found a working solution for building Adaboost ensembles from MLPs or can help me out in completing the solution proposed by V1nc3nt.

Thank you very much in advance!

Based on the references, which you have given, I have modified `MLPClassifier` to accommodate `sample_weights`.

Try this!

```from sklearn.neural_network import MLPClassifier

class customMLPClassifer(MLPClassifier):
def resample_with_replacement(self, X_train, y_train, sample_weight):

# normalize sample_weights if not already
sample_weight = sample_weight / sample_weight.sum(dtype=np.float64)

X_train_resampled = np.zeros((len(X_train), len(X_train[0])), dtype=np.float32)
y_train_resampled = np.zeros((len(y_train)), dtype=np.int)
for i in range(len(X_train)):
# draw a number from 0 to len(X_train)-1
draw = np.random.choice(np.arange(len(X_train)), p=sample_weight)

# place the X and y at the drawn number into the resampled X and y
X_train_resampled[i] = X_train[draw]
y_train_resampled[i] = y_train[draw]

return X_train_resampled, y_train_resampled

def fit(self, X, y, sample_weight=None):
if sample_weight is not None:
X, y = self.resample_with_replacement(X, y, sample_weight)

return self._fit(X, y, incremental=(self.warm_start and
hasattr(self, "classes_")))

```

Question:

Hi I recently taking course and do some survey on Adaboost

I view some code using Adaboost to boost the performance of neural network

As far as I Know with multiple classes Adaboost can be done by:

(1)Weighting the training data as 1 for each data.

(2)After training we re-weight the data by adding the weight if the

classifier do it wrong,else reduce the weight if classifier predict it correctly.

(3)And final we take the combination of all classifiers we and take the max one (probability)

I could make some code about it with Keras and sklearn:

```model = Model( img_input , o )
model.fit_generator(#some parameters)
```

My question is:

I would like to know how Adaboost is used with neural network

I could imagine two ways to do this not sure how Adaboost do here:

(1)After complete training(1 hour),we re-weight the training data and then again and again until iteration is over.

(2)If first round of all data have been fed into neural network and then we re-weight the training data.

The difference between (1) and (2) is how we define one iteration in Adaboost:

(1) would take too long to complete whole iteration

(2) just some how don't make sense to me cause I don't think the whole process is going to convergence so fast or the iteration number would need to be set large.