Hot questions for Using Neural networks in multiclass classification

Top 10 Python Open Source / Neural networks / multiclass classification

Question:

so, I would like to calculate the ROC curve and AUC of a code of mine where I have 28 classes and my images can be several at the same time. For example, an image can belong to class 1, 2 and 3 at the same time. I have a vector of 28 positions as a label in y_true and there it is marked with 1 in the position for the class. For example, if an image belongs to class 2, 3 and 5 in positions 2, 3 and 5 of the vector, they will be marked with 1 -> [0,0,1,1,0,1,0,0,0 ,..., 0]

def data_validate(samples, loss, network, f1_class):

x, y_true = samples #x-->valor na matriz, y --> label
x = x.cuda() #to GPU
y_true = y_true.cuda() #to GPU
y_pred = network(x) #aqui executa o forward do model.py dos {batch_size} e retorna o fc
y_pred = torch.sigmoid(y_pred)
erro = loss(y_pred, y_true)
f1_class.acumulate(y_pred.cpu().detach(), y_true.cpu().detach(),th=0.5)
print(y_pred)
for i in range(28):
    auc_score = roc_auc_score(y_true[:][i].cpu().detach(), y_pred.cpu().detach(), multi_class='ovr')

return erro, y_pred.cpu().detach(), y_true.cpu().detach()

but I receive that error --> ValueError: Target scores need to be probabilities for multiclass roc_auc, i.e. they should sum up to 1.0 over classes


Answer:

Code changed to:

def data_validate(samples, loss, network, f1_class):
    x, y_true = samples #x-->valor na matriz, y --> label
    x = x.cuda() #to GPU
    y_true = y_true.cuda() #to GPU
    y_pred = network(x) #aqui executa o forward do model.py dos {batch_size} e retorna o fc
    y_pred = torch.sigmoid(y_pred)
    erro = loss(y_pred, y_true)
    f1_class.acumulate(y_pred.cpu().detach(), y_true.cpu().detach(),th=0.5)

    row_sums = torch.sum(y_pred, 1)
    row_sums = row_sums.repeat(28)
    row_sums = row_sums.reshape(y_pred.shape)
    y_pred = torch.div( y_pred , row_sums ) 

    for i in range(len(y_pred)):
        auc_score = roc_auc_score(y_true[:][i].cpu().detach(), y_pred[:][i].cpu().detach(), multi_class='ovr')

    return erro, y_pred.cpu().detach(), y_true.cpu().detach()

Like Priya said.

Question:

I am trying to use a fully connected neural network or multilayer perceptron to perform a multi-class classification: My training data (X) are different DNA strings of equal length. Each of these sequences have a float point value associated with them (e.g. t_X), which I use to simulate labels (y) for my data in the following way. y ~ np.random.poisson(constant * t_X).

After training my Keras model (please see below), I made a histogram of predicted labels and test labels and the issue I am facing is that my model seems to classify a lot of sequences incorrectly, please see image linked below.

Histogram link

My training data looks like the following:

X , Y  
CTATTACCTGCCCACGGTAAAGGCGTTCTGG,    1
TTTCTGCCCGCGGCCTGGCAATTGATACCGC,    6
TTTTTACACGCCTTGCGTAAAGCGGCACGGC,    4
TTGCTGCCTGGCCGATGGTCTATGCCGCTGC,    7

I one-hot encode my Y's and my X sequences are turned into tensors of dimensions: (batch size, sequences length, number of characters), these numbers are something like 10,000 by 50 by 4

My keras model looks like:

model = Sequential() 
model.add(Flatten())
model.add(Dense(100, activation='relu',input_shape=(50,4)))
model.add(Dropout(0.25))
model.add(Dense(50, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(len(one_hot_encoded_labels), activation='softmax'))

I have tried the following different loss functions

#model.compile(loss='mean_squared_error',optimizer=Adam(lr=0.00001), metrics=['accuracy'])
#model.compile(loss='mean_squared_error',optimizer=Adam(lr=0.0001), metrics=['mean_absolute_error',r_square])
#model.compile(loss='kullback_leibler_divergence',optimizer=Adam(lr=0.00001), metrics=['categorical_accuracy'])
#model.compile(loss=log_poisson_loss,optimizer=Adam(lr=0.0001), metrics=['categorical_accuracy'])
#model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.0001), metrics=['categorical_accuracy'])
model.compile(loss='poisson',optimizer=Adam(lr=0.0001), metrics=['categorical_accuracy'])

The loss behaves reasonably; it goes down and flattens out with increasing epochs. I have tried different learning rates, different optimizers, different number of neurons in each layer, different number of hidden layers and different types of regularization.

I think that my model always puts most predicted labels around the peak of the test data, (please see linked histogram), but it is unable to classify the sequences with fewer counts in the test set. Is this a common problem?

Without going to other architectures (like convolution or recurrent), does any one know how I might be able to improve classification performance for this model?

Training data file


Answer:

From your histogram distributions, it is clear that, you have very imbalanced test data-set. I am assuming, you have same training data distribution. Then it might be the reason, that NN is performing poor, because, it doesn't have much data for many of classes to learn the features. You can try some sampling techniques, so it can compare the relation between each class.

Here is a link, which has explained the various methods for such imbalance data-set.

Second, you can check the model's performance by cross-validation, where you can easily find, whether that is reducible or irreducible error. If that is irreducible error, you can't improve any more(you have to try another method for that situation).

Third, there is co-relation between sequences. Simple feed-forward network cann't capture such relation. Recurrent-network can capture such dependencies in the data-set. Here is simple example for that. This example is for binary-class, which can be extended to multi-class as in your case.

For loss-function selection, it is completely problem specific. You can check this link which has explained when and which loss function can be helpful.

Question:

I am trying to figure out to build a neural network in which let's say I have 3 output labels (A, B, C).

Now my data consist of rows in which 2 of the labels can be 1. Like A and B will be 1 and C will be 0. Now I want to train my neural network such that it can predict A or B. I don't want it to be trained to have high probability for both A and B (like multilabel problems), I want only one of them.

The reason for this is that the rows having 1 in A and B are more like don't care rows in which predicting either A or B will be correct. So I don't want neural network to find minima where it tries to predict both A and B.

Is it possible to train neural network like this?


Answer:

I think using a weight is the best way I can think of for your application.

Define a weight w for each sample such that w = 0 if A = 1 and B = 1, else w = 1. Now, define your loss function as:

w * (CE(A) +CE(B)) + w' * min(CE(A), CE(B)) + CE(C)

where CE(A) gives the cross-entropy loss over label A. The w' indicates complement of w. The loss function is quite simple to understand. It will try to predict both A and B correctly when both A and B are not 1. Otherwise, it will either predict A or B correctly. Remember, which one out of A and B will be predicted correctly cannot be known in advance. Also, it may not be consistent over batches. Model will always try to predict the class C correctly.

If you are using your own weights to indicate sample importance, then you should use multiply the entire above expression with that weight.

However, I wouldn't be surprised if you get similar (or even better) performance with the classic multi-label loss function. Assuming you have equal proportion of each label, then only in 1/8th of cases, you are allowing your network to predict either A or B. Otherwise, the network has to predict all three of them correctly. Usually, the simpler loss functions work better.

Question:

I am using ANN for Multiclass Classification(12 classes) in Python. However i am getting errors. Here is the code snippet:

import keras
from keras.models import Sequential
from keras.layers import Dense

# Initialising the ANN
# Initialising the ANN
classifier = Sequential()

# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 8, kernel_initializer = 'uniform', activation = 'relu', input_dim = 4))

# Adding the second hidden layer
classifier.add(Dense(units = 8, kernel_initializer = 'uniform', activation = 'relu'))

# Adding the output layer
classifier.add(Dense(units = 13, kernel_initializer = 'uniform', activation = 'softmax'))

# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])

# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size =200 , epochs = 100)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

   # Making the Confusion Matrix
   from sklearn.metrics import confusion_matrix
   cm = confusion_matrix(y_test, y_pred)

The program runs all the way until running the neural code and also finds the y_pred. After that i get this error i.e the confusion matrix is not formed.

The error:

ValueError: Classification metrics can't handle a mix of multiclass and continuous-multioutput targets


Answer:

from sklearn.metrics import confusion_matrix

y_pred = classifier.predict(X_test)

predictions = np.argmax(y_pred, axis=-1) 

cm = confusion_matrix(y_test, y_pred)

I hope it will resolve your problem

Question:

I am trying to build a model to predict the damage of house. I am using Keras for this.

There are 5 values to be predicted in column 'damage_grade' between 1 to 5. The Higher the number, the more damage a house can get.

Also I would like to mention I am beginner in Keras and it is my first model in Keras. I am trying to do it by taking help from Keras documentation.

My code is:

X_train = rtrain_df.drop("damage_grade", axis=1) 
Y_train = rtrain_df["damage_grade"] 
X_test = rtest_df.drop("building_id", axis=1).copy() 
X_train.shape, Y_train.shape, X_test.shape

import keras 
from keras.models import Sequential 
from keras.layers import Dense, Dropout, Activation 
from keras.optimizers import SGD

model = Sequential() 
model.add(Dense(64, activation='relu', input_dim=46)) #there are 46 feature in my dataset to be trained 
model.add(Dropout(0.5)) 
model.add(Dense(64, activation='relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(10, activation='softmax'))

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train, Y_train, epochs=20, batch_size=128)

When trying to fit the model it gives the following error:

ValueError: Error when checking target: expected dense_6 to have shape (10,) but got array with shape (1,)

There are around 600 000 records to be trained


Answer:

There are a few errors in your code:

  • You've given 10 in the last Dense layer. It must be equal to the number of values to be predicted i.e. 5.
  • You must convert your Y_train into a categorical array having 5 categorical features ('damage_grade' from 0 to 4).

Below is the corrected code:

X_train = rtrain_df.drop("damage_grade", axis=1) 
Y_train = rtrain_df["damage_grade"] 
X_test = rtest_df.drop("building_id", axis=1).copy() 
X_train.shape, Y_train.shape, X_test.shape

import keras 
from keras.models import Sequential 
from keras.layers import Dense, Dropout, Activation 
from keras.optimizers import SGD

from keras.utils import np_utils
Y_train_cat = np_utils.to_categorical(Y_train) # converts into 5 categorical features

model = Sequential() 
model.add(Dense(64, activation='relu', input_dim=46))
model.add(Dropout(0.5)) 
model.add(Dense(64, activation='relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(5, activation='softmax')) 

# last Dense layer is the output layer that'll produce the probabilities for the 5 
# outputs

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train, Y_train_cat, epochs=20, batch_size=128)

import numpy as np

predictions = model.predict(X_test)
result = np.argmax(predictions,axis=1) # sets the output with max probability to 1