## Scikit-learn confusion matrix

normalized confusion matrix

confusion matrix python? - stack overflow

multi-class confusion matrix

confusion matrix python without library

confusion matrix example

sklearn confusion matrix binary classification

confusion matrix diagram

I can't figure out if I've setup my binary classification problem correctly. I labeled the positive class 1 and the negative 0. However It is my understanding that by default scikit-learn uses class 0 as the positive class in its confusion matrix (so the inverse of how I set it up). This is confusing to me. Is the top row, in scikit-learn's default setting, the positive or negative class? Lets assume the confusion matrix output:

confusion_matrix(y_test, preds) [ [30 5] [2 42] ]

How would it look like in a confusion matrix? Are the actual instances the rows or the columns in scikit-learn?

prediction prediction 0 1 1 0 ----- ----- ----- ----- 0 | TN | FP (OR) 1 | TP | FP actual ----- ----- actual ----- ----- 1 | FN | TP 0 | FN | TN

**Confusion matrix,** Example of confusion matrix usage to evaluate the quality of the output of a classifier on the iris data set. The diagonal elements represent the number of points Confusion matrix ¶ Example of confusion matrix usage to evaluate the quality of the output of a classifier on the iris data set. The diagonal elements represent the number of points for which the predicted label is equal to the true label, while off-diagonal elements are those that are mislabeled by the classifier.

Following the example of wikipedia. If a classification system has been trained to distinguish between cats and non cats, a confusion matrix will summarize the results of testing the algorithm for further inspection. Assuming a sample of 27 animals — 8 cats, and 19 non cats, the resulting confusion matrix could look like the table below:

**With sklearn**

If you want to maintain the structure of the wikipedia confusion matrix, first go the predicted values and then the actual class.

from sklearn.metrics import confusion_matrix y_true = [0,0,0,1,0,0,1,0,0,1,0,1,0,0,0,0,1,0,0,1,1,0,1,0,0,0,0] y_pred = [0,0,0,1,0,0,1,0,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0] confusion_matrix(y_pred, y_true, labels=[1,0]) Out[1]: array([[ 5, 2], [ 3, 17]], dtype=int64)

**Another way with crosstab pandas**

true = pd.Categorical(list(np.where(np.array(y_true) == 1, 'cat','non-cat')), categories = ['cat','non-cat']) pred = pd.Categorical(list(np.where(np.array(y_pred) == 1, 'cat','non-cat')), categories = ['cat','non-cat']) pd.crosstab(pred, true, rownames=['pred'], colnames=['Actual'], margins=False, margins_name="Total") Out[2]: Actual cat non-cat pred cat 5 2 non-cat 3 17

I hope it serves you

**sklearn.metrics.confusion_matrix,** Plot Confusion Matrix. Read more in the User Guide. Parameters. estimatorestimator instance. Fitted classifier or a fitted Normalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If None, confusion matrix will not be normalized. display_labels array-like of shape (n_classes,), default=None. Target names used for plotting.

**Short answer**
In binary classification, when using the argument `labels`

,

confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0], labels=[0,1]).ravel()

the class labels, `0`

, and `1`

, are considered to be `Negative`

and `Positive`

, respectively. This is due to the order implied by the list, and not the alpha-numerical order.

**Verification:**
Consider imbalanced class labels like this: (using imbalance class to make the distinction easier)

>>> y_true = [0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0] >>> y_pred = [0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0] >>> table = confusion_matrix(y_true, y_pred, labels=[0,1]).ravel()

this would give you a confusion table as follows:

>>> table array([12, 1, 2, 1])

which corresponds to:

Actual | 1 | 0 | ___________________ pred 1 | TP=1 | FP=1 | 0 | FN=2 | TN=12|

where `FN=2`

means that there were 2 cases where the model predicted the sample to be negative (i.e., `0`

) but the actual label was positive (i.e., `1`

), hence False Negative equals 2.

Similarly for `TN=12`

, in 12 cases the model correctly predicted the negative class (`0`

), hence True Negative equals 12.

This way everything adds up assuming that `sklearn`

considers the first label (in `labels=[0,1]`

as the negative class. Therefore, here, `0`

, the first label, represents the negative class.

**sklearn.metrics.plot_confusion_matrix,** A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a By definition a confusion matrix \(C\)is such that \(C_{i, j}\)is equal to the number of observations known to be in group \(i\)but predicted to be in group \(j\). Thus in binary classification, the count of true negatives is \(C_{0,0}\), false negatives is \(C_{1,0}\), true positives is \(C_{1,1}\)and false positives is \(C_{0,1}\).

**Supporting Answer:**

When drawing the confusion matrix values using **sklearn.metrics**, be aware that the order of the values are

**[ True Negative False positive]
[ False Negative True Positive ]**

If you interpret the values wrong, say TP for TN, your accuracies and AUC_ROC will more or less match, but your **precision, recall, sensitivity, and f1-score will take a hit** and you will end up with completely different metrics. This will result in you making a false judgement of your model's performance.

Do make sure to clearly identify what the 1 and 0 in your model represent. This heavily dictates the results of the confusion matrix.

**Experience:**

I was working on predicting fraud (binary supervised classification), where fraud was denoted by 1 and non-fraud by 0. My model was trained on a **scaled up, perfectly balanced data set**, hence during in-time testing, values of confusion matrix did not seem suspicious when my results were of the order
**[TP FP]
[FN TN]**

Later, when I had to perform an **out-of-time test on a new imbalanced test set**, I realized that the above order of confusion matrix was **wrong** and different from the one mentioned on sklearn's documentation page which refers to the order as **tn,fp,fn,tp**. Plugging in the new order made me realize the blunder and what a difference it had caused in my judgement of the model's performance.

**How to create a confusion matrix in Python using scikit-learn,** scikit learn sorts labels in ascending order, thus 0's are first column/row and 1's are the second one >>> from sklearn.metrics import confusion_matrix as cm If a classification system has been trained to distinguish between cats and non cats, a confusion matrix will summarize the results of testing the algorithm for further inspection. Assuming a sample of 27 animals — 8 cats, and 19 non cats, the resulting confusion matrix could look like the table below: With sklearn

**Scikit-learn confusion matrix,** In the field of machine learning and specifically the problem of statistical A confusion matrix is a summary of prediction results on a classification problem. Learning Model Building in Scikit-learn : A Python Machine Learning Library Confusion Matrix¶. The ConfusionMatrix visualizer is a ScoreVisualizer that takes a fitted scikit-learn classifier and a set of test X and y values and returns a report showing how each of the test values predicted classes compare to their actual classes.

**Confusion Matrix in Machine Learning,** Confusion Matrix¶. The ConfusionMatrix visualizer is a ScoreVisualizer that takes a fitted scikit-learn classifier and a set of test X and y values and returns a In multilabel confusion matrix M C M, the count of true negatives is M C M:, 0, 0, false negatives is M C M:, 1, 0, true positives is M C M:, 1, 1 and false positives is M C M:, 0, 1. Multiclass data will be treated as if binarized under a one-vs-rest transformation.

**Scikit Learn : Confusion Matrix, Accuracy, Precision and Recall ,** sklearn.metrics.confusion_matrix(y_true, y_pred, labels=None)¶. Compute confusion matrix to evaluate the accuracy of a classification. By definition a confusion Example Confusion Matrix in Python with scikit-learn The scikit-learn library for machine learning in Python can calculate a confusion matrix. Given an array or list of expected values and a list of predictions from your machine learning model, the confusion_matrix() function will calculate a confusion matrix and return the result as an array.