Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I can't figure out if I've setup my binary classification problem correctly. I labeled the positive class 1 and the negative 0. However It is my understanding that by default scikit-learn uses class 0 as the positive class in its confusion matrix (so the inverse of how I set it up). This is confusing to me. Is the top row, in scikit-learn's default setting, the positive or negative class? Lets assume the confusion matrix output:

confusion_matrix(y_test, preds)

 [ [30  5]

    [2 42] ]

How would it look like in a confusion matrix? Are the actual instances the rows or the columns in scikit-learn?

          prediction                        prediction

           0       1                          1       0

         -----   -----                      -----   -----

      0 | TN   |  FP        (OR)         1 |  TP  |  FP

actual   -----   -----             actual   -----   -----

      1 | FN   |  TP                     0 |  FN  |  TN

1 Answer

0 votes
by (33.1k points)
edited by

Scikit Learn Tutorial sorts labels in ascending order, thus 0's are first column/row and 1's are the second one.

For example:

>>> from sklearn.metrics import confusion_matrix as cm

>>> y_test = [1, 0, 0]

>>> y_pred = [1, 0, 0]

>>> cm(y_test, y_pred)

array([[2, 0],

       [0, 1]])

>>> y_pred = [4, 0, 0]

>>> y_test = [4, 0, 0]

>>> cm(y_test, y_pred)

array([[2, 0],

       [0, 1]])

>>> y_test = [-2, 0, 0]

>>> y_pred = [-2, 0, 0]

>>> cm(y_test, y_pred)

array([[1, 0],

       [0, 2]])

Thus you can alter this behavior by providing labels to confusion_matrix call

>>> y_test = [1, 0, 0]

>>> y_pred = [1, 0, 0]

>>> cm(y_pred, y_pred)

array([[2, 0],

       [0, 1]])

>>> cm(y_pred, y_pred, labels=[1, 0])

array([[1, 0],

       [0, 2]])

And actual/predicted are ordered just like in your images - predictions are in columns and actual values in rows

>>> y_test = [5, 5, 5, 0, 0, 0]

>>> y_pred = [5, 0, 0, 0, 0, 0]

>>> cm(y_test, y_pred)

array([[3, 0],

       [2, 1]])

  • true: 0, predicted: 0 (value: 3, position [0, 0])
  • true: 5, predicted: 0 (value: 2, position [1, 0])
  • true: 0, predicted: 5 (value: 0, position [0, 1])
  • true: 5, predicted: 5 (value: 1, position [1, 1])

If you want know about Artificial Intelligence and Deep Learning Tutorial then you can watch this video:

Browse Categories

...