直观混淆矩阵指南

标签：混淆 cm true 矩阵 prediction list classes ax 直观

混淆矩阵简介
在不想进行光盘预测的情况下，我们通常希望评估模型的质量，而不仅仅是像模型准确性这样的简单指标。通常，我们为此目的求助于混淆矩阵图。然而，色标可能具有误导性，而且不直观。在这里，我们增强了正常的混淆矩阵，这样你就可以乍一看就传达你的结果。为了提高可读性，我们将这个 “增强 ”的混淆矩阵命名为 “硬币翻转混淆矩阵” （CCM）。

为了更详细地评估我们的模型，一个经典的工具是混淆矩阵。当我们必须以更简单的方式传达结果时，我们可以改变常规矩阵，例如，通过标准化其颜色刻度，使结果更直观。

首先，我们模拟一些玩具数据。为简单起见，我们从 3 个类别开始，即 3 个不同的数据标签（n_classes = 3）。下面，我在 2D 空间中可视化了数据集。

其次，我们将数据拆分为训练集和测试集，并根据数据估计两个模型：逻辑回归模型和“虚拟模型”。虚拟模型进行随机预测。这个“虚拟模型”是比较我们的逻辑回归的基线，没有预测能力。

# generate the data and make predictions with our two models
n_classes = 3
X, y = make_classification(n_samples=10000, n_features=10,
                           n_classes=n_classes, n_clusters_per_class=1,
                           n_informative=10, n_redundant=0)
y = y.astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=0)
prediction_naive = np.random.randint(low=0, high=n_classes, size=len(y_test))
clf = LogisticRegression().fit(X_train, y_train)
prediction = clf.predict(X_test)

在这里插入图片描述

标准混淆矩阵
现在我们有两个模型的预测。哪个模型表现更好？为了给出更精细的答案，我们计算了一个混淆矩阵。
首先，我用默认的彩条绘制混淆矩阵。它的色图以 0.5（白色）为中心，范围从 0（绿色）到 1（粉红色）。我们可以看到，对于好的模型，我们在“色调”（即粉红色与绿色）上有差异，而对于坏模型，主对角线和非对角线之间没有差异。然而，我们并没有对模型的属性有一个非常详细的了解！25% 的假阳性率（FPR）在非对角线上以绿色着色——但这真的比简单的预测有所改进吗？40% 的 FPR 呢？这个更高的 FPR 也会被涂成浅绿色。然而，这个预测会比随机预测更糟糕！

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))

plot_cm_standard(y_true=y_test, y_pred=prediction, title="Awesome Model", list_classes=[str(i) for i in range(n_classes)],

normalize="prediction", ax=ax1)

plot_cm_standard(y_true=y_test, y_pred=prediction_naive, title="Rolling Dice Model", list_classes=[str(i) for i in range(n_classes)],

normalize="prediction", ax=ax2)

plt.show()

在这里插入图片描述

现在，我们进入秘诀：CM_Norm调整颜色条，使其原点等于随机预测的预期准确性。从本质上讲，“朴素预测准确性”是我们的“原点”，因为预测比抛硬币更差的模型一开始就不是一个有用的模型（因此得名：“抛硬币混淆矩阵”）。换句话说，我们对模型的“超额性能”感兴趣，而不是其“绝对”错误率。举两个例子：对于 3 个不同的类别，颜色条的“原点”将设置为 1/3，或者对于 10 个类别，它将设置为 1/10。

import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
import numpy as np
def plot_cm_standard(y_true, y_pred, list_classes: list, normalize: str, title: str=None, ax=None):
    """ plot the standard confusion matrix!
    :param y_true: np.array, the true values
    :param y_pred: np.array, the predicted values
    :param list_classes: list, of names of the classes
    :param normalize: str, either None, prediction or true
    :param title: str, title of the plot
    """
    # color map and normalization
    cmap = sns.diverging_palette(145, 325, s=200, as_cmap=True)
    norm = CM_Norm(midpoint=1/len(list_classes), vmin=0, vmax=1)
    # the confusion matrix
    cm = confusion_matrix(y_true=y_true, y_pred=y_pred)
    # use normalization?
    if normalize == 'prediction':
        cm = np.round(cm.astype('float') / cm.sum(axis=0)[np.newaxis, :], 2)
    elif normalize == 'true':
        cm = np.round(cm.astype('float') / cm.sum(axis=1)[:, np.newaxis], 2)
    ax = sns.heatmap(cm, annot=True, cmap=cmap, square=True, annot_kws={'fontsize':18}, ax=ax, vmin=0, vmax=1)
    # axis labels
    ax.set_xticklabels(list_classes)
    ax.set_yticklabels(list_classes)
    # titles and labels
    accuracy = np.round(accuracy_score(y_true=y_test, y_pred=y_pred), 2)        
    #compute accuracy
    ax.set_title(title + f" (Acc.: {accuracy})")
    ax.set_ylabel('True')
    ax.set_xlabel('Prediction')
    # layout
    plt.grid(False)
    plt.tight_layout()
class CM_Norm(plt.cm.colors.Normalize):
    """ normalize the colorbar around a value
    """
    def __init__(self, vmin=None, vmax=None, midpoint=None, clip=False):
        self.midpoint = midpoint
        plt.cm.colors.Normalize.__init__(self, vmin, vmax, clip)
    def __call__(self, value, clip=None):
        x, y = [self.vmin, self.midpoint, self.vmax], [0, 0.5, 1]
        return np.ma.masked_array(np.interp(value, x, y), np.isnan(value))
def plot_cm(y_true, y_pred, list_classes: list, normalize: str, title: str=None, ax=None):
    """ plot the confusion matrix and normalize the values
    :param y_true: np.array, the true values
    :param y_pred: np.array, the predicted values
    :param list_classes: list, of names of the classes
    :param normalize: str, either None, prediction or true
    :param title: str, title of the plot
    """
    from sklearn.metrics import accuracy_score, confusion_matrix
    # color map and normalization
    cmap = sns.diverging_palette(145, 325, s=200, as_cmap=True)
    norm = CM_Norm(midpoint=1/len(list_classes), vmin=0, vmax=1)
    # the confusion matrix
    cm = confusion_matrix(y_true=y_true, y_pred=y_pred)
    # use normalization?
    if normalize == 'prediction':
        cm = np.round(cm.astype('float') / cm.sum(axis=0)[np.newaxis, :], 2)
    elif normalize == 'true':
        cm = np.round(cm.astype('float') / cm.sum(axis=1)[:, np.newaxis], 2)
    ax = sns.heatmap(cm, annot=True, cmap=cmap, norm=norm, square=True, annot_kws={'fontsize':18}, ax=ax)
    # axis labels
    ax.set_xticklabels(list_classes)
    ax.set_yticklabels(list_classes)
    # titles and labels
    accuracy = np.round(accuracy_score(y_true=y_test, y_pred=y_pred), 2)  
    #compute accuracy
    ax.set_title(title + f" (Acc.: {accuracy})")
    ax.set_ylabel('True')
    ax.set_xlabel('Prediction')
    # layout
    plt.grid(False)
plt.tight_layout()

归一化导致以下内容：较亮的颜色表示性能较差，较暗的颜色表示性能较好。此属性适用于主对角线（真阳性率：值越接近 1 越好）或非对角线（假阳性率：值越接近 0 越好）。标准混淆矩阵不会在两种类型的错误率之间进行精细区分！

强烈的色彩等于强大的模型！
在下图中，我们将 Logistic 回归与其虚拟回归进行了比较：“great”模型的混淆矩阵的鲜艳色彩立即表明其高真阳性率和低假阳性率！

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))

plot_cm(y_true=y_test, y_pred=prediction, title="Awesome Model", list_classes=[str(i) for i in range(n_classes)],

normalize="prediction", ax=ax1)

plot_cm(y_true=y_test, y_pred=prediction_naive, title="Rolling Dice Model", list_classes=[str(i) for i in range(n_classes)],

normalize="prediction", ax=ax2)

plt.show()

在这里插入图片描述

提高复杂性
更复杂的分类问题加剧了不直观的混淆矩阵问题。当我们处理更多的类时，CCM 才真正开始大放异彩：尽管混淆矩阵更广泛，但你仍然可以一目了然地比较这两个模型的性能！
为了更直观地说明这种可视化效果，我们模拟了一个具有 10 个类的离散预测问题：

n_classes = 10

X, y = make_classification(n_samples=10000, n_features=10,

n_classes=n_classes, n_clusters_per_class=1,

n_informative=10, n_redundant=0)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=0)

prediction_naive = np.random.randint(low=0, high=n_classes, size=len(y_test))

clf = LogisticRegression().fit(X_train, y_train)

prediction = clf.predict(X_test)

在这里插入图片描述

现在，将经典混淆矩阵与 CCM 进行比较。“正常”混淆矩阵并不能提供非常复杂的可视化，因为我们只能判断哪个模型“更好”，因为粉红色的主对角线（好模型）与绿色主对角线（虚拟模型）。但是，我们无法直观地比较两个模型的 FPR。此外，比较两个性能相似的模型将归结为比较单个单元格，这在向观众展示结果时太麻烦了。
CCM 为我们提供了更详细的配色方案：尽管单元格更多，我们仍然可以“瞥见”逻辑回归是更好的模型，因为它由深绿色和粉红色组成，与虚拟模型的浅绿色和白色矩阵相比：强烈的颜色，强大的性能。除了能够选择更强的模型之外，我们还获得了 Logistic 回归的优势和劣势的指示。例如，我们看到，当模型预测“1 类”时，它最终出错的频率比任何其他预测都高，或者真正的 “1 类”从未被预测为“9 类”。

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))
plot_cm_standard(y_true=y_test, y_pred=prediction, title="Awesome Model", list_classes=[str(i) for i in range(n_classes)],
        normalize="prediction", ax=ax1)
plot_cm_standard(y_true=y_test, y_pred=prediction_naive, title="Rolling Dice Model", list_classes=[str(i) for i in range(n_classes)],
        normalize="prediction", ax=ax2)
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))
plot_cm(y_true=y_test, y_pred=prediction, title="Our Awesome Model", list_classes=[str(i) for i in range(n_classes)],
        normalize="prediction", ax=ax1)
plot_cm(y_true=y_test, y_pred=prediction_naive, title="Rolling Dice", list_classes=[str(i) for i in range(n_classes)],
        normalize="prediction", ax=ax2)
plt.show()

在这里插入图片描述

混淆矩阵的结论
我想分享这篇文章中的一些关键要点：
• 使用混淆矩阵评估分类模型的预测
• 对于分类，不仅准确性很重要，真阳性/阴性率也很重要
• 相对于 Naive 基线评估您的模型，例如随机预测或启发式
• 绘制混淆矩阵时，相对于 Naive 基线模型的性能对颜色条进行标准化
• CCM 可让您更直观地评估模型的性能，并且比常规混淆矩阵更适合演示

最后推荐：一个GPU矩阵乘法运算工具-GPUMatrix1.23【Windows版本】
https://download.csdn.net/download/axecute/90253223
在这里插入图片描述

标签：混淆,cm,true,矩阵,prediction,list,classes,ax,直观
From： https://blog.csdn.net/axecute/article/details/145073441

相关文章

赞助商

阅读排行