I have an 18x18 confusion matrix. For better error visualization I want to colormap based on the value we have in our normalized array.
I want to make a color gradation in which to visualize the data more adequately. For example, the zones in which the correct answers are greater than 80%, would have a different color scale than the zones in which the correct answers are between 20% and 80%. Finally, the areas where the hits are less than 20%, would have a different color gradation.
I have the following code (where we make a dataframe and with it our confusion matrix).
import os
import numpy as np
import pandas as pd
import seaborn as sn
working_path = os.getcwd() #sirve para establecer en qué carpeta estamos trabajando (ruta), ahora todos los archivos q se encuentren en esa carpeta solo los tenemos que llamar con su nombre
df = pd.read_csv("salida.txt",delimiter="\t") #Hacemos un dataframe, importando el archivo txt separado por tabuladores
df.rename(columns={'Number of Syllables': 'NSyllables'}, inplace = True) #Cambiamos (acortamos) nombres de la columna que indica el nº de sílabas
conf_matC = pd.crosstab(df['TargetC'], df['RespC'], rownames=['Target'], colnames=['Response'], margins = True); #Matriz de confusión para CONSONANTES
ncmC = conf_matC.drop(["All", "**"], axis = 0); #Quitamos la fila All (no da información relevante) Consonantes
confusion_matrixC = ncmC.drop(["All", "**","gr", "zr"], axis = 1); #Matriz de confusión total Consonantes - Quitamos la columna All (no nos da información) Consonantes
ncmC1 = confusion_matrixC/confusion_matrixC.max().astype(np.float64); #Normalizamos matriz confusión Consonantes
normarlize_confusion_matrixC = ncmC1.round(2); #Redondeamos los datos a dos decimales en df Consonantes
printed_matrixC = sn.heatmap(normarlize_confusion_matrixC, cmap='Oranges', annot=False); #Imprimimos matriz de confusion con mapa de calor, Consonantes
With this code, we get the following array:
Response b ch d f g k ... rr s t x y z
Target ...
b 1.00 0.0 0.10 0.0 0.08 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
ch 0.00 1.0 0.00 0.0 0.00 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
d 0.00 0.0 1.00 0.0 0.08 0.00 ... 0.08 0.00 0.02 0.0 0.0 0.00
f 0.00 0.0 0.00 1.0 0.00 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.36
g 0.03 0.0 0.00 0.0 1.00 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
k 0.00 0.0 0.00 0.0 0.00 1.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
l 0.00 0.0 0.03 0.0 0.00 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
m 0.03 0.0 0.00 0.0 0.08 0.00 ... 0.08 0.00 0.00 0.0 0.0 0.00
n 0.00 0.0 0.10 0.0 0.00 0.00 ... 0.08 0.00 0.00 0.0 0.0 0.00
ny 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 0.00 0.00 0.0 0.0 0.00
p 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 0.00 0.04 0.0 0.0 0.00
r 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.08 0.00 0.02 0.0 0.0 0.00
rr 0.00 0.0 0.00 0.0 0.00 0.00 ... 1.00 0.00 0.00 0.0 0.0 0.00
s 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 1.00 0.00 0.0 0.0 0.00
t 0.00 0.0 0.00 0.0 0.00 0.09 ... 0.00 0.00 1.00 0.0 0.0 0.12
x 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 0.00 0.00 1.0 0.0 0.04
y 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 0.00 0.00 0.0 1.0 0.00
z 0.00 0.0 0.00 0.0 0.00 0.00 ... 0.00 0.12 0.00 0.0 0.0 1.00
Seen as an image in the dataframe:
At first I have made a heatmap of my confusion matrix, but it gives me several display errors as shown in the following image.
How could I solve this problem and print the matrix by colors without getting the numbers cut off? Could you make different color gradations according to the degree of success as explained above?
Thank you very much