I'm starting to get into the world of neural networks on the run. We are trying to find a good fit for the following dataset: https://www.kaggle.com/tanlikesmath/diabetic-retinopathy-resized (The classic Diabetic Retinopathy)
I'm really stuck at the moment as no matter what I do I never get a CCR higher than 0.75. I have tried several types of Data Augmentation and to modify the VGG16 network (which is the one we have been asked to do this practice with). Sometimes during the iterations, I watch it go up but at the end, when the epoch is over it ends up again between 0.72 and 0.75.
I'm pretty new to this and it's obvious I'm doing things wrong but I don't know what yet.
train_datagen = ImageDataGenerator(
rescale=1./255,
#featurewise_std_normalization=True,
#samplewise_std_normalization=False,
featurewise_center=True,
samplewise_center=True,
validation_split=0.30)
train_generator = train_datagen.flow_from_dataframe(
dataframe=trainLabels,
directory='resized_train_cropped/resized_train_cropped/',
x_col="image",
y_col="level",
target_size=(224, 224),
batch_size=10,
class_mode='categorical',
color_mode='rgb', #quitar o no quitar
subset='training')
validation_generator = train_datagen.flow_from_dataframe(
dataframe=trainLabels,
directory='resized_train_cropped/resized_train_cropped/',
x_col="image",
y_col="level",
target_size=(224, 224),
batch_size=10,
class_mode='categorical',
subset='validation')
model=Sequential()
model.add(vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(224,224,3), pooling=None, classes=5))
model.add(Flatten())
model.add(Dense(4096, activation='relu', name='fc1'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu', name='fc2'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax', name='predictions'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['acc', 'mse'])
log_dir="logs\\fit\\" +'Prueba'+ datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)
parada=callbacks.callbacks.EarlyStopping(monitor='val_loss',mode='min',verbose=1,restore_best_weights=True)
learningRate=callbacks.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, verbose=1, mode='min', min_delta=0.0001, cooldown=0, min_lr=0)
model.fit_generator(
train_generator,
steps_per_epoch=500,
epochs=10,
validation_data=validation_generator,
validation_steps=100,
validation_freq=1,
callbacks=[tensorboard_callback,parada,learningRate])
Here I leave the code with what I have so far. One of my main questions is how to use the featurewise_std_normalization, samplewise_std_normalization, featurewise_center and samplewise_center attributes. Which give me a warning that I must do a "fit" first to the images, but I don't know how to do it. I think this can be one of the keys to improve.
If anyone can give me some advice I would be very grateful.
You use many methods, it's very good, I'll give you some explanations:
Image generator
Indeed you have to make a fit to your generator, a generator is a model that learns to generate new images with the modifications that you introduce. To train it you can use:
train_datagen.fit(x_train)
or use in your modelfit_generator()
as you are doing.The warning appears because the hyperparameters indicated by the warning
featurewise_std_normalization, samplewise_std_normalization, featurewise_center, samplewise_center
cannot be trained with, itfit_generator()
is necessary to use themtrain_datagen.fit(x_train)
previously (with the rest of the parameters it is not necessary, that is why it does not give you the warning). Basically what this warning is telling you is that the generator is training you, but without those parameters.I have deduced all this from the Keras documentation itself and after doing tests . And I say I figured it out, because the documentation here is pretty vague and doesn't specifically say so.
I leave you here exactly the section that deals with it, so you can also see more examples that help you.
How to solve this warning?
I propose three options (there may be more):
Train the generated with
train_datagen.fit(x_train)
. The downside to this is that if your image database is a bit large, it won't be able to do it and it won't work for you.Remove these parameters and use others.
See the RAM capacity of your graphic, and calculate the number of images you can load. If it's a decent amount, you can take a representative sample from the image dataset you have and train the generator on just that sample using
train_datagen.fit(x_train)
Functioning of a Neural Network.
From what you say, what happens to you is that you are overtraining your neural network (overfitting). To reduce overfitting you can use various methods like the ones you use, in the end neural networks lack explainability and many times it is to test various hypotheses and combinations to see if they work or not :
We must differentiate between loss (loss function) and accuracy . They are two totally different functions in a neural network and they don't really have any direct relationship.
Loss: A loss function tries to optimize the error so that it is the minimum possible. (it can be the maximum, if it is a profit maximization problem for example). The network tries through its weights to have the smallest possible error.
Accuracy: It is a function that simply measures the percentage of hits over the total, a metric outside the neural network that is used to see its performance.
Taking the above into account and saving me the mathematical operation to make it fast and understandable:
The missing function tries to hit each image as best as possible, while the accuracy does not care how well you hit an image, you just got it right or not. I give you an example:
Your CNN gives an image a probability of 51%, that this patient has diabetes and indeed, you have been right and you have it. For the measure of accuracy, your success is 100%, since you have been right. However the optimization function this output is not good, because it means that a binary classification has hit a class by 51% so it could do much better, and it will simply try to increase the probability next time, so that instead of 51%, be it 60%, 70%, 80%, 99%. But your accuracy will remain the same 100%.
For this reason, not because it has a greater number of epochs , it will be better, because the neural network will try to optimize to have the least possible error, but this can lead to overtraining (overfitting), and lower the accuracy, since it is getting it right. many images with a probability of 100% but what he is really doing is learning those images by heart, and he is not able to generalize, so he misclassifies others. In addition, these types of models later when you pass them the test set, they usually do very badly.
Finally , I want to tell you that this is an explanation of what is happening and I have saved the mathematics so that it is understandable, in short , it is an explanation of walking around the house .