Тренировка после 15 эпох в наборе данных CIFAR-10, кажется, делает потерю проверки больше не снижающейся, придерживаясь приблизительно 1,4 (с точностью подтверждения 60%). Я перетасовал тренировочный набор, разделил его на 255 и импортировал как float32. Я пробовал множество архитектур, как с выпадением, так и без него в слоях Conv2D, и, похоже, ничего не работает. Та же архитектура обеспечивает точность 99,7% в тестовых наборах для MNIST. Пожалуйста, посмотрите архитектуру ниже:
(Примечание: я пытался увеличить отсев и увеличивать / уменьшать скорость обучения оптимизатора Adam, чтобы предотвратить переоснащение, все, что он делает, это предотвращает переоснащение, но и тренировочные, и тестовые наборы теперь имеют одинаковую низкую точность около 60%).
with tf.device('/gpu:0'):
tf.placeholder(tf.float32, shape=(None, 20, 64))
#placeholder initialized (pick /cpu:0 or /gpu:0)
seed = 6
np.random.seed(seed)
modelnn = Sequential()
neurons = x_train_reduced.shape[1:]
modelnn.add(Convolution2D(32, 3, 3, input_shape=neurons, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(32, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
#modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
#modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
#modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Flatten())
#modelnn.add(Dropout(0.5))
modelnn.add(Dense(1024, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(Dropout(0.5))
modelnn.add(Dense(512, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(Dropout(0.5))
modelnn.add(Dense(10, activation='softmax'))
modelnn.compile(loss='categorical_crossentropy', optimizer=optimizer_input, metrics=['accuracy'])
y_train = to_categorical(y_train)
modelnn.fit(x_train_reduced, y_train, nb_epoch=nb_epoch_count, shuffle=True, batch_size=bsize,
validation_split=0.1)
Результаты:
44100/44100 [==============================] - 22s - loss: 2.1453 - acc: 0.2010 - val_loss: 1.9812 - val_acc: 0.2959
Epoch 2/50
44100/44100 [==============================] - 24s - loss: 1.9486 - acc: 0.3089 - val_loss: 1.8685 - val_acc: 0.3567
Epoch 3/50
44100/44100 [==============================] - 18s - loss: 1.8599 - acc: 0.3575 - val_loss: 1.7822 - val_acc: 0.3982
Epoch 4/50
44100/44100 [==============================] - 18s - loss: 1.7925 - acc: 0.3933 - val_loss: 1.7272 - val_acc: 0.4229
Epoch 5/50
44100/44100 [==============================] - 18s - loss: 1.7425 - acc: 0.4195 - val_loss: 1.6806 - val_acc: 0.4459
Epoch 6/50
44100/44100 [==============================] - 18s - loss: 1.6998 - acc: 0.4440 - val_loss: 1.6436 - val_acc: 0.4682
Epoch 7/50
44100/44100 [==============================] - 18s - loss: 1.6636 - acc: 0.4603 - val_loss: 1.6156 - val_acc: 0.4837
Epoch 8/50
44100/44100 [==============================] - 18s - loss: 1.6333 - acc: 0.4781 - val_loss: 1.6351 - val_acc: 0.4776
Epoch 9/50
44100/44100 [==============================] - 18s - loss: 1.6086 - acc: 0.4898 - val_loss: 1.5732 - val_acc: 0.5063
Epoch 10/50
44100/44100 [==============================] - 18s - loss: 1.5776 - acc: 0.5065 - val_loss: 1.5411 - val_acc: 0.5227
Epoch 11/50
44100/44100 [==============================] - 18s - loss: 1.5585 - acc: 0.5145 - val_loss: 1.5485 - val_acc: 0.5212
Epoch 12/50
44100/44100 [==============================] - 18s - loss: 1.5321 - acc: 0.5288 - val_loss: 1.5354 - val_acc: 0.5316
Epoch 13/50
44100/44100 [==============================] - 18s - loss: 1.5082 - acc: 0.5402 - val_loss: 1.5022 - val_acc: 0.5427
Epoch 14/50
44100/44100 [==============================] - 18s - loss: 1.4945 - acc: 0.5438 - val_loss: 1.4916 - val_acc: 0.5490
Epoch 15/50
44100/44100 [==============================] - 192s - loss: 1.4762 - acc: 0.5535 - val_loss: 1.5159 - val_acc: 0.5394
Epoch 16/50
44100/44100 [==============================] - 18s - loss: 1.4577 - acc: 0.5620 - val_loss: 1.5389 - val_acc: 0.5257
Epoch 17/50
44100/44100 [==============================] - 18s - loss: 1.4425 - acc: 0.5671 - val_loss: 1.4590 - val_acc: 0.5667
Epoch 18/50
44100/44100 [==============================] - 18s - loss: 1.4258 - acc: 0.5766 - val_loss: 1.4552 - val_acc: 0.5763
Epoch 19/50
44100/44100 [==============================] - 18s - loss: 1.4113 - acc: 0.5805 - val_loss: 1.4439 - val_acc: 0.5767
Epoch 20/50
44100/44100 [==============================] - 18s - loss: 1.3971 - acc: 0.5879 - val_loss: 1.4473 - val_acc: 0.5769
Epoch 21/50
44100/44100 [==============================] - 18s - loss: 1.3850 - acc: 0.5919 - val_loss: 1.4251 - val_acc: 0.5871
Epoch 22/50
44100/44100 [==============================] - 18s - loss: 1.3668 - acc: 0.6006 - val_loss: 1.4203 - val_acc: 0.5910
Epoch 23/50
44100/44100 [==============================] - 18s - loss: 1.3549 - acc: 0.6051 - val_loss: 1.4207 - val_acc: 0.5939
Epoch 24/50
44100/44100 [==============================] - 18s - loss: 1.3373 - acc: 0.6111 - val_loss: 1.4516 - val_acc: 0.5784
Epoch 25/50
44100/44100 [==============================] - 18s - loss: 1.3285 - acc: 0.6149 - val_loss: 1.4146 - val_acc: 0.5922
Epoch 26/50
44100/44100 [==============================] - 18s - loss: 1.3134 - acc: 0.6205 - val_loss: 1.4090 - val_acc: 0.6024
Epoch 27/50
44100/44100 [==============================] - 18s - loss: 1.3043 - acc: 0.6239 - val_loss: 1.4307 - val_acc: 0.5959
Epoch 28/50
44100/44100 [==============================] - 18s - loss: 1.2862 - acc: 0.6297 - val_loss: 1.4241 - val_acc: 0.5978
Epoch 29/50
44100/44100 [==============================] - 18s - loss: 1.2706 - acc: 0.6340 - val_loss: 1.4046 - val_acc: 0.6067
Epoch 30/50
44100/44100 [==============================] - 18s - loss: 1.2634 - acc: 0.6405 - val_loss: 1.4120 - val_acc: 0.6037
Epoch 31/50
44100/44100 [==============================] - 18s - loss: 1.2473 - acc: 0.6446 - val_loss: 1.4067 - val_acc: 0.6045
Epoch 32/50
44100/44100 [==============================] - 18s - loss: 1.2411 - acc: 0.6471 - val_loss: 1.4083 - val_acc: 0.6098
Epoch 33/50
44100/44100 [==============================] - 18s - loss: 1.2241 - acc: 0.6498 - val_loss: 1.4091 - val_acc: 0.6076
Epoch 34/50
44100/44100 [==============================] - 18s - loss: 1.2121 - acc: 0.6541 - val_loss: 1.4209 - val_acc: 0.6127
Epoch 35/50
44100/44100 [==============================] - 18s - loss: 1.1995 - acc: 0.6582 - val_loss: 1.4230 - val_acc: 0.6131
Epoch 36/50
44100/44100 [==============================] - 18s - loss: 1.1884 - acc: 0.6622 - val_loss: 1.4024 - val_acc: 0.6124
Epoch 37/50
44100/44100 [==============================] - 18s - loss: 1.1778 - acc: 0.6657 - val_loss: 1.4328 - val_acc: 0.6080
Epoch 38/50
44100/44100 [==============================] - 18s - loss: 1.1612 - acc: 0.6683 - val_loss: 1.4246 - val_acc: 0.6159
Epoch 39/50
44100/44100 [==============================] - 18s - loss: 1.1466 - acc: 0.6735 - val_loss: 1.4282 - val_acc: 0.6122
Epoch 40/50
44100/44100 [==============================] - 18s - loss: 1.1325 - acc: 0.6783 - val_loss: 1.4311 - val_acc: 0.6157
Epoch 41/50
44100/44100 [==============================] - 18s - loss: 1.1213 - acc: 0.6806 - val_loss: 1.4647 - val_acc: 0.6047
Epoch 42/50
44100/44100 [==============================] - 18s - loss: 1.1064 - acc: 0.6842 - val_loss: 1.4631 - val_acc: 0.6047
Epoch 43/50
44100/44100 [==============================] - 18s - loss: 1.0967 - acc: 0.6870 - val_loss: 1.4535 - val_acc: 0.6106
Epoch 44/50
44100/44100 [==============================] - 18s - loss: 1.0822 - acc: 0.6893 - val_loss: 1.4532 - val_acc: 0.6149
Epoch 45/50
44100/44100 [==============================] - 18s - loss: 1.0659 - acc: 0.6941 - val_loss: 1.4691 - val_acc: 0.6108
Epoch 46/50
44100/44100 [==============================] - 18s - loss: 1.0610 - acc: 0.6956 - val_loss: 1.4751 - val_acc: 0.6106
Epoch 47/50
44100/44100 [==============================] - 18s - loss: 1.0397 - acc: 0.6981 - val_loss: 1.4857 - val_acc: 0.6041
Epoch 48/50
44100/44100 [==============================] - 18s - loss: 1.0208 - acc: 0.7039 - val_loss: 1.4901 - val_acc: 0.6106
Epoch 49/50
44100/44100 [==============================] - 18s - loss: 1.0187 - acc: 0.7036 - val_loss: 1.4994 - val_acc: 0.6106
Epoch 50/50
44100/44100 [==============================] - 18s - loss: 1.0024 - acc: 0.7070 - val_loss: 1.5078 - val_acc: 0.6039
Time: 1109.7512991428375
Neural Network now trained from dimensions (49000, 3, 32, 32)
Обновление: дальнейшее тестирование, включая BatchNormalization как с MaxNorm, так и без него -
Новая архитектура:
modelnn.add(Convolution2D(32, 3, 3, input_shape=neurons, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(32, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(BatchNormalization())
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
# modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
# modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
# modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Flatten())
modelnn.add(Dense(1024, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.5))
modelnn.add(Dense(512, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.5))
modelnn.add(Dense(10, activation='softmax'))