{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Start by getting the MNIST data -- the handwritten digits -- again through the Keras interface. Remember, we are trying to train a model to classify the input data into which digit (0-9) it represents.\n", "
\n",
"Here we explicitly reshape the input data and normalize it, and encode the training labels. Keras also has an interface to the MNIST data: it returns the images and integers giving the true values. We can use the keras to_categorical function to convert the true values into \"one-hot\" vectors.\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-04-30 09:34:03.047380: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: SSE4.1 SSE4.2 AVX, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"training data shape: (60000, 28, 28)\n",
"First label: [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]\n"
]
}
],
"source": [
"#import data\n",
"from tensorflow.keras.datasets import mnist\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
"\n",
"# shape input data into 1D and scale\n",
"shape=train_images.shape\n",
"print('training data shape: ',shape)\n",
"train_images = train_images.reshape((shape[0],shape[1]*shape[2]))\n",
"train_images = train_images.astype('float32') / 255\n",
"\n",
"# shape test data input 1D and scale\n",
"shape=test_images.shape\n",
"test_images = test_images.reshape((shape[0],shape[1]*shape[2]))\n",
"test_images = test_images.astype('float32') / 255\n",
"\n",
"# convert labels into vectors\n",
"from tensorflow.keras.utils import to_categorical\n",
"train_labels= to_categorical(train_labels)\n",
"print('First label: ', train_labels[0])\n",
"test_labels = to_categorical(test_labels)\n",
"\n",
"# number of input nodes\n",
"ninput = shape[1]*shape[2]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we will define the Keras model and the parameters we will use to train it. We will start by instantiating a model using models.Sequential(), then add layers using the add() method. This takes as a first argument the number of neurons. You can specify the activation function with the activation() keyword. \n",
"
\n",
"For the first layer, you have to specify the input shape, but for all subsequent layers, it will inherit the input shape from the previous layer.\n",
"
\n",
"After adding the layer, use the compile() method to specify an optimizer with the optimizer() keyword, a loss function with the loss=() keyword, and, optionally, an accuracy metric.\n",
"
\n",
"Note that Keras supports a layers.Normalization layer that you can use as the first layer to normalize the data."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from tensorflow.keras import models\n",
"from tensorflow.keras import layers\n",
"from tensorflow.keras import optimizers"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" normalization (Normalizatio (None, 784) 1569 \n",
" n) \n",
" \n",
" dense (Dense) (None, 30) 23550 \n",
" \n",
" dense_1 (Dense) (None, 10) 310 \n",
" \n",
"=================================================================\n",
"Total params: 25,429\n",
"Trainable params: 23,860\n",
"Non-trainable params: 1,569\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"#instantiate a model\n",
"net = models.Sequential()\n",
"\n",
"# add layers\n",
"net.add(layers.Normalization(input_shape=(ninput,)))\n",
"net.add(layers.Dense(30,activation='sigmoid'))\n",
"\n",
"# how many neurons does the last layer need to have?\n",
"net.add(layers.Dense(10, activation='sigmoid'))\n",
"\n",
"# print out a summary of the architecture\n",
"net.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You now specify the optimizer and loss function using the compile() method. You can find information about optimers and loss functions . Here are a couple of examples."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"net.compile(optimizer=optimizers.SGD(learning_rate=30.),loss='categorical_crossentropy',metrics=['accuracy'])\n",
"net.compile(optimizer=optimizers.Adam(),loss='categorical_crossentropy',metrics=['accuracy'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we train the model with the input data using the fit() method, specifying number of epochs with the epochs= keyword and batch size with batch_size= keyword. We can also include validation data with the validation_data= keyword, which takes a tuple of (validatation_input,validation_labels)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/10\n",
"469/469 [==============================] - 5s 7ms/step - loss: 1.0112 - accuracy: 0.7948 - val_loss: 0.5002 - val_accuracy: 0.8972\n",
"Epoch 2/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.4113 - accuracy: 0.9012 - val_loss: 0.3352 - val_accuracy: 0.9160\n",
"Epoch 3/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.3127 - accuracy: 0.9167 - val_loss: 0.2794 - val_accuracy: 0.9254\n",
"Epoch 4/10\n",
"469/469 [==============================] - 3s 7ms/step - loss: 0.2683 - accuracy: 0.9259 - val_loss: 0.2488 - val_accuracy: 0.9298\n",
"Epoch 5/10\n",
"469/469 [==============================] - 3s 7ms/step - loss: 0.2403 - accuracy: 0.9330 - val_loss: 0.2282 - val_accuracy: 0.9341\n",
"Epoch 6/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.2198 - accuracy: 0.9384 - val_loss: 0.2130 - val_accuracy: 0.9389\n",
"Epoch 7/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.2034 - accuracy: 0.9431 - val_loss: 0.2012 - val_accuracy: 0.9424\n",
"Epoch 8/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.1900 - accuracy: 0.9470 - val_loss: 0.1899 - val_accuracy: 0.9442\n",
"Epoch 9/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.1787 - accuracy: 0.9501 - val_loss: 0.1812 - val_accuracy: 0.9473\n",
"Epoch 10/10\n",
"469/469 [==============================] - 3s 6ms/step - loss: 0.1686 - accuracy: 0.9531 - val_loss: 0.1746 - val_accuracy: 0.9488\n"
]
}
],
"source": [
"history = net.fit(train_images,train_labels,epochs=10,batch_size=128, \n",
" validation_data=(test_images,test_labels))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now inspect the learning curves. The fit() method returns a dictionary that has keys 'loss' and 'val_loss', among others, that are arrays with length of the number of training steps, so you can plot loss against training steps for both training data and validation data."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"history keys: dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])\n"
]
},
{
"data": {
"image/png": "",
"text/plain": [
"