Tensorflow¶

[1]:

%matplotlib inline

[2]:

import warnings
warnings.simplefilter('ignore', RuntimeWarning)

[3]:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns

[4]:

import tensorflow as tf
import tensorflow.keras as keras

Keras¶

[5]:

Dense = keras.layers.Dense

We can consider a DL model as just a black box with a bunch of unnown parameters. For example, when the output is a Dense layer with just one node, the entire network model is just doing some form of regression. If we use a single node with a sigmoid activation function, the model is essentially doing logistic regression.

A single unit with sigmoid activation function¶

[6]:

X_train = pd.read_csv('data/X_train.csv')
X_test = pd.read_csv('data/X_test.csv')
y_train = pd.read_csv('data/y_train.csv')
y_test = pd.read_csv('data/y_test.csv')

[7]:

X_train.shape

[7]:

(981, 11)

[8]:

model01 = keras.models.Sequential([
    Dense(1,
          activation='sigmoid',
          input_shape=X_train.shape[1:]),
])

[9]:

model01.compile(loss="binary_crossentropy",
                optimizer="sgd",
                metrics=["accuracy"])

[10]:

model01.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 1)                 12
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________

[11]:

model01.layers

[11]:

[<tensorflow.python.keras.layers.core.Dense at 0x1527d1910>]

[12]:

model01.layers[0].name

[12]:

'dense'

[13]:

model01.layers[0].activation

[13]:

<function tensorflow.python.keras.activations.sigmoid(x)>

[14]:

hist = model01.fit(X_train,
                   y_train,
                   epochs=20,
                   verbose=0,
                   validation_split=0.2)

[15]:

import pandas as pd

[16]:

df = pd.DataFrame(hist.history)

[17]:

df.head()

[17]:

	loss	accuracy	val_loss	val_accuracy
0	1.029838	0.371173	0.959598	0.380711
1	0.922509	0.409439	0.866757	0.431472
2	0.833006	0.436224	0.791162	0.467005
3	0.759757	0.474490	0.730747	0.532995
4	0.700539	0.536990	0.684353	0.553299

[18]:

df.plot()
pass

../_images/notebooks_B12_Keras_Building_Blocks_21_0.png

[19]:

model01.evaluate(X_test, y_test)

11/11 [==============================] - 0s 591us/step - loss: 0.4566 - accuracy: 0.8293

[19]:

[0.4565536081790924, 0.8292682766914368]

[20]:

k = 5
np.c_[model01.predict(X_test.iloc[:k, :]), y_test[:k]]

[20]:

array([[0.71404094, 0.        ],
       [0.17881006, 0.        ],
       [0.56976336, 1.        ],
       [0.33437088, 1.        ],
       [0.87005341, 1.        ]])

Saving and loading model weights¶

[21]:

model01.save('titanic.h5')

[22]:

model011A = keras.models.load_model('titanic.h5')

[23]:

model011A.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 1)                 12
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________

Logistic regression replicates results¶

[24]:

import h5py

[25]:

fh = h5py.File('titanic.h5', 'r')

[26]:

fh.keys()

[26]:

<KeysViewHDF5 ['model_weights', 'optimizer_weights']>

[27]:

fh['model_weights'].keys()

[27]:

<KeysViewHDF5 ['dense']>

[28]:

fh['model_weights'].get('dense').get('dense').keys()

[28]:

<KeysViewHDF5 ['bias:0', 'kernel:0']>

[29]:

bias = fh['model_weights'].get('dense').get('dense').get('bias:0')
bias

[29]:

<HDF5 dataset "bias:0": shape (1,), type "<f4">

[30]:

wts = fh['model_weights'].get('dense').get('dense').get('kernel:0')
wts

[30]:

<HDF5 dataset "kernel:0": shape (11, 1), type "<f4">

[31]:

bias[:]

[31]:

array([-0.4318819], dtype=float32)

[32]:

wts[:]

[32]:

array([[-0.26382327],
       [ 0.11888377],
       [ 1.0947229 ],
       [-0.20582937],
       [-0.35638043],
       [ 0.02472745],
       [ 0.10281957],
       [ 0.03278307],
       [-0.25210753],
       [-0.26847404],
       [ 0.09856348]], dtype=float32)

[33]:

logodds = bias[:] + X_test.iloc[:5, :] @ wts[:]
odds = np.exp(logodds)
p_lr = odds/(1 + odds)

[34]:

p_nn = model01.predict(X_test.iloc[:5])

[35]:

pd.DataFrame(np.c_[p_lr, p_nn], columns=['lr', 'nn'])

[35]:

	lr	nn
0	0.714041	0.714041
1	0.178810	0.178810
2	0.569763	0.569763
3	0.334371	0.334371
4	0.870053	0.870053

[36]:

fh.close()

Building blocks¶

A keras model is composed of layers. Each layer has its own activation function. Each layer also has its own biases and weights. To set initial random weights, there are several possible strategies known as initializers. To fit the model, you need to specify a loss function. During training, the optimizer finds biases and weights that minimize the loss function. Model performance is evaluated using metrics.

Commonly used versions of these classes or functions come built-in with keras.

Layers¶

[37]:

[x for x in dir(keras.layers) if
 x[0].isupper() and
 not x.startswith('_')]

[37]:

['AbstractRNNCell',
 'Activation',
 'ActivityRegularization',
 'Add',
 'AdditiveAttention',
 'AlphaDropout',
 'Attention',
 'Average',
 'AveragePooling1D',
 'AveragePooling2D',
 'AveragePooling3D',
 'AvgPool1D',
 'AvgPool2D',
 'AvgPool3D',
 'BatchNormalization',
 'Bidirectional',
 'Concatenate',
 'Conv1D',
 'Conv1DTranspose',
 'Conv2D',
 'Conv2DTranspose',
 'Conv3D',
 'Conv3DTranspose',
 'ConvLSTM2D',
 'Convolution1D',
 'Convolution1DTranspose',
 'Convolution2D',
 'Convolution2DTranspose',
 'Convolution3D',
 'Convolution3DTranspose',
 'Cropping1D',
 'Cropping2D',
 'Cropping3D',
 'Dense',
 'DenseFeatures',
 'DepthwiseConv2D',
 'Dot',
 'Dropout',
 'ELU',
 'Embedding',
 'Flatten',
 'GRU',
 'GRUCell',
 'GaussianDropout',
 'GaussianNoise',
 'GlobalAveragePooling1D',
 'GlobalAveragePooling2D',
 'GlobalAveragePooling3D',
 'GlobalAvgPool1D',
 'GlobalAvgPool2D',
 'GlobalAvgPool3D',
 'GlobalMaxPool1D',
 'GlobalMaxPool2D',
 'GlobalMaxPool3D',
 'GlobalMaxPooling1D',
 'GlobalMaxPooling2D',
 'GlobalMaxPooling3D',
 'Input',
 'InputLayer',
 'InputSpec',
 'LSTM',
 'LSTMCell',
 'Lambda',
 'Layer',
 'LayerNormalization',
 'LeakyReLU',
 'LocallyConnected1D',
 'LocallyConnected2D',
 'Masking',
 'MaxPool1D',
 'MaxPool2D',
 'MaxPool3D',
 'MaxPooling1D',
 'MaxPooling2D',
 'MaxPooling3D',
 'Maximum',
 'Minimum',
 'Multiply',
 'PReLU',
 'Permute',
 'RNN',
 'ReLU',
 'RepeatVector',
 'Reshape',
 'SeparableConv1D',
 'SeparableConv2D',
 'SeparableConvolution1D',
 'SeparableConvolution2D',
 'SimpleRNN',
 'SimpleRNNCell',
 'Softmax',
 'SpatialDropout1D',
 'SpatialDropout2D',
 'SpatialDropout3D',
 'StackedRNNCells',
 'Subtract',
 'ThresholdedReLU',
 'TimeDistributed',
 'UpSampling1D',
 'UpSampling2D',
 'UpSampling3D',
 'Wrapper',
 'ZeroPadding1D',
 'ZeroPadding2D',
 'ZeroPadding3D']

Activations¶

[38]:

[x for x in dir(keras.activations) if
 x[0].islower() and
 not x.startswith('_')]

[38]:

['deserialize',
 'elu',
 'exponential',
 'get',
 'hard_sigmoid',
 'linear',
 'relu',
 'selu',
 'serialize',
 'sigmoid',
 'softmax',
 'softplus',
 'softsign',
 'swish',
 'tanh']

Example¶

[39]:

x = tf.range(-10, 10, 0.1)
y = keras.activations.sigmoid(x)

[40]:

plt.plot(x, y);

../_images/notebooks_B12_Keras_Building_Blocks_49_0.png

Initializers¶

[41]:

[x for x in dir(keras.initializers) if
 x[0].isupper() and
 not x.startswith('_')]

[41]:

['Constant',
 'GlorotNormal',
 'GlorotUniform',
 'HeNormal',
 'HeUniform',
 'Identity',
 'Initializer',
 'LecunNormal',
 'LecunUniform',
 'Ones',
 'Orthogonal',
 'RandomNormal',
 'RandomUniform',
 'TruncatedNormal',
 'VarianceScaling',
 'Zeros']

Example¶

[42]:

init = keras.initializers.LecunNormal(seed=0)
init(shape=(2,3)).numpy()

[42]:

array([[-1.3641988 , -0.38696545, -0.532354  ],
       [ 0.06994588, -0.21366763,  0.07270983]], dtype=float32)

Losses¶

[43]:

[x for x in dir(keras.losses) if
 x[0].isupper() and
 not x.startswith('_')]

[43]:

['BinaryCrossentropy',
 'CategoricalCrossentropy',
 'CategoricalHinge',
 'CosineSimilarity',
 'Hinge',
 'Huber',
 'KLD',
 'KLDivergence',
 'LogCosh',
 'Loss',
 'MAE',
 'MAPE',
 'MSE',
 'MSLE',
 'MeanAbsoluteError',
 'MeanAbsolutePercentageError',
 'MeanSquaredError',
 'MeanSquaredLogarithmicError',
 'Poisson',
 'Reduction',
 'SparseCategoricalCrossentropy',
 'SquaredHinge']

Example¶

[44]:

loss = keras.losses.BinaryCrossentropy()

[45]:

y_true = [1,0,0,1]
y_pred = [0.9, 0.2, 0.3, 0.8]
loss(y_true, y_pred).numpy()

[45]:

0.2270805

Metrics¶

[46]:

[x for x in dir(keras.metrics) if
 x[0].isupper() and
 not x.startswith('_')]

[46]:

['AUC',
 'Accuracy',
 'BinaryAccuracy',
 'BinaryCrossentropy',
 'CategoricalAccuracy',
 'CategoricalCrossentropy',
 'CategoricalHinge',
 'CosineSimilarity',
 'FalseNegatives',
 'FalsePositives',
 'Hinge',
 'KLD',
 'KLDivergence',
 'LogCoshError',
 'MAE',
 'MAPE',
 'MSE',
 'MSLE',
 'Mean',
 'MeanAbsoluteError',
 'MeanAbsolutePercentageError',
 'MeanIoU',
 'MeanRelativeError',
 'MeanSquaredError',
 'MeanSquaredLogarithmicError',
 'MeanTensor',
 'Metric',
 'Poisson',
 'Precision',
 'PrecisionAtRecall',
 'Recall',
 'RecallAtPrecision',
 'RootMeanSquaredError',
 'SensitivityAtSpecificity',
 'SparseCategoricalAccuracy',
 'SparseCategoricalCrossentropy',
 'SparseTopKCategoricalAccuracy',
 'SpecificityAtSensitivity',
 'SquaredHinge',
 'Sum',
 'TopKCategoricalAccuracy',
 'TrueNegatives',
 'TruePositives']

Example¶

[47]:

metric = keras.metrics.Accuracy()

[48]:

metric.reset_states()

[49]:

metric.update_state(
    [[1], [2], [3]],
    [[1], [1], [3]]
)

[49]:

<tf.Variable 'UnreadVariable' shape=() dtype=float32, numpy=3.0>

[50]:

metric.result().numpy()

[50]:

0.6666667

Optimizers¶

[51]:

[x for x in dir(keras.optimizers) if
 x[0].isupper() and
 not x.startswith('_')]

[51]:

['Adadelta',
 'Adagrad',
 'Adam',
 'Adamax',
 'Ftrl',
 'Nadam',
 'Optimizer',
 'RMSprop',
 'SGD']

Example¶

[52]:

opt = keras.optimizers.Adam(learning_rate=0.1)

[53]:

v = tf.Variable(10.0)
loss = lambda: v**2/2.0
n_steps = opt.minimize(loss, [v]).numpy()

[54]:

v.numpy()

[54]:

9.9

[55]:

n_steps = opt.minimize(loss, [v]).numpy()

[56]:

v.numpy()

[56]:

9.800028

[57]:

n_steps

[57]:

Tensorflow Datasets Project¶

Makes it simple to download standard datasets for deep learning.

List of datasets

[58]:

! python3 -m pip install --quiet tensorflow-datasets

[59]:

import tensorflow_datasets as tfds

[60]:

ds, info = tfds.load(name='fashion_mnist',
                     as_supervised=True,
                     with_info=True)

[61]:

ds

[61]:

{'test': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>,
 'train': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>}

[62]:

info

[62]:

tfds.core.DatasetInfo(
    name='fashion_mnist',
    version=3.0.1,
    description='Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.',
    homepage='https://github.com/zalandoresearch/fashion-mnist',
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    }),
    total_num_examples=70000,
    splits={
        'test': 10000,
        'train': 60000,
    },
    supervised_keys=('image', 'label'),
    citation="""@article{DBLP:journals/corr/abs-1708-07747,
      author    = {Han Xiao and
                   Kashif Rasul and
                   Roland Vollgraf},
      title     = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
                   Algorithms},
      journal   = {CoRR},
      volume    = {abs/1708.07747},
      year      = {2017},
      url       = {http://arxiv.org/abs/1708.07747},
      archivePrefix = {arXiv},
      eprint    = {1708.07747},
      timestamp = {Mon, 13 Aug 2018 16:47:27 +0200},
      biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1708-07747},
      bibsource = {dblp computer science bibliography, https://dblp.org}
    }""",
    redistribution_info=,
)

[63]:

X_train, X_test = ds['train'], ds['test']

[64]:

#### Displaying the dataset

[65]:

tfds.as_dataframe(X_train.take(5), info)

[65]:

	image	label
0		2 (Pullover)
1		1 (Trouser)
2		8 (Bag)
3		4 (Coat)
4		1 (Trouser)

[66]:

tfds.show_examples(X_train, info);

../_images/notebooks_B12_Keras_Building_Blocks_84_0.png

Feed into deep learning pipeline¶

The data is ready to be fed to a keras model.

Coming in next lecture.