Learning

Learning probabilistic models. (Chapters 20)

class probabilistic_learning.CountingProbDist(observations=None, default=0)[source]

Bases: object

A probability distribution formed by observing and counting examples. If p is an instance of this class and o is an observed value, then there are 3 main operations: p.add(o) increments the count for observation o by 1. p.sample() returns a random element from the distribution. p[o] returns the probability for o (as in a regular ProbDist).

add(o)[source]: Add an observation o to the distribution.

smooth_for(o)[source]: Include o among the possible observations, whether or not it’s been observed yet.

top(n)[source]: Return (count, obs) tuples for the n most frequent observations.

sample()[source]: Return a random sample from the distribution.

probabilistic_learning.NaiveBayesLearner(dataset, continuous=True, simple=False)[source]: Return a naive Bayes classifier for dataset, dispatching to the simple, continuous (Gaussian), or discrete variant according to simple and continuous.

probabilistic_learning.NaiveBayesSimple(distribution)[source]

A simple naive bayes classifier that takes as input a dictionary of CountingProbDist objects and classifies items according to these distributions. The input dictionary is in the following form:

(ClassName, ClassProb): CountingProbDist

probabilistic_learning.NaiveBayesDiscrete(dataset)[source]: Just count how many times each value of each input attribute occurs, conditional on the target value. Count the different target values too.

probabilistic_learning.NaiveBayesContinuous(dataset)[source]: Count how many times each target value occurs. Also, find the means and deviations of input attribute values for each target value.

Deep learning. (Chapters 20)

class deep_learning4e.Node(weights=None, value=None)[source]

Bases: object

A single unit of a layer in a neural network :param weights: weights between parent nodes and current node :param value: value of current node

class deep_learning4e.Layer(size)[source]

Bases: object

A layer in a neural network based on a computational graph. :param size: number of units in the current layer

forward(inputs)[source]: Define the operation to get the output of this layer

class deep_learning4e.Activation[source]

Bases: object

Abstract base class for neural-network activation functions.

Subclasses implement function and its derivative; calling an instance applies the activation to its input.

function(x)[source]: Apply the activation function to input x.

derivative(x)[source]: Return the derivative of the activation function at x.

class deep_learning4e.Sigmoid[source]

Bases: Activation

Logistic sigmoid activation, 1 / (1 + e**-x).

function(x)[source]: Return the logistic sigmoid of x.

derivative(value)[source]: Return the sigmoid derivative given the layer output value.

class deep_learning4e.ReLU[source]

Bases: Activation

Rectified Linear Unit activation, max(0, x).

function(x)[source]: Return max(0, x).

derivative(value)[source]: Return the ReLU derivative (1 if value > 0 else 0).

class deep_learning4e.ELU(alpha=0.01)[source]

Bases: Activation

Exponential Linear Unit activation, with scale alpha for non-positive inputs.

function(x)[source]: Return x if positive else alpha * (e**x - 1).

derivative(value)[source]: Return the ELU derivative given the layer output value.

class deep_learning4e.LeakyReLU(alpha=0.01)[source]

Bases: Activation

Leaky ReLU activation, with small slope alpha for negative inputs.

function(x)[source]: Return max(x, alpha * x).

derivative(value)[source]: Return the Leaky ReLU derivative (1 if value > 0 else alpha).

class deep_learning4e.Tanh[source]

Bases: Activation

Hyperbolic tangent activation.

function(x)[source]: Return tanh(x).

derivative(value)[source]: Return the tanh derivative given the layer output value (1 - value**2).

class deep_learning4e.SoftMax[source]

Bases: Activation

Softmax activation that normalises a vector into a probability distribution.

function(x)[source]: Return the softmax of vector x (normalised exponentials).

derivative(x)[source]: Return a placeholder unit gradient for each element of x.

class deep_learning4e.SoftPlus[source]

Bases: Activation

SoftPlus activation, log(1 + e**x) (a smooth approximation of ReLU).

function(x)[source]: Return log(1 + e**x) for x.

derivative(x)[source]: Return the SoftPlus derivative at x (the logistic sigmoid).

class deep_learning4e.Linear[source]

Bases: Activation

Identity (linear) activation that returns its input unchanged.

function(x)[source]: Return x unchanged.

derivative(x)[source]: Return an all-ones gradient matching the shape of x.

class deep_learning4e.InputLayer(size=3)[source]

Bases: Layer

1D input layer. Layer size is the same as input vector size.

forward(inputs)[source]: Take each value of the inputs to each unit in the layer.

class deep_learning4e.OutputLayer(size=3)[source]

Bases: Layer

1D softmax output layer in 19.3.2.

forward(inputs, activation=<class 'deep_learning4e.SoftMax'>)[source]: Apply activation (softmax by default) to inputs and store it in each node.

class deep_learning4e.DenseLayer(in_size=3, out_size=3, activation=<class 'deep_learning4e.Sigmoid'>)[source]

Bases: Layer

1D dense layer in a neural network. :param in_size: (int) input vector size :param out_size: (int) output vector size :param activation: (Activation object) activation function

forward(inputs)[source]: Apply the activation to each unit’s weighted sum of inputs and return the outputs.

class deep_learning4e.ConvLayer1D(size=3, kernel_size=3)[source]

Bases: Layer

1D convolution layer of in neural network. :param kernel_size: convolution kernel size

forward(features)[source]: Convolve each input channel in features with its node kernel and return the outputs.

class deep_learning4e.MaxPoolingLayer1D(size=3, kernel_size=3)[source]

Bases: Layer

1D max pooling layer in a neural network. :param kernel_size: max pooling area size

forward(features)[source]: Apply 1D max pooling over each channel in features and return the pooled outputs.

class deep_learning4e.BatchNormalizationLayer(size, eps=0.001)[source]

Bases: Layer

Batch normalization layer.

forward(inputs)[source]: Normalise inputs by their mean and std, then scale and shift by the layer weights.

deep_learning4e.init_examples(examples, idx_i, idx_t, o_units)[source]: Init examples from dataset.examples.

deep_learning4e.stochastic_gradient_descent(dataset, net, loss, epochs=1000, l_rate=0.01, batch_size=1, verbose=False)[source]: Gradient descent algorithm to update the learnable parameters of a network. :return: the updated network

deep_learning4e.adam(dataset, net, loss, epochs=1000, rho=(0.9, 0.999), delta=1e-08, l_rate=0.001, batch_size=1, verbose=False)[source]: [Figure 19.6] Adam optimizer to update the learnable parameters of a network. Required parameters are similar to gradient descent. :return the updated network

deep_learning4e.BackPropagation(inputs, targets, theta, net, loss)[source]: The back-propagation algorithm for multilayer networks in only one epoch, to calculate gradients of theta. :param inputs: a batch of inputs in an array. Each input is an iterable object :param targets: a batch of targets in an array. Each target is an iterable object :param theta: parameters to be updated :param net: a list of predefined layer objects representing their linear sequence :param loss: a predefined loss function taking array of inputs and targets :return: gradients of theta, loss of the input batch

deep_learning4e.get_batch(examples, batch_size=1)[source]: Split examples into multiple batches

class deep_learning4e.NeuralNetworkLearner(dataset, hidden_layer_sizes, l_rate=0.01, epochs=1000, batch_size=10, optimizer=<function stochastic_gradient_descent>, loss=<function mean_squared_error_loss>, verbose=False, plot=False)[source]

Bases: object

Simple dense multilayer neural network. :param hidden_layer_sizes: size of hidden layers in the form of a list

fit(X, y)[source]: Train the network with the configured optimizer and loss, returning self.

predict(example)[source]: Forward-pass example through the trained net and return the index of the max output.

class deep_learning4e.PerceptronLearner(dataset, l_rate=0.01, epochs=1000, batch_size=10, optimizer=<function stochastic_gradient_descent>, loss=<function mean_squared_error_loss>, verbose=False, plot=False)[source]

Bases: object

Simple perceptron neural network.

fit(X, y)[source]: Train the perceptron with the configured optimizer and loss, returning self.

predict(example)[source]: Forward-pass example and return the index of the maximum output unit.

deep_learning4e.keras_dataset_loader(dataset, max_length=500)[source]: Helper function to load keras datasets. :param dataset: keras data set type :param max_length: max length of each input sequence

deep_learning4e.SimpleRNNLearner(train_data, val_data, epochs=2, verbose=False)[source]

RNN example for text sentimental analysis.

Parameters:

train_data – a tuple of (training data, targets) Training data: ndarray taking training examples, while each example is coded by embedding Targets: ndarray taking targets of each example. Each target is mapped to an integer
val_data – a tuple of (validation data, targets)
epochs – number of epochs
verbose – verbosity mode

Returns:

a keras model

deep_learning4e.AutoencoderLearner(inputs, encoding_size, epochs=200, verbose=False)[source]: Simple example of linear auto encoder learning producing the input itself. :param inputs: a batch of input data in np.ndarray type :param encoding_size: int, the size of encoding layer :param epochs: number of epochs :param verbose: verbosity mode :return: a keras model

Reinforcement Learning (Chapter 21)

class reinforcement_learning.PassiveDUEAgent(pi, mdp)[source]

Bases: object

Passive (non-learning) agent that uses direct utility estimation on a given MDP and policy:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveDUEAgent(policy, sequential_decision_environment)
for i in range(200):
    run_single_trial(agent,sequential_decision_environment)
    agent.estimate_U()
agent.U[(0, 0)] > 0.2
True

estimate_U()[source]: Update utility estimates from the most recent completed trial by direct utility estimation: average the observed reward-to-go for each visited state, blend it with the running estimate, and reset the trial history. Must be called only once the MDP has reached a terminal state.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward)

class reinforcement_learning.PassiveADPAgent(pi, mdp)[source]

Bases: object

[Figure 21.2] Passive (non-learning) agent that uses adaptive dynamic programming on a given MDP and policy:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveADPAgent(policy, sequential_decision_environment)
for i in range(100):
    run_single_trial(agent,sequential_decision_environment)

agent.U[(0, 0)] > 0.2
True
agent.U[(0, 1)] > 0.2
True

class ModelMDP(init, actlist, terminals, gamma, states)[source]

Bases: MDP

Class for implementing modified Version of input MDP with an editable transition model P and a custom function T.

T(s, a)[source]: Return a list of tuples with probabilities for states based on the learnt model P.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning.PassiveTDAgent(pi, mdp, alpha=None)[source]

Bases: object

[Figure 21.4] The abstract class for a Passive (non-learning) agent that uses temporal differences to learn utility estimates. Override update_state method to convert percept to state and reward. The mdp being provided should be an instance of a subclass of the MDP Class:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveTDAgent(policy, sequential_decision_environment, alpha=lambda n: 60./(59+n))
for i in range(200):
    run_single_trial(agent,sequential_decision_environment)

agent.U[(0, 0)] > 0.2
True
agent.U[(0, 1)] > 0.2
True

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning.QLearningAgent(mdp, Ne, Rplus, alpha=None)[source]

Bases: object

[Figure 21.8] An exploratory Q-learning agent. It avoids having to learn the transition model because the Q-value of a state can be related directly to those of its neighbors:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
q_agent = QLearningAgent(sequential_decision_environment, Ne=5, Rplus=2, alpha=lambda n: 60./(59+n))
for i in range(200):
    run_single_trial(q_agent,sequential_decision_environment)

q_agent.Q[((0, 1), (0, 1))] >= -0.5
True
q_agent.Q[((1, 0), (0, -1))] <= 0.5
True

f(u, n)[source]: Exploration function. Returns fixed Rplus until agent has visited state, action a Ne number of times. Same as ADP agent in book.

actions_in_state(state)[source]: Return actions possible in given state. Useful for max and argmax.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning.SARSALearningAgent(mdp, Ne, Rplus, alpha=None)[source]

Bases: QLearningAgent

[Section 21.3] An on-policy temporal-difference control agent (SARSA: State-Action-Reward- State-Action). It is identical to the Q-learning agent except for the update rule: instead of bootstrapping on the maximum Q-value over next actions, SARSA bootstraps on the Q-value of the action a1 that its exploration policy will actually take in the next state. Being on-policy, SARSA learns the value of the policy it is following, exploration included, rather than that of the greedy policy.

reinforcement_learning.run_single_trial(agent_program, mdp)[source]: Execute trial for given agent_program and mdp. mdp should be an instance of subclass of mdp.MDP

Reinforcement Learning (Chapter 21)

class reinforcement_learning4e.PassiveDUEAgent(pi, mdp)[source]

Bases: object

Passive (non-learning) agent that uses direct utility estimation on a given MDP and policy:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveDUEAgent(policy, sequential_decision_environment)
for i in range(200):
    run_single_trial(agent,sequential_decision_environment)
    agent.estimate_U()
agent.U[(0, 0)] > 0.2
True

estimate_U()[source]: Update utility estimates from the most recent completed trial by direct utility estimation: average the observed reward-to-go for each visited state, blend it with the running estimate, and reset the trial history. Must be called only once the MDP has reached a terminal state.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward)

class reinforcement_learning4e.PassiveADPAgent(pi, mdp)[source]

Bases: object

[Figure 21.2] Passive (non-learning) agent that uses adaptive dynamic programming on a given MDP and policy:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveADPAgent(policy, sequential_decision_environment)
for i in range(100):
    run_single_trial(agent,sequential_decision_environment)

agent.U[(0, 0)] > 0.2
True
agent.U[(0, 1)] > 0.2
True

class ModelMDP(init, actlist, terminals, gamma, states)[source]

Bases: MDP

Class for implementing modified Version of input MDP with an editable transition model P and a custom function T.

T(s, a)[source]: Return a list of tuples with probabilities for states based on the learnt model P.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning4e.PassiveTDAgent(pi, mdp, alpha=None)[source]

Bases: object

[Figure 21.4] The abstract class for a Passive (non-learning) agent that uses temporal differences to learn utility estimates. Override update_state method to convert percept to state and reward. The mdp being provided should be an instance of a subclass of the MDP Class:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
agent = PassiveTDAgent(policy, sequential_decision_environment, alpha=lambda n: 60./(59+n))
for i in range(200):
    run_single_trial(agent,sequential_decision_environment)

agent.U[(0, 0)] > 0.2
True
agent.U[(0, 1)] > 0.2
True

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning4e.QLearningAgent(mdp, Ne, Rplus, alpha=None)[source]

Bases: object

[Figure 21.8] An exploratory Q-learning agent. It avoids having to learn the transition model because the Q-value of a state can be related directly to those of its neighbors:

import sys
from mdp import sequential_decision_environment
north = (0, 1)
south = (0,-1)
west = (-1, 0)
east = (1, 0)
policy = {(0, 2): east, (1, 2): east, (2, 2): east, (3, 2): None, (0, 1): north, (2, 1): north,
          (3, 1): None, (0, 0): north, (1, 0): west, (2, 0): west, (3, 0): west,}
q_agent = QLearningAgent(sequential_decision_environment, Ne=5, Rplus=2, alpha=lambda n: 60./(59+n))
for i in range(200):
    run_single_trial(q_agent,sequential_decision_environment)

q_agent.Q[((0, 1), (0, 1))] >= -0.5
True
q_agent.Q[((1, 0), (0, -1))] <= 0.5
True

f(u, n)[source]: Exploration function. Returns fixed Rplus until agent has visited state, action a Ne number of times. Same as ADP agent in book.

actions_in_state(state)[source]: Return actions possible in given state. Useful for max and argmax.

update_state(percept)[source]: To be overridden in most cases. The default case assumes the percept to be of type (state, reward).

class reinforcement_learning4e.SARSALearningAgent(mdp, Ne, Rplus, alpha=None)[source]

Bases: QLearningAgent

[Section 22.3] An on-policy temporal-difference control agent (SARSA: State-Action-Reward- State-Action). It is identical to the Q-learning agent except for the update rule: instead of bootstrapping on the maximum Q-value over next actions, SARSA bootstraps on the Q-value of the action a1 that its exploration policy will actually take in the next state. Being on-policy, SARSA learns the value of the policy it is following, exploration included, rather than that of the greedy policy.

reinforcement_learning4e.run_single_trial(agent_program, mdp)[source]: Execute trial for given agent_program and mdp. mdp should be an instance of subclass of mdp.MDP

Perception (Chapter 24)

perception4e.array_normalization(array, range_min, range_max)[source]: Normalize an array in the range of (range_min, range_max)

perception4e.gradient_edge_detector(image)[source]: Image edge detection by calculating gradients in the image :param image: numpy ndarray or an iterable object :return: numpy ndarray, representing a gray scale image

perception4e.gaussian_derivative_edge_detector(image)[source]: Image edge detector using derivative of gaussian kernels

perception4e.laplacian_edge_detector(image)[source]: Extract image edge with laplacian filter

perception4e.show_edges(edges)[source]: helper function to show edges picture

perception4e.sum_squared_difference(pic1, pic2)[source]: SSD of two frames

perception4e.gen_gray_scale_picture(size, level=3)[source]

Generate a picture with different gray scale levels

Parameters:

size – size of generated picture
level – the number of level of gray scales in the picture, range (0, 255) are equally divided by number of levels

:return image in numpy ndarray type

perception4e.probability_contour_detection(image, discs, threshold=0)[source]: Detect edges/contours by applying a set of discs to an image :param image: an image in type of numpy ndarray :param discs: a set of discs/filters to apply to pixels of image :param threshold: threshold to tell whether the pixel at (x, y) is on an edge :return image showing edges in numpy ndarray type

perception4e.group_contour_detection(image, cluster_num=2)[source]: Detecting contours in an image with k-means clustering :param image: an image in numpy ndarray type :param cluster_num: number of clusters in k-means

perception4e.image_to_graph(image)[source]: Convert an image to an graph in adjacent matrix form

perception4e.generate_edge_weight(image, v1, v2)[source]: Find edge weight between two vertices in an image :param image: image in numpy ndarray type :param v1, v2: verticles in the image in form of (x index, y index)

class perception4e.Graph(image)[source]

Bases: object

Graph in adjacent matrix to represent an image

bfs(s, t, parent)[source]: Breadth first search to tell whether there is an edge between source and sink parent: a list to save the path between s and t

min_cut(source, sink)[source]: Find the minimum cut of the graph between source and sink

perception4e.gen_discs(init_scale, scales=1)[source]: Generate a collection of disc pairs by splitting an round discs with different angles :param init_scale: the initial size of each half discs :param scales: scale number of each type of half discs, the scale size will be doubled each time :return: the collection of generated discs: [discs of scale1, discs of scale2…]

perception4e.load_MINST(train_size, val_size, test_size)[source]: Load MINST dataset from keras

perception4e.simple_convnet(size=3, num_classes=10)[source]: Simple convolutional network for digit recognition :param size: number of convolution layers :param num_classes: number of output classes :return a convolution network in keras model type

perception4e.train_model(model)[source]: Train the simple convolution network

perception4e.selective_search(image)[source]: Selective search for object detection :param image: str, the path of image or image in ndarray type with 3 channels :return list of bounding boxes, each element is in form of [x_min, y_min, x_max, y_max]

perception4e.pool_rois(feature_map, rois, pooled_height, pooled_width)[source]: Applies ROI pooling for a single image and various ROIs :param feature_map: ndarray, in shape of (width, height, channel) :param rois: list of roi :param pooled_height: height of pooled area :param pooled_width: width of pooled area :return list of pooled features

perception4e.pool_roi(feature_map, roi, pooled_height, pooled_width)[source]: Applies a single ROI pooling to a single image :param feature_map: ndarray, in shape of (width, height, channel) :param roi: region of interest, in form of [x_min_ratio, y_min_ratio, x_max_ratio, y_max_ratio] :return feature of pooling output, in shape of (pooled_width, pooled_height)