Visualize Moving MNIST Python – Complete Datasets

Sharing is Caring

MNIST is a large collection of rectangular, 28×28-pixel written single digits in grayscale between 0 and 9. In total, there are 70,000 handwritten digit representations, with 60,000 in the training dataset and 10,000 in the test set. Each image is labeled with the numeral that it represents. There are a total of ten separate digit classes (from 0 to 9). It takes a lot of work to create a dataset like MNIST. Not only is it challenging to gather, but it is also challenging to have it prepared specifically for your needs.

Visualize Moving MNIST Python

Moving MNIST Dataset

The 10,000 video sequences in the Moving MNIST database each have 20 frames. Two numbers individually travel around the frame of each video sequence, with a spatial resolution of 6464 pixels. The numbers constantly cross over one another and bounce over the frame’s edges.

Also Read: Acceptable Equity Curve – Explained

Visualize MNIST Dataset Using Python

We can visualize MNIST using the following python code.

“ from keras.datasets import mnist

(train_X, train_y), (test_X, test_y) = mnist.load_data()

print(‘X_train: ‘ + str(train_X.shape))

print(‘Y_train: ‘ + str(train_y.shape))

print(‘X_test:  ‘  + str(test_X.shape))

print(‘Y_test:  ‘  + str(test_y.shape))

from keras.datasets import mnist

#loading the dataset

(train_X, train_y), (test_X, test_y) = mnist.load_data()

#printing the shapes of the vectors 

print(‘X_train: ‘ + str(train_X.shape))

print(‘Y_train: ‘ + str(train_y.shape))

print(‘X_test:  ‘  + str(test_X.shape))

print(‘Y_test:  ‘  + str(test_y.shape))

from matplotlib import pyplot

for i in range(9):  

 pyplot.subplot(330 + 1 + i)

 pyplot.imshow(train_X[i], cmap=pyplot.get_cmap(‘gray’))”

The output will be the following.

“Downloading data from

11490434/11490434 [==============================] – 0s 0us/step

X_train: (60000, 28, 28)

Y_train: (60000,)

X_test:  (10000, 28, 28)

Y_test:  (10000,)

X_train: (60000, 28, 28)

Y_train: (60000,)

X_test:  (10000, 28, 28)

Y_test:  (10000,)”

Visualize Moving MNIST Dataset Using Python

Fastai is used in this tutorial to process image sequences. The model’s task in this problem is to foretell the following frames in a sequence. We will address a play example involving sliding MNIST numbers on a canvas. This task is from an image tuple to an image tuple.

  • A moving MNIST dataset will first be built.
  • We’ll train a straightforward model to predict how numbers will move.
  • To deal with, we shall define a few constants.
  • digit_size: is the MNIST pictures’ resolution (28×28)
  • image_size: is the picture size (64×64)
  • step_length determines how quickly the moving numbers on the canvas move.
  • By selecting one at random from the random trajectory, we shift the digit.

“from import *

path = untar_data(URLs.MNIST)

files = get_image_files(path/’training’)


img = load_image(files[0])


digit_size = 28

image_size = 64

step_length = 0.2

N = len(files)

def get_random_trajectory(seq_length):

    Generate a trajectory

    #canvas_size = image_size – digit_size

    x, y, v_x, v_y = np.random.random(4)

    out_x, out_y = [], []

      for i in range(seq_length):

        # Take a step along velocity.

        y += v_y * step_length

        x += v_x * step_length

        # Bounce off edges.

        if x <= 0:

            x = 0

            v_x = -v_x

        if x >= 1.0:

            x = 1.0

            v_x = -v_x

        if y <= 0:

            y = 0

            v_y = -v_y

        if y >= 1.0:

            y = 1.0

            v_y = -v_y

        out_x.append(x * canvas_size)

        out_y.append(y * canvas_size)

     return tensor(out_x, dtype=torch.uint8), tensor(out_y, dtype=torch.uint8)

x,y = get_random_trajectory(10)


def generate_moving_digit(n_frames, image_size=64):

    #Move one digit on the canvas

    digit_image = get_rand_img()

    xs, ys = get_random_trajectory(n_frames)

    canvas = torch.zeros((n_frames, 1, image_size, image_size), dtype=torch.uint8)

    for i,(x,y) in enumerate(zip(xs,ys)):

        canvas[i, 0, y:(y+digit_size),x:(x+digit_size)] = digit_image

    return canvas


# Multiple digits with various trajectories can be combined simultaneously.

def generate_moving_digits(n_frames, digits=1):

    #generate multiple digits

    return torch.stack([generate_moving_digit(n_frames) for n in range(digits)]).max(dim=0)[0]

digits = generate_moving_digits(5, 2)


#The output will be like 66 66 66 66 66

Due to the fact that we already possess a tensor, using the mid level APi is fairly straightforward.

class ImageSeq(fastuple):
def create(cls, t, cl_type=TensorImageBW):
return cls(tuple(cl_type(im) for im in t))
def show(self, ctx=None, **kwargs):
return show_image([t for t in self], dim=-1), ctx=ctx, **self[0]._show_args, figsize=(10,5), **kwargs)
img_seq = ImageSeq.create(digits);

To divide our series on (x,y), we will write a straightforward function that takes the first n in frames as feed and the last n_out frames as target.

def get_items(n_in=3, n_out=3, n_digits=2):
n_frames = n_in + n_out
digits = generate_moving_digits(n_frames, n_digits)
x, y = digits[0:n_in], digits[n_in:]
return x, y
class ImageSeqTransform(Transform):
def init(self, n_in, n_out, n_digits=2, cl_type=TensorImageBW):

def encodes(self, idx): x, y = get_items(self.n_in, self.n_out, self.n_digits) return ImageSeq.create(x, self.cl_type), ImageSeq.create(y, self.cl_type)

We provide a list of numbers to the TfmdLists function Object() { [native code] } that will only be used as a counting technique because the images are produced on the fly.

idxs = range_of(10)
splits = [0,1,2,3,4,5,6,7], [8,9]
tls = TfmdLists(idxs, ImageSeqTransform(3,3), splits=splits)

We will combine everything into a DataLoaders object, and then we can begin training.

dls = tls.dataloaders(bs=4, after_batch=[IntToFloatTensor, Normalize.from_stats(*mnist_stats)])

With one batch and explode types, we receive three photos as input and three as output, as can be shown.

b = dls.one_batch()
def get_dls(n_in, n_out, N=100, bs=4):
idxs = range_of(N)
splits = RandomSplitter()(idxs)
tls = TfmdLists(idxs, ImageSeqTransform(n_in, n_out), splits=splits)
return tls.dataloaders(bs=bs, after_batch=[IntToFloatTensor, Normalize.from_stats(*mnist_stats)])
dls = get_dls(3, 3, N=1000, bs=4)

To display our ImageSeq objects, we must create a special show batch method using the @typedispatch decorator.

def show_batch(x:ImageSeq, y:ImageSeq, samples, ctxs=None, max_n=6, nrows=None, ncols=2, figsize=None, *kwargs): if figsize is None: figsize = (ncols6, max_n* 1.2)
if ctxs is None:
_, ctxs = plt.subplots(min(x[0].shape[0], max_n), ncols, figsize=figsize)
for i,ctx in enumerate(ctxs):
samples[i][0].show(ctx=ctx[0]), samples[i][1].show(ctx=ctx[1])


In this article, we have covered the MNIST dataset, the moving dataset, and how to visualize the MNIST dataset using python. We come to know that the MNIST dataset consists of 0-9 digits. While the moving MNIST dataset consists of video frames. 


What does the Python function plot () do?

To display points (markers) in a diagram, use the plot() function. The plot() function constructs a line from one point to another by default. The function accepts parameters for specifying diagram points.

What is the MNIST dataset?

The MNIST collection includes a large number of handwritten digits. It is a well-known data in the field of image processing. It is widely used to put machine learning approaches to the test. MNIST is an abbreviation for “Modified National Institute of Standards and Technology database.”

Leave a Comment