Technology

Creating Human Photographic Memory – Artificially.

All of the code from this blog is available at the github repo and can be run from scratch.
Use the slider to switch between Biological and Digital photographic memory regions.

What is Photographic Memory ?

The Occipital lobe (in red) of the cerebral cortex stores and processes visual imagery in mammals. [src]

Photographic memory, also known as Eidetic memory is the ability to remember/recall an image form memory, accurately. In humans and other mammals, the cerebral cortex is the part of the brain that deals with the attention, perception, awareness, thought consciousness and memory.

The Occipital lobe is one of four main lobes of the cerebral cortex, positioned at the back of the head and is the main visual processing unit of a mammalian brain.

Our Objective

The objective of this exercise is to create a computer system that can uniquely remember an image or a kind of object – with all it’s physical properties. The system should also be able to differentiate between memories of other kinds of objects and the one being remembered.

Such a computer system, should then be able to replace photographic memories, with digital files – making memories permanent. Given enough computational power, this system should also be able stream memories in-and-out of a biological brain via an external neural link into this system.

Advanced Computer Vision

Today’s Artificial Intelligence vision architectures have superhuman visual capabilities and can outperform human vision in a lot of areas. A photographic memory implementation based on such architectures can be used to process extremely detailed memories. However, to keep things simple, we have a small ask from this system :

  1. Can it remember how a bicycle looks like? or a tree?
  2. Can it differentiate between the two?
  3. Can it tell you from memory if a new image is that of a bicycle (or a tree?)
  4. Can it scale infinitesimally to hold memories of all objects around us, and even remember a particular bicycle ?

We are going to cover points 1, 2 and 3 in this blog and lay the groundwork for point #4 in this post.

A Convolutional Neural Network

A CNN in action (src)

A Convolutional Neural Network(CNN) is a specific category of Artificial Neural Networks, often used for vision based tasks. Their design was inspired by the biological design of a visual cortex part of the brain. The visual cortex is located inside the Occipital lobe.

The Visual Cortex (src)
A Simple Neural Network

Although biologic memory representation is contained in the parts of brain we are discussing, the computer system we will design, aims to emulate the concept of remembrance, in general. It does not replace or replicate the minute cellular or sub-cellular structure of the brain and it’s parts.

Our Goal

We want to create a computer system that can remember how bicycles looks like. It also wants to remember how trees look like. And then it should be able to tell the difference between them.

Approach

  1. We’ll create a CNN and train it to learn to remember bicycles and trees.
  2. The training procedure will ensure that the network can learn to understand the two kind of objects uniquely – bicycles and trees.
  3. We would then ask the CNN to show us how it understands the concept of a bicycle and the concept of a tree. This will involve extracting the embedding or cluster properties of this particular object being seen by the network.

The embedding is a commonly used term by Data Scientists to represent qualities of an input after it gets processed by a Neural Network. For simple understanding, we will refer to this as the representation of the memory that the Neural Network creates, of whatever was shown to it.

Training the Network

Data, the model architecture and the training procedure (hyperparameters etc) are the main elements that go in to create an accurate Neural Network. Lets start with the first one.

Data

We’ll download data for cycles and trees using duckduckgo’s search API. Let’s look at some samples:

The duckduckgo search API can be used to fetch thousands of images that we can train our network on.

Training the CNN

Takes less than 10 seconds to train a network with more than 99% accuracy. This training was done on a typical laptop. (not a high end machine or GPU).

Inference from the CNN

Now that the CNN has learnt how Bicycles and Trees look, lets show our CNN an image of a bicycle, and see if it can tell us what it is:

The Network is 99.99% sure that it is a bicycle.

Hey CNN – How does a Bicycle look like?

After successful training, the Network is accurately able to tell us how it sees something it has been trained to see. We would now get deeper into the Network and ask it for the embedding (representation of remembrance) of how it sees and remembers a bicycle. Most modern Deep Learning tools like Tensorflow and Pytorch work with a mathematical constructs called Tensors. Tensors are used to represent inputs, outputs and most things that a Neural Network works with. A simple way to understand a Tensor is to think of a multi-dimensional matrix. Our embedding would also be a Tensor.

This bicycle:

appears like this to the Neural Network:

This is just a small part of the whole Tensor for this Bicycle, as created by our Neural Network.

It is not very easy to make sense out of the visual memory that is represented by the CNN output, especially in a large sized Tensor. This particular tensor has 512 elements in it.

Understanding and Visualizing Embeddings

A better way to visualize this remembrance is to reduce this number to use the most significant components of this data. We would now show all of our bicycle and tree images to the CNN, and ask it to return how it remembers them (embeddings). Let us then plot these embeddings in a 3 dimensional space, down from 512 so that it contains the most important features of the embeddings.

Dimentionality Reduction and understanding primary components of an embedding is often done by algorithms like the PCA and tSNE. Here we plot the PCA in 3 dimensions.

We plot all bicycle embeddings in Grey, and all tree embeddings in Black. This is how the Network plots these representations of visual remembrance :

All bicycles are represented in the left side of the 3D space, and all trees are represented in the right side of the 3D space. You may see 4-5 tree embeddings pretty close to bicycle embeddings. Want to take a guess why that might me? (hint: Can they appear together?)

So is this how A.I. sees the world?

Yes. We’ve now seen how a typical Convolutional Neural Network “sees” things, after it has been trained to see them.

But then, how does it remember what it saw?

All embeddings created by a Convolutional Neural Network, can be stored as files. Since embeddings can be understood as visual representation of what the CNN saw, the embedding files give us a collection of permanent memories that last forever.

Resources

The following resources were used in the creation of this blog:

PicKey.ai

https://blog.pickey.ai

View Comments

Recent Posts

Browser Privacy Settings in 2024

Essential privacy settings, plugins and practices for Chrome, Firefox, Safari and Microsoft Edge. Staying Private…

2 years ago

Comparison with Master Passwords: The Superiority & Ease of PicKey’s Master Key

PicKey's Master Key: The Strongest Master Password on the Planet, Yet the Easiest to Use.…

2 years ago

Defying Brute Force : The Unparalleled Strength of PicKey’s Master Key

How Strong is PicKey's Security? PicKey's Master Key - Simply the Strongest Master Password.... Ever!…

2 years ago

The Science Behind the Vision Secret

Deep Tech Creates Your Cybersecurity Autopilot In the realm of cybersecurity, the quest for a…

2 years ago

Introducing : Unbreakable Master Passwords with PicKey’s Vision Secret

Capturing The World's Natural Randomness into Computer Security Systems Endless potential in what we see.…

2 years ago

The Keymoji Store : Immerse Yourself in Free 3D & AR Adventures

The Keymoji - Free AR & 3D Experience https://videopress.com/v/EEAo7H2G?resizeToParent=true&cover=true&autoPlay=true&loop=true&muted=true&persistVolume=false&playsinline=true&preloadContent=metadata&useAverageColor=true Thousands of AR & 3D Experiences,…

3 years ago

This website uses cookies.