Photographic memory, also known as Eidetic memory is the ability to remember/recall an image form memory, accurately. In humans and other mammals, the cerebral cortex is the part of the brain that deals with the attention, perception, awareness, thought consciousness and memory.
The Occipital lobe is one of four main lobes of the cerebral cortex, positioned at the back of the head and is the main visual processing unit of a mammalian brain.
The objective of this exercise is to create a computer system that can uniquely remember an image or a kind of object – with all it’s physical properties. The system should also be able to differentiate between memories of other kinds of objects and the one being remembered.
Such a computer system, should then be able to replace photographic memories, with digital files – making memories permanent. Given enough computational power, this system should also be able stream memories in-and-out of a biological brain via an external neural link into this system.
Today’s Artificial Intelligence vision architectures have superhuman visual capabilities and can outperform human vision in a lot of areas. A photographic memory implementation based on such architectures can be used to process extremely detailed memories. However, to keep things simple, we have a small ask from this system :
We are going to cover points 1, 2 and 3 in this blog and lay the groundwork for point #4 in this post.
A Convolutional Neural Network(CNN) is a specific category of Artificial Neural Networks, often used for vision based tasks. Their design was inspired by the biological design of a visual cortex part of the brain. The visual cortex is located inside the Occipital lobe.
Although biologic memory representation is contained in the parts of brain we are discussing, the computer system we will design, aims to emulate the concept of remembrance, in general. It does not replace or replicate the minute cellular or sub-cellular structure of the brain and it’s parts.
We want to create a computer system that can remember how bicycles looks like. It also wants to remember how trees look like. And then it should be able to tell the difference between them.
The embedding is a commonly used term by Data Scientists to represent qualities of an input after it gets processed by a Neural Network. For simple understanding, we will refer to this as the representation of the memory that the Neural Network creates, of whatever was shown to it.
Data, the model architecture and the training procedure (hyperparameters etc) are the main elements that go in to create an accurate Neural Network. Lets start with the first one.
We’ll download data for cycles and trees using duckduckgo’s search API. Let’s look at some samples:
Now that the CNN has learnt how Bicycles and Trees look, lets show our CNN an image of a bicycle, and see if it can tell us what it is:
After successful training, the Network is accurately able to tell us how it sees something it has been trained to see. We would now get deeper into the Network and ask it for the embedding (representation of remembrance) of how it sees and remembers a bicycle. Most modern Deep Learning tools like Tensorflow and Pytorch work with a mathematical constructs called Tensors. Tensors are used to represent inputs, outputs and most things that a Neural Network works with. A simple way to understand a Tensor is to think of a multi-dimensional matrix. Our embedding would also be a Tensor.
This bicycle:
appears like this to the Neural Network:
This is just a small part of the whole Tensor for this Bicycle, as created by our Neural Network.
It is not very easy to make sense out of the visual memory that is represented by the CNN output, especially in a large sized Tensor. This particular tensor has 512 elements in it.
A better way to visualize this remembrance is to reduce this number to use the most significant components of this data. We would now show all of our bicycle and tree images to the CNN, and ask it to return how it remembers them (embeddings). Let us then plot these embeddings in a 3 dimensional space, down from 512 so that it contains the most important features of the embeddings.
Dimentionality Reduction and understanding primary components of an embedding is often done by algorithms like the PCA and tSNE. Here we plot the PCA in 3 dimensions.
We plot all bicycle embeddings in Grey, and all tree embeddings in Black. This is how the Network plots these representations of visual remembrance :
Yes. We’ve now seen how a typical Convolutional Neural Network “sees” things, after it has been trained to see them.
All embeddings created by a Convolutional Neural Network, can be stored as files. Since embeddings can be understood as visual representation of what the CNN saw, the embedding files give us a collection of permanent memories that last forever.
The following resources were used in the creation of this blog:
Essential privacy settings, plugins and practices for Chrome, Firefox, Safari and Microsoft Edge. Staying Private…
PicKey's Master Key: The Strongest Master Password on the Planet, Yet the Easiest to Use.…
How Strong is PicKey's Security? PicKey's Master Key - Simply the Strongest Master Password.... Ever!…
Deep Tech Creates Your Cybersecurity Autopilot In the realm of cybersecurity, the quest for a…
Capturing The World's Natural Randomness into Computer Security Systems Endless potential in what we see.…
The Keymoji - Free AR & 3D Experience https://videopress.com/v/EEAo7H2G?resizeToParent=true&cover=true&autoPlay=true&loop=true&muted=true&persistVolume=false&playsinline=true&preloadContent=metadata&useAverageColor=true Thousands of AR & 3D Experiences,…
This website uses cookies.
View Comments