After reading about Dimensions of Dialogue (the seminal work on CCNs) and GlyphNet, I decided to investigate the idea of two artificial neural networks (NNs) communicating over a noisy channel. The idea is to have an arbitrary number of classes that serve as content to be communicated effectively between an observer and speaker, all with the chance of some of the information in the communication to be lost.

Previous work has focused on an agent generating an image and an agent decoding the original class. I decided I wanted to expand on this by creating two NNs that could both serve as “speakers” and “listeners.” Their architectures are simple: convolutional neural networks that share weights, encoding to a single output (the class being communicated) while also generating an image. An initial class is fed into these ConvNets by means of a one-hot encoded vector. In this initial stage, they also receive an image of only ones or zeros. They then generate an image that has noise added to it. This noisy image is finally fed into the other network, along with a vector of the length of the one-hotted classes, but with only zeros.

This process not only allows both agents to produce outputs, it also allows them to pass this content back and forth over the noisy channel, all while trying to keep the information regarding the original class intact. Similar to a game of telephone between two very forgetful people, the networks pass this image to-and-fro while trying to retain its recognizability.

So, what does all of this mean? Not much in the absence of results.

Noise-robust representations

But when we see these, it becomes clear that, given constraining environments with a lack of capacity (These are only made with 8*8 pixels!) and noise, representations that look a whole lot like our own letters and numbers emerge!

There are a number of ways to move forward with this. Future posts will focus on adversarial additive interference along with different ways of communicating - like audio.