Autoencoder-assisted decoding of behavioral video

Top left: High-speed video of the behaving mouse during the task, cropped so just the mouse face is visible. Top right: Simultaneous recording from ~800 neurons using a Neuropixels array. This panel shows average activity within each observed brain region (blue: visual cortex; red: hippocampus; yellow: thalamus; purple: motor cortex; green: striatum). 80-200 cells were observed in each brain region.
Data courtesy of Nick Steinmetz, Matteo Carandini, Kenneth Harris.

Middle left: an auto-encoder (AE) was trained to nonlinearly compress the video into a low-dimensional space (d = 8 here). This panel shows the output of the AE after mapping from this 8-d space back into the image space. Even with this very low-d representation, we can recover most visible features of the video.

Middle right: the 8-d latent variables output by the AE, visualized as a function of time. Bottom: next we regressed the 800-dimensional neural activity onto the 8-d signal, and then mapped the result back into the image space (using the trained AE) to obtain a decoder map from the neural activity into the face image. Decoded image shown on left; decoded 8-d latent traces shown in gray on right.
Preliminary results from the Paninski lab.

Video Feature Extraction

Autoencoder-assisted decoding of behavioral video