Statistically Testing Spike Responses to Stimulus#

There are many ways to identify and select “responsive” cells for inclusion in analysis. In this notebook, we demonstrate one inclusion criterion from [Siegle et al., 2021] which performs convolution.

Given the high-resolution capabilities of Neuropixel probes, they can record from hundreds to thousands of neurons simultaneously, resulting in a large and complex dataset. The spikes are recorded at precise time points, but for many types of analysis, it is helpful to convert these discrete spike events into a continuous measure of spiking activity. The convolution operation in this notebook does this, creating a “smoothed” version of the original spiking activity to improve the selection of responsive cells.

Environment Setup#

⚠️Note: If running on a new environment, run this cell once and then restart the kernel⚠️

try:
    from databook_utils.dandi_utils import dandi_download_open
except:
    !git clone https://github.com/AllenInstitute/openscope_databook.git
    %cd openscope_databook
    %pip install -e .
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np

from scipy.ndimage import convolve1d
from scipy.signal import exponential

%matplotlib inline

Downloading Ecephys File#

Change the values below to download the file you’re interested in. In this example, we the Units table of an Ecephys file from The Allen Institute’s Visual Coding - Neuropixels dataset, so you’ll have to choose one with the same kind of data. Set dandiset_id and dandi_filepath to correspond to the dandiset id and filepath of the file you want. If you’re accessing an embargoed dataset, set dandi_api_key to your DANDI API key.

dandiset_id = "000021"
dandi_filepath = "sub-703279277/sub-703279277_ses-719161530.nwb"
download_loc = "."
dandi_api_key = None
io = dandi_download_open(dandiset_id, dandi_filepath, download_loc, dandi_api_key=dandi_api_key)
nwb = io.read()
File already exists
Opening file
c:\Users\carter.peene\AppData\Local\Programs\Python\Python39\lib\site-packages\hdmf\spec\namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.1.3 because version 1.5.1 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
c:\Users\carter.peene\AppData\Local\Programs\Python\Python39\lib\site-packages\hdmf\spec\namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.2.2 because version 2.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."

Getting Units Data and Stimulus Data#

Below, the Units table is retrieved from the file. It contains many metrics for every putative neuronal unit, printed below. For the analysis in this notebook, we are only interested in the spike_times attribute. This is an array of timestamps that a spike is measured for each unit.

units = nwb.units
units.colnames
('waveform_duration',
 'cluster_id',
 'peak_channel_id',
 'cumulative_drift',
 'amplitude_cutoff',
 'snr',
 'recovery_slope',
 'isolation_distance',
 'nn_miss_rate',
 'silhouette_score',
 'velocity_above',
 'quality',
 'PT_ratio',
 'l_ratio',
 'velocity_below',
 'max_drift',
 'isi_violations',
 'firing_rate',
 'amplitude',
 'local_index',
 'spread',
 'waveform_halfwidth',
 'd_prime',
 'presence_ratio',
 'repolarization_slope',
 'nn_hit_rate',
 'spike_times',
 'spike_amplitudes',
 'waveform_mean')
units[:10]
waveform_duration cluster_id peak_channel_id cumulative_drift amplitude_cutoff snr recovery_slope isolation_distance nn_miss_rate silhouette_score ... local_index spread waveform_halfwidth d_prime presence_ratio repolarization_slope nn_hit_rate spike_times spike_amplitudes waveform_mean
id
950921187 0.604355 4 850249267 481.80 0.425574 2.209140 -0.118430 17.537571 0.009496 0.036369 ... 4 50.0 0.357119 2.962274 0.99 0.381716 0.473829 [1.0439430431793884, 1.543311060144649, 2.7287... [0.0001908626967721937, 0.00016134635752077775... [[0.0, 0.5961149999999966, 5.378099999999993, ...
950921172 0.521943 3 850249267 681.53 0.390098 1.959983 -0.109729 14.677643 0.003857 0.103446 ... 3 40.0 0.260972 2.067810 0.99 0.536663 0.445946 [10.406435026164546, 17.127986534673788, 18.48... [0.00014485615850768024, 0.0001722424107984555... [[0.0, -1.341600000000002, -0.4586399999999933...
950921152 0.467002 2 850249267 1070.71 0.500000 2.522905 -0.109867 15.783665 0.017776 0.027818 ... 2 50.0 0.247236 2.220043 0.99 0.566559 0.284058 [1.2775103414155262, 2.3915133536963493, 3.701... [0.00014859435856024575, 0.0001531048673600236... [[0.0, -0.6427199999999993, -2.836079999999998...
950921135 0.467002 1 850249267 253.42 0.500000 2.803475 -0.150379 26.666930 0.023742 0.076530 ... 1 40.0 0.233501 2.339206 0.99 0.669090 0.590737 [9.473732504122962, 13.198542576065163, 18.302... [0.00032386170367170055, 0.0004518112387675137... [[0.0, -3.2800950000000078, -6.087510000000009...
950921111 0.439531 0 850249267 141.82 0.018056 4.647943 -0.328727 66.901065 0.006595 NaN ... 0 30.0 0.219765 4.395994 0.99 1.261416 0.952667 [1.1677100445138795, 1.1707767194728813, 1.349... [0.00015644521007973124, 0.000214412247939483,... [[0.0, -0.9291749999999945, -6.120270000000007...
950927711 1.455946 482 850249273 2.46 0.000895 1.651500 -0.039932 39.400278 0.000033 NaN ... 464 100.0 0.274707 5.557479 0.37 0.467365 0.000000 [2613.8652081509977, 2624.5193369599215, 2734.... [0.00012946663895843286, 0.0001203425053985725... [[0.0, 6.3216435986159105, 10.324204152249129,...
950921285 2.087772 11 850249273 318.53 0.036848 1.379817 NaN 27.472722 0.000903 0.291953 ... 11 100.0 0.288442 2.751337 0.89 0.372116 0.258065 [39.04904580954626, 39.457346913598556, 40.495... [7.768399792002802e-05, 8.405736507197006e-05,... [[0.0, 5.330324999999991, 2.4261899999999486, ...
950921271 0.947739 10 850249273 1008.50 0.001727 1.420617 -0.008204 30.027595 0.000707 0.406673 ... 10 100.0 0.288442 3.847234 0.96 0.498618 0.796491 [16.751918851114475, 26.127977537450867, 28.65... [0.00016516929470324686, 0.0001501058102103845... [[0.0, -3.103230000000032, 5.680349999999983, ...
950921260 0.453266 9 850249273 175.00 0.000081 4.969091 -0.184456 89.804006 0.000000 0.223876 ... 9 60.0 0.192295 5.274090 0.99 1.140487 0.997333 [0.9620761551434307, 2.092045877265143, 2.4040... [0.0003836112198262231, 0.0004093908262843732,... [[0.0, 1.9104149999999982, -7.270770000000016,...
950921248 0.439531 8 850249273 261.11 0.065478 2.147758 -0.085677 84.145512 0.005474 0.076346 ... 8 60.0 0.274707 3.022387 0.99 0.463570 0.976667 [1.316477113448928, 1.779311698293908, 2.96088... [0.00013780619334124896, 0.0001439873831056905... [[0.0, -3.6761399999999953, 0.8334300000000014...

10 rows × 29 columns

units_spike_times = units["spike_times"]
print(units_spike_times.shape)
(3232,)

Selecting Stimulus Times#

Different types of stimulus require different kinds of inclusion criteria. Since the available stimulus tables vary significantly depending which NWB file and which experimental session you’re analyzing, you’ll may have to adjust some values below for your analysis. First, select which stimulus table you want by changing the key used below in nwb.intervals. The list of stimulus table names is printed below to inform this choice. Additionally, you’ll have to modify the function stim_select to select the stimulus times you want to use. In this example, the stimulus type is the presentation of Gabor patches, and the stimulus times are chosen where a Gabor patch is shown at x and y coordinates 40, 40.

stimulus_names = list(nwb.intervals.keys())
print(stimulus_names)
['drifting_gratings_presentations', 'flashes_presentations', 'gabors_presentations', 'invalid_times', 'natural_movie_one_presentations', 'natural_movie_three_presentations', 'natural_scenes_presentations', 'spontaneous_presentations', 'static_gratings_presentations']
stim_table = nwb.intervals["gabors_presentations"]
print(stim_table.colnames)
stim_table[:10]
('start_time', 'stop_time', 'stimulus_name', 'stimulus_block', 'temporal_frequency', 'x_position', 'y_position', 'color', 'mask', 'opacity', 'phase', 'size', 'units', 'stimulus_index', 'orientation', 'spatial_frequency', 'contrast', 'tags', 'timeseries')
start_time stop_time stimulus_name stimulus_block temporal_frequency x_position y_position color mask opacity phase size units stimulus_index orientation spatial_frequency contrast tags timeseries
id
0 89.896827 90.130356 gabors 0.0 4.0 10.0 -10.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 0.0 0.08 0.8 [stimulus_time_interval] [(1, 1, timestamps pynwb.base.TimeSeries at 0x...
1 90.130356 90.380565 gabors 0.0 4.0 -30.0 20.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 90.0 0.08 0.8 [stimulus_time_interval] [(2, 1, timestamps pynwb.base.TimeSeries at 0x...
2 90.380565 90.630774 gabors 0.0 4.0 20.0 -20.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 0.0 0.08 0.8 [stimulus_time_interval] [(3, 1, timestamps pynwb.base.TimeSeries at 0x...
3 90.630774 90.880983 gabors 0.0 4.0 30.0 20.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 90.0 0.08 0.8 [stimulus_time_interval] [(4, 1, timestamps pynwb.base.TimeSeries at 0x...
4 90.880983 91.131199 gabors 0.0 4.0 0.0 -40.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 90.0 0.08 0.8 [stimulus_time_interval] [(5, 1, timestamps pynwb.base.TimeSeries at 0x...
5 91.131199 91.381415 gabors 0.0 4.0 30.0 30.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 45.0 0.08 0.8 [stimulus_time_interval] [(6, 1, timestamps pynwb.base.TimeSeries at 0x...
6 91.381415 91.631631 gabors 0.0 4.0 0.0 10.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 90.0 0.08 0.8 [stimulus_time_interval] [(7, 1, timestamps pynwb.base.TimeSeries at 0x...
7 91.631631 91.881847 gabors 0.0 4.0 -40.0 -20.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 0.0 0.08 0.8 [stimulus_time_interval] [(8, 1, timestamps pynwb.base.TimeSeries at 0x...
8 91.881847 92.132049 gabors 0.0 4.0 -30.0 -30.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 0.0 0.08 0.8 [stimulus_time_interval] [(9, 1, timestamps pynwb.base.TimeSeries at 0x...
9 92.132049 92.382250 gabors 0.0 4.0 10.0 -30.0 [1.0, 1.0, 1.0] circle 1.0 [3644.93333333, 3644.93333333] [20.0, 20.0] deg 0.0 0.0 0.08 0.8 [stimulus_time_interval] [(10, 1, timestamps pynwb.base.TimeSeries at 0...
### select start times from table that fit certain criteria here

stim_select = lambda row: True
stim_select = lambda row: float(row.x_position) == 40 and float(row.y_position) == 40
stim_times = [float(stim_table[i].start_time) for i in range(len(stim_table)) if stim_select(stim_table[i])]
print(len(stim_times))
45

Getting Unit Spike Response Counts#

With stimulus times selected for each trial, we can generate a spike matrix to perform our analysis on. The spike matrix will have dimensions Units, Time, and Trials. You may set time_resolution to be the duration, in seconds, of each time bin in the matrix. Additionally, window_start_time, and window_end_time can be set to the time, in seconds, relative to the onset of the stimulus at time 0. Finally, the stimulus matrix will also be averaged across trials to get the average spike counts over time for each unit, called mean_spike_counts.

# bin size for counting spikes
time_resolution = 0.005

# start and end times (relative to the stimulus at 0 seconds) that we want to examine and align spikes to
window_start_time = -0.25
window_end_time = 0.75
# time bins used
bin_edges = np.arange(window_start_time, window_end_time, time_resolution)

# useful throughout analysis
n_units = len(units_spike_times)
n_trials = len(stim_times)

# calculate baseline and stimulus interval indices for use later
stimulus_onset_idx = int(-bin_edges[0] / time_resolution)
# 3D spike matrix to be populated with spike counts
spike_matrix = np.zeros((n_units, len(bin_edges), n_trials))

# populate 3D spike matrix for each unit for each stimulus trial by counting spikes into bins
for unit_idx in range(n_units):
    spike_times = units_spike_times[unit_idx]

    for stim_idx, stim_time in enumerate(stim_times):
        # get spike times that fall within the bin's time range relative to the stim time        
        first_bin_time = stim_time + bin_edges[0]
        last_bin_time = stim_time + bin_edges[-1]
        first_spike_in_range, last_spike_in_range = np.searchsorted(spike_times, [first_bin_time, last_bin_time])
        spike_times_in_range = spike_times[first_spike_in_range:last_spike_in_range]

        # convert spike times into relative time bin indices
        bin_indices = ((spike_times_in_range - (first_bin_time)) / time_resolution).astype(int)
        
        # mark that there is a spike at these bin times for this unit on this stim trial
        for bin_idx in bin_indices:
            spike_matrix[unit_idx, bin_idx, stim_idx] += 1

# get average spikes by unit over time
mean_spike_counts = np.mean(spike_matrix, axis=2)
# make it spikes per second rather than per bin time
mean_firing_rate = mean_spike_counts / time_resolution

mean_firing_rate.shape
(3232, 200)

Plotting Function#

Here we define a plotting function to show the spiking behavior throughout the time window with the stimulus time clearly shown. It is used below to show the mean_spike_counts as well as the relative change in spike counts at the onset of the stimulus. The plot below is not super useful for analysis yet; the spikes will be convolved and firing rates will be normalized for a clearer view below.

### method to show plot of spike counts of units over time

def show_counts(counts_array, title="", c_label="", aspect="auto", vmin=None, vmax=None):
    fig, ax = plt.subplots(figsize=(6,12)) # change fig size for different plot dimensions
    img = ax.imshow(counts_array, 
                    extent=[np.min(bin_edges), np.max(bin_edges), 0, len(counts_array)], 
                    aspect=aspect,
                    vmin=vmin,
                    vmax=vmax) # change vmax to get a better depiction of your data

    ax.plot([0, 0],[0, len(counts_array)], ':', color='white', linewidth=1.0)

    ax.set_xlabel("Time relative to stimulus onset (s)")
    ax.set_ylabel("Unit #")

    ax.set_title(title)

    cbar = fig.colorbar(img, shrink=0.5)
    cbar.set_label(c_label)
show_counts(mean_firing_rate,
            title="Mean Unit Firing Rates",
            c_label=f"Firing Rate (spikes / second)",
            vmin=0,
            vmax=50)
../_images/7daf46cc2f88254560eadaaad3f76d63d4269dd26baf046f0f0b4348264852e0.png

Computing SDFs#

The inclusion criteria discussed in [Siegle et al., 2021] consists of a few steps. They convolve the spike counts for each trial with a causal exponential filter, convert from spike counts to firing rate, subtract the mean firing rate of the baseline window, and then average across trials.

Then the mean firing rate across different trials is calculated. The end result is a matrix that provides a smoothed, continuous, and normalized representation of neuronal firing activity, which is often easier to work with and interpret than raw spike times, particularly when dealing with large populations of neurons. As can be seen below, the normalized firing rate plot identifies a small set of responsive neurons.

Below, the exponential filter used for the convolution is shown shown as well as the convolved SDFs for each unit with the baseline firing rates subtracted. To get a precise description of this calculation, see page 15 of Siegle et al. The code for the SDF calculation can be found here

sigma = 0.01

filtPts = int(5*sigma / time_resolution)
expFilt = np.zeros(filtPts*2)
expFilt[-filtPts:] = exponential(filtPts, center=0, tau=sigma/time_resolution, sym=False)
expFilt /= expFilt.sum()

plt.plot(expFilt)
plt.title("Causal Exponential Filter")
Text(0.5, 1.0, 'Causal Exponential Filter')
../_images/b9edad8285a945299bf41a6cc1f0734323b5fad6dd9607cc7a267f9577f49968.png
# convolve spike matrix with the causal exponential filter
sdfs = convolve1d(spike_matrix, expFilt, axis=1)

# convert from spike counts to firing rate
sdfs /= time_resolution

show_counts(np.mean(sdfs, axis=2),
            title="Mean Unit SDFs",
            c_label="Firing rate (spikes / second)",
            vmin=0,
            vmax=50)
../_images/d9ee31e78d67a23cb8b0970ac00d2ab6b88732b87cfb282c861ce732d5483551.png
# subtract baseline SDF from sdfs to yield change in firing rate
baseline_sdfs = np.mean(sdfs, axis=1)
normalized_sdfs = sdfs - np.expand_dims(baseline_sdfs, axis=1)

# compute the mean sdf across trials
mean_normalized_sdfs = np.mean(normalized_sdfs, axis=2)

show_counts(mean_normalized_sdfs,
            title="Mean Relative SDFs",
            c_label="Firing rate relative to baseline\n(spikes / second)",
            vmin=0,
            vmax=25)
../_images/b7f341dfb7b1bfe53688a30abf814d44c0994743e57894e6b911abfd47f1bbdf.png

Selecting Units#

Units are included in their analysis if;

“their mean firing rate was greater than 0.1 spikes per second and the peak of the mean SDF after image change was greater than 5 times the standard deviation of the mean SDF during the baseline window”

Therefore the final step is to select units from our array which fit these two criteria. The selected unit’s normalized mean firing rates are shown below. It can be seen that a majority of units (2315) were included.

evoked_firing_rates = np.mean(mean_normalized_sdfs[:,stimulus_onset_idx:], axis=1)
active_units = evoked_firing_rates > 0.1

sdf_response_peaks = np.max(mean_normalized_sdfs[:,stimulus_onset_idx:], axis=1)
responsive_units = sdf_response_peaks > 5*np.mean(baseline_sdfs, axis=1)

selected_units = mean_normalized_sdfs[active_units & responsive_units]
selected_units.shape
(27, 200)
show_counts(selected_units,
            title="Selected Units Mean Relative SDFs",
            c_label="Firing rate relative to baseline\n(spikes / second)",
            aspect=0.05,
            vmin=0,
            vmax=20)
../_images/8a505f207cd144f4e3791e28f49c00724195b2c98d5ded4690f7ee828a788029.png