This lesson is in the early stages of development (Alpha version)

Lossless Pipeline Parameters

Introdution to Lossless Parameters

Overview

Teaching: 35 min
Exercises: 10 min
Questions
  • What are parameters?

  • How does the Lossless pipeline use parameters to make decisions?

Objectives
  • Understand what parameters are and how they are used in the Lossless pipeline.

Introduction

Each script in the Lossless pipeline has parameters. Parameters can be conceptualized as various sliders and dials for tuning the performance of a particular script. In the case of Lossless, this is often montage information, path specifications, memory and time allocations on the remote, and outlier detection decision criteria. Parameters in Lossless are controlled (and defined) by the associated batch configuration files.

It is highly recommended that parameters related to pipeline decision criteria are optimized for each study. In the case of montage information and staging, it is required to change the configuration files. The parameters for channel and time decision criteria as well as montage information and staging can be edited in batch configuration file c01. Batch configuration files c03 and c05 also contain parameters, however these parameters should only be edited by expert users and are not covered in this tutorial.

Channel Decision Criteria

This figure shows the decision criteria for removing artefactual channels during the pipeline. These decisions are based on statistical distributions of the data and the pipeline decision criteria parameters. The parameters can be edited to change the decisions the pipeline is making about channels. Decisions regarding removing channels are made by the pipeline based on the parameters and can not be edited during the quality control procedured by the reviewer.

Decision Criteria Channels

A. The voltage variance for each channel calculated for every 1-second epoch.

B. The median, quantiles, and outliers for each 1-second epoch across all of the channels for the measure of voltage variance. In this example, a channel is marked as an outlier if the channel’s voltage variance is greater than 6 times the median to quantile distance. The multiplication factor for the median to quantile distance (6 in this example) is a parameter in c01 that can be edited.

C. Binary (yes/no) decision for each channel at every 1-second epoch based on if the channel is an outlier during that epoch. Channels that are an outlier are shown in yellow.

D. Plot that shows the percentage of time that each channel is an outlier. The red line is the critical cut-off value and if a channel is an outlier for a greater percentage of time than the critical cut-off it will be marked for removal. The crtical cut-off value is a parameter in c01 that can be edited. In this example the critical cut-off value is [0.2] indicating that a channel must be an outlier for 20% of the epochs to be marked for removal.

Channel Decision Parameters

In the batch configuration file c01, the parameters that begin with sd_ch_ are related to the channel decision criteria for voltage variance.

The [sd_ch_o] parameter is the factor to multiple against the median to quantile distance to determine outliers. Increasing this value will increase the critical distance for outlier detection, meaning that for every channel, less epochs will be considered outliers.

The [sd_ch_f_o] parameter is the critical cut-off and indicates what percentage of epochs have to be an outlier for each channel for that channel to be considered artefactual. Increasing this value will mean that a greater percentage of epochs have to be an outlier for that channel to be marked for removal. Increasing this value results in less channels being considered artefactual.

If there are non-artefactual channels that are being marked for removal for the measure of voltage variance, it is recommended to try increasing the [sd_ch_o] and/or the [sd_ch_f_o] parameters.

The [sd_ch_vals] parameter are the quantiles used for the critical distance. The value for this parameter in the above example is [.3 .7], indicating that the 30% and 70% quantiles are used. When optimizing parameters it is recommended to leave this parameter at the default values and change the [sd_ch_o] and [sd_ch_f_o] parameters to change pipeline decisions.

Time Decison Criteria

This figure shows the decision criteria for removing artefactual time during the pipeline. These decisions are based on statistical distributions of the data and the pipeline decision criteria parameters. The parameters can be edited to change the decisions the pipeline is making about time periods. The time decision criteria is the same idea as the channel decision criteria but the data is collasped in the other direction.

Decision Criteria Time

A. The voltage variance for each channel calculated for every 1-second epoch.

B. The median, quantiles, and outliers for each 1-second epoch across all of the time points for the measure of voltage variance. In this example, a time point is marked as an outlier if the time point’s voltage variance is greater than 6 times the median to quantile distance. The multiplication factor for the median to quantile distance (6 in this example) is a parameter in c01 that can be edited.

C. Binary (yes/no) decision for each time point at every channel based on if the time point is an outlier for that channel. Time points that are an outlier are shown in yellow.

D. Plot that shows the percentage of channels that are outliers for each time point. The red line is the critical cut-off value and if a time point is an outlier for a greater percentage of channels than the critical cut-off it will be marked for removal. The crtical cut-off value is a parameter in c01 that can be edited. In this example the critical cut-off if [0.2] indicating that at any given time point, 20% of channels must be an outlier for the time point to be identified as artefactual.

Time Decision Parameters

In the batch configuration file c01 the parameters that begin with sd_t_ are related to the time decision criteria for voltage variance.

The [sd_t_o] parameter is the factor to multiple against the median to quantile distance to determine outliers. Increasing this value will increase the critical distance for outlier detection. This means that at every time point, less channels will be considered an outlier.

The [sd_t_f_o] parameter is the critical cut-off and indicates what percentage of channels have to be an outlier at each time point for that time point to be considered artefactual. Increasing this value will mean that a greater percentage of channels have to be an outlier for that time point to be marked for removal. Increasing this value results in less time points being considered artefactual.

If there are non-artefactual time points that are being marked for removal for the measure of voltage variance, it is recommended to try increasing the [sd_t_o] and/or the [sd_t_f_o] parameters.

The [sd_t_vals] parameter are the quantiles used for the critical distance. The value for this parameter in the above example is [.3 .7], indicating that the 30% and 70% quantiles are used. When optimizing parameters it is recommmended to leave this parameter at the default values and change the [sd_t_o] and [sd_t_f_o] parameters to change pipeline decisions.


Key Points

  • Parameters in batch configuration files can be edited to change the decisions made in the pipeline scripts.

  • Editing parameters is the only way to change the decisions the pipeline has made about channels.


Determining Optimal Parameters

Overview

Teaching: 35 min
Exercises: 10 min
Questions
  • How do you determine the optimial parameters for a dataset?

Objectives
  • Understand how to optimize parameters for each dataset.

Running the localParam.m Script

Prior to running the data through the pipeline, optiminal parameters can be determined by running the localParam.m script on a loaded data file. This script is a version of the s01 pipeline script that is designed to be run locally. Running the script will add annotations to the file that enable you to quickly determine the impact of parameter edits and find optiminal values. After the script is run, the annotations added to the EEG channel scroll plot can be visually inspected. This will allow for the determination of whether appropriate parameters were used based on the decisions that were made about channels and epochs.

Channels that are artefactual for most of the file should be identified by the pipeline and marked for removal. Channels that are only artefactual for short periods of time such as channels that contain eyeblinks should remain in the data. The artefacts in these channels (including the eyeblinks) can be isolated into a component that can then be marked for removal. Decisions about channels can not be edited during the QC procedure and as such, parameters will have to be edited in order to change channel decisions.

Decisions about time can be edited by the reviewer during the QC procedure, however, it is still important to ensure that the pipeline is making appropriate decisions about time. Epochs that have been marked for removal by the pipeline should be artefactual.

  1. Open MATLAB and navigate to your local project directory to make it your current path.

  2. Open EEGLAB by typing the following into the command window:

     >> addpath derivatives/BIDS-Lossless-EEG/code/install
     >> lossless_path
     >> eeglab
    
  3. If you are using an older version of MATLAB (pre-2014b), you will need to set the default figure renderer to OpenGL by typing the following into the command window:

     >> set(0,'DefaultFigureRenderer','OpenGL');
    
  4. In the EEGLAB drop-down menu, navigate to Import data-> Using EEGLAB functions and plugins-> From BIDS subject folder. In the file chooser navigate into the sub-001/eeg directory, select the sub-001_task-faceFO_eeg.set file and press open.

  5. Once the file has loaded, run the localParam.m script by typing the following into the command window:

     >> localParam
    
  6. The script will display ‘Done!’ in the command window when it has completed. The script made decisions about the data using the default parameters in the localParam.m script. To visualize the decisions that have been made, navigate in the EEGLAB drop-down menu to Edit and press Visually edit in scroll plot.

  7. In the Select visual editing parameters pop-up window, type 60 in the Y axis spacing [spacing] field and 40 in the window tine length [winlength] field. The window should look like this:

    Visually Edit in Scroll Plot Winodw

  8. Press enter after you finish inputting the values to ensure that they have been saved. Then press the Ok button. This will load a window containing the EEG channel data.

  9. The script has added annotations to the file that show the decisions the script has made. These decisions are based off of the parameters. First look at the decisions that have been made about channels. Channels that have been marked artefactual are shown in gray and have a gray manual flag and a coloured flag on the left side of the data window. In this example, there are several channels that have been marked as artefactual with annotation ch_sd. These are channels that have been found to be artefactual on the basis of channel voltage variance based on the channel decisions criteria that was introduced in the previous episode.

  10. Scroll through the file to see the decisions that have been made about time. There is a period of time at 230s that has been marked for removal because of large artefacts across most channels.

Editing Parameters

  1. To edit the parameters, navigate to the localParam.m script (Face13/derivatives/BIDS-Lossless-EEG/code/scripts/localParam.m) and open the file in a plain text editor.

  2. The parameters are located at the top of this file. A description and the default value for each parameter can be found on the Lossless pipeline wiki.

  3. To investigate the impact of editing parameters, change the value for the sd_ch_o parameter to 4. This parameter is located on line 17 and the default value of this parameter is 16. Be sure the save the changes that have been made.

What will the impact of lowering the sd_ch_o parameter be?

Before you visually inspect the data, what do you think the outcome of the parameter edit will be? Will editing this parameter change the decisions that are made regarding epochs or channels? Will the script mark more or less as artefactual?

Solution

The sd_ch_o parameter is related to outlier detection for channels and is the factor to multiple against the median to quantile distance to determine outliers. Lowering this value will make the critical distance for determining outliers smaller and therefore more channels will be marked as artefactual.

  1. If you would like to make a direct comparsion between the different parameters, you may leave the scroll plot open. Otherwise, you can close this scroll plot before running the localParam.m script with edited parameters.

  2. To prepare for running the edited localParam.m script, the file and workspace memory need to be cleared. To do this, first navigate in the EEGLAB drop-down menu to File and click Clear study/Clear all.

  3. In the Matlab Command Window, type clear. This should clear all of the variables from memory.

Steps for between each run of the localParam.m script

Steps 4-6 should be repeated between any runs of the localParam.m script.

  1. Run the localParam.m script by typing the following in the command window:

     >> localParam
    
  2. The script will display ‘Done!’ in the command window when it has completed. To visualize the decisions that have been made, navigate in the EEGLAB drop-down menu to Edit and press Visually edit in scroll plot.

  3. Investigate the channels that have been marked as artefactual, focusing on the ch_sd mark. In the scroll plot, it is evident that lowering the sd_ch_o parameter resulted in more channels being marked as artefactual. Looking at the channels that have been marked, it can be seen that many of the channels are not truly artefactual and as such should not be marked for removal. This is an indication that the lowered parameter is not ideal for this dataset and the default value was more optimal.

Editing other parameters

Other parameters in the localParam.m script can be edited by repeating this procedure. For example, the parameters sd_t_o and sd_t_f_o can be edited to change the time decision criteria function. The sd_ch_f_o parameter can also be edited to change the channel decision criteria function. The low_bound_hz and high_bound_hz parameters are for filtering and can be edited to change the filtering that is applied to the data.

Investigating Decision Making Criteria Figures

  1. The output figures for the time and channel decision criteria functions that were introduced in the previous episode can be plotted while running the localParam.m script. To plot the figures, navigate to the localParam.m script (Face13/derivatives/BIDS-Lossless-EEG/code/scripts/localParam.m) and open the file in a plain text editor.

  2. Scroll through the script until you reach the Calculate Data SD that says identifying comically bad epochs. In the script there will be a line that says 'plot_figs' 'off', to plot the figures change the off to on. Ensure that the changes have been saved. The script should look like this:

    Plot Epoch Decision Criteria Figures

Starting a new Matlab session

Note that if you are starting a new Matlab session you will need to complete steps 1-3 from the beggining of this episode to set the path for and open EEGLAB.

  1. In the EEGLAB drop-down menu, navigate to Import data-> Using EEGLAB functions and plugins-> From BIDS subject folder. In the file chooser navigate into the sub-001/eeg directory, select the sub-001_task-faceFO_eeg.set file and press open.

  2. Once the file has loaded, run the localParam.m script by typing the following into the command window:

     >> localParam
    
  3. The script will display ‘Done!’ in the command window when it has completed. The script will have printed out 4 figures as it ran. These figures represent that same time decision criteris from the pervious episode but show the decisions that are being made with the current parameters on the loaded file from the Face13 dataset.

Face13 Time Decision Criteria Figs

While trying to determine optiminal parameters for a dataset it can be helpful to plot the figures to see how parameter edits influence the different steps in the decisions making criteria function.

Investigating Channel Decision Making Criteria

The figures for the channel decision making criteria can also be plotted. The decision criteria for channels is located below the time decision criteria in the script. The line that says 'plot_figs' off' needs to be changed to say 'plot_figs' on'. The script should look like this:

Plot Channel Decision Criteria Figures.


Key Points

  • Optiminal parameters can be determined by running the localparam.m script.

  • Once optiminal parameters are determined, these values can be input into the batch configuration file c01.


Making Decisions About Parameters

Overview

Teaching: 15 min
Exercises: 15 min
Questions
  • How do you identify incorrect parameters?

Objectives
  • Understand how to identify and optimize incorrect parameters.

Determining Optiminal Parameters

The following examples are datasets that have been run through the pipline with incorrect parameters. In each example, determine what parameters are incorrect and how they should be corrected. Remember to look at decisions the pipeline has made about removing channels and time, as well ensure the correct montage was used.

Examples

Example 1

Parameters Example 1 Channels Parameters Example 1 Components Parameters Example 1 Topos

Identifying Incorrect Parameters

  • For Channels: In this example, too many channels are being removed for low_r and ch_sd. Both of these marks are being added to non-artefactual channels, indicating that the parameters are not optimized. When scrolling through the data, channels that are being removed for low_r or ch_sd should be visably artefactual. For example, in this figure there are a few channels at the bottom that are truly artefactual. These are channels that we would want to be marked for removal. In this example there are also a lot of channels at the top of the file that are being marked for removal because they were outliers on the basis of channel standard deviation of voltage measure (ch_sd). These channels contain eyeblinks but are not otherwise artefactual. These are channels that we would want to remain in the data for the ICA because we know the eyeblink artefacts can be isolated into one component and be removed.

  • For Time: In this one screenshot there were no decisions that the pipeline made about time. We can agree that there weren’t any artefactual time periods that should have been marked for removal.

  • For Topographies: The topographies in this example are warped and labeled correctly. The easiest way to determine that the topographies are correct is by looking in the component data scroll plot for the component that contains eyeblinks. The topography for this eye component should have activation at the front of the head and be labeled as an eye component. In this example we can identify component 1 as containing eyeblinks and we can see in the topography that the activation is at the front of the head and that it has been correctly labeled as eye.

Determining Optiminal Parameters

  • For Channels: The parameters for low_r and ch_sd for channels should be edited so the pipeline makes more accurate decisions about which channels are artefactual. The low_r marker identifies channels that have a low correlation coefficient to neighbouring channels throughout the file. The main parameters that can be edited to change low_r decisions are the [r_ch_o] and [r_ch_f_o] parameters.

The [r_ch_o] parameter is the threshold for identifying outlier channels during the fixed neighbour correlation criteria. This parameter is the multiplication factor to determine how many quantiles beyond the median a channel must be for it to be indetified as an outlier. A higher value indicates that for a channel to be marked an outlier, the channel must be further from the median.

The [r_ch_f_o] parameter is the threshold for flagging channels during the fixed neighbour correlation criteria and is used to determine what percentage of the file a channel has to be an outlier for it to be marked. A higher value indicates that the channel has to be an outlier for a greater percentage of the file to be marked.

Increasing the value of either the [r_ch_o] or [r_ch_f_o] parameters would decrease the number of channels that are being marked for low_r. Since in this example, non-artefactual channels are being marked we want less to be marked. B


Key Points

  • FIXME