Medical-Blocks

Public Access


SWEC iEEG Dataset

Dataset summary

The SWEC iEEG Dataset contains fully anonymised multi-channel iEEG recordings collected from a total of 68 subjects suffering from pharmacoresistent epilepsy undergoing pre-surgical evaluation for epilepsy. The data was recorded at the Sleep Wake Epilepsy Center (SWEC) of the Department of Neurology at the Inselspital in Bern, Switzerland. The dataset includes a total of 9328 hours of signal and 704 ictal events, annotated by board-certified epileptologist Prof. Kaspar Schindler.

Dataset structure

The dataset is divided into 68 folders, one per subject. All the files in each folder are prefixed with the ID of the corresponding patient. Each folder contains the entire data for one patient divided into multiple HDF5 part files, with parts being roughly 10GB in (uncompressed) size. Each folder also contains a total file in the HDF5 VDS format, combining all the parts into one continuous virtual recording for easy access and manipulation. The total file includes the ictal annotations.

The dataset is comprised of 696 files across 68 folders, for a total (compressed, see Recordings details) size of 4.6TB.

Structure example

As an example of a typical subject folder, here is subject

ID04
:

|-- ... |--   ID03 |--   ID04 |     |--   ID04_part_1.h5 |     |--   ID04_part_2.h5 |     |--   ID04_total.h5 |--   ID05 |--   ...
Recordings

The recordings can be accessed at the

data/ieeg
dataset in every part file, or alternatively at the
data/ieeg
dataset of the total file for a unified view. Access through the total file is recommended. The
ieeg
dataset has shape (C, T) and is chunked into pieces containing 3 minutes of signal across all the channels (e.g., each chunk is (64, 92160) for a recording with sample rate 512Hz and 64 channels), to fit a suitable random access pattern. Moreover, the data is compressed with
lz4hc
, requiring an appropriate decoder for reading (e.g., h5py with hdf5plugin).

Annotations

The ictal annotations can be accessed at the

data/seizures
dataset of the total file. The
seizures
dataset is a structured array with fields
onsets
and
offsets
, representing the seizures onsets and offsets respectively in seconds since the beginning of recording.

File integrity

The total file also contains the datasets

info/files
and
info/checksums
. The
files
dataset includes the list of parts name associated with the specific patient, while the
checksum
dataset includes the
blake2b
checksums of each part file to verify their integrate (e.g., with the
b2sum <part>
utility).

Other attributes

Every file contains attributes

patient
with the patient ID,
channels
with the number of channels, and
sampling_rate
with the sampling rate in Hz.

Dataset curation

Preparation

The iEEG signals were recorded intracranially by strip, grid, and depth electrodes. After 16-bit analog-to-digital conversion, the recordings were visually inspected for removal of channels corrupted by artifacts. The signals were then digitally band-pass filtered between 0.5 and 150 Hz using a fourth-order Butterworth forward-backward filter and finally downsampled to either 512Hz or 1024Hz.

Ethical considerations

All the subjects gave written informed consent that their iEEG data might be used for research and teaching purposes. The decision on the necessity for iEEG recordings, the electrode implantation scheme, and the decision about surgical therapy were made entirely on clinical grounds. These decisions were taken prior to and completely independently from the compilation of this dataset.

Additional information

Dataset curators

The dataset was created by Kaspar Schindler and the team at the SWEC. The dataset was further prepared for public availability by Francesco Carzaniga, Kaspar Schindler, Abbas Rahimi, and the team at the SWEC.

Initial version

An initial version containing the first 18 subjects with a different format can be found at this location. Please note that that version is considered obsolete and might be made unavailable without notice.

Licensing

The iEEG SWEC dataset is licensed using the Community Data License Agreement – Permissive, Version 2.0.

Disclaimer

This dataset may only be used for research. For other applications any liability is denied. In particular, the dataset must not be used for diagnostic purposes.

Citation

If you are using this dataset, please cite the following:

@article{carzaniga2025foundation,
title={A foundation model with multi-variate parallel attention to generate neuronal activity},
author={Carzaniga, Francesco and Hersche, Michael and Sebastian, Abu and Schindler, Kaspar and Rahimi, Abbas},
journal={arXiv preprint arXiv:2506.20354},
year={2025}
}
Access To Database

To download the database using the browser, please use the following link:
SWEC iEEG Database Access.

If you prefer to download the database via the terminal using the

wget
command, follow the steps below. Ensure you are logged in to access your authentication tokens.

  1. Download the command list:
    Retrieve the file containing the necessary

    wget
    commands:
    Download wget_list_SWECiEEG.csv

  2. Set your environment variables:
    Before executing the commands, make sure to set the required variables in your system:

    $TOKEN
    and
    $REFRESH_TOKEN
    .

    👉

Our work is supported by

University of Bern