IDeFIX  Version 1.0
A bioinformatics project developed by MF1 at the RKI, Berlin, Germany.
IDeFIX Documentation

IDeFIX

IDeFIX is a tool for demultiplexing Illumina NGS data.

It reports inconsistencies between the raw data and the Sample Sheet, checks for duplicates of indices/ index combinations in the latter and removes unwanted characters from it. Apart from messages printed on the terminal, IDeFIX creates an IDeFIX_Report.csv containing the indices/ index combinations from the raw data and their abundance as well as their count in the Sample Sheet and the corresponding Index ID(s). This file is stored in the project folder.

IDeFIX_Workflow

Dependencies and Limitations

  • Works for MiSeq and HiSeq 2000/2500 Systems
    • RunInfo.xml and SampleSheet.csv have to be in the project folder, which they are by default
  • Python 3.4 or higher required
  • Necessary Python modules:
    • numpy
    • pandas

Download

To run IDeFIX, download the repository from GitLab:

git clone https://gitlab.com/rki_bioinformatics/IDeFIX.git

Features

The tool has two main functionalities, which can be executed independent of each other or in combination. Those are:

Character Correction

This feature removes backslashes, dots, tabs and whitespaces as well as umlauts from SampleSheet.csv, which impede the action of the Illumina software "bcl2fastq". By default it is set to True.

If you want to run IDeFIX for an Index report without character removal, please refer to Report of Indices and Optional Arguments.

If you want to only remove the unwanted characters, without an Index Report, use:

python3 path/to/IDeFIX.py path/to/project_dir --only_char_correct True

For more information on optional arguments, see Optional Arguments

Report of Indices

Based on the binary NGS data and the Sample Sheet, IDeFIX reports inconsistencies of indices between these two and duplicates in the Sample Sheet on the terminal and as IDeFIX_Report.csv in the project directory.

By default, unwanted characters (see previous section) are removed beforehand.

The general usage of IDeFIX:

python3 path/to/IDeFIX.py path/to/project_directory <optional_arg> <value>

Optional Arguments

Optional Argument Description Default
-j, –jobs number of jobs/processes 20
-t, –threshold threshold for minimum number of reads of an index to be reported 1000
-c, –correct_chars removal of backslashes, dots, tabs, whitespaces and umlauts from SampleSheet, saves the corrected version as SampleSheet.csv and the original as SampleSheet_beforeIDeFIX.csv True
-C, –only_char_correct removal of backslashes, dots, tabs and whitespaces as well as umlauts only False