╔═══╦╗░╔╦═══╦═══╦╗░░╔═══╗
║╔═╗║║░║║╔═╗║╔═╗║║░░║╔══╝
║╚═╝║║░║║╚═╝║╚═╝║║░░║╚══╗
║╔══╣║░║║╔╗╔╣╔══╣║░╔╣╔══╝
║║░░║╚═╝║║║╚╣║░░║╚═╝║╚══╗
╚╝░░╚═══╩╝╚═╩╝░░╚═══╩═══╝

Picking Unique Relevant Peptides for viraL Experiments

Version: 0.3.1

Description
Requirements
How-To
How-To-Portable
Workflow

Description

Recent outbreaks of the Ebola and Zika viruses demonstrate that viruses are a critical topic which has an immense need for research. In viral diagnostics, mass spectrometry is catching up to the established ELISA and PCR. Targeted mass spectrometry-based proteomic analysis is predestined for viral detection since it excels with a high sensitivity for low abundance peptides. However, a preselection of marker peptides must be established for developing a targeted assay. For this purpose the tool Purple was designed in this thesis. Based on the input of the user, Purple identifies peptides that are representative for a species or strain of interest. A comparison is done between target peptides and background peptides, which includes not only uniqueness to identical background sequences, but also to highly similar ones, as it is necessary in biological applications, such as viruses with a high mutation rate. In addition, the ability to predict the detectability of the recommended peptides was performed in a proof of concept. The software suggests a sufficient amount of peptides which are highly different from a given background and can be detected. With the simplification of targeted assay design, Purple can be effectively used in targeted proteomics applications for robust, time efficient and high-throughput diagnostic testing in viral contexts.

Requirements

Python 3.4+
- tqdm
- biopython

How to use Purple

Download the latest version from the releases page and extract it.
Edit the config file src/config.yml

Parameter	Description	Example	Default
target	List of targets to find unique peptides	[Hepatitis B, Hepatitis A]	No default
threshold	Threshold to filter matches	Values between 0 and 100	70
update_DB	Build a database or use old one	True or False	False
path_DB	Path to folder with fasta files	C:/myFASTAs/	../res/DB/
path_output	Path to output folder to store results	C:/results/	../output/
targetFile	File name of the fasta with target entries	target.fasta
i_am_not_sure_about_target	Option to check targets before matching peptides	True or False	True
max_len_peptides	Maximum length of peptides	Positive numerical values	25
min_len_peptides	Minimum length of peptides	Positive numerical values	5
removeFragments	Option to remove proteins with "(Fragments)" in the header	True or False	No default
print_peptides	Print peptides at the end	True or False	False
comment	Comments for the log book	Text or numbers	no comment
do_genome_mapping	Do additional genome mapping afterwards	True or False	False

Run Purple in the console. Python is required.

```bash python Purple_Main.py –config config.yml ```
Open results in the output folder (output)
- Peptide: Unique peptide.
- Score: Score of the inexact matching for each peptide.
- Occurrences: Number of occurrences for each peptide.
- Species: species of the peptide.
- Protein name: Names of the proteins containing this peptide.
- Description: Complete header of the proteins listed in protein name.

How to use Purple portable

Download the latest portable version from the releases page and extract it.
Edit the config file config/config.yml and specify database folder and target.

Parameter	Description	Example	Default
target	List of targets to find unique peptides	[Hepatitis B, Hepatitis A]	No default
threshold	Threshold to filter matches	Values between 0 and 100	70
update_DB	Build a database or use old one	True or False	False
path_DB	Path to folder with fasta files	C:/myFASTAs/	../res/DB/
path_output	Path to output folder to store results	C:/results/	../output/
targetFile	File name of the fasta with target entries	target.fasta
i_am_not_sure_about_target	Option to check targets before matching peptides	True or False	True
max_len_peptides	Maximum length of peptides	Positive numerical values	25
min_len_peptides	Minimum length of peptides	Positive numerical values	5
removeFragments	Option to remove proteins with "(Fragments)" in the header	True or False	No default
print_peptides	Print peptides at the end	True or False	False
comment	Comments for the log book	Text or numbers	no comment
do_genome_mapping	Do additional genome mapping afterwards	True or False	False

Run Purple portable by double-clicking the Purple_Main.exe in the main folder (Python is not required) or run it via command line:

```bash Purple_Main.exe ```
Open results in the output folder (output)
- Peptide: Unique peptide.
- Score: Score of the inexact matching for each peptide.
- Occurrences: Number of occurrences for each peptide.
- Species: species of the peptide.
- Protein name: Names of the proteins containing this peptide.
- Description: Complete header of the proteins listed in protein name.

Workflow

Author: Johanna Lechner