Download Wsearch

Installation

Overview

Wsearch screens large data sets of protein sequences for identification of potential AGO-binding sites.

Requirements

Wsearch only requires Python >= 3.3. It does not require any additional library. Wsearch can be used on any operating system. No compilation required.

Download
Running Wsearch

Input

As input, Wsearch requires protein sequence(s) in FASTA format and Bidirectional Position-Specific Scoring Matrix (BPSSM).

The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.

Usage

To see how Wsearch works, type python wsearch.py -help

The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.

python3.3 wsearch.py
usage: wsearch.py [-h] -q QUERY -p PSSM [-o OUTPUT]

wsearch.py scans protein FASTA sequences for W-containing motifs.

optional arguments:
  -h, --help            show this help message and exit
  -q QUERY, -query QUERY, --query QUERY
                        file w/ fasta protein sequences
  -p PSSM, -pssm PSSM, --pssm PSSM
                        PSSM matrix
  -o OUTPUT, -out OUTPUT, --output OUTPUT
                        output file (default: [query].motifs)

Example

To search a ago.plant.fa file containing FASTA sequences using ago.animals.mat PSSM, run:

The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.

$ python3.3 wsearch.py -query examples/ago.plants.fa -pssm pssm/ago.animals.mat
Custom PSSM

Input

To calculate custom PSSMs use makepssm.py script.

Usage

$ python3.3 makepssm.py

usage: makepssm.py [-h] [-q QUERY] [-d DB] [-o OUT]

makepssm.py calculates bidirectional PSSM matrix with Trp(W) residue in
center.

optional arguments:
  -h, --help            show this help message and exit
  -q QUERY, -query QUERY, --query QUERY
                        file w/ domain fasta sequences/subsequences
  -d DB, -db DB, --db DB
                        file w/ background fasta sequences/subsequences
  -o OUT, -out OUT, --out OUT
                        output PSSM file (by default as --in [file].pssm)

Example

To create a PSSM matrix based on FASTA sequences in ago.animals.fa and background proteins in uniprot, run:

$ python3.3 makepssm.py -in ago.animals.fa -db uniprot_sprot_human.fa -out ago.animals.mat