Download Wsearch
Overview
Wsearch screens large data sets of protein sequences for identification of potential AGO-binding sites.
Requirements
Wsearch only requires Python >= 3.3. It does not require any additional library. Wsearch can be used on any operating system. No compilation required.
DownloadInput
As input, Wsearch requires protein sequence(s) in FASTA format and Bidirectional Position-Specific Scoring Matrix (BPSSM).
The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.
Usage
To see how Wsearch works, type python wsearch.py -help
The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.
python3.3 wsearch.py usage: wsearch.py [-h] -q QUERY -p PSSM [-o OUTPUT] wsearch.py scans protein FASTA sequences for W-containing motifs. optional arguments: -h, --help show this help message and exit -q QUERY, -query QUERY, --query QUERY file w/ fasta protein sequences -p PSSM, -pssm PSSM, --pssm PSSM PSSM matrix -o OUTPUT, -out OUTPUT, --output OUTPUT output file (default: [query].motifs)
Example
To search a ago.plant.fa
file containing FASTA sequences using ago.animals.mat
PSSM, run:
The matrices for different taxonomic groups (plants, animals and all species) were calculated based on experimentally-verified Argonaute-bindig proteins and their orthologous sequences and are provided with the software.
$ python3.3 wsearch.py -query examples/ago.plants.fa -pssm pssm/ago.animals.mat
Input
To calculate custom PSSMs use makepssm.py
script.
Usage
$ python3.3 makepssm.py usage: makepssm.py [-h] [-q QUERY] [-d DB] [-o OUT] makepssm.py calculates bidirectional PSSM matrix with Trp(W) residue in center. optional arguments: -h, --help show this help message and exit -q QUERY, -query QUERY, --query QUERY file w/ domain fasta sequences/subsequences -d DB, -db DB, --db DB file w/ background fasta sequences/subsequences -o OUT, -out OUT, --out OUT output PSSM file (by default as --in [file].pssm)
Example
To create a PSSM matrix based on FASTA sequences in ago.animals.fa
and background proteins in uniprot, run:
$ python3.3 makepssm.py -in ago.animals.fa -db uniprot_sprot_human.fa -out ago.animals.mat