Home | Site Map | Site Stats | Contact Us | Discussion Forum

Welcome to the Dendrome Project!

Icon 05 Resources

Scripts BLAST FASTA Tools Databases Public ESTs TreeGenes Plant Ontology

Icon 03 Updates

New EST analysis and submission pipeline available for use! | Plant Gene Ontology database ported into mysql | New Forestry Careers and Education Outreach Website is Live! |

Icon 02 Links

Conifer Genome Network | Conifer Genome Project | TreeGenes Database | Dendrome Wiki | Neale Lab | Forestry Careers and Education Resource |



TreeGenes::Scripts

Here are a few resources available to facilitate bioinformatic needs. All of the scripts here are written in Perl and require libraries that have been installed on our system. These utilities assist with parsing, processing, and sequence comparisons. You may add a Perl script that you have developed here. If you encounter difficulties with any of the scripts listed here, please contact the help desk.

ace2fasta_contigs_align_pipe.pl

Takes Ace files and outputs aligned fasta files for each contig and a fasta file of the contig consensus seuqences. Used in the Re-sequencing alignment pipeline
[ more info.. ]

addcolumn.pl

This script adds a column to a column file specified by an arithmetic string
[ more info.. ]

alignedcontig2fasta_align_pipe.pl

Generates a fasta file fontigs fror reads from multiple com phrap output after the contig consensus sequences have been aligned using a program such as probconsRNA
[ more info.. ]

getcolumns.pl

This script extracts a column from a multicolumn file (splits on whitespace by default). You can specify more than one column at a time by - or -
[ more info.. ]

grepseq.pl

Extract sub-sequences from sequences on stdin based on a (perl) regular expression given on the cmd line. Input sequences in labeled fasta format. By default the labels are searched using the regexp.
[ more info.. ]

gseq.pl

batchextract retrieves one or more sequence entries from NCBI specified by accession numbers. These must be given either as standard input, as seperate arguments or from a file.
[ more info.. ]

histogram.pl

Makes a 2D or 3D histogram data-set for gnuplot from data in specified column(s) of an input file. Lines not starting with a real number are ignored. It also understands ``framed'' tables dumped from databases.
[ more info.. ]

mmaquery.pl

mrnaquery.pl retrieves the accession numbers of all the mRNA sequences for a specified organism from Genbank, omitting ESTs, STSs, working drafts and patents. If the optional third argument is not given, the output is written to STDOUT
[ more info.. ]

nuclcount.pl

This script counts n-mers in a set of sequences.
[ more info.. ]

orfprune.pl

Prunes a seqfile of open reading frames. Either by masking of removing sequence entries
[ more info.. ]

pickseq.pl

Picks out sequence entries from a sequence file based on id/accession. It takes a newline separated list of ids/accessions to pick, either as first argument or from STDIN. Genbank and EMBL entries is identified by their accession. Fasta by id
[ more info.. ]

PineSAP_alignment_osx

pipeRuns the Pine SAP alignment line.
[ more info.. ]

reformat.pl

This script does reformatting between sequence formats. It handles GenBank, EMBL, Fasta and all the other formats supported by bioperl. In addition it formats to labeled fasta (lfa) which is the a handy extention of the fasta format
[ more info.. ]

seqfilediff.pl

Reports the differences between two sequence files, and prints either the sequences unique to file1 or the sequences that the files have in common
[ more info.. ]

splitbyorg.pl

This script splits EMBL/GenBank/Swissprot entries in a file/stream into subfiles for each organism
[ more info.. ]

tablefilter.pl

This script filters a table based on a specifies string of conditions. Quote the string in '' not to interpolate the $ vars in the shell.
[ more info.. ]

tablesort.pl

Specify what columns to sort by. Default is to sort from column one and onwards until the row is unique
[ more info.. ]

 

^ top