Here are a few resources available to facilitate bioinformatic needs. All of the scripts here are written in Perl and require libraries that have been installed on our system. These utilities assist with parsing, processing, and sequence comparisons. To select the entire script, simply double click on any section of the code. If you encounter difficulties with any of the scripts listed here or would like to submit your own, please contact the help desk.
This script will parse a blastx or blastn result and has two output formats that summarize the blast output, see below for example, both outputs ignore Queries with no hits
Extract sub-sequences from sequences on stdin based on a (perl) regular expression given on the cmd line. Input sequences in labeled fasta format. By default the labels are searched using the regexp.
batchextract retrieves one or more sequence entries from NCBI specified by accession numbers. These must be given either as standard input, as seperate arguments or from a file.
Makes a 2D or 3D histogram data-set for gnuplot from data in specified column(s) of an input file. Lines not starting with a real number are ignored. It also understands ``framed'' tables dumped from databases.
mrnaquery.pl retrieves the accession numbers of all the mRNA sequences for a specified organism from Genbank, omitting ESTs, STSs, working drafts and patents. If the optional third argument is not given, the output is written to STDOUT
Picks out sequence entries from a sequence file based on id/accession. It takes a newline separated list of ids/accessions to pick, either as first argument or from STDIN. Genbank and EMBL entries is identified by their accession. Fasta by id
This script does reformatting between sequence formats. It handles GenBank, EMBL, Fasta and all the other formats supported by bioperl. In addition it formats to labeled fasta (lfa) which is the a handy extention of the fasta format