Class Bio::Blast::Fastacmd
In: lib/bio/io/fastacmd.rb
Parent: Object

DESCRIPTION

Retrieves FASTA formatted sequences from a blast database using NCBI fastacmd command.

This class requires ‘fastacmd’ command and a blast database (formatted using the ’-o’ option of ‘formatdb’).

USAGE

 require 'bio'

 fastacmd = Bio::Blast::Fastacmd.new("/db/myblastdb")

 entry = fastacmd.get_by_id("sp:128U_DROME")
 fastacmd.fetch("sp:128U_DROME")
 fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])

 fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"]).each do |fasta|
   puts fasta
 end

REFERENCES

Methods

each   each_entry   fetch   get_by_id   new  

Included Modules

Enumerable

Attributes

database  [RW]  Database file path.
fastacmd  [RW]  fastacmd command file path.

Public Class methods

This method provides a handle to a BLASTable database, which you can then use to retrieve sequences.

Prerequisites:

  • You have created a BLASTable database with the ’-o T’ option.
  • You have the NCBI fastacmd tool installed.

For example, suppose the original input file looks like:

 >my_seq_1
 ACCGACCTCCGGAACGGATAGCCCGACCTACG
 >my_seq_2
 TCCGACCTTTCCTACCGCACACCTACGCCATCAC
 ...

and you‘ve created a BLASTable database from that with the command

 cd /my_dir/
 formatdb -i my_input_file -t Test -n Test -o T

then you can get a handle to this database with the command

 fastacmd = Bio::Blast::Fastacmd.new("/my_dir/Test")

Arguments:

  • database:: path and name of BLASTable database

[Source]

    # File lib/bio/io/fastacmd.rb, line 81
81:   def initialize(blast_database_file_path)
82:     @database = blast_database_file_path
83:     @fastacmd = 'fastacmd'
84:   end

Public Instance methods

each()

Alias for each_entry

Iterates over all sequences in the database.

 fastacmd.each_entry do |fasta|
   p [ fasta.definition[0..30], fasta.seq.size ]
 end

Returns:a Bio::FastaFormat object for each iteration

[Source]

     # File lib/bio/io/fastacmd.rb, line 130
130:   def each_entry
131:     cmd = [ @fastacmd, '-d', @database, '-D', '1' ]
132:     Bio::Command.call_command(cmd) do |io|
133:       io.close_write
134:       Bio::FlatFile.open(Bio::FastaFormat, io) do |f|
135:         f.each_entry do |entry|
136:           yield entry
137:         end
138:       end
139:     end
140:     self
141:   end

Get the sequence for a list of IDs in the database.

For example:

 p fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])

This method always returns an array of Bio::FastaFormat objects, even when the result is a single entry.


Arguments:

  • ids: list of IDs to retrieve from the database
Returns:array of Bio::FastaFormat objects

[Source]

     # File lib/bio/io/fastacmd.rb, line 109
109:   def fetch(list)
110:     if list.respond_to?(:join)
111:       entry_id = list.join(",")
112:     else
113:       entry_id = list
114:     end
115: 
116:     cmd = [ @fastacmd, '-d', @database, '-s', entry_id ]
117:     Bio::Command.call_command(cmd) do |io|
118:       io.close_write
119:       Bio::FlatFile.new(Bio::FastaFormat, io).to_a
120:     end
121:   end

Get the sequence of a specific entry in the BLASTable database. For example:

 entry = fastacmd.get_by_id("sp:128U_DROME")

Arguments:

  • id: id of an entry in the BLAST database
Returns:a Bio::FastaFormat object

[Source]

    # File lib/bio/io/fastacmd.rb, line 94
94:   def get_by_id(entry_id)
95:     fetch(entry_id).shift
96:   end

[Validate]