Class | Bio::FlatFile |
In: |
lib/bio/io/flatfile.rb
lib/bio/io/flatfile/autodetection.rb lib/bio/io/flatfile/buffer.rb lib/bio/io/flatfile/splitter.rb |
Parent: | Object |
Bio::FlatFile is a helper and wrapper class to read a biological data file. It acts like a IO object. It can automatically detect data format, and users do not need to tell the class what the data is.
dbclass | [R] | Returns database class which is automatically detected or given in FlatFile#initialize. | ||||||
entry | [R] | |||||||
raw | [R] | If true, raw mode. | ||||||
skip_leader_mode | [RW] |
The mode how to skip leader of the data.
|
Same as Bio::FlatFile.open(nil, filename_or_stream, mode, perm, options).
Bio::FlatFile.auto(ARGF)
Bio::FlatFile.auto("embl/est_hum17.dat")
Bio::FlatFile.auto(IO.popen("gzip -dc nc1101.flat.gz"))
# File lib/bio/io/flatfile.rb, line 122 122: def self.auto(*arg, &block) 123: self.open(nil, *arg, &block) 124: end
Detects database class (== file format) of given string. If fails to determine, returns false or nil.
# File lib/bio/io/flatfile.rb, line 460 460: def self.autodetect(text) 461: AutoDetect.default.autodetect(text) 462: end
Detects database class (== file format) of given file. If fails to determine, returns nil.
# File lib/bio/io/flatfile.rb, line 440 440: def self.autodetect_file(filename) 441: self.open_file(filename).dbclass 442: end
Detects database class (== file format) of given input stream. If fails to determine, returns nil. Caution: the method reads some data from the input stream, and the data will be lost.
# File lib/bio/io/flatfile.rb, line 448 448: def self.autodetect_io(io) 449: self.new(nil, io).dbclass 450: end
This is OBSOLETED. Please use autodetect_io(io) instead.
# File lib/bio/io/flatfile.rb, line 453 453: def self.autodetect_stream(io) 454: $stderr.print "Bio::FlatFile.autodetect_stream will be deprecated." if $VERBOSE 455: self.autodetect_io(io) 456: end
Executes the block for every entry in the stream. Same as FlatFile.open(*arg) { |ff| ff.each { |entry| … }}.
Bio::FlatFile.foreach('test.fst') { |e| puts e.definition }
# File lib/bio/io/flatfile.rb, line 194 194: def self.foreach(*arg) 195: self.open(*arg) do |flatfileobj| 196: flatfileobj.each do |entry| 197: yield entry 198: end 199: end 200: end
Same as FlatFile.open, except that ‘stream’ should be a opened stream object (IO, File, …, who have the ‘gets’ method).
Bio::FlatFile.new(Bio::GenBank, ARGF)
Bio::FlatFile.new(Bio::GenBank, IO.popen("gzip -dc nc1101.flat.gz"))
Compatibility Note: Now, you cannot specify ":raw => true" or ":raw => false". Below styles are DEPRECATED.
# Bio::FlatFile.new(nil, $stdin, :raw=>true) # => ERROR # Please rewrite as below. ff = Bio::FlatFile.new(nil, $stdin) ff.raw = true
# Bio::FlatFile.new(nil, $stdin, true) # => ERROR # Please rewrite as below. ff = Bio::FlatFile.new(nil, $stdin) ff.raw = true
# File lib/bio/io/flatfile.rb, line 225 225: def initialize(dbclass, stream) 226: # 2nd arg: IO object 227: if stream.kind_of?(BufferedInputStream) 228: @stream = stream 229: else 230: @stream = BufferedInputStream.for_io(stream) 231: end 232: # 1st arg: database class (or file format autodetection) 233: if dbclass then 234: self.dbclass = dbclass 235: else 236: autodetect 237: end 238: # 239: @skip_leader_mode = :firsttime 240: @firsttime_flag = true 241: # default raw mode is false 242: self.raw = false 243: end
Bio::FlatFile.open(file, *arg) Bio::FlatFile.open(dbclass, file, *arg)
Creates a new Bio::FlatFile object to read a file or a stream which contains dbclass data.
dbclass should be a class (or module) or nil. e.g. Bio::GenBank, Bio::FastaFormat.
If file is a filename (which doesn‘t have gets method), the method opens a local file named file with File.open(filename, *arg).
When dbclass is omitted or nil is given to dbclass, the method tries to determine database class (file format) automatically. When it fails to determine, dbclass is set to nil and FlatFile#next_entry would fail. You can still set dbclass using FlatFile#dbclass= method.
Bio::FlatFile.open(Bio::GenBank, "genbank/gbest40.seq")
Bio::FlatFile.open(nil, "embl/est_hum17.dat")
Bio::FlatFile.open("genbank/gbest40.seq")
Bio::FlatFile.open(Bio::GenBank, $stdin)
If it is called with a block, the block will be executed with a new Bio::FlatFile object. If filename is given, the file is automatically closed when leaving the block.
Bio::FlatFile.open(nil, 'test4.fst') do |ff| ff.each { |e| print e.definition, "\n" } end
Bio::FlatFile.open('test4.fst') do |ff| ff.each { |e| print e.definition, "\n" } end
Compatibility Note: *arg is completely passed to the File.open and you cannot specify ":raw => true" or ":raw => false".
# File lib/bio/io/flatfile.rb, line 80 80: def self.open(*arg, &block) 81: # FlatFile.open(dbclass, file, mode, perm) 82: # FlatFile.open(file, mode, perm) 83: if arg.size <= 0 84: raise ArgumentError, 'wrong number of arguments (0 for 1)' 85: end 86: x = arg.shift 87: if x.is_a?(Module) then 88: # FlatFile.open(dbclass, filename_or_io, ...) 89: dbclass = x 90: elsif x.nil? then 91: # FlatFile.open(nil, filename_or_io, ...) 92: dbclass = nil 93: else 94: # FlatFile.open(filename, ...) 95: dbclass = nil 96: arg.unshift(x) 97: end 98: if arg.size <= 0 99: raise ArgumentError, 'wrong number of arguments (1 for 2)' 100: end 101: file = arg.shift 102: # check if file is filename or IO object 103: unless file.respond_to?(:gets) 104: # 'file' is a filename 105: _open_file(dbclass, file, *arg, &block) 106: else 107: # 'file' is a IO object 108: ff = self.new(dbclass, file) 109: block_given? ? (yield ff) : ff 110: end 111: end
Same as FlatFile.auto(filename, *arg), except that it only accept filename and doesn‘t accept IO object. File format is automatically determined.
It can accept a block. If a block is given, it returns the block‘s return value. Otherwise, it returns a new FlatFile object.
# File lib/bio/io/flatfile.rb, line 144 144: def self.open_file(filename, *arg) 145: _open_file(nil, filename, *arg) 146: end
Opens URI specified as uri. uri must be a String or URI object. *arg is passed to OpenURI.open_uri or URI#open.
Like FlatFile#open, it can accept a block.
Note that you MUST explicitly require ‘open-uri’. Because open-uri.rb modifies existing class, it isn‘t required by default.
# File lib/bio/io/flatfile.rb, line 177 177: def self.open_uri(uri, *arg) 178: if block_given? then 179: BufferedInputStream.open_uri(uri, *arg) do |stream| 180: yield self.new(nil, stream) 181: end 182: else 183: stream = BufferedInputStream.open_uri(uri, *arg) 184: self.new(nil, stream) 185: end 186: end
Same as FlatFile.auto(filename_or_stream, *arg).to_a
(This method might be OBSOLETED in the future.)
# File lib/bio/io/flatfile.rb, line 129 129: def self.to_a(*arg) 130: self.auto(*arg) do |ff| 131: raise 'cannot determine file format' unless ff.dbclass 132: ff.to_a 133: end 134: end
Performs determination of database class (file format). Pre-reads lines lines for format determination (default 31 lines). If fails, returns nil or false. Otherwise, returns database class.
The method can be called anytime if you want (but not recommended). This might be useful if input file is a mixture of muitiple format data.
# File lib/bio/io/flatfile.rb, line 429 429: def autodetect(lines = 31, ad = AutoDetect.default) 430: if r = ad.autodetect_flatfile(self, lines) 431: self.dbclass = r 432: else 433: self.dbclass = nil unless self.dbclass 434: end 435: r 436: end
Closes input stream. (similar to IO#close)
# File lib/bio/io/flatfile.rb, line 351 351: def close 352: @stream.close 353: end
Sets database class. Plese use only if autodetect fails.
# File lib/bio/io/flatfile.rb, line 400 400: def dbclass=(klass) 401: if klass then 402: @dbclass = klass 403: begin 404: @splitter = @dbclass.flatfile_splitter(@dbclass, @stream) 405: rescue NameError, NoMethodError 406: begin 407: splitter_class = @dbclass::FLATFILE_SPLITTER 408: rescue NameError 409: splitter_class = Splitter::Default 410: end 411: @splitter = splitter_class.new(klass, @stream) 412: end 413: else 414: @dbclass = nil 415: @splitter = nil 416: end 417: end
(end position of the last entry) + 1
# File lib/bio/io/flatfile.rb, line 322 322: def entry_ended_pos 323: @splitter.entry_ended_pos 324: end
a flag to write down entry start and end positions
# File lib/bio/io/flatfile.rb, line 307 307: def entry_pos_flag 308: @splitter.entry_pos_flag 309: end
Sets flag to write down entry start and end positions
# File lib/bio/io/flatfile.rb, line 312 312: def entry_pos_flag=(x) 313: @splitter.entry_pos_flag = x 314: end
Returns the last raw entry as a string.
# File lib/bio/io/flatfile.rb, line 302 302: def entry_raw 303: @splitter.entry 304: end
start position of the last entry
# File lib/bio/io/flatfile.rb, line 317 317: def entry_start_pos 318: @splitter.entry_start_pos 319: end
Similar to IO#gets. Internal use only. Users should not call it directly.
# File lib/bio/io/flatfile.rb, line 395 395: def gets(*arg) 396: @stream.gets(*arg) 397: end
(DEPRECATED) IO object in the flatfile object.
Compatibility Note: Bio::FlatFile#io is deprecated. Please use Bio::FlatFile#to_io instead.
# File lib/bio/io/flatfile.rb, line 255 255: def io 256: warn "Bio::FlatFile#io is deprecated." 257: @stream.to_io 258: end
Get next entry.
# File lib/bio/io/flatfile.rb, line 277 277: def next_entry 278: raise UnknownDataFormatError, 279: 'file format auto-detection failed?' unless @dbclass 280: if @skip_leader_mode and 281: ((@firsttime_flag and @skip_leader_mode == :firsttime) or 282: @skip_leader_mode == :everytime) 283: @splitter.skip_leader 284: end 285: if raw then 286: r = @splitter.get_entry 287: else 288: r = @splitter.get_parsed_entry 289: end 290: @firsttime_flag = false 291: return nil unless r 292: if raw then 293: r 294: else 295: @entry = r 296: @entry 297: end 298: end
Pathname, filename or URI (or nil).
# File lib/bio/io/flatfile.rb, line 268 268: def path 269: @stream.path 270: end
Returns current position of input stream. If the input stream is not a normal file, the result is not guaranteed. It is similar to IO#pos. Note that it will not be equal to io.pos, because FlatFile has its own internal buffer.
# File lib/bio/io/flatfile.rb, line 361 361: def pos 362: @stream.pos 363: end
(Not recommended to use it.) Sets position of input stream. If the input stream is not a normal file, the result is not guaranteed. It is similar to IO#pos=. Note that it will not be equal to io.pos=, because FlatFile has its own internal buffer.
# File lib/bio/io/flatfile.rb, line 372 372: def pos=(p) 373: @stream.pos=(p) 374: end
If true is given, the next_entry method returns a entry as a text, whereas if false, returns as a parsed object.
# File lib/bio/io/flatfile.rb, line 386 386: def raw=(bool) 387: @raw = (bool ? true : false) 388: end
Resets file pointer to the start of the flatfile. (similar to IO#rewind)
# File lib/bio/io/flatfile.rb, line 343 343: def rewind 344: r = (@splitter || @stream).rewind 345: @firsttime_flag = true 346: r 347: end
IO object in the flatfile object.
Compatibility Note: Bio::FlatFile#io is deprecated.
# File lib/bio/io/flatfile.rb, line 263 263: def to_io 264: @stream.to_io 265: end