Class | Bio::FlatFileIndex |
In: |
lib/bio/io/flatfile/bdb.rb
lib/bio/io/flatfile/index.rb lib/bio/io/flatfile/indexer.rb |
Parent: | Object |
Bio::FlatFileIndex is a class for OBDA flatfile index.
MAGIC_FLAT | = | 'flat/1' | magic string for flat/1 index | |
MAGIC_BDB | = | 'BerkeleyDB/1' | magic string for BerkeleyDB/1 index |
# File lib/bio/io/flatfile/indexer.rb, line 734 734: def self.formatstring2class(format_string) 735: case format_string 736: when /genbank/i 737: dbclass = Bio::GenBank 738: when /genpept/i 739: dbclass = Bio::GenPept 740: when /embl/i 741: dbclass = Bio::EMBL 742: when /sptr/i 743: dbclass = Bio::SPTR 744: when /fasta/i 745: dbclass = Bio::FastaFormat 746: else 747: raise "Unsupported format : #{format}" 748: end 749: end
# File lib/bio/io/flatfile/indexer.rb, line 751 751: def self.makeindex(is_bdb, dbname, format, options, *files) 752: if format then 753: dbclass = formatstring2class(format) 754: else 755: dbclass = Bio::FlatFile.autodetect_file(files[0]) 756: raise "Cannot determine format" unless dbclass 757: DEBUG.print "file format is #{dbclass}\n" 758: end 759: 760: options = {} unless options 761: pns = options['primary_namespace'] 762: sns = options['secondary_namespaces'] 763: 764: parser = Indexer::Parser.new(dbclass, pns, sns) 765: 766: #if /(EMBL|SPTR)/ =~ dbclass.to_s then 767: #a = [ 'DR' ] 768: #parser.add_secondary_namespaces(*a) 769: #end 770: if sns = options['additional_secondary_namespaces'] then 771: parser.add_secondary_namespaces(*sns) 772: end 773: 774: if is_bdb then 775: Indexer::makeindexBDB(dbname, parser, options, *files) 776: else 777: Indexer::makeindexFlat(dbname, parser, options, *files) 778: end 779: end
Opens existing databank. Databank is a directory which contains indexed files and configuration files. The type of the databank (flat or BerkeleyDB) are determined automatically.
Unlike +FlatFileIndex.open+, block is not allowed.
# File lib/bio/io/flatfile/index.rb, line 113 113: def initialize(name) 114: @db = DataBank.open(name) 115: end
Opens existing databank. Databank is a directory which contains indexed files and configuration files. The type of the databank (flat or BerkeleyDB) are determined automatically.
If block is given, the databank object is passed to the block. The databank will be automatically closed when the block terminates.
# File lib/bio/io/flatfile/index.rb, line 88 88: def self.open(name) 89: if block_given? then 90: begin 91: i = self.new(name) 92: r = yield i 93: ensure 94: if i then 95: begin 96: i.close 97: rescue IOError 98: end 99: end 100: end 101: else 102: r = self.new(name) 103: end 104: r 105: end
# File lib/bio/io/flatfile/indexer.rb, line 781 781: def self.update_index(dbname, format, options, *files) 782: if format then 783: parser = Indexer::Parser.new(dbclass) 784: else 785: parser = nil 786: end 787: Indexer::update_index(dbname, parser, options, *files) 788: end
If true, consistency checks will be performed every time accessing flatfiles. If nil/false, no checks are performed.
By default, always_check_consistency is true.
# File lib/bio/io/flatfile/index.rb, line 297 297: def always_check_consistency(bool) 298: @db.always_check 299: end
If true is given, consistency checks will be performed every time accessing flatfiles. If nil/false, no checks are performed.
By default, always_check_consistency is true.
# File lib/bio/io/flatfile/index.rb, line 288 288: def always_check_consistency=(bool) 289: @db.always_check=(bool) 290: end
Check consistency between the databank(index) and original flat files.
If the original flat files are changed after creating the databank, raises RuntimeError.
Note that this check only compares file sizes as described in the OBDA specification.
# File lib/bio/io/flatfile/index.rb, line 278 278: def check_consistency 279: check_closed? 280: @db.check_consistency 281: end
Closes the databank. Returns nil.
# File lib/bio/io/flatfile/index.rb, line 132 132: def close 133: check_closed? 134: @db.close 135: @db = nil 136: end
Returns true if already closed. Otherwise, returns false.
# File lib/bio/io/flatfile/index.rb, line 139 139: def closed? 140: if @db then 141: false 142: else 143: true 144: end 145: end
Returns default namespaces. Returns an array of strings or nil. nil means all namespaces.
# File lib/bio/io/flatfile/index.rb, line 172 172: def default_namespaces 173: @names 174: end
Set default namespaces. default_namespaces = nil means all namespaces in the databank.
default_namespaces= [ str1, str2, … ] means set default namespeces to str1, str2, …
Default namespaces specified in this method only affect get_by_id, search, and include? methods.
Default of default namespaces is nil (that is, all namespaces are search destinations by default).
# File lib/bio/io/flatfile/index.rb, line 160 160: def default_namespaces=(names) 161: if names then 162: @names = [] 163: names.each { |x| @names.push(x.dup) } 164: else 165: @names = nil 166: end 167: end
common interface defined in registry.rb Searching databank and returns entry (or entries) as a string. Multiple entries (contatinated to one string) may be returned. Returns empty string if not found.
# File lib/bio/io/flatfile/index.rb, line 122 122: def get_by_id(key) 123: search(key).to_s 124: end
Searching databank. If some entries are found, returns an array of unique IDs (primary identifiers). If not found anything, returns nil.
This method is useful when search result is very large and search method is very slow.
# File lib/bio/io/flatfile/index.rb, line 210 210: def include?(key) 211: check_closed? 212: if @names then 213: r = @db.search_namespaces_get_unique_id(key, *@names) 214: else 215: r = @db.search_all_get_unique_id(key) 216: end 217: if r.empty? then 218: nil 219: else 220: r 221: end 222: end
Same as include?, but serching only specified namespaces.
# File lib/bio/io/flatfile/index.rb, line 226 226: def include_in_namespaces?(key, *names) 227: check_closed? 228: r = @db.search_namespaces_get_unique_id(key, *names) 229: if r.empty? then 230: nil 231: else 232: r 233: end 234: end
Same as include?, but serching only primary namespace.
# File lib/bio/io/flatfile/index.rb, line 238 238: def include_in_primary?(key) 239: check_closed? 240: r = @db.search_primary_get_unique_id(key) 241: if r.empty? then 242: nil 243: else 244: r 245: end 246: end
Returns names of namespaces defined in the databank. (example: [ ‘LOCUS’, ‘ACCESSION’, ‘VERSION’ ] )
# File lib/bio/io/flatfile/index.rb, line 251 251: def namespaces 252: check_closed? 253: r = secondary_namespaces 254: r.unshift primary_namespace 255: r 256: end
Returns name of primary namespace as a string.
# File lib/bio/io/flatfile/index.rb, line 259 259: def primary_namespace 260: check_closed? 261: @db.primary.name 262: end
Searching databank and returns a Bio::FlatFileIndex::Results object.
# File lib/bio/io/flatfile/index.rb, line 177 177: def search(key) 178: check_closed? 179: if @names then 180: @db.search_namespaces(key, *@names) 181: else 182: @db.search_all(key) 183: end 184: end
Searching only specified namespeces. Returns a Bio::FlatFileIndex::Results object.
# File lib/bio/io/flatfile/index.rb, line 189 189: def search_namespaces(key, *names) 190: check_closed? 191: @db.search_namespaces(key, *names) 192: end
Searching only primary namespece. Returns a Bio::FlatFileIndex::Results object.
# File lib/bio/io/flatfile/index.rb, line 197 197: def search_primary(key) 198: check_closed? 199: @db.search_primary(key) 200: end
Returns names of secondary namespaces as an array of strings.
# File lib/bio/io/flatfile/index.rb, line 265 265: def secondary_namespaces 266: check_closed? 267: @db.secondary.names 268: end