Class | Bio::NCBI::REST |
In: |
lib/bio/io/ncbirest.rb
|
Parent: | Object |
The Bio::NCBI::REST class provides REST client for the NCBI E-Utilities
Entrez utilities index:
NCBI_INTERVAL | = | 1.0 / 3.0 |
Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time
weekdays for any series of more than 100 requests. -> Not implemented
yet in BioRuby
Wait for 1/3 seconds. NCBI‘s restriction is: "Make no more than 3 requests every 1 second.". |
# File lib/bio/io/ncbirest.rb, line 351 351: def self.efetch(*args) 352: self.new.efetch(*args) 353: end
# File lib/bio/io/ncbirest.rb, line 343 343: def self.esearch(*args) 344: self.new.esearch(*args) 345: end
# File lib/bio/io/ncbirest.rb, line 347 347: def self.esearch_count(*args) 348: self.new.esearch_count(*args) 349: end
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"}) ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
Returns: | String |
# File lib/bio/io/ncbirest.rb, line 315 315: def efetch(ids, hash = {}, step = 100) 316: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" 317: opts = default_parameters.merge({ "retmode" => "text" }) 318: opts.update(hash) 319: 320: case ids 321: when Array 322: list = ids 323: else 324: list = ids.to_s.split(/\s*,\s*/) 325: end 326: 327: result = "" 328: 0.step(list.size, step) do |i| 329: opts["id"] = list[i, step].join(',') 330: unless opts["id"].empty? 331: response = ncbi_post_form(serv, opts) 332: result += response.body 333: end 334: end 335: return result.strip 336: #return result.strip.split(/\n\n+/) 337: end
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome books cancerchromosomes cdd gap domains gene genomeprj gensat geo gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc popset probe proteinclusters pcassay pccompound pcsubstance snp taxonomy toolkit unigene unists
ncbi = Bio::NCBI::REST.new ncbi.einfo Bio::NCBI::REST.einfo
Returns: | array of string (database names) |
# File lib/bio/io/ncbirest.rb, line 179 179: def einfo 180: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" 181: opts = default_parameters.merge({}) 182: response = ncbi_post_form(serv, opts) 183: result = response.body 184: list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten 185: return list 186: end
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 246 246: def esearch(str, hash = {}, limit = nil, step = 10000) 247: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 248: opts = default_parameters.merge({ "term" => str }) 249: opts.update(hash) 250: 251: case opts["rettype"] 252: when "count" 253: count = esearch_count(str, opts) 254: return count 255: else 256: retstart = 0 257: retstart = hash["retstart"].to_i if hash["retstart"] 258: 259: limit ||= hash["retmax"].to_i if hash["retmax"] 260: limit ||= 100 # default limit is 100 261: limit = esearch_count(str, opts) if limit == 0 # unlimit 262: 263: list = [] 264: 0.step(limit, step) do |i| 265: retmax = [step, limit - i].min 266: opts.update("retmax" => retmax, "retstart" => i + retstart) 267: response = ncbi_post_form(serv, opts) 268: result = response.body 269: list += result.scan(/<Id>(.*?)<\/Id>/m).flatten 270: end 271: return list 272: end 273: end
Arguments: | same as esearch method |
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 277 277: def esearch_count(str, hash = {}) 278: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 279: opts = default_parameters.merge({ "term" => str }) 280: opts.update(hash) 281: opts.update("rettype" => "count") 282: response = ncbi_post_form(serv, opts) 283: result = response.body 284: count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i 285: return count 286: end