Class Bio::PubMed
In: lib/bio/io/pubmed.rb
Parent: Bio::NCBI::REST

Description

The Bio::PubMed class provides several ways to retrieve bibliographic information from the PubMed database at

  http://www.ncbi.nlm.nih.gov/sites/entrez?db=PubMed

Basically, two types of queries are possible:

The different methods within the same group are interchangeable and should return the same result.

Additional information about the MEDLINE format and PubMed programmable APIs can be found on the following websites:

  • PubMed Overview:
      http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html
    
  • PubMed help:
      http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html
    
  • Entrez utilities index:
       http://www.ncbi.nlm.nih.gov/entrez/utils/utils_index.html
    
  • How to link:
      http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helplinks.chapter.linkshelp
    

Usage

  require 'bio'

  # If you don't know the pubmed ID:
  Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics").each do |x|
    p x
  end

  Bio::PubMed.search("(genome AND analysis) OR bioinformatics").each do |x|
    p x
  end

  # To retrieve the MEDLINE entry for a given PubMed ID:
  puts Bio::PubMed.efetch("10592173", "14693808")
  puts Bio::PubMed.query("10592173")
  puts Bio::PubMed.pmfetch("10592173")

  # This can be converted into a Bio::MEDLINE object:
  manuscript = Bio::PubMed.query("10592173")
  medline = Bio::MEDLINE.new(manuscript)

Methods

efetch   efetch   esearch   esearch   pmfetch   pmfetch   query   query   search   search  

Public Class methods

[Source]

     # File lib/bio/io/pubmed.rb, line 204
204:   def self.efetch(*args)
205:     self.new.efetch(*args)
206:   end

[Source]

     # File lib/bio/io/pubmed.rb, line 200
200:   def self.esearch(*args)
201:     self.new.esearch(*args)
202:   end

[Source]

     # File lib/bio/io/pubmed.rb, line 216
216:   def self.pmfetch(*args)
217:     self.new.pmfetch(*args)
218:   end

[Source]

     # File lib/bio/io/pubmed.rb, line 212
212:   def self.query(*args)
213:     self.new.query(*args)
214:   end

[Source]

     # File lib/bio/io/pubmed.rb, line 208
208:   def self.search(*args)
209:     self.new.search(*args)
210:   end

Public Instance methods

Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez efetch. Multiple PubMed IDs can be provided:

  Bio::PubMed.efetch(123)
  Bio::PubMed.efetch([123,456,789])

Arguments:

  • ids: list of PubMed IDs (required)
  • hash: hash of E-Utils options
    • retmode: "xml", "html", …
    • rettype: "medline", …
    • retmax: integer (default 100)
    • retstart: integer
    • field
    • reldate
    • mindate
    • maxdate
    • datetype
Returns:Array of MEDLINE formatted String

[Source]

     # File lib/bio/io/pubmed.rb, line 117
117:   def efetch(ids, hash = {})
118:     opts = { "db" => "pubmed", "rettype"  => "medline" }
119:     opts.update(hash)
120:     result = super(ids, opts)
121:     if !opts["retmode"] or opts["retmode"] == "text"
122:       result = result.split(/\n\n+/)
123:     end
124:     result
125:   end

Search the PubMed database by given keywords using E-Utils and returns an array of PubMed IDs.

For information on the possible arguments, see eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed


Arguments:

  • str: query string (required)
  • hash: hash of E-Utils options
    • retmode: "xml", "html", …
    • rettype: "medline", …
    • retmax: integer (default 100)
    • retstart: integer
    • field
    • reldate
    • mindate
    • maxdate
    • datetype
Returns:array of PubMed IDs or a number of results

[Source]

    # File lib/bio/io/pubmed.rb, line 93
93:   def esearch(str, hash = {})
94:     opts = { "db" => "pubmed" }
95:     opts.update(hash)
96:     super(str, opts)
97:   end

Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez pmfetch.


Arguments:

Returns:MEDLINE formatted String

[Source]

     # File lib/bio/io/pubmed.rb, line 183
183:   def pmfetch(id)
184:     host = "www.ncbi.nlm.nih.gov"
185:     path = "/entrez/utils/pmfetch.fcgi?tool=bioruby&mode=text&report=medline&db=PubMed&id="
186: 
187:     ncbi_access_wait
188: 
189:     http = Bio::Command.new_http(host)
190:     response = http.get(path + CGI.escape(id.to_s))
191:     result = response.body
192:     if result =~ /#{id}\s+Error/
193:       raise( result )
194:     else
195:       result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '')
196:       return result
197:     end
198:   end

Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez query.


Arguments:

Returns:MEDLINE formatted String

[Source]

     # File lib/bio/io/pubmed.rb, line 153
153:   def query(*ids)
154:     host = "www.ncbi.nlm.nih.gov"
155:     path = "/sites/entrez?tool=bioruby&cmd=Text&dopt=MEDLINE&db=PubMed&uid="
156:     list = ids.collect { |x| CGI.escape(x.to_s) }.join(",")
157: 
158:     ncbi_access_wait
159: 
160:     http = Bio::Command.new_http(host)
161:     response = http.get(path + list)
162:     result = response.body
163:     result = result.scan(/<pre>\s*(.*?)<\/pre>/m).flatten
164: 
165:     if result =~ /id:.*Error occurred/
166:       # id: xxxxx Error occurred: Article does not exist
167:       raise( result )
168:     else
169:       if ids.size > 1
170:         return result
171:       else
172:         return result.first
173:       end
174:     end
175:   end

Search the PubMed database by given keywords using entrez query and returns an array of PubMed IDs. Caution: this method returns the first 20 hits only. Instead, use of the ‘esearch’ method is strongly recomended.


Arguments:

  • id: query string (required)
Returns:array of PubMed IDs

[Source]

     # File lib/bio/io/pubmed.rb, line 134
134:   def search(str)
135:     host = "www.ncbi.nlm.nih.gov"
136:     path = "/sites/entrez?tool=bioruby&cmd=Search&doptcmdl=Brief&db=PubMed&term="
137: 
138:     ncbi_access_wait
139: 
140:     http = Bio::Command.new_http(host)
141:     response = http.get(path + CGI.escape(str))
142:     result = response.body
143:     result = result.scan(/value="(\d+)" id="UidCheckBox"/m).flatten
144:     return result
145:   end

[Validate]