Class Bio::KEGG::GENES
In: lib/bio/db/kegg/genes.rb
Parent: KEGGDB

Methods

Included Modules

Common::DblinksAsHash Common::PathwaysAsHash Common::OrthologsAsHash

Constants

DELIMITER = RS = "\n///\n"
TAGSIZE = 12

Public Class methods

Creates a new Bio::KEGG::GENES object.


Arguments:

Returns:Bio::KEGG::GENES object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 116
116:   def initialize(entry)
117:     super(entry, TAGSIZE)
118:   end

Public Instance methods

Returns length of the amino acid sequence described in the AASEQ lines.


Returns:Integer

[Source]

     # File lib/bio/db/kegg/genes.rb, line 347
347:   def aalen
348:     fetch('AASEQ')[/\d+/].to_i
349:   end

Returns amino acid sequence described in the AASEQ lines.


Returns:Bio::Sequence::AA object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 337
337:   def aaseq
338:     unless @data['AASEQ']
339:       @data['AASEQ'] = Bio::Sequence::AA.new(fetch('AASEQ').gsub(/\d+/, ''))
340:     end
341:     @data['AASEQ']
342:   end

Chromosome described in the POSITION line.


Returns:String or nil

[Source]

     # File lib/bio/db/kegg/genes.rb, line 236
236:   def chromosome
237:     if position[/:/]
238:       position.sub(/:.*/, '')
239:     elsif ! position[/\.\./]
240:       position
241:     else
242:       nil
243:     end
244:   end

Codon usage data described in the CODON_USAGE lines.


Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 304
304:   def codon_usage(codon = nil)
305:     unless @data['CODON_USAGE']
306:       hash = Hash.new
307:       list = cu_list
308:       base = %w(t c a g)
309:       base.each_with_index do |x, i|
310:         base.each_with_index do |y, j|
311:           base.each_with_index do |z, k|
312:             hash["#{x}#{y}#{z}"] = list[i*16 + j*4 + k]
313:           end
314:         end
315:       end
316:       @data['CODON_USAGE'] = hash
317:     end
318:     @data['CODON_USAGE']
319:   end

Codon usage data described in the CODON_USAGE lines as an array.


Returns:Array

[Source]

     # File lib/bio/db/kegg/genes.rb, line 324
324:   def cu_list
325:     ary = []
326:     get('CODON_USAGE').sub(/.*/,'').each_line do |line| # cut 1st line
327:       line.chomp.sub(/^.{11}/, '').scan(/..../) do |cu|
328:         ary.push(cu.to_i)
329:       end
330:     end
331:     return ary
332:   end
dblinks()

Alias for dblinks_as_hash

Returns a Hash of the DB name and an Array of entry IDs in DBLINKS field.

[Source]

    # File lib/bio/db/kegg/genes.rb, line 98
98:   def dblinks_as_hash; super; end

Links to other databases described in the DBLINKS lines.


Returns:Array containing String objects

[Source]

     # File lib/bio/db/kegg/genes.rb, line 286
286:   def dblinks_as_strings
287:     lines_fetch('DBLINKS')
288:   end

Definition of the entry, described in the DEFINITION line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 186
186:   def definition
187:     field_fetch('DEFINITION')
188:   end

Division of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 150
150:   def division
151:     entry['division']                   # CDS, tRNA etc.
152:   end

Enzyme‘s EC numbers shown in the DEFINITION line.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 193
193:   def eclinks
194:     ec_list = definition.slice(/\[EC:(.*?)\]/, 1)
195:     if ec_list
196:       ec_list.strip.split(/\s+/)
197:     else
198:       []
199:     end
200:   end

Returns the "ENTRY" line content as a Hash. For example,

  {"organism"=>"E.coli", "division"=>"CDS", "id"=>"b0356"}

Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 126
126:   def entry
127:     unless @data['ENTRY']
128:       hash = Hash.new('')
129:       if get('ENTRY').length > 30
130:         e = get('ENTRY')
131:         hash['id']       = e[12..29].strip
132:         hash['division'] = e[30..39].strip
133:         hash['organism'] = e[40..80].strip
134:       end
135:       @data['ENTRY'] = hash
136:     end
137:     @data['ENTRY']
138:   end

ID of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 143
143:   def entry_id
144:     entry['id']
145:   end

The position in the genome described in the POSITION line as GenBank feature table location formatted string.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 250
250:   def gbposition
251:     position.sub(/.*?:/, '')
252:   end

Returns the first gene name described in the NAME line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 179
179:   def gene
180:     genes.first
181:   end

Names of the entry as an Array, described in the NAME line.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 172
172:   def genes
173:     name.split(', ')
174:   end

The position in the genome described in the POSITION line as Bio::Locations object.


Returns:Bio::Locations object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 258
258:   def locations
259:     Bio::Locations.new(gbposition)
260:   end

Motif information described in the MOTIF lines.


Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 265
265:   def motif
266:     unless @data['MOTIF']
267:       hash = {}
268:       db = nil
269:       lines_fetch('MOTIF').each do |line|
270:         if line[/^\S+:/]
271:           db, str = line.split(/:/)
272:         else
273:           str = line
274:         end
275:         hash[db] ||= []
276:         hash[db] += str.strip.split(/\s+/)
277:       end
278:       @data['MOTIF'] = hash
279:     end
280:     @data['MOTIF']              # Hash of Array of IDs in MOTIF
281:   end
nalen()

Alias for ntlen

Returns the NAME line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 164
164:   def name
165:     field_fetch('NAME')
166:   end
naseq()

Alias for ntseq

Returns nucleic acid sequence length.


Returns:Integer

[Source]

     # File lib/bio/db/kegg/genes.rb, line 365
365:   def ntlen
366:     fetch('NTSEQ')[/\d+/].to_i
367:   end

Returns nucleic acid sequence described in the NTSEQ lines.


Returns:Bio::Sequence::NA object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 354
354:   def ntseq
355:     unless @data['NTSEQ']
356:       @data['NTSEQ'] = Bio::Sequence::NA.new(fetch('NTSEQ').gsub(/\d+/, ''))
357:     end
358:     @data['NTSEQ']
359:   end

Organism name of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 157
157:   def organism
158:     entry['organism']                   # H.sapiens etc.
159:   end
orthologs()

Alias for orthologs_as_hash

Returns a Hash of the orthology ID and definition in ORTHOLOGY field.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 108
108:   def orthologs_as_hash; super; end

Orthologs described in the ORTHOLOGY lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 205
205:   def orthologs_as_strings
206:     lines_fetch('ORTHOLOGY')
207:   end

Returns the PATHWAY lines as a String.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 212
212:   def pathway
213:     field_fetch('PATHWAY')
214:   end
pathways()

Alias for pathways_as_hash

Returns a Hash of the pathway ID and name in PATHWAY field.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 103
103:   def pathways_as_hash; super; end

Pathways described in the PATHWAY lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 219
219:   def pathways_as_strings
220:     lines_fetch('PATHWAY')
221:   end

The position in the genome described in the POSITION line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 226
226:   def position
227:     unless @data['POSITION']
228:       @data['POSITION'] = fetch('POSITION').gsub(/\s/, '')
229:     end
230:     @data['POSITION']
231:   end

Returns structure ID information described in the STRUCTURE lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 293
293:   def structure
294:     unless @data['STRUCTURE']
295:       @data['STRUCTURE'] = fetch('STRUCTURE').sub(/(PDB: )*/,'').split(/\s+/)
296:     end
297:     @data['STRUCTURE'] # ['PDB:1A9X', ...]
298:   end
structures()

Alias for structure

[Validate]