Class | Bio::Locations |
In: |
lib/bio/location.rb
|
Parent: | Object |
The Bio::Locations class is a container for Bio::Location objects: creating a Bio::Locations object (based on a GenBank style position string) will spawn an array of Bio::Location objects.
locations = Bio::Locations.new('join(complement(500..550), 600..625)') locations.each do |loc| puts "class = " + loc.class.to_s puts "range = #{loc.from}..#{loc.to} (strand = #{loc.strand})" end # Output would be: # class = Bio::Location # range = 500..550 (strand = -1) # class = Bio::Location # range = 600..625 (strand = 1) # For the following three location strings, print the span and range ['one-of(898,900)..983', 'one-of(5971..6308,5971..6309)', '8050..one-of(10731,10758,10905,11242)'].each do |loc| location = Bio::Locations.new(loc) puts location.span puts location.range end
According to the GenBank manual ‘gbrel.txt’, position notations were classified into 10 patterns - (A) to (J).
3.4.12.2 Feature Location The second column of the feature descriptor line designates the location of the feature in the sequence. The location descriptor begins at position 22. Several conventions are used to indicate sequence location. Base numbers in location descriptors refer to numbering in the entry, which is not necessarily the same as the numbering scheme used in the published report. The first base in the presented sequence is numbered base 1. Sequences are presented in the 5 to 3 direction. Location descriptors can be one of the following: (A) 1. A single base; (B) 2. A contiguous span of bases; (C) 3. A site between two bases; (D) 4. A single base chosen from a range of bases; (E) 5. A single base chosen from among two or more specified bases; (F) 6. A joining of sequence spans; (G) 7. A reference to an entry other than the one to which the feature belongs (i.e., a remote entry), followed by a location descriptor referring to the remote sequence; (H) 8. A literal sequence (a string of bases enclosed in quotation marks).
(C) A site between two residues, such as an endonuclease cleavage site, is indicated by listing the two bases separated by a carat (e.g., 23^24). (D) A single residue chosen from a range of residues is indicated by the number of the first and last bases in the range separated by a single period (e.g., 23.79). The symbols < and > indicate that the end point (I) of the range is beyond the specified base number. (B) A contiguous span of bases is indicated by the number of the first and last bases in the range separated by two periods (e.g., 23..79). The (I) symbols < and > indicate that the end point of the range is beyond the specified base number. Starting and ending positions can be indicated by base number or by one of the operators described below. Operators are prefixes that specify what must be done to the indicated sequence to locate the feature. The following are the operators available, along with their most common format and a description. (J) complement (location): The feature is complementary to the location indicated. Complementary strands are read 5 to 3. (F) join (location, location, .. location): The indicated elements should be placed end to end to form one contiguous sequence. (F) order (location, location, .. location): The elements are found in the specified order in the 5 to 3 direction, but nothing is implied about the rationality of joining them. (F) group (location, location, .. location): The elements are related and should be grouped together, but no order is implied. (E) one-of (location, location, .. location): The element can be any one, but only one, of the items listed.
locations | [RW] | (Array) An Array of Bio::Location objects |
operator | [RW] | (Symbol or nil) Operator. nil (means :join), :order, or :group (obsolete). |
Parses a GenBank style position string and returns a Bio::Locations object, which contains a list of Bio::Location objects.
locations = Bio::Locations.new('join(complement(500..550), 600..625)')
Arguments:
Returns: | Bio::Locations object |
# File lib/bio/location.rb, line 324 324: def initialize(position) 325: @operator = nil 326: if position.is_a? Array 327: @locations = position 328: else 329: position = gbl_cleanup(position) # preprocessing 330: @locations = gbl_pos2loc(position) # create an Array of Bio::Location objects 331: end 332: end
Returns nth Bio::Location object.
# File lib/bio/location.rb, line 361 361: def [](n) 362: @locations[n] 363: end
Converts relative position in the locus to position in the whole of the DNA sequence.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ’:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)') puts loc.absolute(10) # => 13524 puts loc.absolute(10, :aa) # => 13506
Arguments:
Returns: | position within the whole of the sequence |
# File lib/bio/location.rb, line 451 451: def absolute(n, type = nil) 452: case type 453: when :location 454: ; 455: when :aa 456: n = (n - 1) * 3 + 1 457: rel2abs(n) 458: else 459: rel2abs(n) 460: end 461: end
Iterates on each Bio::Location object.
# File lib/bio/location.rb, line 354 354: def each 355: @locations.each do |x| 356: yield(x) 357: end 358: end
Evaluate equality of Bio::Locations object.
# File lib/bio/location.rb, line 342 342: def equals?(other) 343: if ! other.kind_of?(Bio::Locations) 344: return nil 345: end 346: if self.sort == other.sort 347: return true 348: else 349: return false 350: end 351: end
Returns first Bio::Location object.
# File lib/bio/location.rb, line 366 366: def first 367: @locations.first 368: end
Returns last Bio::Location object.
# File lib/bio/location.rb, line 371 371: def last 372: @locations.last 373: end
Converts absolute position in the whole of the DNA sequence to relative position in the locus.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ’:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)') puts loc.relative(13524) # => 10 puts loc.relative(13506, :aa) # => 3
Arguments:
Returns: | position within the location |
# File lib/bio/location.rb, line 419 419: def relative(n, type = nil) 420: case type 421: when :location 422: ; 423: when :aa 424: if n = abs2rel(n) 425: (n - 1) / 3 + 1 426: else 427: nil 428: end 429: else 430: abs2rel(n) 431: end 432: end
Returns an Array containing overall min and max position [min, max] of this Bio::Locations object.
# File lib/bio/location.rb, line 377 377: def span 378: span_min = @locations.min { |a,b| a.from <=> b.from } 379: span_max = @locations.max { |a,b| a.to <=> b.to } 380: return span_min.from, span_max.to 381: end
String representation.
Note: In some cases, it fails to detect whether "complement(join(…))" or "join(complement(..))", and whether "complement(order(…))" or "order(complement(..))".
Returns: | String |
# File lib/bio/location.rb, line 472 472: def to_s 473: return '' if @locations.empty? 474: complement_join = false 475: locs = @locations 476: if locs.size >= 2 and locs.inject(true) do |flag, loc| 477: # check if each location is complement 478: (flag && (loc.strand == -1) && !loc.xref_id) 479: end and locs.inject(locs[0].from) do |pos, loc| 480: if pos then 481: (pos >= loc.from) ? loc.from : false 482: else 483: false 484: end 485: end then 486: locs = locs.reverse 487: complement_join = true 488: end 489: locs = locs.collect do |loc| 490: lt = loc.lt ? '<' : '' 491: gt = loc.gt ? '>' : '' 492: str = if loc.from == loc.to then 493: "#{lt}#{gt}#{loc.from.to_i}" 494: elsif loc.carat then 495: "#{lt}#{loc.from.to_i}^#{gt}#{loc.to.to_i}" 496: else 497: "#{lt}#{loc.from.to_i}..#{gt}#{loc.to.to_i}" 498: end 499: if loc.xref_id and !loc.xref_id.empty? then 500: str = "#{loc.xref_id}:#{str}" 501: end 502: if loc.strand == -1 and !complement_join then 503: str = "complement(#{str})" 504: end 505: if loc.sequence then 506: str = "replace(#{str},\"#{loc.sequence}\")" 507: end 508: str 509: end 510: if locs.size >= 2 then 511: op = (self.operator || 'join').to_s 512: result = "#{op}(#{locs.join(',')})" 513: else 514: result = locs[0] 515: end 516: if complement_join then 517: result = "complement(#{result})" 518: end 519: result 520: end