Class | Bio::GFF::GFF3::Record::Gap |
In: |
lib/bio/db/gff.rb
|
Parent: | Object |
Bio:GFF::GFF3::Record::Gap is a class to store data of "Gap" attribute.
Code | = | Struct.new(:code, :length) | Code is a class to store length of single-letter code. |
data | [R] | Internal data. Users must not use it. |
Arguments:
# File lib/bio/db/gff.rb, line 1246 1246: def initialize(str = nil) 1247: if str then 1248: @data = str.split(/ +/).collect do |x| 1249: if /\A([A-Z])([0-9]+)\z/ =~ x.strip then 1250: Code.new($1.intern, $2.to_i) 1251: else 1252: warn "ignored unknown token: #{x}.inspect" if $VERBOSE 1253: nil 1254: end 1255: end 1256: @data.compact! 1257: else 1258: @data = [] 1259: end 1260: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
Arguments:
# File lib/bio/db/gff.rb, line 1362 1362: def self.new_from_sequences_na(reference, target, 1363: gap_regexp = /[^a-zA-Z]/) 1364: gap = self.new 1365: gap.instance_eval { 1366: __initialize_from_sequences_na(reference, target, 1367: gap_regexp) 1368: } 1369: gap 1370: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
For incorrect alignments that break 3:1 rule, gap positions will be moved inside codons, unwanted gaps will be removed, and some forward or reverse frameshift will be inserted.
For example,
atgg-taagac-att M V K - I
is treated as:
atggt<aagacatt M V K >>I
Incorrect combination of frameshift with frameshift or gap may cause undefined behavior.
Forward frameshifts are recomended to be indicated in the target sequence. Reverse frameshifts can be indicated in the reference sequence or the target sequence.
Priority of regular expressions:
space > forward/reverse frameshift > gap
Arguments:
# File lib/bio/db/gff.rb, line 1558 1558: def self.new_from_sequences_na_aa(reference, target, 1559: gap_regexp = /[^a-zA-Z]/, 1560: space_regexp = /\s/, 1561: forward_frameshift_regexp = /\>/, 1562: reverse_frameshift_regexp = /\</) 1563: gap = self.new 1564: gap.instance_eval { 1565: __initialize_from_sequences_na_aa(reference, target, 1566: gap_regexp, 1567: space_regexp, 1568: forward_frameshift_regexp, 1569: reverse_frameshift_regexp) 1570: } 1571: gap 1572: end
If self == other, returns true. otherwise, returns false.
# File lib/bio/db/gff.rb, line 1586 1586: def ==(other) 1587: if other.class == self.class and 1588: @data == other.data then 1589: true 1590: else 1591: false 1592: end 1593: end
Processes nucleotide sequences and returns gapped sequences as an array of sequences.
Note for forward/reverse frameshift: Forward/Reverse_frameshift is simply treated as gap insertion to the target/reference sequence.
Arguments:
# File lib/bio/db/gff.rb, line 1686 1686: def process_sequences_na(reference, target, gap_char = '-') 1687: s_ref, s_tgt = dup_seqs(reference, target) 1688: 1689: s_ref, s_tgt = __process_sequences(s_ref, s_tgt, 1690: gap_char, gap_char, 1691: 1, 1, 1692: gap_char, gap_char) 1693: 1694: if $VERBOSE and s_ref.length != s_tgt.length then 1695: warn "returned sequences not equal length" 1696: end 1697: return s_ref, s_tgt 1698: end
Processes sequences and returns gapped sequences as an array of sequences. reference must be a nucleotide sequence, and target must be an amino acid sequence.
Note for reverse frameshift: Reverse_frameshift characers are inserted in the reference sequence. For example, alignment of "Gap=M3 R1 M2" is:
atgaagat<aatgtc M K I N V
Alignment of "Gap=M3 R3 M3" is:
atgaag<<<attaatgtc M K I I N V
Arguments:
# File lib/bio/db/gff.rb, line 1723 1723: def process_sequences_na_aa(reference, target, 1724: gap_char = '-', 1725: space_char = ' ', 1726: forward_frameshift = '>', 1727: reverse_frameshift = '<') 1728: s_ref, s_tgt = dup_seqs(reference, target) 1729: s_tgt = s_tgt.gsub(/./, "\\0#{space_char}#{space_char}") 1730: ref_increment = 3 1731: tgt_increment = 1 + space_char.length * 2 1732: ref_gap = gap_char * 3 1733: tgt_gap = "#{gap_char}#{space_char}#{space_char}" 1734: return __process_sequences(s_ref, s_tgt, 1735: ref_gap, tgt_gap, 1736: ref_increment, tgt_increment, 1737: forward_frameshift, 1738: reverse_frameshift) 1739: end