Converts an integer-encoded matrix of amino acids back to its
character representation under the package's 25-symbol alphabet. Inverse
of encode_aa_sequence.
Arguments
- matrix_input
Numeric matrix of integer-encoded amino acids, typically the output of
encode_aa_sequence. Values outside1:25(including0andNA) decode to the sentinel string"0". A non-matrix input is coerced viaas.matrix.
Value
A character matrix of the same dimensions as matrix_input,
with the same dimnames. Each cell is either a one-character
amino acid code from the 25-symbol alphabet, or the sentinel
"0" for out-of-range and missing values.
Details
Decoding is a single vectorised lookup against the fixed alphabet
(A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V for
the twenty standard residues, then B, Z, X, *, - for ambiguous
codes, stop codons, and gaps). The input matrix is flattened, indexed
against the alphabet vector in one operation, and reshaped — there is
no per-row or per-element loop.
Values outside the valid range 1:25 (including 0, which
encode_aa_sequence produces for unrecognised characters)
are returned as the sentinel string "0". This preserves
round-trip consistency with the encoder: encoding then decoding any
character originally outside the alphabet yields "0" rather
than throwing an error. NA values are also mapped to
"0".
Row and column names of the input matrix are preserved on the output.
See also
encode_aa_sequence for the inverse operation;
fasta_to_char_matrix for the FASTA-to-character-matrix
step that typically precedes encoding.
Examples
# 1. Decode a numeric matrix.
num_mat = matrix(c(1, 2, 25, 10), nrow = 2, byrow = TRUE)
decoded = decode_aa_sequence(num_mat)
print(decoded)
#> [,1] [,2]
#> [1,] "A" "R"
#> [2,] "-" "I"
# 1 -> "A", 2 -> "R", 25 -> "-", 10 -> "I"
# 2. Round-trip consistency check (excluding unknowns).
orig = matrix(c("A", "C", "W", "G"), nrow = 2)
enc = encode_aa_sequence(orig)
dec = decode_aa_sequence(enc)
all.equal(orig, dec)
#> [1] TRUE
# 3. Out-of-range and NA values both decode to the sentinel "0".
decode_aa_sequence(matrix(c(0, NA, 30, 5), nrow = 2))
#> [,1] [,2]
#> [1,] "0" "0"
#> [2,] "0" "C"