Computes the Shannon entropy of a categorical vector.
Usage
calculate_entropy(vctr, base = 2, precision = 6)
Arguments
- vctr
A vector (character, factor, or integer) representing categorical data.
- base
A numeric scalar. The base of the logarithm. Default is 2.
- precision
Integer. The number of decimal places to round the result to. Default is 6.
Value
A numeric scalar representing the entropy. Returns 0 if the vector contains
only one unique value or has length 0.
Details
Entropy is calculated as \(H(X) = -\sum p(x) \log_b p(x)\), where \(p(x)\)
is the proportion of observations belonging to category \(x\).
Examples
seq_vec = c("A", "A", "T", "G", "C", "A")
calculate_entropy(seq_vec)
#> [1] 1.792481
# Pure homogeneity
calculate_entropy(rep("A", 10))
#> [1] 0