Fast rank
frank.RdSimilar to base::rank but much faster. And it accepts vectors, lists, data.frames or data.tables as input. In addition to the ties.method possibilities provided by base::rank, it also provides ties.method="dense".
Like forder, sorting is done in "C-locale"; in particular, this may affect how capital/lowercase letters are ranked. See Details on forder for more.
bit64::integer64 type is also supported.
Arguments
- x
A vector, or list with all its elements identical in length or
data.frameordata.table.- ...
Only for
lists,data.frames anddata.tables. The columns to calculate ranks based on. Do not quote column names. If...is missing, all columns are considered by default. To sort by a column in descending order prefix"-", e.g.,frank(x, a, -b, c).-bworks whenbis of typecharacteras well.- cols
A
charactervector of column names (or numbers) ofx, for which to obtain ranks.- order
An
integervector with only possible values of 1 and -1, corresponding to ascending and descending order. The length ofordermust be either 1 or equal to that ofcols. Iflength(order) == 1, it is recycled tolength(cols).- na.last
Control treatment of
NAs. IfTRUE, missing values in the data are put last; ifFALSE, they are put first; ifNA, they are removed; if"keep"they are kept with rankNA.- ties.method
A character string specifying how ties are treated, see
Details.
Details
To be consistent with other data.table operations, NAs are considered identical to other NAs (and NaNs to other NaNs), unlike base::rank. Therefore, for na.last=TRUE and na.last=FALSE, NAs (and NaNs) are given identical ranks, unlike rank.
frank is not limited to vectors. It accepts data.tables (and lists and data.frames) as well. It accepts unquoted column names (with names preceded with a - sign for descending order, even on character vectors), for e.g., frank(DT, a, -b, c, ties.method="first") where a,b,c are columns in DT. The equivalent in frankv is the order argument.
In addition to the ties.method values possible using base's rank, it also provides another additional argument "dense" which returns the ranks without any gaps in the ranking. See examples.
Value
A numeric vector of length equal to NROW(x) (unless na.last = NA, when missing values are removed). The vector is of integer type unless ties.method = "average" when it is of double type (irrespective of ties).
Examples
# on vectors
x = c(4, 1, 4, NA, 1, NA, 4)
# NAs are considered identical (unlike base R)
# default is average
frankv(x) # na.last=TRUE
#> [1] 4.0 1.5 4.0 6.5 1.5 6.5 4.0
frankv(x, na.last=FALSE)
#> [1] 6.0 3.5 6.0 1.5 3.5 1.5 6.0
# ties.method = min
frankv(x, ties.method="min")
#> [1] 3 1 3 6 1 6 3
# ties.method = dense
frankv(x, ties.method="dense")
#> [1] 2 1 2 3 1 3 2
# on data.table
DT = data.table(x, y=c(1, 1, 1, 0, NA, 0, 2))
frankv(DT, cols="x") # same as frankv(x) from before
#> [1] 4.0 1.5 4.0 6.5 1.5 6.5 4.0
frankv(DT, cols="x", na.last="keep")
#> [1] 4.0 1.5 4.0 NA 1.5 NA 4.0
frankv(DT, cols="x", ties.method="dense", na.last=NA)
#> [1] 2 1 2 1 2
frank(DT, x, ties.method="dense", na.last=NA) # equivalent of above using frank
#> [1] 2 1 2 1 2
# on both columns
frankv(DT, ties.method="first", na.last="keep")
#> [1] 2 1 3 NA NA NA 4
frank(DT, ties.method="first", na.last="keep") # equivalent of above using frank
#> [1] 2 1 3 NA NA NA 4
# order argument
frank(DT, x, -y, ties.method="first")
#> [1] 4 1 5 6 2 7 3
# equivalent of above using frankv
frankv(DT, order=c(1L, -1L), ties.method="first")
#> [1] 4 1 5 6 2 7 3