Set operations for data tables
setops.RdSimilar to base R set functions, union, intersect, setdiff and setequal but for data.tables. Additional all argument controls how duplicated rows are handled. Functions fintersect, setdiff (MINUS or EXCEPT in SQL) and funion are meant to provide functionality of corresponding SQL operators. Unlike SQL, data.table functions will retain row order.
Usage
fintersect(x, y, all = FALSE)
fsetdiff(x, y, all = FALSE)
funion(x, y, all = FALSE)
fsetequal(x, y, all = TRUE)Arguments
- x, y
data.tables.- all
Logical. Default is
FALSEand removes duplicate rows on the result. WhenTRUE, if there arexncopies of a particular row inxandyncopies of the same row iny, then:fintersectwill returnmin(xn, yn)copies of that row.fsetdiffwill returnmax(0, xn-yn)copies of that row.funionwill returnxn+yncopies of that row.fsetequalwill returnFALSEunlessxn == yn.
Details
bit64::integer64 columns are supported but not complex and list, except for funion.
Examples
x = data.table(c(1,2,2,2,3,4,4))
x2 = data.table(c(1,2,3,4)) # same set of rows as x
y = data.table(c(2,3,4,4,4,5))
fintersect(x, y) # intersect
#> V1
#> <num>
#> 1: 2
#> 2: 3
#> 3: 4
fintersect(x, y, all=TRUE) # intersect all
#> V1
#> <num>
#> 1: 2
#> 2: 3
#> 3: 4
#> 4: 4
fsetdiff(x, y) # except
#> V1
#> <num>
#> 1: 1
fsetdiff(x, y, all=TRUE) # except all
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 2
funion(x, y) # union
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
funion(x, y, all=TRUE) # union all
#> V1
#> <num>
#> 1: 1
#> 2: 2
#> 3: 2
#> 4: 2
#> 5: 3
#> 6: 4
#> 7: 4
#> 8: 2
#> 9: 3
#> 10: 4
#> 11: 4
#> 12: 4
#> 13: 5
fsetequal(x, x2, all=FALSE) # setequal
#> [1] TRUE
fsetequal(x, x2) # setequal all
#> [1] FALSE