Skip to content

peakon/pg_roaringbitmap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pg_roaringbitmap

RoaringBitmap extension for PostgreSQL.

It's initial based on https://github.com/zeromax007/gpdb-roaringbitmap.

Introduction

Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps. More information https://github.com/RoaringBitmap/CRoaring.

Build

Requirements

Note: The regression testing before the version release only covers PostgreSQL 13 and above.

Build

su - postgres
make
sudo make install
psql -c "create extension roaringbitmap"

Note:You can use make -f Makefile_native instead of make to let the compiler use the SIMD instructions if it's supported by your CPU. In some scenarios, it may double the performance. But if you copy the pg_roaringbitmap binary which builded on machine with SIMD support to other machine without SIMD and run, you could get a SIGILL crash.

Test

make installcheck

Usage

roaringbitmap

about roaringbitmap data type

Logically, you could think of roaringbitmap data type as bit(4294967296), and it should be noted that the integers added to bitmaps is considered to be unsigned. Within bitmaps, numbers are ordered according to uint32. We order the numbers like 0, 1, ..., 2147483647, -2147483648, -2147483647,..., -1. But we use bigint to reference the range of these integers, that is [0 4294967296).

input and ouput

Support two kind of input/output syntax 'array' and 'bytea', The default output format is 'bytea'.

postgres=# select roaringbitmap('{1,100,10}');
                 roaringbitmap
------------------------------------------------
 \x3a30000001000000000002001000000001000a006400
(1 row)

or

postgres=# select '\x3a30000001000000000002001000000001000a006400'::roaringbitmap;
                 roaringbitmap
------------------------------------------------
 \x3a30000001000000000002001000000001000a006400
(1 row)

output format can changed by roaringbitmap.output_format

postgres=# set roaringbitmap.output_format='bytea';
SET
postgres=# select '{1}'::roaringbitmap;
             roaringbitmap
----------------------------------------
 \x3a3000000100000000000000100000000100
(1 row)

postgres=# set roaringbitmap.output_format='array';
SET
postgres=# select '{1}'::roaringbitmap;
 roaringbitmap
---------------
 {1}
(1 row)

sample of usage

Use bitmap as type of column

CREATE TABLE t1 (id integer, bitmap roaringbitmap);

Build bitmap from integers

INSERT INTO t1 SELECT 1,rb_build(ARRAY[1,2,3,4,5,6,7,8,9,200]);

INSERT INTO t1 SELECT 2,rb_build_agg(e) FROM generate_series(1,100) e;

Bitmap Calculation (OR, AND, XOR, ANDNOT)

SELECT roaringbitmap('{1,2,3}') | roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') & roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') # roaringbitmap('{3,4,5}');
SELECT roaringbitmap('{1,2,3}') - roaringbitmap('{3,4,5}');

Bitmap Aggregate (OR, AND, XOR, BUILD)

SELECT rb_or_agg(bitmap) FROM t1;
SELECT rb_and_agg(bitmap) FROM t1;
SELECT rb_xor_agg(bitmap) FROM t1;
SELECT rb_build_agg(e) FROM generate_series(1,100) e;

Calculate cardinality

SELECT rb_cardinality('{1,2,3}');

Convert bitmap to integer array

SELECT rb_to_array(bitmap) FROM t1 WHERE id = 1;

Convert bitmap to SET of integers

SELECT unnest(rb_to_array('{1,2,3}'::roaringbitmap));

or

SELECT rb_iterate('{1,2,3}'::roaringbitmap);

Opperator List

Opperator Input Output Desc Example Result
& roaringbitmap,roaringbitmap roaringbitmap bitwise AND roaringbitmap('{1,2,3}') & roaringbitmap('{3,4,5}') {3}
| roaringbitmap,roaringbitmap roaringbitmap bitwise OR roaringbitmap('{1,2,3}') | roaringbitmap('{3,4,5}') {1,2,3,4,5}
| roaringbitmap,integer roaringbitmap add element to roaringbitmap roaringbitmap('{1,2,3}') | 6 {1,2,3,6}
| integer,roaringbitmap roaringbitmap add element to roaringbitmap 6 | roaringbitmap('{1,2,3}') {1,2,3,6}
# roaringbitmap,roaringbitmap roaringbitmap bitwise XOR roaringbitmap('{1,2,3}') # roaringbitmap('{3,4,5}') {1,2,4,5}
<< roaringbitmap,bigint roaringbitmap bitwise shift left roaringbitmap('{1,2,3}') << 2 {0,1}
>> roaringbitmap,bigint roaringbitmap bitwise shift right roaringbitmap('{1,2,3}') >> 3 {4,5,6}
- roaringbitmap,roaringbitmap roaringbitmap difference(bitwise ANDNOT) roaringbitmap('{1,2,3}') - roaringbitmap('{3,4,5}') {1,2}
- roaringbitmap,integer roaringbitmap remove element from roaringbitmap roaringbitmap('{1,2,3}') - 3 {1,2}
@> roaringbitmap,roaringbitmap bool contains roaringbitmap('{1,2,3}') @> roaringbitmap('{3,4,5}') f
@> roaringbitmap,integer bool contains roaringbitmap('{1,2,3,4,5}') @> 3 t
<@ roaringbitmap,roaringbitmap bool is contained by roaringbitmap('{1,2,3}') <@ roaringbitmap('{3,4,5}') f
<@ integer,roaringbitmap bool is contained by 3 <@ roaringbitmap('{3,4,5}') t
&& roaringbitmap,roaringbitmap bool overlap (have elements in common) roaringbitmap('{1,2,3}') && roaringbitmap('{3,4,5}') t
= roaringbitmap,roaringbitmap bool equal roaringbitmap('{1,2,3}') = roaringbitmap('{3,4,5}') f
<> roaringbitmap,roaringbitmap bool not equal roaringbitmap('{1,2,3}') <> roaringbitmap('{3,4,5}') t

Function List

Function Input Output Desc Example Result
rb_build integer[] roaringbitmap Create roaringbitmap from integer array rb_build('{1,2,3,4,5}') {1,2,3,4,5}
rb_index roaringbitmap,integer bigint Return the 0-based index of element in this roaringbitmap, or -1 if do not exist rb_index('{1,2,3}',3) 2
rb_cardinality roaringbitmap bigint Return cardinality of the roaringbitmap rb_cardinality('{1,2,3,4,5}') 5
rb_and_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the AND of two roaringbitmaps rb_and_cardinality('{1,2,3}','{3,4,5}') 1
rb_or_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the OR of two roaringbitmaps rb_or_cardinality('{1,2,3}','{3,4,5}') 5
rb_xor_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the XOR of two roaringbitmaps rb_xor_cardinality('{1,2,3}','{3,4,5}') 4
rb_andnot_cardinality roaringbitmap,roaringbitmap bigint Return cardinality of the ANDNOT of two roaringbitmaps rb_andnot_cardinality('{1,2,3}','{3,4,5}') 2
rb_is_empty roaringbitmap boolean Check if roaringbitmap is empty. rb_is_empty('{1,2,3,4,5}') f
rb_fill roaringbitmap,range_start bigint,range_end bigint roaringbitmap Fill the specified range (not include the range_end) rb_fill('{1,2,3}',5,7) {1,2,3,5,6}
rb_clear roaringbitmap,range_start bigint,range_end bigint roaringbitmap Clear the specified range (not include the range_end) rb_clear('{1,2,3}',2,3) {1,3}
rb_flip roaringbitmap,range_start bigint,range_end bigint roaringbitmap Negative the specified range (not include the range_end) rb_flip('{1,2,3}',2,10) {1,4,5,6,7,8,9}
rb_range roaringbitmap,range_start bigint,range_end bigint roaringbitmap Return new set with specified range (not include the range_end) rb_range('{1,2,3}',2,3) {2}
rb_range_cardinality roaringbitmap,range_start bigint,range_end bigint bigint Return the cardinality of specified range (not include the range_end) rb_range_cardinality('{1,2,3}',2,3) 1
rb_min roaringbitmap integer Return the smallest offset in roaringbitmap. Return NULL if the bitmap is empty rb_min('{1,2,3}') 1
rb_max roaringbitmap integer Return the greatest offset in roaringbitmap. Return NULL if the bitmap is empty rb_max('{1,2,3}') 3
rb_rank roaringbitmap,integer bigint Return the number of elements that are smaller or equal to the specified offset rb_rank('{1,2,3}',3) 3
rb_jaccard_dist roaringbitmap,roaringbitmap double precision Return the jaccard distance(or the Jaccard similarity coefficient) of two bitmaps rb_jaccard_dist('{1,2,3}','{3,4}') 0.25
rb_select roaringbitmap,bitset_limit bigint,bitset_offset bigint=0,reverse boolean=false,range_start bigint=0,range_end bigint=4294967296 roaringbitmap Return subset [bitset_offset,bitset_offset+bitset_limit) of bitmap between range [range_start,range_end) rb_select('{1,2,3,4,5,6,7,8,9}',5,2) {3,4,5,6,7}
rb_to_array roaringbitmap integer[] Convert roaringbitmap to integer array rb_to_array(roaringbitmap('{1,2,3}')) {1,2,3}
rb_iterate roaringbitmap SET of integer Return set of integer from a roaringbitmap data.
rb_iterate(roaringbitmap('{1,2,3}'))
1
2
3

Aggregation List

Function Input Output Desc Example Result
rb_build_agg integer roaringbitmap Build a roaringbitmap from a integer set
select rb_build_agg(id)
    from (values (1),(2),(3)) t(id)
{1,2,3}
rb_or_agg roaringbitmap roaringbitmap AND Aggregate calculations from a roaringbitmap set
select rb_or_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{1,2,3,4}
rb_and_agg roaringbitmap roaringbitmap AND Aggregate calculations from a roaringbitmap set
select rb_and_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{2,3}
rb_xor_agg roaringbitmap roaringbitmap XOR Aggregate calculations from a roaringbitmap set
select rb_xor_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
{1,4}
rb_or_cardinality_agg roaringbitmap bigint OR Aggregate calculations from a roaringbitmap set, return cardinality.
select rb_or_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
4
rb_and_cardinality_agg roaringbitmap bigint AND Aggregate calculations from a roaringbitmap set, return cardinality
select rb_and_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
2
rb_xor_cardinality_agg roaringbitmap bigint XOR Aggregate calculations from a roaringbitmap set, return cardinality
select rb_xor_cardinality_agg(bitmap)
    from (values (roaringbitmap('{1,2,3}')),
                 (roaringbitmap('{2,3,4}'))
          ) t(bitmap)
2

roaringbitmap64

about roaringbitmap64 data type

​​roaringbitmap64​​ is a 64-bit Roaring bitmap implementation, and its format definition can be found in https://github.com/RoaringBitmap/RoaringFormatSpec. Logically, you could think of roaringbitmap64 data type as bit(18446744073709551615) just like roaringbitmap, and it should be noted that the bigint data added to bitmaps is considered to be unsigned. Within 64 bit bitmaps, numbers are ordered according to uint64. We order the numbers like 0, 1, ..., 9223372036854775807, -9223372036854775808, -9223372036854775807,..., -1.

input and ouput

Support two kind of input/output syntax 'array' and 'bytea', The default output format is 'bytea'.

postgres=# select roaringbitmap64('{1,100,10}');
                            roaringbitmap64
------------------------------------------------------------------------
 \x0100000000000000000000003a30000001000000000002001000000001000a006400
(1 row)

or

postgres=# select '\x0100000000000000000000003a30000001000000000002001000000001000a006400'::roaringbitmap64;
                            roaringbitmap64
------------------------------------------------------------------------
 \x0100000000000000000000003a30000001000000000002001000000001000a006400
(1 row)

output format can changed by roaringbitmap.output_format

postgres=# set roaringbitmap.output_format='bytea';
SET
postgres=# select '{1}'::roaringbitmap64;
                        roaringbitmap64
----------------------------------------------------------------
 \x0100000000000000000000003a3000000100000000000000100000000100
(1 row)

postgres=# set roaringbitmap.output_format='array';
SET
postgres=# select '{1}'::roaringbitmap64;
 roaringbitmap64
-----------------
 {1}
(1 row)

sample of usage

Use bitmap as type of column

CREATE TABLE t1 (id integer, bitmap roaringbitmap64);

Build bitmap from set of bigint

INSERT INTO t1 SELECT 1,rb64_build(ARRAY[1,2,3,4,5,6,7,8,9,200]);

INSERT INTO t1 SELECT 2,rb64_build_agg(e) FROM generate_series(1,100) e;

Bitmap Calculation (OR, AND, XOR, ANDNOT)

SELECT roaringbitmap64('{1,2,3}') | roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') & roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') # roaringbitmap64('{3,4,5}');
SELECT roaringbitmap64('{1,2,3}') - roaringbitmap64('{3,4,5}');

Bitmap Aggregate (OR, AND, XOR, BUILD)

SELECT rb64_or_agg(bitmap) FROM t1;
SELECT rb64_and_agg(bitmap) FROM t1;
SELECT rb64_xor_agg(bitmap) FROM t1;
SELECT rb64_build_agg(e) FROM generate_series(1,100) e;

Calculate cardinality

SELECT rb64_cardinality('{1,2,3}');

Convert bitmap to bigint array

SELECT rb64_to_array(bitmap) FROM t1 WHERE id = 1;

Convert bitmap to SET of bigint

SELECT unnest(rb64_to_array('{1,2,3}'::roaringbitmap64));

or

SELECT rb64_iterate('{1,2,3}'::roaringbitmap64);

Opperator List

Opperator Input Output Desc Example Result
& roaringbitmap64,roaringbitmap64 roaringbitmap64 bitwise AND roaringbitmap64('{1,2,3}') & roaringbitmap64('{3,4,5}') {3}
| roaringbitmap64,roaringbitmap64 roaringbitmap64 bitwise OR roaringbitmap64('{1,2,3}') | roaringbitmap64('{3,4,5}') {1,2,3,4,5}
| roaringbitmap64,bigint roaringbitmap64 add element to roaringbitmap64 roaringbitmap64('{1,2,3}') | 6 {1,2,3,6}
| bigint,roaringbitmap64 roaringbitmap64 add element to roaringbitmap64 6 | roaringbitmap64('{1,2,3}') {1,2,3,6}
# roaringbitmap64,roaringbitmap64 roaringbitmap64 bitwise XOR roaringbitmap64('{1,2,3}') # roaringbitmap64('{3,4,5}') {1,2,4,5}
<< roaringbitmap64,bigint roaringbitmap64 bitwise shift left roaringbitmap64('{1,2,3}') << 2 {0,1}
>> roaringbitmap64,bigint roaringbitmap64 bitwise shift right roaringbitmap64('{1,2,3}') >> 3 {4,5,6}
- roaringbitmap64,roaringbitmap64 roaringbitmap64 difference(bitwise ANDNOT) roaringbitmap64('{1,2,3}') - roaringbitmap64('{3,4,5}') {1,2}
- roaringbitmap64,bigint roaringbitmap64 remove element from roaringbitmap64 roaringbitmap64('{1,2,3}') - 3 {1,2}
@> roaringbitmap64,roaringbitmap64 bool contains roaringbitmap64('{1,2,3}') @> roaringbitmap64('{3,4,5}') f
@> roaringbitmap64,bigint bool contains roaringbitmap64('{1,2,3,4,5}') @> 3 t
<@ roaringbitmap64,roaringbitmap64 bool is contained by roaringbitmap64('{1,2,3}') <@ roaringbitmap64('{3,4,5}') f
<@ bigint,roaringbitmap64 bool is contained by 3 <@ roaringbitmap64('{3,4,5}') t
&& roaringbitmap64,roaringbitmap64 bool overlap (have elements in common) roaringbitmap64('{1,2,3}') && roaringbitmap64('{3,4,5}') t
= roaringbitmap64,roaringbitmap64 bool equal roaringbitmap64('{1,2,3}') = roaringbitmap64('{3,4,5}') f
<> roaringbitmap64,roaringbitmap64 bool not equal roaringbitmap64('{1,2,3}') <> roaringbitmap64('{3,4,5}') t

Function List

Function Input Output Desc Example Result
rb64_build bigint[] roaringbitmap64 Create roaringbitmap64 from bigint array rb64_build('{1,2,3,4,5}') {1,2,3,4,5}
rb64_index roaringbitmap64,bigint bigint Return the 0-based index of element in this roaringbitmap64, or -1 if do not exist rb64_index('{1,2,3}',3) 2
rb64_cardinality roaringbitmap64 bigint Return cardinality of the roaringbitmap64 rb64_cardinality('{1,2,3,4,5}') 5
rb64_and_cardinality roaringbitmap64,roaringbitmap64 bigint Return cardinality of the AND of two roaringbitmaps rb64_and_cardinality('{1,2,3}','{3,4,5}') 1
rb64_or_cardinality roaringbitmap64,roaringbitmap64 bigint Return cardinality of the OR of two roaringbitmaps rb64_or_cardinality('{1,2,3}','{3,4,5}') 5
rb64_xor_cardinality roaringbitmap64,roaringbitmap64 bigint Return cardinality of the XOR of two roaringbitmaps rb64_xor_cardinality('{1,2,3}','{3,4,5}') 4
rb64_andnot_cardinality roaringbitmap64,roaringbitmap64 bigint Return cardinality of the ANDNOT of two roaringbitmaps rb64_andnot_cardinality('{1,2,3}','{3,4,5}') 2
rb64_is_empty roaringbitmap64 boolean Check if roaringbitmap64 is empty. rb64_is_empty('{1,2,3,4,5}') f
rb64_fill roaringbitmap64,range_start bigint,range_end bigint roaringbitmap64 Fill the specified range (not include the range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_fill('{1,2,3}',5,7) {1,2,3,5,6}
rb64_clear roaringbitmap64,range_start bigint,range_end bigint roaringbitmap64 Clear the specified range (not include the range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_clear('{1,2,3}',2,3) {1,3}
rb64_flip roaringbitmap64,range_start bigint,range_end bigint roaringbitmap64 Negative the specified range (not include the range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_flip('{1,2,3}',2,10) {1,4,5,6,7,8,9}
rb64_range roaringbitmap64,range_start bigint,range_end bigint roaringbitmap64 Return new set with specified range (not include the range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_range('{1,2,3}',2,3) {2}
rb64_range_cardinality roaringbitmap64,range_start bigint,range_end bigint bigint Return the cardinality of specified range (not include the range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_range_cardinality('{1,2,3}',2,3) 1
rb64_min roaringbitmap64 bigint Return the smallest offset in roaringbitmap64. Return NULL if the bitmap is empty rb64_min('{1,2,3}') 1
rb64_max roaringbitmap64 bigint Return the greatest offset in roaringbitmap64. Return NULL if the bitmap is empty rb64_max('{1,2,3}') 3
rb64_rank roaringbitmap64,bigint bigint Return the number of elements that are smaller or equal to the specified offset rb64_rank('{1,2,3}',3) 3
rb64_jaccard_dist roaringbitmap64,roaringbitmap64 double precision Return the jaccard distance(or the Jaccard similarity coefficient) of two bitmaps rb64_jaccard_dist('{1,2,3}','{3,4}') 0.25
rb64_select roaringbitmap64,bitset_limit bigint,bitset_offset bigint=0,reverse boolean=false,range_start bigint=0,range_end bigint=-1 roaringbitmap64 Return subset [bitset_offset,bitset_offset+bitset_limit) of bitmap between range [range_start,range_end),negative range value will be internally converted to unsigned int64, and range_end = 0 means unlimited rb64_select('{1,2,3,4,5,6,7,8,9}',5,2) {3,4,5,6,7}
rb64_to_array roaringbitmap64 bigint[] Convert roaringbitmap64 to bigint array rb64_to_array(roaringbitmap64('{1,2,3}')) {1,2,3}
rb64_iterate roaringbitmap64 SET of bigint Return set of bigint from a roaringbitmap64 data.
rb64_iterate(roaringbitmap64('{1,2,3}'))
1
2
3

Aggregation List

Function Input Output Desc Example Result
rb64_build_agg bigint roaringbitmap64 Build a roaringbitmap64 from a bigint set
select rb64_build_agg(id)
    from (values (1),(2),(3)) t(id)
{1,2,3}
rb64_or_agg roaringbitmap64 roaringbitmap64 AND Aggregate calculations from a roaringbitmap64 set
select rb64_or_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
{1,2,3,4}
rb64_and_agg roaringbitmap64 roaringbitmap64 AND Aggregate calculations from a roaringbitmap64 set
select rb64_and_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
{2,3}
rb64_xor_agg roaringbitmap64 roaringbitmap64 XOR Aggregate calculations from a roaringbitmap64 set
select rb64_xor_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
{1,4}
rb64_or_cardinality_agg roaringbitmap64 bigint OR Aggregate calculations from a roaringbitmap64 set, return cardinality.
select rb64_or_cardinality_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
4
rb64_and_cardinality_agg roaringbitmap64 bigint AND Aggregate calculations from a roaringbitmap64 set, return cardinality
select rb64_and_cardinality_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
2
rb64_xor_cardinality_agg roaringbitmap64 bigint XOR Aggregate calculations from a roaringbitmap64 set, return cardinality
select rb64_xor_cardinality_agg(bitmap)
    from (values (roaringbitmap64('{1,2,3}')),
                 (roaringbitmap64('{2,3,4}'))
          ) t(bitmap)
2

Cloud Vendor Support

pg_roaringbitmap is supported by the following cloud vendors

To request support for pg_roaringbitmap from other cloud vendors, please see the following:

About

RoaringBitmap extension for PostgreSQL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 87.4%
  • C++ 8.0%
  • PLpgSQL 4.4%
  • Other 0.2%