You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* The Java-based sketches are registered with the <b>Maven Central Repository</b>. For example: [DataSketches-Java](https://search.maven.org/search?q=datasketches-java).
39
41
* Extensive documentation with the systems developer in mind.
40
42
* Designed for production environments:
41
-
* Available in multiple languages: Java, C++, [Python](https://github.com/apache/datasketches-python)
42
-
* Binary compatible across systems and languages
43
+
* Available in multiple languages: [Java](https://github.com/apache/datasketches-java), [C++](https://github.com/apache/datasketches-cpp), [Python](https://github.com/apache/datasketches-python), and [Go](https://github.com/apache/datasketches-go).
44
+
* Binary compatible across systems and languages. For example, a sketch can be built and loaded in a C++ platform, then serialized and transported to a Java platform where it can be merged with other sketches and queried.
43
45
44
46
### Built-In, General Purpose Functions
45
47
46
48
* General purpose [Memory Component]({{site.docs_dir}}/Memory/MemoryComponent.html) for managing data off the Java Heap.
47
49
This enables systems designers the ability to manage their own large data heaps with
48
50
dedicated processor threads that would otherwise put undue pressure on the Java heap and
49
-
its garbage collection.
51
+
its garbage collection. Starting with Java Version 9.0.0, this functionality is now native to the Java 25 language.
50
52
* General purpose implementaion of Austin Appleby's 128-bit MurmurHash3 algorithm,
51
53
with a number of useful extensions.
52
54
@@ -58,8 +60,7 @@ its garbage collection.
58
60
* Reproducible Characterization Studies
59
61
* All our published speed and accuracy performance results can be reproduced using the code included in the
*[Quantiles Sketch Overview]({{site.docs_dir}}/Quantiles/QuantilesSketchOverview.html). Get normal or inverse PDFs or CDFs of the distributions of any numeric value from your raw data in a single pass with well defined error bounds on the results.
90
-
91
-
### Frequent Items
90
+
#### [Four families of Quantile algorithms]({{site.docs_dir}}/QuantilesAll/QuantilesOverview.html)
91
+
Get normal or inverse PDFs or CDFs of the distributions of any numeric value from your raw data in a single pass with well defined error bounds on the results.
92
+
93
+
### Frequency
92
94
93
95
*[Frequent Items Sketches]({{site.docs_dir}}/Frequency/FrequencySketchesOverview.html) Get the most frequent items from a stream of items.
96
+
*[CountMin sketch of Cormode and Muthukrishnan](https://github.com/apache/datasketches-java/blob/main/src/main/java/org/apache/datasketches/count/CountMinSketch.java)
*[Reservoir Sampling]({{site.docs_dir}}/Sampling/ReservoirSampling.html) Knuth's well known Reservoir sampling "Algorithm R", but extended to enable merging across different sized reservoirs.
98
102
*[Weighted Sampling]({{site.docs_dir}}/Sampling/VarOptSampling.html) Edith Cohen's famous sampling algorithm that enables computing subset sums of weighted samples with optimum variance.
103
+
*[Exact and Bounded Sampling Proportional to Size](https://github.com/apache/datasketches-java/blob/main/src/main/java/org/apache/datasketches/sampling/EbppsItemsSketch.java)
0 commit comments