@@ -6,7 +6,136 @@ fast-float
6
6
[ ![ Documentation] ( https://docs.rs/fast-float/badge.svg )] ( https://docs.rs/fast-float )
7
7
[ ![ Apache 2.0] ( https://img.shields.io/badge/License-Apache%202.0-blue.svg )] ( https://opensource.org/licenses/Apache-2.0 )
8
8
[ ![ MIT] ( https://img.shields.io/badge/License-MIT-blue.svg )] ( https://opensource.org/licenses/MIT )
9
- [ ![ Rust 1.47+] ( https://img.shields.io/badge/rustc-1.47+-lightgray.svg )] ( https://blog.rust-lang.org/2020/10/08/Rust-1.47.html )
9
+ [ ![ Rustc 1.37+] ( https://img.shields.io/badge/rustc-1.37+-lightgray.svg )] ( https://blog.rust-lang.org/2019/08/15/Rust-1.37.0.html )
10
+
11
+ This crate provides a super-fast decimal number parser from strings into floats.
12
+
13
+ ``` toml
14
+ [dependencies ]
15
+ fast-float = " 0.1"
16
+ ```
17
+
18
+ There are no dependencies and the crate can be used in a no_std context by disabling the "std" feature.
19
+
20
+ * Compiler support: rustc 1.37+.*
21
+
22
+ ## Usage
23
+
24
+ There's two top-level functions provided:
25
+ [ ` parse() ` ] ( https://docs.rs/fast-float/latest/fast_float/fn.parse.html ) and
26
+ [ ` parse_partial() ` ] ( https://docs.rs/fast-float/latest/fast_float/fn.parse_partial.html ) , both taking
27
+ either a string or a bytes slice and parsing the input into either ` f32 ` or ` f64 ` :
28
+
29
+ - ` parse() ` treats the whole string as a decimal number and returns an error if there are
30
+ invalid characters or if the string is empty.
31
+ - ` parse_partial() ` tries to find the longest substring at the beginning of the given input
32
+ string that can be parsed as a decimal number and, in the case of success, returns the parsed
33
+ value along the number of characters processed; an error is returned if the string doesn't
34
+ start with a decimal number or if it is empty. This function is most useful as a building
35
+ block when constructing more complex parsers, or when parsing streams of data.
36
+
37
+ Example:
38
+
39
+ ``` rust
40
+ // Parse the entire string as a decimal number.
41
+ let s = " 1.23e-02" ;
42
+ let x : f32 = fast_float :: parse (s ). unwrap ();
43
+ assert_eq! (x , 0.0123 );
44
+
45
+ // Parse as many characters as possible as a decimal number.
46
+ let s = " 1.23e-02foo" ;
47
+ let (x , n ) = fast_float :: parse_partial :: <f32 , _ >(s ). unwrap ();
48
+ assert_eq! (x , 0.0123 );
49
+ assert_eq! (n , 8 );
50
+ assert_eq! (& s [n .. ], " foo" );
51
+ ```
52
+
53
+ ## Details
54
+
55
+ This crate is a direct port of Daniel Lemire's [ ` fast_float ` ] ( https://github.com/fastfloat/fast_float )
56
+ C++ library (valuable discussions with Daniel while porting it helped shape the crate and get it to
57
+ the performance level it's at now), with some Rust-specific tweaks. Please see the original
58
+ repository for many useful details regarding the algorithm and the implementation.
59
+
60
+ The parser is locale-independent. The resulting value is the closest floating-point values (using either
61
+ ` f32 ` or `f64), using the "round to even" convention for values that would otherwise fall right in-between
62
+ two values. That is, we provide exact parsing according to the IEEE standard.
63
+
64
+ Infinity and NaN values can be parsed, along with scientific notation.
65
+
66
+ Both little-endian and big-endian platforms are equally supported, with extra optimizations enabled
67
+ on little-endian architectures.
68
+
69
+ ## Performance
70
+
71
+ The presented parser seems to beat all of the existing C/C++/Rust float parsers known to us at the
72
+ moment by a large margin, in all of the datasets we tested it on so far – see detailed benchmarks
73
+ below (the only exception being the original fast_float C++ library, of course – performance of
74
+ which is within noise bounds of this crate). On modern machines, parsing throughput can reach
75
+ up to 1GB/s.
76
+
77
+ In particular, it is faster than Rust standard library's ` FromStr::from_str() ` by a factor of 2-8x
78
+ (larger factor for longer float strings).
79
+
80
+ While various details regarding the algorithm can be found in the repository for the original
81
+ C++ library, here are few brief notes:
82
+
83
+ - The parser is specialized to work lightning-fast on inputs with at most 19 significant digits
84
+ (which constitutes the so called "fast-path"). We believe that most real-life inputs should
85
+ fall under this category, and we treat longer inputs as "degenerate" edge cases since it
86
+ inevitable causes overflows and loss of precision.
87
+ - If the significand happens to be longer than 19 digits, the parser falls back to the "slow path",
88
+ in which case its performance roughly matches that of the top Rust/C++ libraries (and still
89
+ beats them most of the time, although not by a lot).
90
+ - On little-endian systems, there's additional optimizations for numbers with more than 8 digits
91
+ after the decimal point.
92
+
93
+ ## Benchmarks
94
+
95
+ Below is the table of average timings in nanoseconds for parsing a single number
96
+ into a 64-bit float.
97
+
98
+ | | ` canada ` | ` mesh ` | ` uniform ` | ` iidi ` | ` iei ` | ` rec32 ` |
99
+ | ---------------- | -------- | -------- | --------- | ------ | ------ | ------- |
100
+ | fast-float | 22.08 | 11.10 | 20.04 | 40.77 | 26.33 | 29.84 |
101
+ | lexical | 61.63 | 25.10 | 53.77 | 72.33 | 53.39 | 72.40 |
102
+ | lexical/lossy | 61.51 | 25.24 | 54.00 | 71.30 | 52.87 | 71.71 |
103
+ | from_str | 175.07 | 22.58 | 103.00 | 228.78 | 115.76 | 211.13 |
104
+ | fast_float (C++) | 22.78 | 10.99 | 20.05 | 41.12 | 27.51 | 30.85 |
105
+ | abseil (C++) | 42.66 | 32.88 | 46.01 | 50.83 | 46.33 | 49.95 |
106
+ | netlib (C++) | 57.53 | 24.86 | 64.72 | 56.63 | 36.20 | 67.29 |
107
+ | strtod (C) | 286.10 | 31.15 | 258.73 | 295.73 | 205.72 | 315.95 |
108
+
109
+ Parsers:
110
+
111
+ - ` fast-float ` - this very crate
112
+ - ` lexical ` – from ` lexical_core ` crate, v0.7
113
+ - ` lexical/lossy ` - from ` lexical_core ` crate, v0.7 (lossy parser)
114
+ - ` from_str ` – Rust standard library, ` FromStr ` trait
115
+ - ` fast_float (C++) ` – original C++ implementation of 'fast-float' method
116
+ - ` abseil (C++) ` – Abseil C++ Common Libraries
117
+ - ` netlib (C++) ` – C++ Network Library
118
+ - ` strtod (C) ` – C standard library
119
+
120
+ Datasets:
121
+
122
+ - ` canada ` – numbers in ` canada.txt ` file
123
+ - ` mesh ` – numbers in ` mesh.txt ` file
124
+ - ` uniform ` – uniform random numbers from 0 to 1
125
+ - ` iidi ` – random numbers of format ` %d%d.%d `
126
+ - ` iei ` – random numbers of format ` %de%d `
127
+ - ` rec32 ` – reciprocals of random 32-bit integers
128
+
129
+ Notes:
130
+
131
+ - Test environment: macOS 10.14.6, clang 11.0, Rust 1.49, 3.5 GHz i7-4771 Haswell.
132
+ - The two test files referred above can be found in
133
+ [ this] ( https://github.com/lemire/simple_fastfloat_benchmark ) repository.
134
+ - The Rust part of the table (along with a few other benchmarks) can be generated via
135
+ the benchmark tool that can be found under ` extras/simple-bench ` of this repo.
136
+ - The C/C++ part of the table (along with a few other benchmarks and parsers) can be
137
+ generated via a C++ utility that can be found in [ this] ( https://github.com/lemire/simple_fastfloat_benchmark )
138
+ repository.
10
139
11
140
<br >
12
141
0 commit comments