-
Notifications
You must be signed in to change notification settings - Fork 546
CXX-3234 refactor bsoncxx::v_noabi to use bsoncxx::v1 #1447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining error condition mappings for v_noabi::error_code
<-> v1::*::errc
was considered but rejected due to too much effort and complexity for an API (v_noabi) which we intend to deprecate and remove (in favor of v1). It will also be more reasonable to consider defining mappings between component-specific v1::*::errc
error codes and also component-specific v2::*::errc
in the future, than to define mappings to a library-wide v_noabi::error_code
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there is no way to general-case-extendably arrange for the v_noabi exception class to IS-A the v1 exception class (or for v1 exceptions to inherit v2 exceptions, etc.). As demonstrated by changes in this PR, any v_noabi exception boundaries must manually catch and rethrow potential v1 exceptions as a v_noabi exception instead to honor their documented "exception thrown by <abi>" contract. Users writing against both v_noabi and v1 API will need to catch std::system_error
instead of either v_noabi::exception
or v1::exception
to handle both cases. Perhaps we can/should consider removing these exception classes in favor of throwing generic std::system_error
with error codes instead...
template <bsoncxx::detail::endian = bsoncxx::detail::endian::native> | ||
void construct(float value); | ||
|
||
template <bsoncxx::detail::endian = bsoncxx::detail::endian::native> | ||
float convert() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new v1/detail/bit.hpp header is used to provide endianness checks. Instead of preprocessor macros, the compile-time conditional branch is implemented as template specializations (hopefully eventually replaced with C++17 if constexpr
).
/// @par Includes | ||
/// - @ref bsoncxx/v1/decimal128-fwd.hpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to CXX-3119: all v_noabi headers publically document the inclusion of v1 equivalent headers. This is not only due to the refactor, but also to set precedent for v_noabi headers to continue to provide root namespace redeclarations for users who do not require ABI stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding. Is including the v1 forwarding header needed? Removing the v1 include still compiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not strictly necessary, yet. This is implementing in advance the header structure necessary to support declaring+providing v1 API via v_noabi headers for when root namespace redeclaration will be updated to v_noabi -> v1
(e.g. bsoncxx::decimal128
-> bsoncxx::v1::decimal128
).
#include <bsoncxx/decimal128-fwd.hpp> | ||
|
||
// | ||
|
||
#include <bsoncxx/v1/decimal128.hpp> | ||
#include <bsoncxx/v1/detail/type_traits.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the scope of potential breaking changes minimal, this PR refrains from changing (removing) include directives even when undocumented. Following releases which publically document transitive includes, the next API major release will hopefully take the opportunity to perform a more thorough header hygiene cleanup.
That being said, component headers are reordered so they are always included first before others.
auto const deleter = ptr.get_deleter(); // bson_free_deleter | ||
|
||
SECTION("default") { | ||
(void)v_noabi{data, size, deleter}; // Unused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to reacquire ownership the released data above for consistency with other test sections.
bool operator!=(types::bson_value::view const& v, element const& elem) { | ||
return !(elem == v); | ||
} | ||
// MSVC: `std::is_constructible<T, Args...>` does not work with using-declared conversion functions to class type...? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An odd MSVC bug where a using-declared conversion function to a class type does not seem to inherit the explicit
specifier. See: https://godbolt.org/z/9jTxMan5W
@@ -313,6 +314,38 @@ TEST_CASE("b_types", "[bsoncxx][v1][types][value]") { | |||
#pragma pop_macro("X") | |||
} | |||
|
|||
TEST_CASE("get_bson_value", "[bsoncxx][v1][types][value][internal]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test cases were added to document and compare the subtly different requirements concerning empty string behavior by v1's internal implementation vs. what is required by mongocxx's scoped_bson_value
utility (which is refactored to use v1::types::value
's internal bson_value_t
directly) when interfacing with CSFLE API.
// CSFLE API requires empty strings to be not-null. | ||
if (v.value_type == BSON_TYPE_UTF8 && v.value.v_utf8.str == nullptr) { | ||
v.value.v_utf8.str = static_cast<char*>(bson_malloc0(1u)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requirement is both unexpected and, when not satisfied, a major pain to diagnose. This problem manifests as inscrutible CSFLE errors such as:
-------------------------------------------------------------------------------
Corpus
-------------------------------------------------------------------------------
../src/mongocxx/test/v_noabi/client_side_encryption.cpp:1059
...............................................................................
../src/mongocxx/test/v_noabi/client_side_encryption.cpp:979: FAILED:
explicitly with message:
caught an exception for encrypting an allowed field payload=0,algo=rand: BSON
type invalid for encryption: generic server error
with stack trace (libmongocrypt 1.13.0, mongoc 912209d, ignoring mocks) pointing here (with bson_type == BSON_TYPE_EOD
):
_permitted_for_encryption (src/mongocrypt-ctx-encrypt.c:1739)
explicit_encrypt_init (src/mongocrypt-ctx-encrypt.c:1962)
mongocrypt_ctx_explicit_encrypt_init (src/mongocrypt-ctx-encrypt.c)
_mongoc_crypt_explicit_encrypt (src/libmongoc/src/mongoc/mongoc-crypt.c:1772)
mongoc_client_encryption_encrypt (src/libmongoc/src/mongoc/mongoc-client-side-encryption.c:2671)
mongocxx::v_noabi::client_encryption::impl::encrypt (src/mongocxx/lib/mongocxx/private/client_encryption.hh:111)
mongocxx::v_noabi::client_encryption::encrypt (src/mongocxx/lib/mongocxx/v_noabi/mongocxx/client_encryption:43)
_run_corpus_test (src/mongocxx/lib/mongocxx/v_noabi/mongocxx/client_encryption:971)
There may be insufficient input validation by mongoc, libmongocrypt, or both, but atm I am not sure which may be at fault. For now, this PR assumes it is mongocxx's responsibility and applies the necessary workaround to ensure v_utf8.str != nullptr
is always true even for empty strings. This issue is only observable by mongocxx due to accessing bsoncxx::v1::types::value
's bson_value_t
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which may be at fault
I expect this is related to this documented quirk in bson_append_utf8:
Due to legacy behavior, passing
NULL
for value appends a null value, not a UTF8 value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming: am I correct that this results in no observed behavior change? The previous make_copy_for_libbson appears to also allocate for an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should not have any observable behavior changes, since mongoc (and mongocrypt) should not assume null-termination for a bson_value_t
UTF-8 with explicit length. The bug described above was due to passing a null pointer despite length == 0
due to the legacy behavior you identified and linked above, which I believe to be the correct root cause. This allocation (formerly unconditional, now only on-demand when passed to CSFLE API) preserves the not-null guarantee to avoid this legacy behavior.
value v{b_string{bsoncxx::stdx::string_view{}}}; | ||
REQUIRE(v.get()->value_type == BSON_TYPE_UTF8); | ||
auto const& v_utf8 = v.get()->value.v_utf8; | ||
CHECK(v_utf8.len == 0u); | ||
CHECK(static_cast<void const*>(v_utf8.str) != nullptr); | ||
CHECK(v_utf8.str[0] == '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test cases were added to document and compare the subtly different requirements concerning empty string behavior by v1's internal implementation vs. what is required by mongocxx's scoped_bson_value
utility (which is refactored to use v1::types::value
's internal bson_value_t
directly) when interfacing with CSFLE API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting initial comments.
v1::document::view _view; | ||
std::size_t _length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the current behavior allows passing a larger buffer length:
// Document is 12 bytes { 'x': 1}. Buffer is 13 bytes (extra 0xFF).
std::uint8_t const data[] = {12, 0, 0, 0, 16, 'x', '\0', 1, 0, 0, 0, 0, 0xFF};
auto v = bsoncxx::v_noabi::document::view(data, sizeof(data));
// Returns the passed length:
REQUIRE(v.length() == 13);
Though I would rather discourage it. Iterating quietly fails:
REQUIRE(v.begin() == v.end());
Consider adding a warning to the v_noabi
view constructors accepting a length:
/// @warning For backward compatibility, this function does NOT check if @p length equals the embedded length in the BSON bytes.
bson_iter_t iter{}; | ||
bson_iter_init_from_data_at_offset(&iter, e.raw(), e.length(), e.offset(), e.keylen()); | ||
(void)bson_iter_init_from_data_at_offset(&iter, e.raw(), e.length(), e.offset(), e.keylen()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly prefer checking the return. I expect the unchecked return could result in quiet failures. Example:
// Document is {'x': 'y'}, except length for 'y' is incorrect.
std::uint8_t const data[] = {
14, 0, 0, 0, // Document length.
2, 'x', '\0', // String and key.
123, 0, 0, 0, // Bad length for value.
'y', 0,
0
};
auto v = bsoncxx::v_noabi::document::view(data, sizeof(data));
REQUIRE(v.begin() == v.end()); // No error indicated!?
Though I would rather fix in a separate PR to avoid including behavior changes with the refactor. Reported: CXX-3333.
// CSFLE API requires empty strings to be not-null. | ||
if (v.value_type == BSON_TYPE_UTF8 && v.value.v_utf8.str == nullptr) { | ||
v.value.v_utf8.str = static_cast<char*>(bson_malloc0(1u)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which may be at fault
I expect this is related to this documented quirk in bson_append_utf8:
Due to legacy behavior, passing
NULL
for value appends a null value, not a UTF8 value.
// CSFLE API requires empty strings to be not-null. | ||
if (v.value_type == BSON_TYPE_UTF8 && v.value.v_utf8.str == nullptr) { | ||
v.value.v_utf8.str = static_cast<char*>(bson_malloc0(1u)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming: am I correct that this results in no observed behavior change? The previous make_copy_for_libbson appears to also allocate for an empty string.
/// @par Includes | ||
/// - @ref bsoncxx/v1/decimal128-fwd.hpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding. Is including the v1 forwarding header needed? Removing the v1 include still compiles.
A closer inspection of abi-compliance-checker's API compatibility report identified some unintended API breaking changes. First, the return type for
This can break existing code in the following manner: auto example(v_noabi::types::bson_value::view const& v)
-> v_noabi::types::b_int32 const&
{
return v.get_int32(); // Before: reference to this->_b_int32.
// After: dangling reference to temporary b_int32. :(
} Therefore, partially reverted the initial changes and refactored the implementation so that The
This one is comparatively trivial in both scope and fix. These are simply changed back to There are two "private" (and undocumented) hidden friend functions which were overlooked and omitted by the initial changes:
These are clearly intended to be private API, and I was misled by their declaration in Due to On the other hand, I do not believe the remaining issues in the API compatibility report should be problematic given our circumstances, as they mostly concern user-provided ctors/dtors and implicit defaulted ctors which are now made trivial (reported as "removed"...?) as well as changes to set of implicit template instantiations and specializations (whose template definitions are visible and/or not owned by us). There is a lot of noise in the v_noabi API compatibility report due to v1 API being included in the unstable ABI report despite the list of headers to analyze being limited to v_noabi headers. The XML descriptors may need to be updated to better filter results to v_noabi API only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A closer inspection of abi-compliance-checker's API compatibility report
Great catch. I am happy to see the compatibility tasks helped to identify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with only a few very minor comments.
struct uninit_type {}; | ||
|
||
oid(uninit_type) : _bytes{} {} // Required by bsoncxx::v_noabi::oid::oid(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: rename to zeroinit_type
? This value-initiailizes the _bytes
to all zeroes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zero-initialization is meant to be an implementation detail that avoids language-level undefined behavior (in the spirit of -ftrivial-auto-var-init=zero
and to avoid triggering static analysis warnings for uninitialized data members), but it's still meant to be library-level undefined behavior to access _bytes
after initializing with uninit_type
, since the elements are expected to be unconditionally overwritten by v_noabi::oid::oid()
anyways. I expect/hope the compiler should be able to see this unconditional write-without-read and optimize accordingly (ignoring exception paths):
v_noabi::oid::oid()
// Zeroed bytes are never read after this initialization.
: _oid{v1::oid::internal::make_oid_without_init()}
{
// Unconditionally overwritten.
_oid = v1::oid{};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, to avoid the "uninit" confusion, adopted the *_for_overwrite
suffix from std::make_unique_for_overwrite
, which hopefully makes the intention much clearer.
value(std::vector<unsigned char> v, binary_sub_type const sub_type = {}); | ||
/* explicit(false) */ | ||
value(std::vector<unsigned char> v, v_noabi::binary_sub_type const sub_type = {}) | ||
: _value{reinterpret_cast<std::uint8_t const*>(v.data()), v.size(), to_v1(sub_type)} {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be taking the vector parameter by-value, or by const&
?
|
||
template <> | ||
inline void float32::construct<bsoncxx::detail::endian::little>(float value) { | ||
std::memcpy(bytes, &value, sizeof value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are template specializations required, or can this just be a regular if
branch in a single function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specializations are to avoid "conditional expression is constant" compiler warnings without needing to use warning suppressions. This should really be an if constexpr
, but alas...
template <typename From, typename To> | ||
struct is_explicitly_convertible : bsoncxx::detail::conjunction< | ||
std::is_constructible<To, From>, | ||
bsoncxx::detail::negation<std::is_convertible<From, To>>> {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tweak name: implicit-convertbile should imply is_explicit_convertible
, but this checks that it is also not implicit-convertible. Maybe is_only_explicitly_convertible
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of these type traits is to more accurately assert the presence or absence of explicit
for a given constructor or conversion function while also avoiding the To
vs. From
ordering inconsistency (which I frequently mixed up when writing the relevant static assertions prior to defining these helpers):
struct From {};
struct To { /* explicit(false) */ To(From); };
static_assert(std::is_constructible_v<To, From>, "OK...?"); // Implies nothing about `explicit`.
static_assert(std::is_convertible_v<From, To>, "there is no `explicit`");
static_assert(is_explicitly_convertible_v<From, To>, "there is an `explicit`"); // Fails: a useful check!
static_assert(is_implicitly_convertible_v<From, To>, "there is no `explicit`");
I considered naming them has_(implicit|explicit)_ctor
, but this would not account for conversion functions (indistinguishable from ctors by usage syntax alone). I also considered has_explicit
+ has_no_explicit
, but then there'd be a no_
in a type trait identifier... 😑
Therefore I ended up with the current names for the best symmetry: is_only_explicitly_convertible
would imply an is_explicitly_convertible
("not only explicit")... but that would just be equivalent to std::is_constructible<To, From>
(providing no additional value). I don't think we want/need an is_only_implicitly_convertible
either (is that even possible?).
That being said, added additional notes to their documentation to clarify that std::is_constructible
and std::is_convertible
should be used if the additional implications are not desirable.
Summary
Resolves CXX-3234. Followup to #1430 and #1445.
This is 4 out of an estimated 7 major PRs which in total are expected to resolve CXX-2745 and CXX-3320
This PR refactors the bsoncxx v_noabi interfaces and implementations to:
Important
There should be no API breaking changes in this refactor!
Unlike prior bsoncxx v1 PRs thus far, this PR is the first case where the changes to v_noabi and v1 are not an example of how similar changes would be implemented for v1 and v2, since v1 is a stable ABI namespace. This refactor takes advantage of v_noabi's lack of ABI stability guarantees to reduce code duplication and re-implement v_noabi in terms of v1 as much as possible. However, when v2 API is introduced, v1 will not be able to be refactored to the same extent, although ABI+API compatible refactors would of course still be permitted.
Note
Extending support for
BSONCXX_API_OVERRIDE_DEFAULT_ABI
is not in scope for this PR. All changes to the v_noabi test suite in this PR are additions only (excluding CXX-2120-related error message assertions), meaning there should be no API breaking changes. Extending support forBSONCXX_API_OVERRIDE_DEFAULT_ABI
will necessarily need to be accompanied by corresponding changes to test code to support the API breaking change caused by the changes to root namespace redeclarations.Per local testing, this PR reduces the ABI footprint for
bsoncxx::v_noabi
from 326 symbols to 169 (-48.2%). Excluding the symbols corresponding to components refactored by this PR (which have v1 counterparts), remaining symbols include:bsoncxx::v_noabi::builder::*
: pending CXX-3275bsoncxx::v_noabi::*_json()
+ related: out-of-scope for initial v1 implementation.bsoncxx::v_noabi::*view_or_value
: to be removed in favor of CXX-1546, CXX-1827, etc.bsoncxx::v_noabi::validate
+ related: out-of-scope for initial v1 implementation.bsoncxx::v_noabi::vector
: out-of-scope for initial v1 implementation.