[May 24th 2025] Merge changes from upstream #3723

powerboat9 · 2025-04-09T21:49:40Z

This should improve our situation with respect to downstreaming. Any merge should be done with the github default merge method, rather than with a rebase-merge.

powerboat9 · 2025-04-09T22:01:33Z

GitHub appears to be having issues displaying the diff, considering the amount of commits or files involved I'd guess. I'm not sure how we'd fix that, so we might have to just work around it.

powerboat9 · 2025-04-09T22:40:11Z

Looks like one of the tests is failing -- any ideas?

…targets Many tests became unsupported on aarch64 when -mcpu=unset was added to several arm_* effective targets, because this flag is only supported on arm. Since these effective targets are used on arm and aarch64, the patch adds -mcpu=unset on arm only, and restores "" on aarch64. This re-enables lots of tests: advsimd-intrinsics/vqrdmlah fp16 tests dotprod tests i8mm tests aarch64/simd/vmmla.c bf16 tests gcc.dg/vect/complex tests With this change, a few more failures appear, but should be fixed separately: FAIL: gcc.dg/vect/complex/fast-math-complex-mls-double.c -flto -ffat-lto-objects scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-double.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-float.c -flto -ffat-lto-objects scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-float.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-half-float.c -flto -ffat-lto-objects scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-half-float.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270" gcc/testsuite/ChangeLog * lib/target-supports.exp (check_effective_target_arm_v8_1a_neon_ok_nocache): Use -mcpu=unset on arm only. (check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache): Likewise. (check_effective_target_arm_v8_2a_fp16_neon_ok_nocache): Likewise. (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): Likewise. (check_effective_target_arm_v8_2a_i8mm_ok_nocache): Likewise. (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): Likewise. (check_effective_target_arm_v8_3a_complex_neon_ok_nocache): Likewise. (check_effective_target_arm_v8_3a_fp16_complex_neon_ok_nocache): Likewise.

The kernel developers have requested such a constraint to use csrxchg in inline assembly. gcc/ChangeLog: * doc/md.texi: Document the 'q' constraint for LoongArch.

…xtended targets Shifts are the only special case I'm aware of where the most significant limb (if it is has padding bits) is accessed inside of a loop or with access outside of a loop but with variable idx. Everything else should access the most significant limb using INTEGER_CST idx and thus can (and should) deal with the needed extension on that access directly. And RSHIFT_EXPR shouldn't really violate the content of the padding bits. For LSHIFT_EXPR we should IMHO do the following (which fixes the testcase on s390x-linux). The LSHIFT_EXPR is /* Lower dst = src << n; as unsigned n1 = n % limb_prec; size_t n2 = n / limb_prec; size_t n3 = n1 != 0; unsigned n4 = (limb_prec - n1) % limb_prec; size_t idx; size_t p = prec / limb_prec - (prec % limb_prec == 0); for (idx = p; (ssize_t) idx >= (ssize_t) (n2 + n3); --idx) dst[idx] = (src[idx - n2] << n1) | (src[idx - n2 - n3] >> n4); if (n1) { dst[idx] = src[idx - n2] << n1; --idx; } for (; (ssize_t) idx >= 0; --idx) dst[idx] = 0; */ as described in the comment (note, the comments are for the little-endian lowering only, didn't want to complicate it with endianity). As can be seen, the most significant limb can be modifier either inside of the loop or in the if (n1) body if the loop had 0 iterations. In your patch you've modified I believe just the loop and not the if body, and made it conditional on every iteration (furthermore through gimplification of COND_EXPR which is not the way this is done elsewhere in gimple-lower-bitint.cc, there is if_then helper and it builds gimple_build_cond etc.). I think that is way too expensive. In theory we could peel off the first iteration manually and do the info->extended handling in there and do it again inside of the if (n1) case if idx == (bitint_big_endian ? size_zero_node : p) in that case, but I think just doing the extension after the loops is easier. Note, we don't need to worry about volatile here, the shift is done into an addressable variable memory only if it is non-volatile, otherwise it is computed into a temporary and then copied over into the volatile var. 2025-05-22 Jakub Jelinek <[email protected]> * gimple-lower-bitint.cc (bitint_extended): New variable. (bitint_large_huge::lower_shift_stmt): For LSHIFT_EXPR with bitint_extended if lhs has most significant partial limb extend it afterwards. * gcc.dg/bitintext.h: New file. * gcc.dg/torture/bitint-82.c: New test.

Because this constructor delegates to vector(a) the object has been fully constructed and the destructor will run if an exception happens. That means we need to set _M_finish == _M_start so that the destructor doesn't try to destroy any elements. libstdc++-v3/ChangeLog: PR libstdc++/120367 * include/bits/stl_vector.h (_M_range_initialize): Initialize _M_impl._M_finish. * testsuite/23_containers/vector/cons/from_range.cc: Check with a type that throws on construction. exceptions during construction. Reviewed-by: Patrick Palka <[email protected]>

These were fixed upstream by: uxlfoundation/oneDPL#534 uxlfoundation/oneDPL#546 libstdc++-v3/ChangeLog: * testsuite/util/pstl/test_utils.h (ForwardIterator::operator++): Fix return type. (BidirectionalIterator::operator++): Likewise. (BidirectionalIterator::operator--): Likewise.

libstdc++-v3/ChangeLog: * include/bits/allocated_ptr.h (_Scoped_allocation): New class template. Co-authored-by: Tomasz Kamiński <[email protected]> Signed-off-by: Tomasz Kamiński <[email protected]>

…ze_comparison. This is the first part in fixing PR target/120372. The current code for canonicalize_comparison, uses gen_move_insn and rtx_cost to find out the cost of generating a constant. This is ok in most cases except sometimes the comparison instruction can handle different constants than a simple set intruction can do. This changes to use rtx_cost directly with the outer being COMPARE just like how prepare_cmp_insn handles that. Note this is also a small speedup and small memory improvement because we are not creating a move for the constant any more. Since we are not creating a psedu-register any more, this also removes the check on that. Also adds a dump so we can see why one choice was chosen over the other. Build and tested for aarch64-linux-gnu. gcc/ChangeLog: * expmed.cc (canonicalize_comparison): Use rtx_cost directly instead of gen_move_insn. Print out the choice if dump is enabled. Signed-off-by: Andrew Pinski <[email protected]>

The middle-end uses rtx_cost on constants with the outer of being COMPARE to find out the cost of a constant formation for a comparison instruction. So for aarch64 backend, we would just return the cost of constant formation in general. We can improve this by seeing if the outer is COMPARE and if the constant fits the constraints of the cmp instruction just set the costs to being one instruction. Built and tested for aarch64-linux-gnu. PR target/120372 gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_rtx_costs <case CONST_INSN>): Handle if outer is COMPARE and the constant can be handled by the cmp instruction. gcc/testsuite/ChangeLog: * gcc.target/aarch64/imm_choice_comparison-2.c: New test. Signed-off-by: Andrew Pinski <[email protected]>

Patch is originally from Siarhei Volkau <[email protected]>. RISC-V has a zero register (x0) which we can use to store zero into memory without loading the constant into a distinct register. Adjust the constraints of the 32-bit movdi_32bit pattern to recognize that we can store 0.0 into memory using x0 as the source register. This patch only affects RISC-V. It has been regression tested on riscv64-elf. Jeff has also tested this in his tester (riscv64-elf and riscv32-elf) with no regressions. PR target/70557 gcc/ * config/riscv/riscv.md (movdi_32bit): Add "J" constraint to allow storing 0 directly to memory.

So the next step in Shreya's work. In the prior patch we used two shifts to clear bits at the high or low end of an object. In this patch we use 3 shifts to clear bits on both ends. Nothing really special here. With mvconst_internal still in the tree it's of marginal value, though Shreya and I have confirmed the code coming out of expand looks good. It's just that combine reconstitutes the operation via mvconst_internal+and which looks cheaper. When I was playing in this space earlier I definitely saw testsuite cases that need this case handled to not regress with mvconst_internal removed. This has spun in my tester on rv32 and rv64 and it's bootstrap + testing on my BPI with a mere 23 hours to go. Waiting on pre-commit testing to render a verdict before moving forward. gcc/ * config/riscv/riscv.cc (synthesize_and): When profitable, use a three shift sequence to clear bits at both upper and lower bits rather than synthesizing the constant mask.

[aarch64] [vxworks] mark x18 as fixed, adjust tests VxWorks uses x18 as the TCB, so STATIC_CHAIN_REGNUM has long been set (in gcc/config/aarch64/aarch64-vxworks.h) to use x9 instead. This patch marks x18 as fixed if the newly-introduced TARGET_OS_USES_R18 is defined, so that it is not chosen by the register allocator, rejects -fsanitize-shadow-call-stack due to the register conflict, and adjusts tests that depend on x18 or on the static chain register. for gcc/ChangeLog * config/aarch64/aarch64-vxworks.h (TARGET_OS_USES_R18): Define. Update comments. * config/aarch64/aarch64.cc (aarch64_conditional_register_usage): Mark x18 as fixed on VxWorks. (aarch64_override_options_internal): Issue sorry message on -fsanitize=shadow-call-stack if TARGET_OS_USES_R18. for gcc/testsuite/ChangeLog * gcc.dg/cwsc1.c (CHAIN, aarch64): x9 instead x18 for __vxworks. * gcc.target/aarch64/reg-alloc-4.c: Drop x18-assigned asm operand on vxworks. * gcc.target/aarch64/shadow_call_stack_1.c: Don't expect -ffixed-x18 error on vxworks, but rather the sorry message. * gcc.target/aarch64/shadow_call_stack_2.c: Skip on vxworks. * gcc.target/aarch64/shadow_call_stack_3.c: Likewise. * gcc.target/aarch64/shadow_call_stack_4.c: Likewise. * gcc.target/aarch64/shadow_call_stack_5.c: Likewise. * gcc.target/aarch64/shadow_call_stack_6.c: Likewise. * gcc.target/aarch64/shadow_call_stack_7.c: Likewise. * gcc.target/aarch64/shadow_call_stack_8.c: Likewise. * gcc.target/aarch64/stack-check-prologue-19.c: Likewise. * gcc.target/aarch64/stack-check-prologue-20.c: Likewise.

Since vxworks' libc contains much of libatomic, in not-very-granular modules, building all of libatomic doesn't work very well. However, some expected entry points are not present in libc, so arrange for libatomic to build only those missing bits. for libatomic/ChangeLog * configure.tgt: Set partial_libatomic on *-*-vxworks*. * configure.ac (PARTIAL_VXWORKS): New AM_CONDITIONAL. * Makefile.am (libatomic_la_SOURCES): Select few sources for PARTIAL_VXWORKS. * configure, Makefile.in: Rebuilt.

In cp_fold we do speculative constant evaluation of constexpr calls when inlining is enabled. Let's also do it for always_inline functions. PR c++/120935 gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold): Check always_inline. gcc/testsuite/ChangeLog: * g++.dg/opt/always_inline2.C: New test. * g++.dg/debug/dwarf2/pubnames-2.C: Suppress -fimplicit-constexpr. * g++.dg/debug/dwarf2/pubnames-3.C: Likewise.

Typo. gcc/testsuite/ChangeLog: * g++.dg/opt/always_inline2.C: Correct PR number.

Bit-fields are stored left-justified for big-endian targets. gcc/ * dwarf2out.cc (loc_list_from_tree_1) <COMPONENT_REF>: Add specific handling of bit-fields for big-endian targets.

It is used to specify which files are compiled with -gnato, but the switch has been the default for at least a decade. gcc/testsuite/ * ada/acats/overflow.lst: Delete. * ada/acats/run_all.sh: Do not process overflow.lst.

This patch fixes an ICE which occurs if a constant char is assigned into an integer array. The fix it to introduce type checking in M2GenGCC.mod:CodeXIndr. gcc/m2/ChangeLog: PR modula2/120389 * gm2-compiler/M2GenGCC.mod (CodeXIndr): Check to see that the type of left is assignment compatible with the type of right. gcc/testsuite/ChangeLog: PR modula2/120389 * gm2/iso/fail/badarray3.mod: New test. Signed-off-by: Gaius Mulley <[email protected]>

Add references to C23 subclauses to the documentation of implementation-defined behavior, and new entries for implementation-defined behavior new in C23; change some references in the text to e.g. "C99 and C11" to encompass C23 as well. Tested with "make info html pdf". * doc/implement-c.texi: Document C23 implementation-defined behavior. (Constant expressions implementation, Types implementation): New nodes.

ChangeLog: * MAINTAINERS: Add myself to write after approval and DCO.

…structions This patch modifies the shift expander to immediately lower constant shifts without unspec. It also modifies the ADR, SRA and ADDHNB patterns to match the lowered forms of the shifts, as the predicate register is not required for these instructions. Bootstrapped and regtested on aarch64-linux-gnu. Signed-off-by: Dhruv Chawla <[email protected]> Co-authored-by: Richard Sandiford <[email protected]> gcc/ChangeLog: * config/aarch64/aarch64-sve.md (@aarch64_adr<mode>_shift): Match lowered form of ashift. (*aarch64_adr<mode>_shift): Likewise. (*aarch64_adr_shift_sxtw): Likewise. (*aarch64_adr_shift_uxtw): Likewise. (<ASHIFT:optab><mode>3): Check amount instead of operands[2] in aarch64_sve_<lr>shift_operand. (v<optab><mode>3): Generate unpredicated shifts for constant operands. (@aarch64_pred_<optab><mode>): Convert to a define_expand. (*aarch64_pred_<optab><mode>): Create define_insn_and_split pattern from @aarch64_pred_<optab><mode>. (*post_ra_v_ashl<mode>3): Rename to ... (aarch64_vashl<mode>3_const): ... this and remove reload requirement. (*post_ra_v_<optab><mode>3): Rename to ... (aarch64_v<optab><mode>3_const): ... this and remove reload requirement. * config/aarch64/aarch64-sve2.md (@aarch64_sve_add_<sve_int_op><mode>): Match lowered form of SHIFTRT. (*aarch64_sve2_sra<mode>): Likewise. (*bitmask_shift_plus<mode>): Match lowered form of lshiftrt.

This patch folds the following pattern: lsl <y>, <x>, <shift> lsr <z>, <x>, <shift> orr <r>, <y>, <z> to: revb/h/w <r>, <x> when the shift amount is equal to half the bitwidth of the <x> register. Bootstrapped and regtested on aarch64-linux-gnu. Signed-off-by: Dhruv Chawla <[email protected]> Co-authored-by: Richard Sandiford <[email protected]> gcc/ChangeLog: * expmed.cc (expand_rotate_as_vec_perm): Avoid a no-op move if the target already provided the result in the expected register. * config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Avoid forcing subregs into fresh registers unnecessarily. * config/aarch64/aarch64-sve.md: Add define_split for rotate. (*v_revvnx8hi): New pattern. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/shift_rev_1.c: New test. * gcc.target/aarch64/sve/shift_rev_2.c: Likewise. * gcc.target/aarch64/sve/shift_rev_3.c: Likewise.

…pile [PR118694] OpenMP's 'target teams' is strictly coupled with 'teams'; if the latter exists, the kernel is launched in directly with multiple teams. Thus, the host has to know whether the teams construct exists or not. For #pragma omp target #pragma omp metadirective when (device={arch("nvptx")}: teams loop) it is simple when 'nvptx' offloading is not supported, otherwise it depends on the default device at runtime as the user code asks for a single team for host fallback and gcn offload and multiple for nvptx offload. In any case, this commit ensures that no FAIL is printed, whatever a future solution might look like. Instead of a dg-bogus combined with an 'xfail offload_target_nvptx', one an also argue that a dg-error for 'target offload_target_nvptx' would be more appropriate. libgomp/ChangeLog: PR middle-end/118694 * testsuite/libgomp.c-c++-common/metadirective-1.c: xfail when compiling (also) for nvptx offloading as an error is then expected.

…oads prop There are two places which forwprop replaces an original load to a few different loads. Both can set the vuse manually instead of relying on update_ssa. One is doing a complex load followed by REAL/IMAG_PART only And the other is very similar but for vector loads followed by BIT_FIELD_REF. Since this was the last place that needed to handle updating the ssa form, Remove the TODO_update_ssa also from the pass. gcc/ChangeLog: * tree-ssa-forwprop.cc (optimize_vector_load): Set the vuse manually on the new load statements. Also remove forward declaration since the definition is before the first use. (pass_forwprop::execute): Likewise for complex loads. (pass_data_forwprop): Remove TODO_update_ssa. Signed-off-by: Andrew Pinski <[email protected]>

Previously parsed strings with errors were being cached such that subsequent use of the format string were not being checked for errors. PR libfortran/119856 libgfortran/ChangeLog: * io/format.c (parse_format_list): Set the fmt->error message for missing comma. (parse_format): Do not cache the parsed format string if a previous error ocurred. gcc/testsuite/ChangeLog: * gfortran.dg/pr119856.f90: New test.

Move get_call_rtx_from to final.c and call call_from_call_insn. PR other/120493 * final.cc (call_from_call_insn): Change the argument type to const rtx_call_insn *. (get_call_rtx_from): New. * rtl.h (is_a_helper <const rtx_call_insn *>::test): New. (get_call_rtx_from): Moved to the final.cc section. * rtlanal.cc (get_call_rtx_from): Removed. Signed-off-by: H.J. Lu <[email protected]>

This patch fixes the typo in the test case `param-autovec-mode.c` in the RISC-V autovec testsuite. The option `autovec-mode` is changed to `riscv-autovec-mode` to match the expected parameter name. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/param-autovec-mode.c: Change `autovec-mode` to `riscv-autovec-mode` in dg-options.

Tobias had noted that the C front end was not treating C23 constexprs as constant in the user/condition selector property, which led to missed opportunities to resolve metadirectives at parse time. Additionally neither C nor C++ was permitting the expression to have pointer or floating-point type -- the former being a common idiom in other C/C++ conditional expressions. By using the existing front-end hooks for the implicit conversion to bool in conditional expressions, we also get free support for using a C++ class object that has a bool conversion operator in the user/condition selector. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_context_selector): Call convert_lvalue_to_rvalue and c_objc_common_truthvalue_conversion on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR. gcc/cp/ChangeLog * cp-tree.h (maybe_convert_cond): Declare. * parser.cc (cp_parser_omp_context_selector): Call maybe_convert_cond and fold_build_cleanup_point_expr on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR. * pt.cc (tsubst_omp_context_selector): Likewise. * semantics.cc (maybe_convert_cond): Remove static declaration. gcc/testsuite/ChangeLog * c-c++-common/gomp/declare-variant-2.c: Update expected output. * c-c++-common/gomp/metadirective-condition-constexpr.c: New. * c-c++-common/gomp/metadirective-condition.c: New. * c-c++-common/gomp/metadirective-error-recovery.c: Update expected output. * g++.dg/gomp/metadirective-condition-class.C: New. * g++.dg/gomp/metadirective-condition-template.C: New.

This commit implements a full-featured iterator for the riscv_subset_list, that it able to use range-based-for-loop. That could simplfy the code in the future, and make it more readable, also more compatible with standard C++ containers. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Use range-based-for-loop. * config/riscv/riscv-subset.h (riscv_subset_list::iterator): New. (riscv_subset_list::const_iterator): New.

`--enable-default-pie` is an option to specify whether to enable position-independent executables by default for `target`. However c++tools is build for `host`, so it should just follow `--enable-host-pie` option to determine whether to build with position-independent executables or not. NOTE: I checked PR 98324 and build with same configure option (`--enable-default-pie` and lto bootstrap) on x86-64 linux to make sure it won't cause same problem. c++tools/ChangeLog: * configure.ac: Don't check `--enable-default-pie`. * configure: Regen.

Separate the build rules to compile and link stage to make sure BUILD_LINKERFLAGS and BUILD_LDFLAGS are applied correctly. We hit this issue when we try to build GCC with non-system-default g++, and it use newer libstdc++, and then got error from using older libstdc++ from system, that should not happened if we link with -static-libgcc and -static-libstdc++. gcc/ChangeLog: * config/riscv/t-riscv: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi.

Some tests have 'dg-do link' but currently require 'tls' which is a compile-only check. In some configurations of arm-none-eabi, the 'tls' effective-target can be successful although these tests fail to link with undefined reference to `__aeabi_read_tp' This patch as a new tls_link effective target which makes sure we can build an executable. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_tls_link): New. * g++.dg/tls/pr102496-1.C: Require tls_link. * g++.dg/tls/pr77285-1.C: Likewise. gcc/ChangeLog: * doc/sourcebuild.texi (tls_link): Add documentation.

The -mcmodel=large option was originally added to handle generation of large binaries with large PLTs. However, when compiling the Linux kernel with allyesconfig the output binary is so large that the jump instruction 26-bit immediate is not large enough to store the jump offset to some symbols when linking. Example error: relocation truncated to fit: R_OR1K_INSN_REL_26 against symbol `do_fpe_trap' defined in .text section in arch/openrisc/kernel/traps.o We fix this by forcing jump offsets to registers when -mcmodel=large. Note, to get the Linux kernel allyesconfig config to work with OpenRISC, this patch is needed along with some other patches to the Linux hand coded assembly bits. gcc/ChangeLog: * config/or1k/predicates.md (call_insn_operand): Add condition to not allow symbol_ref operands with TARGET_CMODEL_LARGE. * config/or1k/or1k.opt: Document new -mcmodel=large implications. * doc/invoke.texi: Likewise. gcc/testsuite/ChangeLog: * gcc.target/or1k/call-1.c: New test. * gcc.target/or1k/got-1.c: New test.

In or1k structs are returned from functions using the memory address passed in r3. In the current version of GCC the struct stores changed from r11 (the return value) to r3 the incoming memory address. Both of are valid. Adjust the test to match what GCC is producing now. gcc/testsuite/ChangeLog: * gcc.target/or1k/return-2.c: Fix test.

This patch implements C++26 std::polymorphic as specified in P3019 with amendment to move assignment from LWG 4251. The implementation always allocate stored object on the heap. The manager function (_M_manager) is similary keep with the object (polymorphic::_Obj), which reduces the size of the polymorphic to size of the single pointer plus allocator (that is declared with [[no_unique_address]]). The implementation does not not use small-object optimization (SSO). We may consider adding this in the future, as SSO is allowed by the standard. However, storing any polimorphic object will require providing space for two pointers (manager function and vtable pointer) and user-declared data members. PR libstdc++/119152 libstdc++-v3/ChangeLog: * include/bits/indirect.h (std::polymorphic, pmr::polymorphic) [__glibcxx_polymorphic]: Define. * include/bits/version.def (polymorphic): Define. * include/bits/version.h: Regenerate. * include/std/memory: Define __cpp_lib_polymorphic. * testsuite/std/memory/polymorphic/copy.cc: New test. * testsuite/std/memory/polymorphic/copy_alloc.cc: New test. * testsuite/std/memory/polymorphic/ctor.cc: New test. * testsuite/std/memory/polymorphic/ctor_poly.cc: New test. * testsuite/std/memory/polymorphic/incomplete.cc: New test. * testsuite/std/memory/polymorphic/invalid_neg.cc: New test. * testsuite/std/memory/polymorphic/move.cc: New test. * testsuite/std/memory/polymorphic/move_alloc.cc: New test. Co-authored-by: Tomasz Kamiński <[email protected]> Signed-off-by: Tomasz Kamiński <[email protected]>

This patch adjust the passing of parameters for the move_only_function, copyable_function and function_ref. For types that are declared as being passed by value in signature template argument, they are passed by value to the invoker, when they are small (at most two pointers), trivially move constructible and trivially destructible. The latter guarantees that passing them by value has not user visible side effects. In particular, this extends the set of types forwarded by value, that was previously limited to scalars, to also include specializations of std::span and std::string_view, and similar standard and program defined-types. Checking the suitability of the parameter types requires the types to be complete. As a consequence, the implementation imposes requirements on instantiation of move_only_function and copyable_function. To avoid producing the errors from the implementation details, a static assertion was added to partial specializations of copyable_function, move_only_function and function_ref. The static assertion uses existing __is_complete_or_unbounded, as arrays type parameters are automatically decayed in function type. Standard already specifies in [res.on.functions] p2.5 that instantiating these partial specialization with incomplete types leads to undefined behavior. libstdc++-v3/ChangeLog: * include/bits/funcwrap.h (__polyfunc::__pass_by_rref): Define. (__polyfunc::__param_t): Update to use __pass_by_rref. * include/bits/cpyfunc_impl.h:: Assert that are parameters type are complete. * include/bits/funcref_impl.h: Likewise. * include/bits/mofunc_impl.h: Likewise. * testsuite/20_util/copyable_function/call.cc: New test. * testsuite/20_util/function_ref/call.cc: New test. * testsuite/20_util/move_only_function/call.cc: New test. * testsuite/20_util/copyable_function/conv.cc: New test. * testsuite/20_util/function_ref/conv.cc: New test. * testsuite/20_util/move_only_function/conv.cc: New test. * testsuite/20_util/copyable_function/incomplete_neg.cc: New test. * testsuite/20_util/function_ref/incomplete_neg.cc: New test. * testsuite/20_util/move_only_function/incomplete_neg.cc: New test. Reviewed-by: Patrick Palka <[email protected]> Reviewed-by: Jonathan Wakely <[email protected]> Signed-off-by: Tomasz Kamiński <[email protected]>

We don't want the new call to get_dtor to cause function instantiation. PR c++/107600 gcc/cp/ChangeLog: * semantics.cc (trait_expr_value) [CPTK_HAS_TRIVIAL_DESTRUCTOR]: Add cp_unevaluated. gcc/testsuite/ChangeLog: * g++.dg/ext/has_trivial_destructor-3.C: New test.

destructible_expr was wrongly assuming that TO is a class type. When is_xible_helper was added in r8-742 it returned early for abstract class types, which is correct for __is_constructible, but not __is_assignable or (now) __is_destructible. PR c++/107600 gcc/cp/ChangeLog: * method.cc (destructible_expr): Handle non-classes. (constructible_expr): Check for abstract class here... (is_xible_helper): ...not here. gcc/testsuite/ChangeLog: * g++.dg/ext/is_destructible2.C: New test.

PR libgomp/120444 include/ChangeLog: * cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare. * libgomp.h (struct gomp_device_descr): Add memset_func. * libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}. * libgomp.texi (Device Memory Routines): Document them. * omp.h.in (omp_target_memset, omp_target_memset_async): Declare. * omp_lib.f90.in (omp_target_memset, omp_target_memset_async): Add interfaces. * omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise. * plugin/cuda-lib.def: Add cuMemsetD8. * plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add hsa_amd_memory_fill_fn. (init_hsa_runtime_functions): DLSYM_OPT_FN load it. (GOMP_OFFLOAD_memset): New. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New. * target.c (omp_target_memset_int, omp_target_memset, omp_target_memset_async_helper, omp_target_memset_async): New. (gomp_load_plugin_for_device): Add DLSYM (memset). * testsuite/libgomp.c-c++-common/omp_target_memset.c: New test. * testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test. * testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test. * testsuite/libgomp.fortran/omp_target_memset.f90: New test. * testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.

The current overload set for __unique_copy handles three cases: - The input range uses forward iterators, the output range does not. This is the simplest case, and can just compare adjacent elements of the input range. - Neither the input range nor output range use forward iterators. This requires a local variable copied from the input range and updated by assigning each element to the local variable. - The output range uses forward iterators. For this case we compare the current element from the input range with the element just written to the output range. There are two problems with this implementation. Firstly, the third case assumes that the value type of the output range can be compared to the value type of the input range, which might not be possible at all, or might be possible but give different results to comparing elements of the input range. This is the problem identified in LWG 2439. Secondly, the third case is used when both ranges use forward iterators, even though the first case could (and should) be used. This means that we compare elements from the output range instead of the input range, with the problems described above (either not well-formed, or might give the wrong results). The cause of the second problem is that the overload for the first case looks like: OutputIterator __unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred, forward_iterator_tag, output_iterator_tag); When the output range uses forward iterators this overload cannot be used, because forward_iterator_tag does not inherit from output_iterator_tag, so is not convertible to it. To fix these problems we need to implement the resolution of LWG 2439 so that the third case is only used when the value types of the two ranges are the same. This ensures that the comparisons are well behaved. We also need to ensure that the first case is used when both ranges use forward iterators. This change replaces a single step of tag dispatching to choose between three overloads with two step of tag dispatching, choosing between two overloads at each step. The first step dispatches based on the iterator category of the input range, ignoring the category of the output range. The second step only happens when the input range uses non-forward iterators, and dispatches based on the category of the output range and whether the value type of the two ranges is the same. So now the cases that are handled are: - The input range uses forward iterators. - The output range uses non-forward iterators or a different value type. - The output range uses forward iterators and has the same value type. For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val to wrap the predicate in another level of indirection. That seems unnecessary, as we can just use a pointer to the local variable instead of an iterator referring to it. During review of this patch, it was discovered that all known implementations of std::unique_copy and ranges::unique_copy (except cmcstl2) disagree with the specification. The standard (and the SGI STL documentation) say that it uses pred(*i, *(i-1)) but everybody uses pred(*(i-1), *i) instead, and apparently always has done. This patch adjusts ranges::unique_copy to be consistent. In the first __unique_copy overload, the local copy of the iterator is changed to be the previous position not the next one, so that we use ++first as the "next" iterator, consistent with the logic used in the other overloads. This makes it easier to compare them, because we aren't using pred(*first, *next) in one and pred(something, *first) in the others. Instead it's always pred(something, *first). libstdc++-v3/ChangeLog: PR libstdc++/120386 * include/bits/ranges_algo.h (__unique_copy_fn): Reorder arguments for third case to match the first two cases. * include/bits/stl_algo.h (__unique_copy): Replace three overloads with two, depending only on the iterator category of the input range. Dispatch to __unique_copy_1 for the non-forward case. (__unique_copy_1): New overloads for the case where the input range uses non-forward iterators. (unique_copy): Only pass the input range category to __unique_copy. * testsuite/25_algorithms/unique_copy/lwg2439.cc: New test. Reviewed-by: Tomasz Kamiński <[email protected]>

We don't use this GCC coding convention in libstdc++. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (basic_string::size): Remove space before parameter list. (basic_string::capacity): Likewise. * include/bits/stl_deque.h (deque::size): Likewise. * include/bits/stl_vector.h (vector::size, vector::capacity): Likewise. * include/bits/vector.tcc (vector::_M_realloc_insert): Likewise. (vector::_M_realloc_append): Likewise.

ChangeLog: * .github/alpine_32bit_log_warnings: Adjust with latest warnings. * .github/glibcxx_ubuntu64b_log_expected_warnings: Likewise. * .github/log_expected_warnings: Likewise. Signed-off-by: Marc Poulhiès <[email protected]>

Config has been updated upstream to correctly indent declaration. gcc/rust/ChangeLog: * rust-attribs.cc (handle_hot_attribute): Remove clang-format comment. Signed-off-by: Marc Poulhiès <[email protected]>

Bump clang-format version to 16. This is needed as upstream has updated the config and clang-format 10 doesn't support it. ChangeLog: * .github/workflows/clang-format.yml: Bump clang-format version. Signed-off-by: Marc Poulhiès <[email protected]>

powerboat9 requested a review from dkm April 9, 2025 21:50

powerboat9 force-pushed the merge-3 branch from a9fa136 to b65c7b8 Compare April 24, 2025 22:54

powerboat9 changed the title ~~[April 9th 2025] Merge changes from upstream~~ [April 24th 2025] Merge changes from upstream Apr 24, 2025

powerboat9 force-pushed the merge-3 branch from b65c7b8 to 489f9e8 Compare April 29, 2025 20:33

powerboat9 changed the title ~~[April 24th 2025] Merge changes from upstream~~ [April 29th 2025] Merge changes from upstream Apr 29, 2025

Christophe Lyon and others added 23 commits May 22, 2025 08:33

doc: Document the 'q' constraint for LoongArch

9c621ef

The kernel developers have requested such a constraint to use csrxchg in inline assembly. gcc/ChangeLog: * doc/md.texi: Document the 'q' constraint for LoongArch.

libstdc++: Define _Scoped_allocation RAII helper

0faa31d

libstdc++-v3/ChangeLog: * include/bits/allocated_ptr.h (_Scoped_allocation): New class template. Co-authored-by: Tomasz Kamiński <[email protected]> Signed-off-by: Tomasz Kamiński <[email protected]>

c++: fix testcase comment

f5016d8

Typo. gcc/testsuite/ChangeLog: * g++.dg/opt/always_inline2.C: Correct PR number.

Fix oversight about big-endian targets in latest change

d7f24e3

Bit-fields are stored left-justified for big-endian targets. gcc/ * dwarf2out.cc (loc_list_from_tree_1) <COMPONENT_REF>: Add specific handling of bit-fields for big-endian targets.

testsuite: Remove obsolete ada/acats/overflow.lst file

05fc147

It is used to specify which files are compiled with -gnato, but the switch has been the default for at least a decade. gcc/testsuite/ * ada/acats/overflow.lst: Delete. * ada/acats/run_all.sh: Do not process overflow.lst.

Daily bump.

2e9ef61

[MAINTAINERS] Add myself to write after approval and DCO.

3213828

ChangeLog: * MAINTAINERS: Add myself to write after approval and DCO.

apinski-quic and others added 19 commits June 1, 2025 09:48

Daily bump.

fa71562

powerboat9 force-pushed the merge-3 branch 2 times, most recently from 4d6917a to ca27567 Compare June 2, 2025 20:58

powerboat9 and others added 8 commits June 2, 2025 20:16

Perform intermediate merge

37da492

Merge downstream into intermediate merge

e5048e5

Merge intermediate merge into result

f4bad7e

Merge gcc-master into result

832e601

ci: adjust expected warnings after upstream sync

99b6c57

ChangeLog: * .github/alpine_32bit_log_warnings: Adjust with latest warnings. * .github/glibcxx_ubuntu64b_log_expected_warnings: Likewise. * .github/log_expected_warnings: Likewise. Signed-off-by: Marc Poulhiès <[email protected]>

gccrs: remove now useless clang-format comments

b71d2dd

Config has been updated upstream to correctly indent declaration. gcc/rust/ChangeLog: * rust-attribs.cc (handle_hot_attribute): Remove clang-format comment. Signed-off-by: Marc Poulhiès <[email protected]>

ci: use clang-format 16

61cb171

Bump clang-format version to 16. This is needed as upstream has updated the config and clang-format 10 doesn't support it. ChangeLog: * .github/workflows/clang-format.yml: Bump clang-format version. Signed-off-by: Marc Poulhiès <[email protected]>

Clang format

8903015

powerboat9 force-pushed the merge-3 branch from ca27567 to 8903015 Compare June 3, 2025 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[May 24th 2025] Merge changes from upstream #3723

[May 24th 2025] Merge changes from upstream #3723

Uh oh!

powerboat9 commented Apr 9, 2025

Uh oh!

powerboat9 commented Apr 9, 2025 •

edited

Loading

Uh oh!

powerboat9 commented Apr 9, 2025

Uh oh!

Uh oh!

[May 24th 2025] Merge changes from upstream #3723

Are you sure you want to change the base?

[May 24th 2025] Merge changes from upstream #3723

Uh oh!

Conversation

powerboat9 commented Apr 9, 2025

Uh oh!

powerboat9 commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

powerboat9 commented Apr 9, 2025

Uh oh!

Uh oh!

powerboat9 commented Apr 9, 2025 •

edited

Loading