DO NOT MERGE, pleas review: WIP implement strand::post #1

rodgert · 2018-01-21T17:37:36Z

I've taken a stab at an initial implementation of strand::post()

Basically it posts a _Runnable which also has ownership of the strand's state to the inner executor. The runnable will continue to reschedule itself until the queue is empty.

I believe this also covers the strand dtor TODO as well.

Comments?

Introduced a race condition

Move post of runnable to underlying executor outside lock

…piling for Thumb2 Thumb2 code now uses the Arm implementation of legitimize_address. That code has a case to handle addresses that are absolute CONST_INT values, which is a common use case in deeply embedded targets (eg: void *p = (void*)0x12345678). Since thumb has very limited negative offsets from a constant, we want to avoid forming a CSE base that will then be used with a negative value. This was reported upstream originally in https://gcc.gnu.org/ml/gcc-help/2019-10/msg00122.html For example, void test1(void) { volatile uint32_t * const p = (uint32_t *) 0x43fe1800; p[3] = 1; p[4] = 2; p[1] = 3; p[7] = 4; p[0] = 6; } With the new code, instead of ldr r3, .L2 subw r2, r3, #2035 movs r1, #1 str r1, [r2] subw r2, r3, #2031 movs r1, gcc-mirror#2 str r1, [r2] subw r2, r3, #2043 movs r1, gcc-mirror#3 str r1, [r2] subw r2, r3, #2019 movs r1, gcc-mirror#4 subw r3, r3, #2047 str r1, [r2] movs r2, gcc-mirror#6 str r2, [r3] bx lr We now get ldr r3, .L2 movs r2, #1 str r2, [r3, #2060] movs r2, gcc-mirror#2 str r2, [r3, #2064] movs r2, gcc-mirror#3 str r2, [r3, #2052] movs r2, gcc-mirror#4 str r2, [r3, #2076] movs r2, gcc-mirror#6 str r2, [r3, #2048] bx lr * config/arm/arm.c (arm_legitimize_address): Don't form negative offsets from a CONST_INT address when TARGET_THUMB2. From-SVN: r277677

Made apparent by recent commit dc70315 "openmp: Implement discovery of implicit declare target to clauses": +FAIL: libgomp.c/target-39.c (internal compiler error) +FAIL: libgomp.c/target-39.c (test for excess errors) +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa' enabled (by default). during GIMPLE pass: hsagen source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’: source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault 23 | #pragma omp target map(from:err) | ^~~ [...] GDB: Program received signal SIGSEGV, Segmentation fault. fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 6267 return (fndecl_built_in_p (node, BUILT_IN_NORMAL) (gdb) bt #0 fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304 gcc-mirror#2 0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770 gcc-mirror#3 0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999 gcc-mirror#4 0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596 gcc-mirror#5 0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680 gcc-mirror#6 0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502 [...] (gdb) up #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) (gdb) print function_decl $1 = (tree) 0x0 (gdb) list 5299 if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) 5300 { 5301 tree function_decl = gimple_call_fndecl (stmt); 5302 /* Prefetch pass can create type-mismatching prefetch builtin calls which 5303 fail the gimple_call_builtin_p test above. Handle them here. */ 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) 5305 return; 5306 5307 if (function_decl == NULL_TREE) 5308 { The problem is present already since 2016-11-23 commit 56b1c60 (r242761) "Merge from HSA branch to trunk", and the fix obvious enough. gcc/ * hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl == NULL_TREE' check earlier. gcc/testsuite/ * c-c++-common/gomp/hsa-indirect-call-1.c: New file.

Since 21cfe72 there's a new OMP_LIST_NONTEMPORAL value, but it was missing in resolve_omp_clauses static array that is defined at the function beginning: ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gfortran.dg/gomp/nontemporal-1.f90 -fopenmp -c ../../gcc/fortran/openmp.c:4737:28: runtime error: index 21 out of bounds for type 'char *[21]' #0 0xbdb956 in resolve_omp_clauses ../../gcc/fortran/openmp.c:4737 #1 0xbeb076 in resolve_omp_do ../../gcc/fortran/openmp.c:6139 gcc-mirror#2 0xbf029a in gfc_resolve_omp_directive(gfc_code*, gfc_namespace*) ../../gcc/fortran/openmp.c:6792 gcc-mirror#3 0xcb6363 in gfc_resolve_code(gfc_code*, gfc_namespace*) ../../gcc/fortran/resolve.c:12185 gcc-mirror#4 0xcef8cf in resolve_codes ../../gcc/fortran/resolve.c:17303 gcc/fortran/ChangeLog: * openmp.c (resolve_omp_clauses): Add NONTEMPORAL to clause names.

PR analyzer/94851 reports various false "NULL dereference" diagnostics. The first case (comment #1) affects GCC 10.2 but no longer affects trunk; I believe it was fixed by the state rewrite of r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d. The patch adds a regression test for this case. The other cases (comment gcc-mirror#3 and comment gcc-mirror#4) still affect trunk. In both cases, the && in a conditional is optimized to bitwise & _1 = p_4 != 0B; _2 = p_4 != q_6(D); _3 = _1 & _2; and the analyzer fails to fold this for the case where one (or both) of the conditionals is false, and thus erroneously considers the path where "p" is non-NULL despite being passed a NULL value. Fix this by implementing folding for this case. gcc/analyzer/ChangeLog: PR analyzer/94851 * region-model-manager.cc (region_model_manager::maybe_fold_binop): Fold bitwise "& 0" to 0. gcc/testsuite/ChangeLog: PR analyzer/94851 * gcc.dg/analyzer/pr94851-1.c: New test. * gcc.dg/analyzer/pr94851-3.c: New test. * gcc.dg/analyzer/pr94851-4.c: New test.

Made apparent by recent commit dc70315 "openmp: Implement discovery of implicit declare target to clauses": +FAIL: libgomp.c/target-39.c (internal compiler error) +FAIL: libgomp.c/target-39.c (test for excess errors) +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa' enabled (by default). during GIMPLE pass: hsagen source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’: source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault 23 | #pragma omp target map(from:err) | ^~~ [...] GDB: Program received signal SIGSEGV, Segmentation fault. fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 6267 return (fndecl_built_in_p (node, BUILT_IN_NORMAL) (gdb) bt #0 fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304 gcc-mirror#2 0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770 gcc-mirror#3 0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999 gcc-mirror#4 0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596 gcc-mirror#5 0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680 gcc-mirror#6 0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502 [...] (gdb) up #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) (gdb) print function_decl $1 = (tree) 0x0 (gdb) list 5299 if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) 5300 { 5301 tree function_decl = gimple_call_fndecl (stmt); 5302 /* Prefetch pass can create type-mismatching prefetch builtin calls which 5303 fail the gimple_call_builtin_p test above. Handle them here. */ 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) 5305 return; 5306 5307 if (function_decl == NULL_TREE) 5308 { The problem is present already since 2016-11-23 commit 56b1c60 (r242761) "Merge from HSA branch to trunk", and the fix obvious enough. gcc/ * hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl == NULL_TREE' check earlier. gcc/testsuite/ * c-c++-common/gomp/hsa-indirect-call-1.c: New file. (cherry picked from commit 973bce0)

This PR points out that we accept template<typename T> struct tuple { tuple(T); }; // #1 template<typename T> explicit tuple(T t) -> tuple<T>; // gcc-mirror#2 tuple t = { 1 }; despite the 'explicit' deduction guide in a copy-list-initialization context. That's because in deduction_guides_for we first find the user-defined deduction guide (gcc-mirror#2), and then ctor_deduction_guides_for creates artificial deduction guides: one from the tuple(T) constructor and a copy guide. So we end up with these three guides: (1) template<class T> tuple(T) -> tuple<T> [DECL_NONCONVERTING_P] (2) template<class T> tuple(tuple<T>) -> tuple<T> (3) template<class T> tuple(T) -> tuple<T> Then, in do_class_deduction, we prune this set, and get rid of (1). Then overload resolution selects (3) and we succeed. But [over.match.list]p1 says "In copy-list-initialization, if an explicit constructor is chosen, the initialization is ill-formed." It also goes on to say that this differs from other situations where only converting constructors are considered for copy-initialization. Therefore for list-initialization we consider explicit constructors and complain if one is chosen. E.g. convert_like_internal/ck_user can give an error. So my logic runs that we should not prune the deduction_guides_for guides in a copy-list-initialization context, and only complain if we actually choose an explicit deduction guide. This matches clang++/EDG/msvc++. gcc/cp/ChangeLog: PR c++/90210 * pt.c (do_class_deduction): Don't prune explicit deduction guides in copy-list-initialization. In copy-list-initialization, if an explicit deduction guide was selected, give an error. gcc/testsuite/ChangeLog: PR c++/90210 * g++.dg/cpp1z/class-deduction73.C: New test.

This patch improves block-scope extern handling by always injecting a hidden copy into the enclosing namespace (or using a match already there). This hidden copy will be revealed if the user explicitly declares it later. We can get from the DECL_LOCAL_DECL_P local extern to the alias via DECL_LOCAL_DECL_ALIAS. This fixes several bugs and removes the kludgy per-function extern_decl_map. We only do this pushing for non-dependent local externs -- dependent ones will be pushed during instantiation. User code that expected to be able to handle incompatible local externs in different block-scopes will no longer work. That code is ill-formed. (always was, despite what 31775 claimed). I had to adjust a number of testcases that fell into this. I tried using DECL_VALUE_EXPR, but that didn't work out. Due to constexpr requirements we have to do the replacement very late (it happens in the gimplifier). Consider: extern int l[]; // #1 constexpr bool foo () { extern int l[3]; // this does not complete the type of decl #1 constexpr int *p = &l[2]; // ok return !p; } This requirement, coupled with our use of the common folding machinery makes pr97306 hard to fix, as we end up with an expression containing the two different decls for 'l', and only the c++ FE knows how to reconcile those. I punted on this. gcc/cp/ * cp-tree.h (struct language_function): Delete extern_decl_map. (DECL_LOCAL_DECL_ALIAS): New. * name-lookup.h (is_local_extern): Delete. * name-lookup.c (set_local_extern_decl_linkage): Replace with ... (push_local_extern_decl): ... this new function. (do_pushdecl): Call new function after pushing new decl. Unhide hidden non-functions. (is_local_extern): Delete. * decl.c (layout_var_decl): Do not allow VLA local externs. * decl2.c (mark_used): Also mark DECL_LOCAL_DECL_ALIAS. Drop old local-extern treatment. * parser.c (cp_parser_oacc_declare): Deal with local extern aliases. * pt.c (tsubst_expr): Adjust local extern instantiation. * cp-gimplify.c (cp_genericize_r): Remap DECL_LOCAL_DECLs. gcc/testsuite/ * g++.dg/cpp0x/lambda/lambda-sfinae1.C: Avoid ill-formed local extern * g++.dg/init/pr42844.C: Add expected error. * g++.dg/lookup/extern-redecl1.C: Likewise. * g++.dg/lookup/koenig15.C: Avoid ill-formed. * g++.dg/lto/pr95677.C: New. * g++.dg/other/nested-extern-1.C: Correct expected behabviour. * g++.dg/other/nested-extern-2.C: Likewise. * g++.dg/other/nested-extern.cc: Split ... * g++.dg/other/nested-extern-1.cc: ... here ... * g++.dg/other/nested-extern-2.cc: ... here. * g++.dg/template/scope5.C: Avoid ill-formed * g++.old-deja/g++.law/missed-error2.C: Allow extension. * g++.old-deja/g++.pt/crash3.C: Add expected error.

Prevents the following UBSAN error: ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool' #0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482 #1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511 gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642 gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732 gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823 gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441 gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509 gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597 gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598 gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608 gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726 gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941 gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642 gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777 gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022 gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485 gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321 gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460 gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39 gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314 gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09) gcc/ChangeLog: * ipa-modref.c (merge_call_side_effects): Clear modref_parm_map fields in the vector.

It fixes: /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool' #0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482 #1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779 gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492 gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216 gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697 gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096 gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509 gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936 gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700 gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777 gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022 gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485 gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321 gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460 gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39 gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314 gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9) gcc/ChangeLog: * ipa-modref.c (compute_parm_map): Clear vector.

This is a "canonical types differ for identical types" ICE, which started with r11-4682. It's a bit tricky to explain. Consider: template <typename T> struct S { S<T> bar() noexcept(T::value); // #1 S<T> foo() noexcept(T::value); // gcc-mirror#2 }; template <typename T> S<T> S<T>::foo() noexcept(T::value) {} // gcc-mirror#3 We ICE because gcc-mirror#3 and gcc-mirror#2 have the same type, but their canonical types differ: TYPE_CANONICAL (gcc-mirror#3) == gcc-mirror#2 but TYPE_CANONICAL (gcc-mirror#2) == #1. The member functions #1 and gcc-mirror#2 have the same type. However, since their noexcept-specifier is deferred, when parsing them, we create a variant for both of them, because DEFERRED_PARSE cannot be compared. In other words, build_cp_fntype_variant's tree v = TYPE_MAIN_VARIANT (type); for (; v; v = TYPE_NEXT_VARIANT (v)) if (cp_check_qualified_type (v, type, type_quals, rqual, raises, late)) return v; will *not* find an existing variant when creating a method_type for gcc-mirror#2, so we have to create a new one. But then we perform delayed parsing and call fixup_deferred_exception_variants for #1 and gcc-mirror#2. f_d_e_v will replace TYPE_RAISES_EXCEPTIONS with the newly parsed noexcept-specifier. It also sets TYPE_CANONICAL (gcc-mirror#2) to #1. Both noexcepts turned out to be the same, so now we have two equivalent variants in the list! I.e., +-----------------+ +-----------------+ +-----------------+ | main | | gcc-mirror#2 | | #1 | | S S::<T379>(S*) |----->| S S::<T37c>(S*) |----->| S S::<T37a>(S*) |----->NULL | - | | noex(T::value) | | noex(T::value) | +-----------------+ +-----------------+ +-----------------+ Then we get to gcc-mirror#3. As for #1 and gcc-mirror#2, grokdeclarator calls build_memfn_type, which ends up calling build_cp_fntype_variant, which will use the loop above to look for an existing variant. The first one that matches cp_check_qualified_type will be used, so we use gcc-mirror#2 rather than #1, and the TYPE_CANONICAL mismatch follows. Hopefully that makes sense. As for the fix, I didn't think I could rewrite the method_type gcc-mirror#2 with #1 because the type may have escaped via decltype. So my approach is to elide gcc-mirror#2 from the list, so when looking for a matching variant, we always find #1 (gcc-mirror#2 remains live though, which admittedly sounds sort of dodgy). PR c++/101715 gcc/cp/ChangeLog: * tree.cc (fixup_deferred_exception_variants): Remove duplicate variants after parsing the exception specifications. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept72.C: New test. * g++.dg/cpp0x/noexcept73.C: New test.

…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n#gcc-mirror#2) A(n#gcc-mirror#3) A(n#gcc-mirror#4) A(n#gcc-mirror#5) A(n#gcc-mirror#6) A(n#gcc-mirror#7) A(n#gcc-mirror#8) A(n#gcc-mirror#9) #define C(n) B(n##0) B(n##1) B(n#gcc-mirror#2) B(n#gcc-mirror#3) B(n#gcc-mirror#4) B(n#gcc-mirror#5) B(n#gcc-mirror#6) B(n#gcc-mirror#7) B(n#gcc-mirror#8) B(n#gcc-mirror#9) #define D(n) C(n##0) C(n##1) C(n#gcc-mirror#2) C(n#gcc-mirror#3) C(n#gcc-mirror#4) C(n#gcc-mirror#5) C(n#gcc-mirror#6) C(n#gcc-mirror#7) C(n#gcc-mirror#8) C(n#gcc-mirror#9) #define E(n) D(n##0) D(n##1) D(n#gcc-mirror#2) D(n#gcc-mirror#3) D(n#gcc-mirror#4) D(n#gcc-mirror#5) D(n#gcc-mirror#6) D(n#gcc-mirror#7) D(n#gcc-mirror#8) D(n#gcc-mirror#9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <[email protected]> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive).

Testing cc1 on pr93032-mztools-unsigned-char.c Benchmark #1: (without patch) Time (mean ± σ): 338.8 ms ± 13.6 ms [User: 323.2 ms, System: 14.2 ms] Range (min … max): 326.7 ms … 363.1 ms 10 runs Benchmark gcc-mirror#2: (with patch) Time (mean ± σ): 332.3 ms ± 12.8 ms [User: 316.6 ms, System: 14.3 ms] Range (min … max): 322.5 ms … 357.4 ms 10 runs Summary ./cc1.new ran 1.02 ± 0.06 times faster than ./cc1.old gcc/analyzer/ChangeLog: * store.cc (store::store): Presize m_cluster_map. Signed-off-by: David Malcolm <[email protected]>

DR 2352 changed the definitions of reference-related (so that it uses "similar type" instead of "same type") and of reference-compatible (use a standard conversion sequence). That means that reference-related is now more broad, which means that we will be binding more things directly. The original patch for DR 2352 caused some problems, which were fixed in r276251 by creating a "fake" ck_qual in direct_reference_binding, so that in void f(int *); // #1 void f(const int * const &); // gcc-mirror#2 int *x; int main() { f(x); // call #1 } we call #1. The extra ck_qual in gcc-mirror#2 causes compare_ics to select #1, which is a better match for "int *" because then we don't have to do a qualification conversion. Let's turn to the problem in this PR. We have void f(const int * const &); // #1 void f(const int *); // gcc-mirror#2 int *x; int main() { f(x); } We arrive in compare_ics to decide which one is better. The ICS for #1 looks like ck_ref_bind <- ck_qual <- ck_identity const int *const & const int *const int * and the ICS for gcc-mirror#2 is ck_qual <- ck_rvalue <- ck_identity const int * int * int * We strip the reference and then comp_cv_qual_signature when comparing two ck_quals sees that "const int *" is a proper subset of "const int *const" and we return -1. But that's wrong; presumably the top-level "const" should be ignored and the call should be ambiguous. This patch adjust the type of the "fake" ck_qual so that this problem doesn't arise. PR c++/97296 gcc/cp/ChangeLog: * call.cc (direct_reference_binding): strip_top_quals when creating a ck_qual. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/ref-bind4.C: Add dg-error. * g++.dg/cpp0x/ref-bind8.C: New test.

Here ever since r12-6022-gbb2a7f80a98de3 we stopped deeming the partial specialization gcc-mirror#2 to be more specialized than #1 ultimately because dependent operator expressions now have a DEPENDENT_OPERATOR_TYPE type instead of an empty type, and this made unify stop deducing T(2) == 1 for K during partial ordering for #1 and gcc-mirror#2. This minimal patch fixes this by making the relevant logic in unify treat DEPENDENT_OPERATOR_TYPE like an empty type. PR c++/105425 gcc/cp/ChangeLog: * pt.cc (unify) <case TEMPLATE_PARM_INDEX>: Treat DEPENDENT_OPERATOR_TYPE like an empty type. gcc/testsuite/ChangeLog: * g++.dg/template/partial-specialization13.C: New test.

This patch fixes the second half of 64679. Here we issue a wrong "redefinition of 'int x'" for the following: struct Bar { Bar(int, int, int); }; int x = 1; Bar bar(int(x), int(x), int{x}); // #1 cp_parser_parameter_declaration_list does pushdecl every time it sees a named parameter, so the second "int(x)" causes the error. That's premature, since this turns out to be a constructor call after the third argument! If the first parameter is parenthesized, we can't push until we've established we're looking at a function declaration. Therefore this could be fixed by some kind of lookahead. I thought about introducing a lightweight variant of cp_parser_parameter_declaration_list that would not have any side effects and would return as soon as it figures out whether it's looking at a declaration or expression. Since that would require fairly nontrivial changes, I wanted something simpler. Something like delaying the pushdecl until we've reached the ')' following the parameter-declaration-clause. But we must push the parameters before processing a default argument, as in: Bar bar(int(a), int(b), int c = sizeof(a)); // valid Moreover, this code should still be accepted Bar f(int(i), decltype(i) j = 42); so this patch stashes parameters into a vector when parsing tentatively only when pushdecl-ing a parameter would result in a clash and an error about redefinition/redeclaration. The stashed parameters are pushed at the end of a parameter-declaration-clause if it's followed by a ')', so that we still diagnose redefining a parameter. PR c++/64679 gcc/cp/ChangeLog: * parser.cc (cp_parser_parameter_declaration_clause): Maintain a vector of parameters that haven't been pushed yet. Push them at the end of a valid parameter-declaration-clause. (cp_parser_parameter_declaration_list): Take a new auto_vec parameter. Do not pushdecl while parsing tentatively when pushdecl-ing a parameter would result in a hard error. (cp_parser_cache_defarg): Adjust the call to cp_parser_parameter_declaration_list. gcc/testsuite/ChangeLog: * g++.dg/parse/ambig11.C: New test. * g++.dg/parse/ambig12.C: New test. * g++.dg/parse/ambig13.C: New test. * g++.dg/parse/ambig14.C: New test.

Consider struct A { int x; int y = x; }; struct B { int x = 0; int y = A{x}.y; // #1 }; where for #1 we end up with {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x} that is, two PLACEHOLDER_EXPRs for different types on the same level in a {}. This crashes because our CONSTRUCTOR_PLACEHOLDER_BOUNDARY mechanism to avoid replacing unrelated PLACEHOLDER_EXPRs cannot deal with it. Here's why we wound up with those PLACEHOLDER_EXPRs: When we're performing cp_parser_late_parsing_nsdmi for "int y = A{x}.y;" we use finish_compound_literal on type=A, compound_literal={((struct B *) this)->x}. When digesting this initializer, we call get_nsdmi which creates a PLACEHOLDER_EXPR for A -- we don't have any object to refer to yet. After digesting, we have {.x=((struct B *) this)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x} and since we've created a PLACEHOLDER_EXPR inside it, we marked the whole ctor CONSTRUCTOR_PLACEHOLDER_BOUNDARY. f_c_l creates a TARGET_EXPR and returns TARGET_EXPR <D.2384, {.x=((struct B *) this)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}> Then we get to B b = {}; and call store_init_value, which digests the {}, which produces {.x=NON_LVALUE_EXPR <0>, .y=(TARGET_EXPR <D.2395, {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x}>).y} lookup_placeholder in constexpr won't find an object to replace the PLACEHOLDER_EXPR for B, because ctx->object will be D.2395 of type A, and we cannot search outward from D.2395 to find 'b'. The call to replace_placeholders in store_init_value will not do anything: we've marked the inner { } CONSTRUCTOR_PLACEHOLDER_BOUNDARY, and it's only a sub-expression, so replace_placeholders does nothing, so the <P_E struct B> stays even though now is the perfect time to replace it because we have an object for it: 'b'. Later, in cp_gimplify_init_expr the *expr_p is D.2395 = {.x=(&<PLACEHOLDER_EXPR struct B>)->x, .y=(&<PLACEHOLDER_EXPR struct A>)->x} where D.2395 is of type A, but we crash because we hit <P_E struct B>, which has a different type. My idea was to replace <P_E struct A> with D.2384 after creating the TARGET_EXPR because that means we have an object we can refer to. Then clear CONSTRUCTOR_PLACEHOLDER_BOUNDARY because we no longer have a PLACEHOLDER_EXPR in the {}. Then store_init_value will be able to replace <P_E struct B> with 'b', and we should be good to go. We must be careful not to break guaranteed copy elision, so this replacement happens in digest_nsdmi_init where we can see the whole initializer, and avoid replacing any placeholders in TARGET_EXPRs used in the context of initialization/copy elision. This is achieved via the new function called potential_prvalue_result_of. While fixing this problem, I found PR105550, thus the FIXMEs in the tests. PR c++/100252 gcc/cp/ChangeLog: * typeck2.cc (potential_prvalue_result_of): New. (replace_placeholders_for_class_temp_r): New. (digest_nsdmi_init): Call it. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/nsdmi-aggr14.C: New test. * g++.dg/cpp1y/nsdmi-aggr15.C: New test. * g++.dg/cpp1y/nsdmi-aggr16.C: New test. * g++.dg/cpp1y/nsdmi-aggr17.C: New test. * g++.dg/cpp1y/nsdmi-aggr18.C: New test. * g++.dg/cpp1y/nsdmi-aggr19.C: New test.

As explained in r11-4959-gde6f64f9556ae3, the atom cache assumes two equivalent expressions (according to cp_tree_equal) must use the same template parameters (according to find_template_parameters). This assumption turned out to not hold for TARGET_EXPR, which was addressed by that commit. But this assumption apparently doesn't hold for PARM_DECL either: find_template_parameters walks its DECL_CONTEXT but cp_tree_equal by default doesn't consider DECL_CONTEXT unless comparing_specializations is set. Thus in the first testcase below, the atomic constraints of #1 and gcc-mirror#2 are equivalent according to cp_tree_equal, but according to find_template_parameters the former uses T and the latter uses both T and U (surprisingly). We could fix this assumption violation by setting comparing_specializations in the atom_hasher, which would make cp_tree_equal return false for the two atoms, but that seems overly pessimistic here. Ideally the atoms should continue being considered equivalent and we instead fix find_template_paremeters to return just T for gcc-mirror#2's atom. To that end this patch makes for_each_template_parm_r stop walking the DECL_CONTEXT of a PARM_DECL. This should be safe to do because tsubst_copy / tsubst_decl only substitutes the TREE_TYPE of a PARM_DECL and doesn't bother substituting the DECL_CONTEXT, thus the only relevant template parameters are those used in its type. any_template_parm_r is currently responsible for walking its TREE_TYPE, but I suppose it now makes sense for for_each_template_parm_r to do so instead. In passing this patch also makes for_each_template_parm_r stop walking the DECL_CONTEXT of a VAR_/FUNCTION_DECL since doing so after walking DECL_TI_ARGS is redundant, I think. I experimented with not walking DECL_CONTEXT for CONST_DECL, but the second testcase below demonstrates it's necessary to walk it. PR c++/105797 gcc/cp/ChangeLog: * pt.cc (for_each_template_parm_r) <case FUNCTION_DECL, VAR_DECL>: Don't walk DECL_CONTEXT. <case PARM_DECL>: Likewise. Walk TREE_TYPE. <case CONST_DECL>: Simplify. (any_template_parm_r) <case PARM_DECL>: Don't walk TREE_TYPE. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-decltype4.C: New test. * g++.dg/cpp2a/concepts-memfun3.C: New test.

(In reply to Uroš Bizjak from comment #1) > Instruction does not accept memory operand for operand 3: > > (define_insn_and_split > "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint" > [(set (match_operand:<ssebytemode> 0 "register_operand" "=Yr,*x,x") > (unspec:<ssebytemode> > [(match_operand:<ssebytemode> 1 "register_operand" "0,0,x") > (match_operand:<ssebytemode> 2 "vector_operand" "YrBm,*xBm,xm") > (subreg:<ssebytemode> > (lt:VI48_AVX > (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") > (match_operand:VI48_AVX 4 "const0_operand")) 0)] > UNSPEC_BLENDV))] > > The problematic insn is: > > (define_insn_and_split "*avx_cmp<mode>3_ltint_not" > [(set (match_operand:VI48_AVX 0 "register_operand") > (vec_merge:VI48_AVX > (match_operand:VI48_AVX 1 "vector_operand") > (match_operand:VI48_AVX 2 "vector_operand") > (unspec:<avx512fmaskmode> > [(subreg:VI48_AVX > (not:<ssebytemode> > (match_operand:<ssebytemode> 3 "vector_operand")) 0) > (match_operand:VI48_AVX 4 "const0_operand") > (match_operand:SI 5 "const_0_to_7_operand")] > UNSPEC_PCMP)))] > > which gets split to the above pattern. > > In the preparation statements we have: > > if (!MEM_P (operands[3])) > operands[3] = force_reg (<ssebytemode>mode, operands[3]); > operands[3] = lowpart_subreg (<MODE>mode, operands[3], <ssebytemode>mode); > > Which won't fly when operand 3 is memory operand... > gcc/ChangeLog: PR target/105953 * config/i386/sse.md (*avx_cmp<mode>3_ltint_not): Force_reg operands[3]. gcc/testsuite/ChangeLog: * g++.target/i386/pr105953.C: New test.

This patch implements C++23 P2255R2, which adds two new type traits to detect reference binding to a temporary. They can be used to detect code like std::tuple<const std::string&> t("meow"); which is incorrect because it always creates a dangling reference, because the std::string temporary is created inside the selected constructor of std::tuple, and not outside it. There are two new compiler builtins, __reference_constructs_from_temporary and __reference_converts_from_temporary. The former is used to simulate direct- and the latter copy-initialization context. But I had a hard time finding a test where there's actually a difference. Under DR 2267, both of these are invalid: struct A { } a; struct B { explicit B(const A&); }; const B &b1{a}; const B &b2(a); so I had to peruse [over.match.ref], and eventually realized that the difference can be seen here: struct G { operator int(); // #1 explicit operator int&&(); // gcc-mirror#2 }; int&& r1(G{}); // use gcc-mirror#2 (no temporary) int&& r2 = G{}; // use #1 (a temporary is created to be bound to int&&) The implementation itself was rather straightforward because we already have the conv_binds_ref_to_prvalue function. The main function here is ref_xes_from_temporary. I've changed the return type of ref_conv_binds_directly to tristate, because previously the function didn't distinguish between an invalid conversion and one that binds to a prvalue. Since it no longer returns a bool, I removed the _p suffix. The patch also adds the relevant class and variable templates to <type_traits>. PR c++/104477 gcc/c-family/ChangeLog: * c-common.cc (c_common_reswords): Add __reference_constructs_from_temporary and __reference_converts_from_temporary. * c-common.h (enum rid): Add RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY. gcc/cp/ChangeLog: * call.cc (ref_conv_binds_directly_p): Rename to ... (ref_conv_binds_directly): ... this. Add a new bool parameter. Change the return type to tristate. * constraint.cc (diagnose_trait_expr): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. * cp-tree.h: Include "tristate.h". (enum cp_trait_kind): Add CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. (ref_conv_binds_directly_p): Rename to ... (ref_conv_binds_directly): ... this. (ref_xes_from_temporary): Declare. * cxx-pretty-print.cc (pp_cxx_trait_expression): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. * method.cc (ref_xes_from_temporary): New. * parser.cc (cp_parser_primary_expression): Handle RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY. (cp_parser_trait_expr): Likewise. (warn_for_range_copy): Adjust to call ref_conv_binds_directly. * semantics.cc (trait_expr_value): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. (finish_trait_expr): Likewise. libstdc++-v3/ChangeLog: * include/std/type_traits (reference_constructs_from_temporary, reference_converts_from_temporary): New class templates. (reference_constructs_from_temporary_v, reference_converts_from_temporary_v): New variable templates. (__cpp_lib_reference_from_temporary): Define for C++23. * include/std/version (__cpp_lib_reference_from_temporary): Define for C++23. * testsuite/20_util/variable_templates_for_traits.cc: Test reference_constructs_from_temporary_v and reference_converts_from_temporary_v. * testsuite/20_util/reference_from_temporary/value.cc: New test. * testsuite/20_util/reference_from_temporary/value2.cc: New test. * testsuite/20_util/reference_from_temporary/version.cc: New test. gcc/testsuite/ChangeLog: * g++.dg/ext/reference_constructs_from_temporary1.C: New test. * g++.dg/ext/reference_converts_from_temporary1.C: New test.

In my previous patches I've been extending our std::move warnings, but this tweak actually dials it down a little bit. As reported in bug 89780, it's questionable to warn about expressions in templates that were type-dependent, but aren't anymore because we're instantiating the template. As in, template <typename T> Dest withMove() { T x; return std::move(x); } template Dest withMove<Dest>(); // #1 template Dest withMove<Source>(); // gcc-mirror#2 Saying that the std::move is pessimizing for #1 is not incorrect, but it's not useful, because removing the std::move would then pessimize gcc-mirror#2. So the user can't really win. At the same time, disabling the warning just because we're in a template would be going too far, I still want to warn for template <typename> Dest withMove() { Dest x; return std::move(x); } because the std::move therein will be pessimizing for any instantiation. So I'm using the suppress_warning machinery to that effect. Problem: I had to add a new group to nowarn_spec_t, otherwise suppressing the -Wpessimizing-move warning would disable a whole bunch of other warnings, which we really don't want. PR c++/89780 gcc/cp/ChangeLog: * pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress -Wpessimizing-move. * typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings if they are suppressed. (check_return_expr): Disable -Wpessimizing-move when returning a dependent expression. gcc/ChangeLog: * diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle OPT_Wpessimizing_move and OPT_Wredundant_move. * diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning. * g++.dg/cpp0x/Wredundant-move2.C: Likewise.

To improve compile times, the C++ library could use compiler built-ins rather than implementing std::is_convertible (and _nothrow) as class templates. This patch adds the built-ins. We already have __is_constructible and __is_assignable, and the nothrow forms of those. Microsoft (and clang, for compatibility) also provide an alias called __is_convertible_to. I did not add it, but it would be trivial to do so. I noticed that our __is_assignable doesn't implement the "Access checks are performed as if from a context unrelated to either type" requirement, therefore std::is_assignable / __is_assignable give two different results here: class S { operator int(); friend void g(); // #1 }; void g () { // #1 doesn't matter static_assert(std::is_assignable<int&, S>::value, ""); static_assert(__is_assignable(int&, S), ""); } This is not a problem if __is_assignable is not meant to be used by the users. This patch doesn't make libstdc++ use the new built-ins, but I had to rename a class otherwise its name would clash with the new built-in. PR c++/106784 gcc/c-family/ChangeLog: * c-common.cc (c_common_reswords): Add __is_convertible and __is_nothrow_convertible. * c-common.h (enum rid): Add RID_IS_CONVERTIBLE and RID_IS_NOTHROW_CONVERTIBLE. gcc/cp/ChangeLog: * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONVERTIBLE and CPTK_IS_NOTHROW_CONVERTIBLE. * cp-objcp-common.cc (names_builtin_p): Handle RID_IS_CONVERTIBLE RID_IS_NOTHROW_CONVERTIBLE. * cp-tree.h (enum cp_trait_kind): Add CPTK_IS_CONVERTIBLE and CPTK_IS_NOTHROW_CONVERTIBLE. (is_convertible): Declare. (is_nothrow_convertible): Likewise. * cxx-pretty-print.cc (pp_cxx_trait_expression): Handle CPTK_IS_CONVERTIBLE and CPTK_IS_NOTHROW_CONVERTIBLE. * method.cc (is_convertible): New. (is_nothrow_convertible): Likewise. * parser.cc (cp_parser_primary_expression): Handle RID_IS_CONVERTIBLE and RID_IS_NOTHROW_CONVERTIBLE. (cp_parser_trait_expr): Likewise. * semantics.cc (trait_expr_value): Handle CPTK_IS_CONVERTIBLE and CPTK_IS_NOTHROW_CONVERTIBLE. (finish_trait_expr): Likewise. libstdc++-v3/ChangeLog: * include/std/type_traits: Rename __is_nothrow_convertible to __is_nothrow_convertible_lib. * testsuite/20_util/is_nothrow_convertible/value_ext.cc: Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Enhance to test __is_convertible and __is_nothrow_convertible. * g++.dg/ext/is_convertible1.C: New test. * g++.dg/ext/is_convertible2.C: New test. * g++.dg/ext/is_nothrow_convertible1.C: New test. * g++.dg/ext/is_nothrow_convertible2.C: New test.

In plenty of image and video processing code it's common to modify pixel values by a widening operation and then scale them back into range by dividing by 255. This patch adds an named function to allow us to emit an optimized sequence when doing an unsigned division that is equivalent to: x = y / (2 ^ (bitsize (y)/2)-1) For SVE2 this means we generate for: void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n) { for (int i = 0; i < (n & -16); i+=1) pixel[i] = (pixel[i] * level) / 0xff; } the following: mov z3.b, #1 .L3: ld1b z0.h, p0/z, [x0, x3] mul z0.h, p1/m, z0.h, z2.h addhnb z1.b, z0.h, z3.h addhnb z0.b, z0.h, z1.h st1b z0.h, p0, [x0, x3] inch x3 whilelo p0.h, w3, w2 b.any .L3 instead of: .L3: ld1b z0.h, p1/z, [x0, x3] mul z0.h, p0/m, z0.h, z1.h umulh z0.h, p0/m, z0.h, z2.h lsr z0.h, z0.h, gcc-mirror#7 st1b z0.h, p1, [x0, x3] inch x3 whilelo p1.h, w3, w2 b.any .L3 Which results in significantly faster code. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (@aarch64_bitmask_udiv<mode>3): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/div-by-bitmask_1.c: New test.

A friend declaration can only have constraints if it is defined. If multiple instantiations of a class template define the same friend function signature, it's an error, but that shouldn't happen if it's constrained to only be declared in one instantiation. Currently we don't mangle requirements, so the foos all mangle the same and actually instantiating #1 will break, but for now we can test that they're considered distinct. gcc/cp/ChangeLog: * pt.cc (tsubst_friend_function): Check satisfaction. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-friend11.C: New test.

…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n#gcc-mirror#2) A(n#gcc-mirror#3) A(n#gcc-mirror#4) A(n#gcc-mirror#5) A(n#gcc-mirror#6) A(n#gcc-mirror#7) A(n#gcc-mirror#8) A(n#gcc-mirror#9) #define C(n) B(n##0) B(n##1) B(n#gcc-mirror#2) B(n#gcc-mirror#3) B(n#gcc-mirror#4) B(n#gcc-mirror#5) B(n#gcc-mirror#6) B(n#gcc-mirror#7) B(n#gcc-mirror#8) B(n#gcc-mirror#9) #define D(n) C(n##0) C(n##1) C(n#gcc-mirror#2) C(n#gcc-mirror#3) C(n#gcc-mirror#4) C(n#gcc-mirror#5) C(n#gcc-mirror#6) C(n#gcc-mirror#7) C(n#gcc-mirror#8) C(n#gcc-mirror#9) #define E(n) D(n##0) D(n##1) D(n#gcc-mirror#2) D(n#gcc-mirror#3) D(n#gcc-mirror#4) D(n#gcc-mirror#5) D(n#gcc-mirror#6) D(n#gcc-mirror#7) D(n#gcc-mirror#8) D(n#gcc-mirror#9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <[email protected]> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive). (cherry picked from commit 2f59f06)

While looking at PR 105549, which is about fixing the ABI break introduced in GCC 9.1 in parameter alignment with bit-fields, we noticed that the GCC 9.1 warning is not emitted in all the cases where it should be. This patch fixes that and the next patch in the series fixes the GCC 9.1 break. We split this into two patches since patch gcc-mirror#2 introduces a new ABI break starting with GCC 13.1. This way, patch #1 can be back-ported to release branches if needed to fix the GCC 9.1 warning issue. The main idea is to add a new global boolean that indicates whether we're expanding the start of a function, so that aarch64_layout_arg can emit warnings for callees as well as callers. This removes the need for aarch64_function_arg_boundary to warn (with its incomplete information). However, in the first patch there are still cases where we emit warnings were we should not; this is fixed in patch gcc-mirror#2 where we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly. The fix in aarch64_function_arg_boundary (replacing & with &&) looks like an oversight of a previous commit in this area which changed 'abi_break' from a boolean to an integer. We also take the opportunity to fix the comment above aarch64_function_arg_alignment since the value of the abi_break parameter was changed in a previous commit, no longer matching the description. 2022-11-28 Christophe Lyon <[email protected]> Richard Sandiford <[email protected]> gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix comment. (aarch64_layout_arg): Factorize warning conditions. (aarch64_function_arg_boundary): Fix typo. * function.cc (currently_expanding_function_start): New variable. (expand_function_start): Handle currently_expanding_function_start. * function.h (currently_expanding_function_start): Declare. gcc/testsuite/ChangeLog: * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning.h: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning.h: New test.

This recognises the patterns of the form: while (n & 1) { n >>= 1 } Unfortunately there are currently two issues relating to this patch. Firstly, simplify_using_initial_conditions does not recognise that (n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0). This preconditions arise following the loop copy-header pass, and the assumptions returned by number_of_iterations_exit_assumptions then prevent final value replacement from using the niter result. I'm not sure what is the best way to fix this - one approach could be to modify simplify_using_initial_conditions to handle this sort of case, but it seems that it basically wants the information that ranger could give anway, so would something like that be a better option? The second issue arises in the vectoriser, which is able to determine that the niter->assumptions are always true. When building with -march=armv8.4-a+sve -S -O3, we get this codegen: foo (unsigned int b) { int c = 0; if (b == 0) return PREC; while (!(b & (1 << (PREC - 1)))) { b <<= 1; c++; } return c; } foo: .LFB0: .cfi_startproc cmp w0, 0 cbz w0, .L6 blt .L7 lsl w1, w0, 1 clz w2, w1 cmp w2, 14 bls .L8 mov x0, 0 cntw x3 add w1, w2, 1 index z1.s, #0, #1 whilelo p0.s, wzr, w1 .L4: add x0, x0, x3 mov p1.b, p0.b mov z0.d, z1.d whilelo p0.s, w0, w1 incw z1.s b.any .L4 add z0.s, z0.s, #1 lastb w0, p1, z0.s ret .p2align 2,,3 .L8: mov w0, 0 b .L3 .p2align 2,,3 .L13: lsl w1, w1, 1 .L3: add w0, w0, 1 tbz w1, gcc-mirror#31, .L13 ret .p2align 2,,3 .L6: mov w0, 32 ret .p2align 2,,3 .L7: mov w0, 0 ret .cfi_endproc In essence, the vectoriser uses the niter information to determine exactly how many iterations of the loop it needs to run. It then uses SVE whilelo instructions to run this number of iterations. The original loop counter is also vectorised, despite only being used in the final iteration, and then the final value of this counter is used as the return value (which is the same as the number of iterations it computed in the first place). This vectorisation is obviously bad, and I think it exposes a latent bug in the vectoriser, rather than being an issue caused by this specific patch. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_cltz): New. (number_of_iterations_bitcount): Add call to the above. (number_of_iterations_exit_assumptions): Add EQ_EXPR case for c[lt]z idiom recognition. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cltz-max.c: New test. * gcc.dg/tree-ssa/clz-char.c: New test. * gcc.dg/tree-ssa/clz-int.c: New test. * gcc.dg/tree-ssa/clz-long-long.c: New test. * gcc.dg/tree-ssa/clz-long.c: New test. * gcc.dg/tree-ssa/ctz-char.c: New test. * gcc.dg/tree-ssa/ctz-int.c: New test. * gcc.dg/tree-ssa/ctz-long-long.c: New test. * gcc.dg/tree-ssa/ctz-long.c: New test.

Here the ahead-of-time overload set pruning in finish_call_expr is unintentionally returning a CALL_EXPR whose (pruned) callee is wrapped in an ADDR_EXPR, despite the original callee not being wrapped in an ADDR_EXPR. This ends up causing a bogus declaration mismatch error in the below testcase because the call to min in #1 gets expressed as a CALL_EXPR of ADDR_EXPR of FUNCTION_DECL, whereas the level-lowered call to min in gcc-mirror#2 gets expressed instead as a CALL_EXPR of FUNCTION_DECL. This patch fixes this by stripping the spurious ADDR_EXPR appropriately. Thus the first call to min now also gets expressed as a CALL_EXPR of FUNCTION_DECL, matching the behavior before r12-6075-g2decd2cabe5a4f. PR c++/107461 gcc/cp/ChangeLog: * semantics.cc (finish_call_expr): Strip ADDR_EXPR from the selected callee during overload set pruning. gcc/testsuite/ChangeLog: * g++.dg/template/call9.C: New test.

After r13-5684-g59e0376f607805 the (pruned) callee of a non-dependent CALL_EXPR is a bare FUNCTION_DECL rather than ADDR_EXPR of FUNCTION_DECL. This innocent change revealed that cp_tree_equal doesn't first check dependence of a CALL_EXPR before treating a FUNCTION_DECL callee as a dependent name, which leads to us incorrectly accepting the first two testcases below and rejecting the third: * In the first testcase, cp_tree_equal incorrectly returns true for the two non-dependent CALL_EXPRs f(0) and f(0) (whose CALL_EXPR_FN are different FUNCTION_DECLs) which causes us to treat gcc-mirror#2 as a redeclaration of #1. * Same issue in the second testcase, for f<int*>() and f<char>(). * In the third testcase, cp_tree_equal incorrectly returns true for f<int>() and f<void(*)(int)>() which causes us to conflate the two dependent specializations A<decltype(f<int>()(U()))> and A<decltype(f<void(*)(int)>()(U()))>. This patch fixes this by making called_fns_equal treat two callees as dependent names only if the overall CALL_EXPRs are dependent, via a new convenience function call_expr_dependent_name that is like dependent_name but also checks dependence of the overall CALL_EXPR. PR c++/107461 gcc/cp/ChangeLog: * cp-tree.h (call_expr_dependent_name): Declare. * pt.cc (iterative_hash_template_arg) <case CALL_EXPR>: Use call_expr_dependent_name instead of dependent_name. * tree.cc (call_expr_dependent_name): Define. (called_fns_equal): Adjust to take two CALL_EXPRs instead of CALL_EXPR_FNs thereof. Use call_expr_dependent_name instead of dependent_name. (cp_tree_equal) <case CALL_EXPR>: Adjust call to called_fns_equal. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/overload5.C: New test. * g++.dg/cpp0x/overload5a.C: New test. * g++.dg/cpp0x/overload6.C: New test.

-Wmismatched-tags warns about the (harmless) struct/class mismatch. For, e.g., template<typename T> struct A { }; class A<int> a; it works by adding A<T> to the class2loc hash table while parsing the class-head and then, while parsing the elaborate type-specifier, we add A<int>. At the end of c_parse_file we go through the table and warn about the class-key mismatches. In this PR we crash though; we have template<typename T> struct A { template<typename U> struct W { }; }; struct A<int>::W<int> w; // #1 where while parsing A and #1 we've stashed A<T> A<T>::W<U> A<int>::W<int> into class2loc. Then in class_decl_loc_t::diag_mismatched_tags TYPE is A<int>::W<int>, and specialization_of gets us A<int>::W<U>, which is not in class2loc, so we crash on gcc_assert (cdlguide). But it's OK not to have found A<int>::W<U>, we should just look one "level" up, that is, A<T>::W<U>. It's important to handle class specializations, so e.g. template<> struct A<char> { template<typename U> class W { }; }; where W's class-key is different than in the primary template above, so we should warn depending on whether we're looking into A<char> or into a different instantiation. PR c++/106259 gcc/cp/ChangeLog: * parser.cc (class_decl_loc_t::diag_mismatched_tags): If the first lookup of SPEC didn't find anything, try to look for most_general_template. gcc/testsuite/ChangeLog: * g++.dg/warn/Wmismatched-tags-11.C: New test.

Currently on xstormy16 SImode shifts by a single bit require two instructions, and shifts by other non-zero integer immediate constants require five instructions. This patch implements the obvious optimization that shifts by two bits can be done in four instructions, by using two single-bit sequences. Hence, ashift_2 was previously generated as: mov r7,r2 | shl r2,gcc-mirror#2 | shl r3,gcc-mirror#2 | shr r7,gcc-mirror#14 | or r3,r7 ret and with this patch we now generate: shl r2,#1 | rlc r3,#1 | shl r2,#1 | rlc r3,#1 ret 2023-04-23 Roger Sayle <[email protected]> gcc/ChangeLog * config/stormy16/stormy16.cc (xstormy16_output_shift): Implement SImode shifts by two by performing a single bit SImode shift twice. gcc/testsuite/ChangeLog * gcc.target/xstormy16/shiftsi.c: New test case.

This patch contains some minor tweak to xstormy16's machine description most significantly providing a pattern for HImode rotate left by a single bit that requires only two instructions. unsigned short foo(unsigned short x) { return (x << 1) | (x >> 15); } currently with -O2 generates: foo: mov r7,r2 shr r7,gcc-mirror#15 shl r2,#1 or r2,r7 ret with this patch, GCC now generates: foo: shl r2,#1 | adc r2,#0 ret Additionally neghi2 is converted to a define_insn (so that the RTL optimizers see the negation semantics), and HImode rotations by 8-bits can now be recognized and implemented using swpb. 2023-04-29 Roger Sayle <[email protected]> gcc/ChangeLog * config/stormy16/stormy16.md (neghi2): Convert from a define_expand to a define_insn. (*rotatehi_1): New define_insn for efficient 2 insn sequence. (*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb. gcc/testsuite/ChangeLog * gcc.target/xstormy16/neghi2.c: New test case. * gcc.target/xstormy16/rotatehi-1.c: Likewise.

I noticed that for member class templates of a class template we were unnecessarily substituting both the template and its type. Avoiding that duplication speeds compilation of this silly testcase from ~12s to ~9s on my laptop. It's unlikely to make a difference on any real code, but the simplification is also nice. We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation of the template class, but it makes more sense to do that in tsubst_template_decl anyway. #define NC(X) \ template <class U> struct X##1; \ template <class U> struct X#gcc-mirror#2; \ template <class U> struct X#gcc-mirror#3; \ template <class U> struct X#gcc-mirror#4; \ template <class U> struct X#gcc-mirror#5; \ template <class U> struct X#gcc-mirror#6; #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f) #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E) template <int I> struct A { NC3(am) }; template <class...Ts> void sink(Ts...); template <int...Is> void g() { sink(A<Is>()...); } template <int I> void f() { g<__integer_pack(I)...>(); } int main() { f<1000>(); } gcc/cp/ChangeLog: * pt.cc (instantiate_class_template): Skip the RECORD_TYPE of a class template. (tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.

…D_EXPR) using SVE. Given this input code: int sum_abs (uint8_t *restrict x, uint8_t *restrict y, int n) { int sum = 0; for (int i = 0; i < n; i++) { sum += __builtin_abs (x[i] - y[i]); } return sum; } The resulting SVE code is: 0000000000000000 <sum_abs>: 0: 7100005f cmp w2, #0x0 4: 5400026d b.le 50 <sum_abs+0x50> 8: d2800003 mov x3, #0x0 // #0 c: 93407c42 sxtw x2, w2 10: 2538c002 mov z2.b, #0 14: 25221fe0 whilelo p0.b, xzr, x2 18: 2538c023 mov z3.b, #1 1c: 2518e3e1 ptrue p1.b 20: a4034000 ld1b {z0.b}, p0/z, [x0, x3] 24: a4034021 ld1b {z1.b}, p0/z, [x1, x3] 28: 0430e3e3 incb x3 2c: 0520c021 sel z1.b, p0, z1.b, z0.b 30: 25221c60 whilelo p0.b, x3, x2 34: 040d0420 uabd z0.b, p1/m, z0.b, z1.b 38: 44830402 udot z2.s, z0.b, z3.b 3c: 54ffff21 b.ne 20 <sum_abs+0x20> // b.any 40: 2598e3e0 ptrue p0.s 44: 04812042 uaddv d2, p0, z2.s 48: 1e260040 fmov w0, s2 4c: d65f03c0 ret 50: 1e2703e2 fmov s2, wzr 54: 1e260040 fmov w0, s2 58: d65f03c0 ret Notice how udot is used inside a fully masked loop. gcc/Changelog: 2019-05-07 Alejandro Martinez <[email protected]> * config/aarch64/aarch64-sve.md (<su>abd<mode>_3): New define_expand. (aarch64_<su>abd<mode>_3): Likewise. (*aarch64_<su>abd<mode>_3): New define_insn. (<sur>sad<vsi2qi>): New define_expand. * config/aarch64/iterators.md: Added MAX_OPP attribute. * tree-vect-loop.c (use_mask_by_cond_expr_p): Add SAD_EXPR. (build_vect_cond_expr): Likewise. gcc/testsuite/Changelog: 2019-05-07 Alejandro Martinez <[email protected]> * gcc.target/aarch64/sve/sad_1.c: New test for sum of absolute differences. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@270975 138bc75d-0d04-0410-961f-82ee72b054a4

Here during overload resolution we have two strictly viable ambiguous candidates #1 and gcc-mirror#2, and two non-strictly viable candidates gcc-mirror#3 and gcc-mirror#4 which we hold on to ever since r14-6522. These latter candidates have an empty second arg conversion since the first arg conversion was deemed bad, and this trips up joust when called on gcc-mirror#3 and gcc-mirror#4 which assumes all arg conversions are there. We can fix this by making joust robust to empty arg conversions, but in this situation we shouldn't need to compare gcc-mirror#3 and gcc-mirror#4 at all given that we have a strictly viable candidate. To that end, this patch makes tourney shortcut considering non-strictly viable candidates upon encountering ambiguity between two strictly viable candidates (taking advantage of the fact that the candidates list is sorted according to viability via splice_viable). PR c++/115239 gcc/cp/ChangeLog: * call.cc (tourney): Don't consider a non-strictly viable candidate as the champ if there was ambiguity between two strictly viable candidates. gcc/testsuite/ChangeLog: * g++.dg/overload/error7.C: New test. Reviewed-by: Jason Merrill <[email protected]>

rodgert added 8 commits January 21, 2018 11:34

WIP implement stand::post

526f87d

WIP Cleaning up strand::_Runnable implementation

5d6e601

Unwinding previous change

d9b6db1

Introduced a race condition

WIP Updating strand ctors, dtor, running_in_this_thread()

d3d7e99

WIP Update _M_running_on each time _Runnable runs

a67f143

WIP Move post() outside of lock in strand::_Runnable

5ea764a

WIP Narrow scope of lock in strand::post()

3e61e0b

Move post of runnable to underlying executor outside lock

Remove local state refs

47ec4ec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO NOT MERGE, pleas review: WIP implement strand::post #1

DO NOT MERGE, pleas review: WIP implement strand::post #1

rodgert commented Jan 21, 2018

DO NOT MERGE, pleas review: WIP implement strand::post #1

Are you sure you want to change the base?

DO NOT MERGE, pleas review: WIP implement strand::post #1

Conversation

rodgert commented Jan 21, 2018