Skip to content

Conversation

Gabriel39
Copy link
Contributor

What problem does this PR solve?

start BE in local mode
*** Query id: d82def8526424e69-b00c82d310ccaa1d ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1757020283 (unix time) try "date -d @1757020283" if you are using GNU date ***
*** Current BE git commitID: 0cbb0bf ***
*** SIGSEGV unknown detail explain (@0x0) received by PID 8317 (TID 10279 OR 0x7fa9b984b640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:420
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007FAD203E7520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::pipeline::PriorityTaskQueue::push(std::shared_ptr) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_queue.cpp:129
5# doris::pipeline::MultiCoreTaskQueue::push_back(std::shared_ptr, int) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_queue.cpp:204
6# doris::pipeline::MultiCoreTaskQueue::push_back(std::shared_ptr) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_queue.cpp:198
7# doris::pipeline::TaskScheduler::submit(std::shared_ptr) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_scheduler.cpp:74
8# doris::pipeline::HybridTaskScheduler::submit(std::shared_ptr) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_scheduler.cpp:189
9# doris::pipeline::PipelineTask::wake_up(doris::pipeline::Dependency*) at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/pipeline_task.cpp:803
10# doris::pipeline::Dependency::set_ready() at /home/zcp/repo_center/doris_master/doris/be/src/pipeline/dependency.cpp:88
11# doris::vectorized::VDataStreamRecvr::SenderQueue::add_block(std::unique_ptr >, int, long, google::protobuf::Closure**, long, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdata_stream_recvr.cpp:194
12# doris::vectorized::VDataStreamRecvr::add_block(std::unique_ptr >, int, int, long, google::protobuf::Closure**, long, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdata_stream_recvr.cpp:403
13# doris::vectorized::VDataStreamMgr::transmit_block(doris::PTransmitDataParams const*, google::protobuf::Closure**, long) at /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdata_stream_mgr.cpp:167
14# doris::PInternalService::_transmit_block(google::protobuf::RpcController*, doris::PTransmitDataParams const*, doris::PTransmitDataResult*, google::protobuf::Closure*, doris::Status const&, long) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
15# doris::PInternalService::transmit_block(google::protobuf::RpcController*, doris::PTransmitDataParams const*, doris::PTransmitDataResult*, google::protobuf::Closure*) at /home/zcp/repo_center/doris_master/doris/be/src/service/internal_service.cpp:1624
16# brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
17# brpc::ProcessInputMessage(void*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
18# bthread::TaskGroup::task_runner(long) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
19# bthread_make_fcontext in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
172.20.57.180 last coredump sql: 2025-09-05 05:11:54,768 [query] Query d82def8526424e69-b00c82d310ccaa1d 1 times with new query id: fc55dbd3bbf24d9d-b18012e9210c6be0

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34210 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b48ed6a6b17bd7e7ecbccac7987a0ff0b8b2b1dc, data reload: false

------ Round 1 ----------------------------------
q1	17595	5246	5112	5112
q2	1982	327	216	216
q3	10245	1270	704	704
q4	10243	1016	530	530
q5	7566	2340	2373	2340
q6	186	169	144	144
q7	944	736	639	639
q8	9336	1349	1111	1111
q9	6971	5304	5168	5168
q10	6951	2377	2020	2020
q11	482	308	287	287
q12	360	367	233	233
q13	17785	3707	3065	3065
q14	259	247	240	240
q15	580	506	481	481
q16	435	440	379	379
q17	606	870	363	363
q18	7594	7154	7025	7025
q19	1447	958	599	599
q20	362	339	237	237
q21	3670	3181	2345	2345
q22	1046	1056	972	972
Total cold run time: 106645 ms
Total hot run time: 34210 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5203	5152	5172	5152
q2	249	331	229	229
q3	2174	2681	2271	2271
q4	1357	1780	1332	1332
q5	4222	4396	4520	4396
q6	219	171	135	135
q7	2087	2018	1891	1891
q8	2701	2609	2638	2609
q9	7383	7412	7257	7257
q10	3099	3327	2923	2923
q11	559	514	506	506
q12	695	817	652	652
q13	3570	3896	3384	3384
q14	286	299	307	299
q15	543	507	648	507
q16	523	491	441	441
q17	1171	1532	1424	1424
q18	7876	7741	7432	7432
q19	908	869	924	869
q20	2048	2055	1861	1861
q21	4728	4316	4362	4316
q22	1089	1048	1013	1013
Total cold run time: 52690 ms
Total hot run time: 50899 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187345 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b48ed6a6b17bd7e7ecbccac7987a0ff0b8b2b1dc, data reload: false

query1	1065	440	424	424
query2	6577	1731	1722	1722
query3	6750	224	224	224
query4	26438	23357	23128	23128
query5	4410	650	528	528
query6	349	253	228	228
query7	4653	524	300	300
query8	333	280	281	280
query9	8676	2894	2889	2889
query10	486	356	294	294
query11	15380	15103	14694	14694
query12	173	127	125	125
query13	1709	571	441	441
query14	9222	5811	5863	5811
query15	218	182	174	174
query16	7732	672	505	505
query17	1255	732	679	679
query18	2046	445	341	341
query19	208	209	174	174
query20	131	162	127	127
query21	216	131	114	114
query22	4100	4301	4072	4072
query23	33952	32935	33191	32935
query24	8191	2379	2422	2379
query25	577	522	475	475
query26	1246	278	172	172
query27	2713	513	368	368
query28	4405	2273	2239	2239
query29	778	600	489	489
query30	298	226	197	197
query31	910	836	735	735
query32	96	81	79	79
query33	581	404	354	354
query34	817	869	544	544
query35	862	840	766	766
query36	993	1015	938	938
query37	131	110	94	94
query38	4142	4038	3992	3992
query39	1487	1478	1441	1441
query40	219	135	127	127
query41	62	62	62	62
query42	140	126	124	124
query43	526	515	470	470
query44	1384	881	870	870
query45	185	181	178	178
query46	868	1014	655	655
query47	1800	1819	1777	1777
query48	390	433	327	327
query49	764	547	415	415
query50	663	700	419	419
query51	4208	4152	4130	4130
query52	120	118	111	111
query53	251	277	212	212
query54	623	615	541	541
query55	103	97	94	94
query56	326	347	353	347
query57	1202	1233	1145	1145
query58	295	287	282	282
query59	2702	2687	2617	2617
query60	372	359	356	356
query61	174	162	156	156
query62	834	764	670	670
query63	239	202	205	202
query64	4473	1193	895	895
query65	4289	4260	4260	4260
query66	1117	453	345	345
query67	15453	15410	15328	15328
query68	7970	951	582	582
query69	506	335	297	297
query70	1213	1170	1094	1094
query71	576	365	322	322
query72	5866	5084	5255	5084
query73	707	655	364	364
query74	8954	9142	8994	8994
query75	3445	3182	2598	2598
query76	3424	1226	746	746
query77	660	422	343	343
query78	9478	9586	8880	8880
query79	2499	817	610	610
query80	648	608	519	519
query81	474	263	232	232
query82	460	145	110	110
query83	290	270	240	240
query84	306	118	93	93
query85	879	469	422	422
query86	394	317	303	303
query87	4373	4359	4184	4184
query88	3655	2260	2253	2253
query89	410	342	308	308
query90	1893	231	231	231
query91	167	169	182	169
query92	92	79	77	77
query93	1987	999	648	648
query94	730	430	328	328
query95	408	339	345	339
query96	489	591	285	285
query97	2612	2695	2586	2586
query98	247	223	224	223
query99	1454	1470	1295	1295
Total cold run time: 275354 ms
Total hot run time: 187345 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.94 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b48ed6a6b17bd7e7ecbccac7987a0ff0b8b2b1dc, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.06
query3	0.26	0.08	0.08
query4	1.60	0.12	0.12
query5	0.44	0.41	0.42
query6	1.19	0.64	0.65
query7	0.04	0.03	0.03
query8	0.06	0.05	0.05
query9	0.62	0.53	0.52
query10	0.57	0.58	0.57
query11	0.17	0.11	0.11
query12	0.15	0.12	0.12
query13	0.64	0.63	0.62
query14	0.81	0.83	0.83
query15	0.88	0.85	0.89
query16	0.40	0.41	0.39
query17	1.04	1.07	1.05
query18	0.22	0.21	0.21
query19	1.99	1.86	1.83
query20	0.02	0.01	0.01
query21	15.40	0.95	0.58
query22	0.80	1.17	0.81
query23	14.76	1.39	0.64
query24	6.84	0.69	1.03
query25	0.57	0.32	0.09
query26	0.55	0.17	0.13
query27	0.06	0.06	0.06
query28	9.87	0.91	0.44
query29	12.59	3.86	3.25
query30	0.28	0.14	0.13
query31	2.83	0.60	0.39
query32	3.24	0.56	0.49
query33	3.06	3.13	3.09
query34	16.21	5.51	4.82
query35	4.93	4.97	4.92
query36	0.69	0.54	0.52
query37	0.11	0.07	0.07
query38	0.06	0.04	0.05
query39	0.04	0.03	0.02
query40	0.19	0.15	0.14
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 104.51 s
Total hot run time: 29.94 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 32.35% (11/34) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 51.88% (17230/33213)
Line Coverage 37.28% (157220/421765)
Region Coverage 31.92% (120035/376024)
Branch Coverage 33.29% (52677/158232)

@Gabriel39
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34977 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 781fb859b138ba8ab7a2c61138aab705a05d0848, data reload: false

------ Round 1 ----------------------------------
q1	17627	5365	5194	5194
q2	2025	320	214	214
q3	10250	1371	748	748
q4	10236	1059	515	515
q5	7576	2520	2576	2520
q6	203	187	142	142
q7	1015	768	646	646
q8	9365	1488	1334	1334
q9	6874	5326	5344	5326
q10	7011	2406	1990	1990
q11	508	309	270	270
q12	362	384	230	230
q13	17771	3798	3047	3047
q14	240	249	227	227
q15	600	502	480	480
q16	436	443	392	392
q17	603	908	382	382
q18	7422	7104	7071	7071
q19	1375	1120	610	610
q20	356	358	236	236
q21	3823	3312	2426	2426
q22	1096	1013	977	977
Total cold run time: 106774 ms
Total hot run time: 34977 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5163	5404	5420	5404
q2	259	377	231	231
q3	2217	2697	2296	2296
q4	1370	1803	1335	1335
q5	4437	4531	4525	4525
q6	264	188	140	140
q7	2118	1990	1788	1788
q8	2847	2854	2866	2854
q9	7376	7391	7283	7283
q10	3163	3364	2903	2903
q11	621	525	496	496
q12	776	810	608	608
q13	3639	3946	3301	3301
q14	297	317	287	287
q15	534	500	486	486
q16	469	527	450	450
q17	1231	1708	1485	1485
q18	7980	7582	7508	7508
q19	1113	892	1003	892
q20	2019	2060	1940	1940
q21	4907	4481	4421	4421
q22	1117	1036	972	972
Total cold run time: 53917 ms
Total hot run time: 51605 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185984 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 781fb859b138ba8ab7a2c61138aab705a05d0848, data reload: false

query1	1058	428	406	406
query2	6554	1710	1682	1682
query3	6767	230	221	221
query4	26101	23766	22923	22923
query5	4899	668	554	554
query6	362	257	240	240
query7	4693	535	332	332
query8	336	279	266	266
query9	9251	2986	2983	2983
query10	524	375	303	303
query11	15739	15246	14766	14766
query12	172	122	120	120
query13	1691	556	447	447
query14	9408	5760	5802	5760
query15	213	191	176	176
query16	7703	669	483	483
query17	1219	749	675	675
query18	2037	426	357	357
query19	202	197	184	184
query20	137	130	124	124
query21	204	141	113	113
query22	4278	4097	3999	3999
query23	33613	32925	32761	32761
query24	8059	2343	2394	2343
query25	566	523	450	450
query26	914	283	167	167
query27	2701	510	358	358
query28	4336	2250	2229	2229
query29	714	611	488	488
query30	305	217	199	199
query31	915	792	736	736
query32	88	88	80	80
query33	593	409	374	374
query34	805	865	538	538
query35	808	848	736	736
query36	997	1009	918	918
query37	127	114	93	93
query38	4111	4077	4065	4065
query39	1481	1429	1467	1429
query40	231	145	137	137
query41	74	67	66	66
query42	141	123	123	123
query43	529	529	511	511
query44	1335	861	862	861
query45	185	178	177	177
query46	866	1153	639	639
query47	1799	1797	1756	1756
query48	407	424	329	329
query49	705	505	399	399
query50	639	688	420	420
query51	4650	4293	4418	4293
query52	115	115	108	108
query53	242	273	202	202
query54	605	604	542	542
query55	95	91	101	91
query56	405	348	324	324
query57	1214	1216	1139	1139
query58	290	283	277	277
query59	2552	2647	2629	2629
query60	365	352	351	351
query61	164	159	156	156
query62	841	738	676	676
query63	229	193	189	189
query64	3613	1125	826	826
query65	4278	4225	4177	4177
query66	899	459	347	347
query67	15822	15097	14906	14906
query68	8446	932	583	583
query69	501	328	294	294
query70	1240	1167	1147	1147
query71	580	355	317	317
query72	5931	4968	5099	4968
query73	759	663	361	361
query74	8972	8973	8643	8643
query75	3880	3070	2665	2665
query76	3606	1150	723	723
query77	799	431	341	341
query78	9565	9844	8807	8807
query79	2337	816	600	600
query80	651	631	509	509
query81	504	254	226	226
query82	508	141	111	111
query83	262	268	259	259
query84	263	111	92	92
query85	929	465	424	424
query86	392	342	311	311
query87	4215	4290	4264	4264
query88	3429	2221	2225	2221
query89	399	350	299	299
query90	1913	226	226	226
query91	166	165	145	145
query92	93	75	71	71
query93	1725	975	655	655
query94	674	411	333	333
query95	414	333	337	333
query96	488	585	283	283
query97	2645	2688	2542	2542
query98	246	224	216	216
query99	1330	1426	1286	1286
Total cold run time: 275833 ms
Total hot run time: 185984 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.72 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 781fb859b138ba8ab7a2c61138aab705a05d0848, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.08	0.08
query4	1.61	0.12	0.13
query5	0.45	0.42	0.44
query6	1.17	0.65	0.65
query7	0.03	0.03	0.02
query8	0.06	0.04	0.04
query9	0.60	0.54	0.53
query10	0.58	0.58	0.58
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.62	0.63	0.62
query14	0.79	0.83	0.83
query15	0.87	0.84	0.86
query16	0.39	0.40	0.39
query17	1.04	1.05	1.02
query18	0.21	0.21	0.19
query19	1.93	1.82	1.84
query20	0.02	0.01	0.02
query21	15.43	0.97	0.59
query22	0.80	1.14	0.68
query23	14.92	1.38	0.61
query24	6.58	1.16	0.64
query25	0.52	0.26	0.14
query26	0.60	0.16	0.14
query27	0.08	0.05	0.05
query28	10.32	0.94	0.42
query29	12.63	3.91	3.24
query30	0.29	0.13	0.14
query31	2.84	0.60	0.38
query32	3.23	0.59	0.48
query33	3.07	3.13	3.07
query34	16.06	5.45	4.91
query35	4.99	4.86	4.94
query36	0.69	0.50	0.51
query37	0.11	0.07	0.08
query38	0.07	0.05	0.05
query39	0.04	0.03	0.04
query40	0.18	0.16	0.15
query41	0.09	0.04	0.03
query42	0.04	0.03	0.02
query43	0.05	0.04	0.04
Total cold run time: 104.71 s
Total hot run time: 29.72 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 34.38% (11/32) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 51.88% (17230/33213)
Line Coverage 37.27% (157211/421766)
Region Coverage 31.90% (119968/376018)
Branch Coverage 33.29% (52672/158232)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (32/32) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 70.61% (23028/32613)
Line Coverage 56.98% (240167/421490)
Region Coverage 52.31% (199497/381405)
Branch Coverage 54.02% (85963/159132)

if (_on_blocking_scheduler) {
_tracking.blocking_thread_id = thread_id;
} else {
_tracking.simple_thread_id = thread_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要这么做。pipeline task 本身不应该感知到自己是在哪个调度器里,hybrid task scheduler 随时可能结构变化。 可以考虑每次用自己记录的core id % 目标queue 的sub queue 数量来避免core

@Gabriel39 Gabriel39 closed this Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants