Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

app_rpt: Acquire blocklock mutex when hard hanging up link channel. #460

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

InterLinked1
Copy link
Member

Resolves: #459

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

I needed to add myrpt to all of the hangup_link_chan() calls. After that it still crashes with a ton of these (see end of message):
This appears to be directly a result of calling ast_check_hangup() here:

if (!l->chan || ast_check_hangup(chan)) {

If I remove it, this crash goes away and we get a different error/crash (same as case 2 in the issue comments):

[2025-01-20 12:45:04.793] WARNING[756092][C-00000023]: app_rpt/rpt_channel.c:470 send_newkey: Failed to send text !NEWKEY1! on IAX2/149.154.11.243:24141-4553
[2025-01-20 12:45:04.795] WARNING[756092][C-00000023]: app_rpt/rpt_channel.c:470 send_newkey: Failed to send text !NEWKEY1! on IAX2/149.154.11.243:24141-4553
  == Spawn extension (radio-secure, 287893, 1) exited non-zero on 'IAX2/149.154.11.243:24141-4553'
    -- Hungup 'IAX2/149.154.11.243:24141-4553'

Ton of errors to follow:

[2025-01-20 12:34:13.402] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.404] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.404] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.405] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.405] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.407] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.407] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.408] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.408] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.410] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.410] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.411] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.411] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.413] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.413] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.414] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.414] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.416] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.416] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.417] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.417] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.419] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.419] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.420] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.420] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.422] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.422] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.423] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.423] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.424] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.425] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.426] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.426] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.427] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.427] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.428] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.428] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.429] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.429] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.430] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.430] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.431] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.431] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.432] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.432] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.432] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.432] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.433] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.433] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.434] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.434] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.435] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.435] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.435] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.435] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.436] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.436] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.437] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.437] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.437] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.437] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.438] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.438] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.438] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.438] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.439] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.439] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.440] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.440] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.440] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.440] ERROR[740647]: channel.c:3084 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.441] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x248) [0x55d8b0f5b3f8]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.441] ERROR[740647]: channel.c:3091 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.441] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x32b) [0x55d8b0f5b4db]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.441] ERROR[740647]: channel.c:3118 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.442] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x47c) [0x55d8b0f5b62c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.442] ERROR[740647]: channel.c:3120 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.442] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x4ad) [0x55d8b0f5b65d]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.442] ERROR[740647]: channel.c:3039 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.443] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_lock+0x1a3) [0x55d8b0f1a823]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0xbc) [0x55d8b0f5b26c]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

[2025-01-20 12:34:13.443] ERROR[740647]: channel.c:3055 ast_waitfor_nandfds: FRACK!, Failed assertion bad magic number 0x0 for object 0x7ff7a80014f0 (0)
[2025-01-20 12:34:13.443] ERROR[740647]:   Got 7 backtrace records
# 0: /usr/sbin/asterisk(__ao2_unlock+0x13a) [0x55d8b0f1a96a]
# 1: /usr/sbin/asterisk(ast_waitfor_nandfds+0x167) [0x55d8b0f5b317]
# 2: /usr/sbin/asterisk(ast_waitfor_n+0x14) [0x55d8b0f5bb74]
# 3: /usr/lib/asterisk/modules/app_rpt.so(+0x1fdbe) [0x7ff7d4a1fdbe]
# 4: /usr/sbin/asterisk(+0x1bf26c) [0x55d8b106d26c]
# 5: /lib/x86_64-linux-gnu/libc.so.6(+0x891c4) [0x7ff7d86bf1c4]
# 6: /lib/x86_64-linux-gnu/libc.so.6(+0x10985c) [0x7ff7d873f85c]

@InterLinked1
Copy link
Member Author

What's line 3091 in your channel.c?

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

3091 looks like this:

        for (x = 0; x < n; x++) {
                ast_channel_lock(c[x]);
                for (y = 0; y < ast_channel_fd_count(c[x]); y++) {
                        fdmap[max].fdno = y;  /* fd y is linked to this pfds */
                        fdmap[max].chan = x;  /* channel x is linked to this pfds */
                        max += ast_add_fd(&pfds[max], ast_channel_fd(c[x], y));
                }
                CHECK_BLOCKING(c[x]);
                ast_channel_unlock(c[x]);
        }

ast_channel_unlock() is at 3091

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

I had a theory that we are trying to hang up the channel twice -> once due to return -1; and once due to the softhangup() code.

Commenting out softhangup stops the crash when we execute ast_check_hangup(chan). but we still crash with the return -1;

Wish I understood what the return codes are used for - did a bit of digging and can't find any docs.

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

SO, changing this:

ast_softhangup(chan, AST_SOFTHANGUP_EXPLICIT);

to ast_softhangup_no_lock() AND

if (!l->chan || ast_check_hangup(chan)) {

removing ast_check_hangup(chan) but still returning -1 also prevents a crash.

Again, I'm not sure if it's the "right way", but it doesn't crash :)

@mkmer
Copy link
Collaborator

mkmer commented Jan 20, 2025

This is the set of changes (probably clearer this way):
https://github.com/mkmer/app_rpt/tree/hangup_crash_fix

@InterLinked1
Copy link
Member Author

SO, changing this:

ast_softhangup(chan, AST_SOFTHANGUP_EXPLICIT);

to ast_softhangup_no_lock() AND

if (!l->chan || ast_check_hangup(chan)) {

removing ast_check_hangup(chan) but still returning -1 also prevents a crash.

Again, I'm not sure if it's the "right way", but it doesn't crash :)

I don't think ast_softhangup_no_lock is correct. It's intended for usage where the channel is already locked, which isn't the case here.

To be clear, the channel may be locked by another thread, which seems probable if it's being serviced by another thread too. But no_lock is if the current thread has already locked the channel, which it doesn't appear we have.

ast_check_hangup doesn't acquire the channel lock, so using ast_check_hangup_locked may be the more appropriate thing to do in that case. Could you scrap both of our changes from HEAD thus far and try changing your ast_check_hangup to ast_check_hangup_locked to see if that helps?

@mkmer
Copy link
Collaborator

mkmer commented Jan 21, 2025

That didn't work either.
I think there are two crash sources (this is my theory anyway - having a difficult time following the asterisk side)

#1: We have a channel hangup message, which caused softhangup to execute while rpt_exec is still working. When we return -1; in rpt_exec, the asterisk side of things tries to hangup the channel in a deadlock due to softhangup? (or it force hangs up stuff). This is why return 0; or softhangup_nolock() make the crash go away (maybe?).

#2 Is simply testing the channel with ast_check_hangup is crashing - this one really has me lost. Unless chan has already become NULL (I don't know). Looking at the asterisk function, it makes no sense.

Is there a description of the expected return codes from rpt_exec (or any xxx_exec) somewhere? I've hunted a bit and don't see one.

@mkmer
Copy link
Collaborator

mkmer commented Jan 21, 2025

I think I found the description:

  • \param execute a function callback to execute the application. It should return
  •            non-zero if the channel needs to be hung up.
    

non-zero is calling a hangup.
Maybe we shouldn't be hanging up at the softhangup location if the PBX has it, then return -1; will do it's job?

@mkmer
Copy link
Collaborator

mkmer commented Jan 21, 2025

This is updated to what may be more appropriate (still has ast_check_hangup() removed :
https://github.com/mkmer/app_rpt/tree/hangup_crash_fix

Essentially remove the softhangup, if the rpt_exec thread still has control on a hangup message, don't hangup as it will be managed by the pbx with return -1;

Still not really sure why ast_check_hangup() is causing issues.
This is a crash report for the above code with the ||ast_check_hangup() included.
core-asterisk-2025-01-21T13-51-44Z-info.txt
core-asterisk-2025-01-21T13-51-44Z-locks.txt
core-asterisk-2025-01-21T13-51-44Z-thread1.txt
core-asterisk-2025-01-21T13-51-44Z-brief.txt
core-asterisk-2025-01-21T13-51-44Z-full.txt

@InterLinked1
Copy link
Member Author

That didn't work either. I think there are two crash sources (this is my theory anyway - having a difficult time following the asterisk side)

#1: We have a channel hangup message, which caused softhangup to execute while rpt_exec is still working. When we return -1; in rpt_exec, the asterisk side of things tries to hangup the channel in a deadlock due to softhangup? (or it force hangs up stuff). This is why return 0; or softhangup_nolock() make the crash go away (maybe?).

Returning 0 from a dialplan application (Rpt in this case) indicates dialplan execution can continue.
-1 indicates that dialplan application should stop, and the channel will then typically be hung up. The KEEPALIVE thing is a separate thing mostly unique to Rpt where the channel would be kept alive after the dialplan application had returned.

A softhangup can be called from any thread to queue a hangup on a channel (typically owned by another thread). It will typically cause internal applications like ast_read to return -1, NULL, a NULL frame, etc. - think of it like closing a file descriptor that another thread is using, or signaling it, then causing it to return and then exit.

The PBX thread normally owns a channel, in which case returning -1 would be sufficient to hang it up. But due to the whole keepalive thing where a separate thread is then responsible for the channel, that changes thngs there.

#2 Is simply testing the channel with ast_check_hangup is crashing - this one really has me lost. Unless chan has already become NULL (I don't know). Looking at the asterisk function, it makes no sense.

chan wouldn't be NULL (or it would be visible in the backtrace). But if chan itself is invalid for some reason, then that could indeed be an issue.

Is there a description of the expected return codes from rpt_exec (or any xxx_exec) somewhere? I've hunted a bit and don't see one.

As I said above, 0 means continue and -1 means (failure). Failure can be handled by the TryExec application, but otherwise dialplan will terminate and the channel is hungup.

There used to be more return values like AST_PBX_KEEPALIVE, but those have mostly been removed.

@mkmer
Copy link
Collaborator

mkmer commented Jan 21, 2025

Is this the code that calls rpt_exec() and deals with the return code?
https://github.com/asterisk/asterisk/blob/50a25ac8474d7900ba59a68ed4fd942074082435/apps/app_dial.c#L2082-L2092

@mkmer
Copy link
Collaborator

mkmer commented Jan 22, 2025

I'm not sure what the next step is on this one. While there are a couple of ways to "not crash", it sounds like they are not the right way to deal with the deadlock on the channel.

@InterLinked1
Copy link
Member Author

Is this the code that calls rpt_exec() and deals with the return code?
https://github.com/asterisk/asterisk/blob/50a25ac8474d7900ba59a68ed4fd942074082435/apps/app_dial.c#L2082-L2092

Nope, Dial is another application, just like Rpt. The code would be part of pbx.c

I'm not sure what the next step is on this one. While there are a couple of ways to "not crash", it sounds like they are not the right way to deal with the deadlock on the channel.

Yeah, I think what would probably be most helpful, at least to me, is if there is a way I can reliably reproduce this myself. That was instrumental in getting the other change resolved as we could quickly test and iterate. If there's any way to do that here, that would be great.

Unfortunately, I have to deal with moving some equipment, and I'm going to be out of town next week, so I might be a bit slow in being able to look into this.

@mkmer
Copy link
Collaborator

mkmer commented Jan 22, 2025

You need the server node and a client node:
This python script (I hacked it together - so never mind comments they are probably wrong.)
fastconnect.zip
uses pyami_asterisk library: https://pypi.org/project/asterisk-ami/

If you have another way to issue AMI commands, just use the commands from the script.
action_string = f"Command\nCommand: rpt cmd {MY_NODE} ilink 13 {TARGET_NODE}"
action_string = f"Command\nCommand: rpt cmd {MY_NODE} ilink 11 {TARGET_NODE}"

Anything under .10 seconds typically will trigger the failed to send !NEWKEY1! error then soon after a 2 to 0 message. (and crash anywhere in between)

I found you can't issue them through the command line and you can't create a macro with both connect/disconnect in it - neither are fast enough to get it to fail.

Thinking about it a bit more -> maybe putting a sleep in the right spot in the rpt_exec function could make it stay there long enough to trigger the condition more easily.

I'll be here when you have something :)

@mkmer
Copy link
Collaborator

mkmer commented Jan 22, 2025

Oh, one more thing: my last iteration (remove softhangup from safe_hangup, and return -1; has some problem where my remote node becomes disconnected even with Permanent checked - no crash on either end. So whatever is going on, return -1; by itself may be doing something odd and keeping my remote node from connecting long enough it times out on retries.
Never mind this, it's not related - I accidentally reused my time macro for a disconnect when I was testing - thus it disconnected every hour :(

@mkmer
Copy link
Collaborator

mkmer commented Jan 23, 2025

Is this the code that calls rpt_exec() and deals with the return code?
https://github.com/asterisk/asterisk/blob/50a25ac8474d7900ba59a68ed4fd942074082435/apps/app_dial.c#L2082-L2092

Nope, Dial is another application, just like Rpt. The code would be part of pbx.c

I'm not sure what the next step is on this one. While there are a couple of ways to "not crash", it sounds like they are not the right way to deal with the deadlock on the channel.

Yeah, I think what would probably be most helpful, at least to me, is if there is a way I can reliably reproduce this myself. That was instrumental in getting the other change resolved as we could quickly test and iterate. If there's any way to do that here, that would be great.

Unfortunately, I have to deal with moving some equipment, and I'm going to be out of town next week, so I might be a bit slow in being able to look into this.

I don't know how I got that link... this looks more like the spot:
https://github.com/asterisk/asterisk/blob/50a25ac8474d7900ba59a68ed4fd942074082435/main/pbx.c#L4353-L4375

If we exit non 0, It looks like we fall down to the ast_hangup() while waiting on a soft_hangup in the other thread.
Look forward to what you figure out.

@mkmer
Copy link
Collaborator

mkmer commented Jan 24, 2025

[2025-01-20 11:24:06.322] WARNING[692197][C-00000024]: channel.c:2630 ast_hangup: Hard hangup called by thread LWP 692197 on IAX2/149.154.11.243:24501-5692, while blocked by thread LWP 692009 in procedure ast_waitfor_nandfds!  Expect a failure
    -- Hungup 'IAX2/149.154.11.243:24501-5692'

I really believe this is pointing to us using ast_hangup because of the return -1; while the soft hangup is doing it's thing.
I don't think rpt_exec has the channel locked which allows ast_softhangup to get a lock. Then before it can finish, the result of return -1; is executing a hard hangup code (quoted above).

@mkmer
Copy link
Collaborator

mkmer commented Feb 18, 2025

@InterLinked1 Any new thoughts on how we can fix this?

@InterLinked1
Copy link
Member Author

[2025-01-20 11:24:06.322] WARNING[692197][C-00000024]: channel.c:2630 ast_hangup: Hard hangup called by thread LWP 692197 on IAX2/149.154.11.243:24501-5692, while blocked by thread LWP 692009 in procedure ast_waitfor_nandfds!  Expect a failure
    -- Hungup 'IAX2/149.154.11.243:24501-5692'

I really believe this is pointing to us using ast_hangup because of the return -1; while the soft hangup is doing it's thing. I don't think rpt_exec has the channel locked which allows ast_softhangup to get a lock. Then before it can finish, the result of return -1; is executing a hard hangup code (quoted above).

Sorry, it's been a bit since I've looked at this. I think while blocked by thread LWP 692009 in procedure ast_waitfor_nandfds is key here, what is that thread doing here exactly? It's indeed true that this probably shouldn't be happening at that point in time.

The best way would be if would could get a backtrace from that scenario, is that something you can reproduce easily?

We can force it to crash at that line if you add ast_assert(0) at channel.c, line 2630, whenever that condition is triggered. That should ensure a core is dumped and allow you to get a backtrace from that instant.

@mkmer
Copy link
Collaborator

mkmer commented Feb 20, 2025

core-asterisk-2025-02-20T00-53-49Z-full.txt
core-asterisk-2025-02-20T00-53-49Z-info.txt
core-asterisk-2025-02-20T00-53-49Z-locks.txt
core-asterisk-2025-02-20T00-53-49Z-thread1.txt
core-asterisk-2025-02-20T00-53-49Z-brief.txt

This is the core dump from adding assert(0); to the hangup routine. Nothing interesting appeared in the debug log file - just asterisk restart message.

I placed the assert(0) here:
https://github.com/asterisk/asterisk/blob/d63c3e80fcfc3fc73220482d1794222a9d9b8029/main/channel.c#L2628-L2632
Except I'm on 20.x so it's not exactly at 2632. I hope it's the right spot...

@InterLinked1
Copy link
Member Author

core-asterisk-2025-02-20T00-53-49Z-full.txt core-asterisk-2025-02-20T00-53-49Z-info.txt core-asterisk-2025-02-20T00-53-49Z-locks.txt core-asterisk-2025-02-20T00-53-49Z-thread1.txt core-asterisk-2025-02-20T00-53-49Z-brief.txt

This is the core dump from adding assert(0); to the hangup routine. Nothing interesting appeared in the debug log file - just asterisk restart message.

I placed the assert(0) here: https://github.com/asterisk/asterisk/blob/d63c3e80fcfc3fc73220482d1794222a9d9b8029/main/channel.c#L2628-L2632 Except I'm on 20.x so it's not exactly at 2632. I hope it's the right spot...

This looks suspicious to me, if it's crashing in timing.c. But apparently handle is NULL there and it shouldn't be.

Just to confirm, do you have a timing module loaded? Weird crashes can occur in Asterisk if you don't have one loaded, and there isn't anything that enforces you have one loaded.

@mkmer
Copy link
Collaborator

mkmer commented Feb 20, 2025

I'm fairly sure it's loaded. I will recompile everything just to make sure I didn't cross pollinate one of the files or something odd like that.
I'm still a bit suspect on where you wanted me to put the assert(0) - the line numbers didn't match up quite right. There is an assert(0) a few lines before related to the "expect problems" error.

@mkmer
Copy link
Collaborator

mkmer commented Feb 20, 2025

load => res_timing_dahdi.so         ; DAHDI Timing Interface
load => res_timing_timerfd.so       ; Timerfd Timing Interface is preferred for ASL3

One of these right? I didn't see a loading error in the log.

@InterLinked1
Copy link
Member Author

[2025-01-20 11:24:06.322] WARNING[692197][C-00000024]: channel.c:2630 ast_hangup: Hard hangup called by thread LWP 692197 on IAX2/149.154.11.243:24501-5692, while blocked by thread LWP 692009 in procedure ast_waitfor_nandfds! Expect a failure

Yeah, that should do it, interesting. Just wanted to make sure that wasn't it!

Also, the line is the one containing the error above (Hard hangup)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

app_rpt: "fast hangup" crashing with latest dev.
2 participants