Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rare SIGSEGV caught #473

Open
quakemmo opened this issue Oct 28, 2020 · 5 comments
Open

Rare SIGSEGV caught #473

quakemmo opened this issue Oct 28, 2020 · 5 comments

Comments

@quakemmo
Copy link

quakemmo commented Oct 28, 2020

This thing has been puzzling me for a long time, on a simple map with a bunch of mapobjects and whatnot I'd very rarely get a SIGSEGV on ioquake3 client, on Linux 64, running native cgame / ref / ui, with ref DL on.

It literally happened once a month at most and today I managed to get a backtrace.

Maybe it will be of some use to someone who groks the ioq3 renderer (hint: not me).

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `. ioquake.x86_64 +set fs_game test +set fs_basepath /home/z/testq +set vm_ui 0 +set'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007ff5faf61229 in R_FogFactor (s=-nan(0x400000), t=0.03125) at code/renderergl1/tr_image.c:1150
1150        d = tr.fogTable[ (int)(s * (FOG_TABLE_SIZE-1)) ];
[Current thread is 1 (Thread 0x7ff60e568800 (LWP 9513))]
(gdb) bt 
#0  0x00007ff5faf61229 in R_FogFactor (s=-nan(0x400000), t=0.03125) at code/renderergl1/tr_image.c:1150
#1  0x00007ff5faf83c69 in RB_CalcModulateColorsByFog (colors=0x7ff5fb2bd800 <tess+240000> '\377' <repeats 200 times>...) at code/renderergl1/tr_shade_calc.c:741
#2  0x00007ff5faf80158 in ComputeColors (pStage=0x7ff5fb760828) at code/renderergl1/tr_shade.c:787
#3  0x00007ff5faf80cf8 in RB_IterateStagesGeneric (input=0x7ff5fb282e80 <tess>) at code/renderergl1/tr_shade.c:941
#4  0x00007ff5faf81026 in RB_StageIteratorGeneric () at code/renderergl1/tr_shade.c:1065
#5  0x00007ff5faf816d6 in RB_EndSurface () at code/renderergl1/tr_shade.c:1325
#6  0x00007ff5faf4fd24 in RB_RenderDrawSurfList (drawSurfs=0x7ff5fb2e5048, numDrawSurfs=809) at code/renderergl1/tr_backend.c:663
#7  0x00007ff5faf50b47 in RB_DrawSurfs (data=0x7ff5fb45b4f8) at code/renderergl1/tr_backend.c:923
#8  0x00007ff5faf51189 in RB_ExecuteRenderCommands (data=0x7ff5fb45b4f8) at code/renderergl1/tr_backend.c:1122
#9  0x00007ff5faf589ec in R_IssueRenderCommands (runPerformanceCounters=qtrue) at code/renderergl1/tr_cmds.c:96
#10 0x00007ff5faf593ff in RE_EndFrame (frontEndMsec=0x0, backEndMsec=0x0) at code/renderergl1/tr_cmds.c:474
#11 0x00005641b1f0e5d9 in SCR_UpdateScreen () at code/client/cl_scrn.c:800
#12 0x00005641b1f06812 in CL_Frame (msec=12) at code/client/cl_main.c:3064
#13 0x00005641b1f2b09a in Com_Frame () at code/qcommon/common.c:3278
#14 0x00005641b1fbf20a in main (argc=30, argv=0x7ffdb9cc5ab8) at code/sys/sys_main.c:782
(gdb) 
@timangus
Copy link
Member

Looks like s is a NaN, and is then being normalised and cast to an int (ugh) to use as a table lookup. As soon as s is a NaN all bets are off. I can't see any obvious path by which s (or t) become NaNs in R_FogFactor, indeed given the value of t, s should be 0. This suggests it's already a NaN when passed in. Long story short, it looks like RB_CalcFogTexCoords is the source, but knowing exactly where and why is harder. It could be that a particular map or scene within a map (involving fog) is responsible.

@ensiform
Copy link

ensiform commented Oct 30, 2020

I don't have the specific line handy on mobile, but in OpenJK we fixed this by checking nan on s and t in calc fog texcoords function which is called in the chain above (and use 0 instead) but apparently it is inlined out to not be listed in the backtrace.

EDIT Got lines on PC:
https://github.com/JACoders/OpenJK/blob/master/codemp/rd-vanilla/tr_shade_calc.cpp#L930-L931

@timangus
Copy link
Member

I mean, I guess that stops it happening, but it doesn't actually address the cause.

@ensiform
Copy link

I don't think you can stop it. It's likely map related.

@timangus
Copy link
Member

Right, but the origin of the NaN is likely to be much further up the food chain where it's potentially causing other obscure and hard to track bugs. It would be better to find where this. My bet would be somewhere in tr.world->fogs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants