Skip to content

[Debug Info] Generate typedef nodes for ptr/ref types (and msvc arrays) #144394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Walnut356
Copy link
Contributor

@Walnut356 Walnut356 commented Jul 24, 2025

This kills like 17 birds with 1 stone. It allows displaying the proper &/&mut/*const/*mut type name for *-gnu targets, and fixes a bunch of issues with visualizing *-msvc targets.

In short, none of the debuggers (in their current states) respect the "name" field that is passed to LLVMRustDIBuilderCreatePointerType. That field does appear in the DWARF data, but GDB and LLDB don't care.

This patch wraps the pointer nodes (and msvc array type) in a typedef before finalizing them.

*-gnu

Mainly fixes type-name output. Screenshots should be self-explanatory.

GDB

GDB by default hard-codes pointer types to *mut. This "fixes" that without requiring a code change in GDB

image

LLDB

TypeSystemClang ignores the name field of the pointer in the DWARF info. Using a typedef sidesteps that deficiency. We could maybe modify TypeSystemClang so this isn't necessary, but since it relies on clang (read: c/c++) compiler type representations under the hood, I'm not sure if pointers can have names and it's not really reasonable to change clang itself to accommodate rust.

image

*-msvc

As opposed to DWARF, the name field does not exist anywhere in the PDB data. There are 2 reasons for this

  1. Pointer nodes do not contain a name field

  2. Primitive types are unique, special nodes that have an additional unique, special representation for pointer-to-primitive

The issue with this is with container types, for example Vec. Vec<T>'s heap pointer is not a *mut T, it's a *mut u8 that is cast to a T when needed using (more or less) PhantomData<T>. From the type's perspective, T only "exists" in the generic parameters. That means the debugger, working from the type's perspective, must look it up by name, (e.g. Vec<ref$<u8> > must look up the string "ref$<u8>").

Since those type names aren't in the PDB data, the lookup fails, the debugger cannot cast the heap pointer, and thus cannot visualize the elements of the container.

In LLDB, the sole arbiter of "what types exist" when doing a type lookup is the PDB data itself. I'm sure the msdia works the same way, but LLDB's native PDB parser checks the type stream, and any pointer-node-to-T is formatted C-style as T *. This problem also affects Microsoft's debugger.

array$<T,N> also needs a typedef, as arrays have a bespoke node whose "name" field is also ignored in favor of the C-style format (e.g. T[N]). If you use Visual Sudio's natvis diagnostics logging, you can see errors such as this:

Natvis: C:\Users\ant_b\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\etc\liballoc.natvis(10,23): Error: identifier "array$<u32,7>" is undefined
    Error while evaluating '(array$<u32,7>*)buf.inner.ptr.pointer.pointer' in the context of type 'sample.exe!alloc::vec::Vec<array$<u32,7>,alloc::alloc::Global>'.

LLDB (via CodeLLDB)

image

CDB via Visual Studio 2022

image

CDB via C/C++ extension in Visual Studio Code

The output is identical to Visual Studio, but I want to make a special note because i had to jump through a few hoops to get it to work correctly. Built with stable, it worked the same as the "Before" image from Visual Studio, but built with the patched compiler the Vec visualizer wasn't working.

Clearly based on the Visual Studio "After" screenshot, the natvis files still work. If you binary-patch the extension so that it outputs verbose logging info it appears it never even tried to load liballoc.natvis for some reason?

I manually placed the natvis files in C:\Users\<USER>\.vscode\extensions\ms-vscode.cpptools-1.26.3-win32-x64\debugAdapters\vsdbg\bin\Visualizers\ and it worked fine so iunno. Probably worth someone else testing too. Might also be because I'm only using a stage1 build instead of a full toolchain install? I'm not sure.

Alternatives

I tried some fiddling with using the reference debug info node (which does have a valid counterpart in both DWARF and PDB). The issue is that LLDB uses TypeSystemClang, which is very C/C++-oriented. In Rust, references are borrowing pointers. In C++ references are not objects, they are not guaranteed to have a representation at run-time. That means no array-of-refs, no ref-to-ref, no pointer-to-ref. LLDB seems to interpret ref-to-ref incorrectly. That can be worked around but the hack necessary for that is heinous and infects the visualizers too. It also means without the visualizers, the type-name output is sorta worse than it is now.

@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2025

r? @oli-obk

rustbot has assigned @oli-obk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 24, 2025
@Walnut356
Copy link
Contributor Author

Walnut356 commented Jul 24, 2025

oh yeah, i'm not sure if we even run the debuginfo test suite anymore? But if we do, this is gonna fail a lot of those tests (in a good way). I can get that cleaned up if/when those failures happen.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Walnut356
Copy link
Contributor Author

Ah, so we do still run the debuginfo tests.

One notable problem is that gdb method calls via the T.func() syntax breaks. GDB will attempt to call the function via the process and the process (in the case of function-call.rs) segfaults. This can probably be fixed pretty easily on their end though, and using the more verbose function_call::RegularStruct::get_x(&r) works just fine for the time being.

All the other failures were just minor considerations for "unwrapping" the typedef into its actual type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants