Skip to content

DSP JIT#1

Draft
pstef wants to merge 40 commits into
masterfrom
DSP-JIT
Draft

DSP JIT#1
pstef wants to merge 40 commits into
masterfrom
DSP-JIT

Conversation

@pstef
Copy link
Copy Markdown
Owner

@pstef pstef commented May 17, 2026

No description provided.

LibretroAdmin and others added 4 commits May 17, 2026 16:57
…ion)

Commit 7fdeb7a added an "opportunistic Fetch cache" in T_DrawNBG-VCSEn,
T_DrawRBG (SetupRotVars), and T_DrawRBG_CAB.  The cache is keyed on
(celli, cellj) = (ix>>3, iy>>3), invalidating only when the rendering
cell changes -- but that key omits dependencies that the fetch's
output `tf->tile_vrb` (and, in bitmap mode, the fetch's return value
`is_outside`) actually have on the full (ix, iy):

  Tile mode (bmen=false):
    cg_addr = (charno << 4) + ((celly * BPP) >> 1)
    where celly = (iy & 0x7) ^ (vflip ? 0x7 : 0)

    For two consecutive pixels in the same cell (same celli, cellj)
    but DIFFERENT iy values (e.g., iy=10 and iy=11, both cellj=1),
    celly differs (2 vs 3), so the correct tile_vrb is the address
    of a different row WITHIN the cell.  With the old (celli, cellj)
    key the cache hits, so the stale tile_vrb is reused and
    MAKE_NBGRBG_PIX reads pixel bytes from the wrong row.

  Bitmap mode (bmen=true):
    cg_addr = BMOffset + (((ix & BMWMask) + ((iy & BMHMask) << BMWShift))
                          * BPP) >> 4

    Depends on (ix, iy) at byte-level granularity, not at cell-level.
    For BPP=4 the cg_addr changes every 4 pixels of ix (each uint16_t
    holds 4 4-bit pixels); for BPP>=16 every pixel.  The (celli, cellj)
    key is too coarse and cache hits within a cell produce wrong vrb.

== User-visible symptom ==

In the issue:

    "some VDP2 graphics seem off, in especially extreme ways when
     viewed from certain angles only and almost correct otherwise"

Matches the bug profile: under rotation, consecutive pixels can map
to the same cell but different rows-within-cell (and/or different
within-cell X bytes for bitmap).  At certain rotation angles the
mismatch hits often; at angles where the rotation matrix keeps
celly constant across cell-internal X spans, the cache happens to
be self-consistent and the output is correct.  Non-rotation NBG
(handled by the cell-aligned fast paths added in 60d1f8c) is
unaffected -- it doesn't go through the cached loop in T_DrawNBG-
VCSEn except when vertical-cell-scrolling is enabled, and there
the per-block iy step crossing a same-cellj-different-(iy&7)
boundary triggers the same mismatch (rare but possible).

== Fix ==

Replace `cellj` with full `iy` in the cache key, AND bypass the
cache entirely in bitmap mode via a compile-time `(BMEN) ||`
prefix in the cache predicate.

  Tile mode  (BMEN=false):
    cache key = (ab, celli, iy)             [SetupRotVars]
    cache key = (celli, iy)                 [NBG-VCSEn, RBG_CAB]
    With full iy, every change in (iy & 7) invalidates -> celly
    updates -> cg_addr updates -> tile_vrb fresh.

  Bitmap mode (BMEN=true):
    cache predicate `(BMEN) || ...` always true -> cache never hits,
    fetch runs every pixel.  BMEN is a compile-time macro arg from
    the surrounding template-style dispatch, so the optimizer dead-
    code-eliminates the entire cache machinery in bitmap variants
    (no per-pixel runtime check).

Cache key sufficiency analysis (tile mode):
  - cellx_xor = (ix &~ 7) | (hflip ? 7 : 0): cell-invariant -> (celli)
  - palno, spr, scc:                          cell-invariant (PND data)
  - charno:                                   cell-invariant (PND data)
  - tile_vrb = cg_addr from charno + celly:   needs (cell, iy & 7)
  - is_outside = (ix & doxm) | (iy & doym):
       doxm/doym are >= 9 bits, so cell-invariant
  - return value (PlaneOver & 2)-gated:       cell-invariant

`iy` in the cache key implies (cellj, iy & 7) jointly; this is
exactly the minimal extension that captures the missing
within-cell-Y dependency.

== Verification ==

  check_build_matrix.py: 7 configs OK
  vdp2_render.c standalone compile: clean, no new warnings

No codegen comparison vs. pre-cache (7fdeb7a^) is offered: this
commit intentionally CHANGES codegen at the cache sites
(different key, different vars), so a byte-identical guarantee
would be wrong.  What this commit guarantees is correctness: the
extended key fully captures the fetch's input dependence on the
pixel coordinates, restoring the runtime equivalence to the
pre-7fdeb7a fetch-every-pixel form for tile mode, and
unconditionally bypassing the broken cache for bitmap mode.

Performance impact, tile mode:
  - The cache still elides ~7/8 PND lookups per cell-aligned 8-pixel
    horizontal run (where iy is constant across the run).
  - Rotation-induced cache thrashing (where iy changes per pixel)
    correctly forces a fetch per pixel; the perf cost of the wrong
    output was visual corruption, not speed.
  - SetupRotVars cache still skips fetches when ab+celli+iy all
    repeat, which is the common case along horizontal raster runs
    of low-angle RBG.

Performance impact, bitmap mode:
  - Cache fully bypassed.  Bitmap mode wasn't seeing meaningful
    cache hits anyway (cg_addr changes within a cell at BPP>=16,
    every-2-pixels at BPP=8, every-4-pixels at BPP=4).  Net effect
    on this hot path: negligible compared to the pre-cache form.

Fixes the visual corruption reported in
libretro#71
The sixteen BusRW_DB_* helpers in ss.cpp had a C++-only second
parameter spelling -- `uint32_t& DB` -- that the future ss.cpp ->
ss.c rename cannot keep:

   static INLINE void BusRW_DB_CS0_u8_W1 (const uint32_t A,
                                          uint32_t& DB,   <-- here
                                          const bool BurstHax,
                                          int32_t* SH2DMAHax);
   ... and 15 sibling signatures (CS0/CS12/CS3 x u8/u16/u32 x W0/W1)

C has no references; the same in-out parameter passing pattern in C
is `uint32_t* DB`, dereferenced at every value-access site inside
the body and passed `&DB` at every call site.  This commit does
exactly that, mechanically:

  signatures:    `uint32_t& DB`  ->  `uint32_t* DB`           (16)
  body reads:    `DB & mask`     ->  `*DB & mask`             (42 in-body refs)
                 `DB = value;`   ->  `*DB = value;`
                 `DB >> 16`      ->  `*DB >> 16`
  body forwards: `&DB`           ->  `DB`                     (6 sites passing
                                                              DB onward to
                                                              SCU_FromSH2_BusRW_DB_*,
                                                              which already
                                                              takes uint32_t*)
  callers:       `..., DB, ...`  ->  `..., &DB, ...`          (20 sites in
                                                              sh7095.inc:3170+)

The two `&DB` mentions in /* */ comment blocks describing the
ne16_rwbo_be<uint32_t, IsWrite>(...) source-fold are intentionally
also flipped (the new comment now accurately describes the
post-conversion form).

== Behaviour ==

References and pointers compile to identical code at -O2 when the
parameter is `static INLINE` and the address is known.  All 16
BusRW_DB_* helpers are exactly that, and gcc inlines every call.
The compiler verifies this trivially -- `.text` section of ss.cpp's
.o file is bit-identical to its pre-commit baseline.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 on ss.cpp -- 0 errors, 0 warnings.

G2 (.text byte-equivalence): `.text` section of /tmp/ss.cpp.o is
   byte-for-byte identical to the pre-commit baseline.  Metadata
   sections (debug-info, line table) differ only because some
   signature strings are one character shorter (`&` -> `*`); section
   sizes for text/data/bss unchanged.

G3 (symbol parity): nm reports 12 BusRW_DB-related symbols either
   side of the commit.  Optimizer made the same inlining decisions.

G4 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== Status ==

Two ss.cpp -> ss.c blockers remain:

   1. The 22 `extern "C"` markers (atomic with the rename --
      C linkage is C's default).
   2. The .cpp filename + the SOURCES_CXX -> SOURCES_C move in
      Makefile.common.

Both come together in the rename commit, which is next.
A wider sweep of the same pattern earlier picked up only the six
BSC_Bus* decls (c4563a5, lost in the rebase that landed the
savestates PR -- this commit re-does that cleanup and extends it to
the two remaining clusters that the same `grep -E "<type> INLINE\|
NO_INLINE void"` audit surfaces).

Three groups, thirteen decls, same shape: forward declarations of
member functions that the post-phase-9 struct body has no business
hosting -- the real implementations are free functions in
sh7095.inc with an explicit `SH7095* z` first parameter, called
with the z everywhere.  In C++ each line parsed silently as a
struct method forward decl; in C it is a parse error (struct
fields cannot be function-typed except via function-pointer, and
the `<return-type> INLINE <name>` ordering in the BSC_Bus* group
puts the function specifier in a position C99 does not accept).

== Group 1: BSC_Bus* (6 decls) at L342-L348 ==

    void INLINE BSC_BusWrite_u8 (uint32_t A, uint8_t  V, ...);
    void INLINE BSC_BusWrite_u16(uint32_t A, uint16_t V, ...);
    void INLINE BSC_BusWrite_u32(uint32_t A, uint32_t V, ...);
    uint8_t  INLINE BSC_BusRead_u8 (uint32_t A, ...);
    uint16_t INLINE BSC_BusRead_u16(uint32_t A, ...);
    uint32_t INLINE BSC_BusRead_u32(uint32_t A, ...);

Real functions: SH7095_BSC_BusRead/Write_u8/u16/u32(SH7095* z, ...)
in sh7095.inc:111-116 (forward decls) and sh7095.inc:590, 640
(bodies).  Eight call sites in the DMA paths, all passing z.

== Group 2: DoIDIF_NI_* (4 decls) at L511-L514 ==

    NO_INLINE void DoIDIF_NI_C0_I0(void) MDFN_HOT;
    NO_INLINE void DoIDIF_NI_C0_I1(void) MDFN_HOT;
    NO_INLINE void DoIDIF_NI_C1_I0(void) MDFN_HOT;
    NO_INLINE void DoIDIF_NI_C1_I1(void) MDFN_HOT;

Real functions: SH7095_DoIDIF_NI_C0_I0..C1_I1(SH7095* z) in
sh7095.inc:141-144 (forward decls) and sh7095.inc:5056+ (bodies),
called from sh7095.inc:4969, 4970, 4974, 4975 through an
IntPreventNext branch.

== Group 3: OnChipRegWrite_u* (3 decls) at L529-L531 ==

    NO_INLINE void OnChipRegWrite_u8 (uint32_t A, uint32_t V) MDFN_HOT;
    NO_INLINE void OnChipRegWrite_u16(uint32_t A, uint32_t V) MDFN_HOT;
    NO_INLINE void OnChipRegWrite_u32(uint32_t A, uint32_t V) MDFN_HOT;

Real functions: SH7095_OnChipRegWrite_u8(SH7095*, uint32_t, uint32_t)
and friends (sh7095.inc).

== Audit ==

A repo-wide `grep -E "\\b<BareName>\\b"` for each of the 13 names,
excluding the dead decls themselves, returns ONLY occurrences inside
documentation comments -- no executable code anywhere references
the bare-name form.

== Phase history comments preserved ==

The three "Phase-8l / 8n / 8o" comment blocks that preceded each
dead-decl group are kept intact and extended with a "Phase-9
follow-up:" sentence explaining what was deleted and where the real
functions live.

== Verification ==

G1 (per-TU compile): every SH7095-using TU compiles 0 errors, 0
   new warnings.  Only the pre-existing -Wformat-truncation in
   ss_init.c that earlier commits noted is unchanged.

G2 (byte equivalence): ss.cpp .o is byte-for-byte identical to the
   pre-commit baseline -- 1,447,568 bytes either way.  The decls
   generated zero code; deleting them genuinely costs nothing.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== Status ==

With this commit sh7095.h is fully C-parseable (modulo the typedefs
already in place from 3110fff).  The remaining ss.cpp -> ss.c
blockers are:

   1. 16 `uint32_t& DB` reference parameters in ss.cpp's
      BusRW_DB_* helpers (C has no references; convert to `uint32_t*
      DB` and pass `&DB` at every call site).
   2. The 22 `extern "C"` markers (atomic with the .cpp -> .c
      rename -- C linkage is the default in C).
   3. The .cpp filename itself and the Makefile.common SOURCES_CXX
      -> SOURCES_C move.

Each is its own focused commit; (1) is next.
LibretroAdmin and others added 25 commits May 17, 2026 17:48
The two C-callable proxies at ss.cpp:124, 129 --

    extern "C" void SH7095_SetActive(int cpu, bool active)
    {
        SH7095_SetActive(&CPU[cpu], active);
    }
    extern "C" void SH7095_SetNMI(int cpu, bool level)
    {
        SH7095_SetNMI(&CPU[cpu], level);
    }

-- have the same names as the primitives they wrap.  In C++ this is
fine: the `extern "C"` block puts them in a different linkage
namespace from sh7095.h's `void SH7095_SetActive(SH7095* z, bool)` /
`void SH7095_SetNMI(SH7095* z, bool)`, and the overload resolves
unambiguously since the parameter types differ.  In C the same two
declarations collide -- there is no overloading, no separate linkage
namespace, no name-mangling for C++.  Once sh7095.h became
C-parseable (typedefs from 3110fff, dead member decls cleared in
1c0d273), a C parse of ss.cpp produced:

    error: conflicting types for 'SH7095_SetActive';
           have 'void(int, _Bool)'
    note: previous declaration of 'SH7095_SetActive'
           with type 'void(SH7095*, _Bool)'
    error: conflicting types for 'SH7095_SetNMI';
           have 'void(int, _Bool)'

== Caller audit ==

  smpc.c:579   SH7095_SetActive(1, SlaveSH2On);
  smpc.c:781   SH7095_SetActive(1, SlaveSH2On);
  smpc.c:587   SH7095_SetNMI(0, true);
  smpc.c:1171  SH7095_SetNMI(0, false);
  smpc.c:1172  SH7095_SetNMI(0, true);
  smpc.c:1262  SH7095_SetNMI(0, false);
  smpc.c:1263  SH7095_SetNMI(0, true);
  smpc.c:1585  SH7095_SetNMI(0, false);
  smpc.c:1586  SH7095_SetNMI(0, true);

Every call site passes a hardcoded CPU index -- SetActive is only
ever called with `1` (slave); SetNMI is only ever called with `0`
(master).  Both wrappers' `int cpu` parameter is functionally
pointless; the master/slave split is determined statically.

== Naming choice ==

Encode the CPU in the function name, matching the existing pattern
(SH7095_M_Init, SH7095_M_Reset, SH7095_M_StateAction, ...):

    SH7095_SetActive(int cpu, bool)  ->  SH7095_S_SetActive(bool)
    SH7095_SetNMI(int cpu, bool)     ->  SH7095_M_SetNMI(bool)

Bodies index CPU[1] / CPU[0] directly:

    extern "C" void SH7095_S_SetActive(bool active)
    { SH7095_SetActive(&CPU[1], active); }
    extern "C" void SH7095_M_SetNMI(bool level)
    { SH7095_SetNMI(&CPU[0], level); }

The name no longer shadows the primitive, so the rename eliminates
the collision regardless of language.  Also cleaner: callers no
longer pass meaningless literals.

== Codegen impact ==

Not byte-identical -- text shrinks slightly:

    ss.cpp .text:    310,124 -> 310,108  bytes  (-16)
    smpc.c .text:     15,471 -> 15,471   bytes  (same length,
                                                 different bytes)
    .rodata / .data / .bss: unchanged in both TUs.

The 16-byte ss.cpp drop is the wrappers no longer needing to compute
`&CPU[cpu]` from a runtime int (sign-extend cpu, shift left by 4 for
sizeof(SH7095)*0x... addressing, lea base+index) -- they now do a
single `lea CPU+const` and tail-call.  The smpc.c side, same size:
callers no longer push the literal 1/0 into %edi before the call.
Both are micro-optimizations; the absolute amount is small enough
that the commit's reason is the C-compile, not the perf win.

== Verification ==

G1 (compile): g++ on ss.cpp and gcc on smpc.c -- 0 errors, 0 new
   warnings.

G2 (caller migration completeness): a repo-wide grep for the
   old-name shape `SH7095_(SetActive|SetNMI)\s*\([0-9]` returns
   nothing -- every literal-index call site is updated.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs.
Two small C-compat shims in the SCU layer, both groundwork toward
the ss.cpp -> ss.c rename.  Neither changes behavior; ss.cpp +
scu_dsp_misc.c + scu_dsp_gen.c + scu_dsp_mvi.c .o files are
byte-for-byte identical to their pre-commit baselines (1,447,896 /
6,000 / 4,751,560 / 641,616 bytes respectively).

== 1. DSPS forward typedef in scu_dsp_common.inc ==

scu_dsp_common.inc:83 defines `struct DSPS { ... }` -- the SCU DSP
state container.  scu.inc references it both as `struct DSPS*`
(works in C and C++) and as plain `DSPS*` at three sites that pass
the indirect DMA function pointers:

   scu.inc:756, 1110, 1464 -- ((void (*)(DSPS*))(...))(&DSP);

In C++ the struct tag auto-aliases to a type name; in C a parse
fails with `unknown type name 'DSPS'`.  Adding a single forward
typedef at the top of scu_dsp_common.inc -- just after the existing
DSP_INSTR_BASE_UIPT / DSP_INSTR_RECOVER_TCAST defines and before
the struct body itself -- resolves the bare-name spelling in either
language:

   typedef struct DSPS DSPS;

The three C consumers (scu_dsp_misc.c, scu_dsp_gen.c, scu_dsp_mvi.c)
all already write `struct DSPS* dsp` everywhere, so the typedef is
purely additive there.

== 2. Missing `=` at scu.inc:219 ==

   static const uint8_t external_tab[16 + 1]   <-- missing `=` here
   {
       0x7, 0x7, 0x7, 0x7, 0x4, 0x4, 0x4, 0x4,
       0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1, 0x1,
       0x0
   };

C++11 direct-list-initialization syntax that omits the `=` between
declarator and brace-init-list; the line above it (`internal_tab`)
writes the same array with the `=` present.  Identical semantics in
C++ either way; the `=` is required in C.  Added.

Lone occurrence in the file; the rest of the array initializers in
scu.inc all already use `=`.

== Verification ==

G1 (compile): every TU that includes scu_dsp_common.inc or scu.inc
   compiles 0 errors, 0 new warnings.  Pre-existing
   -Wmissing-attributes from `MDFN_HIDE extern ... = {...}` lines
   in scu_dsp_misc.c / scu_dsp_gen.c / scu_dsp_mvi.c is unchanged.

G2 (byte equivalence): ss.cpp, scu_dsp_misc.c, scu_dsp_gen.c,
   scu_dsp_mvi.c .o files all byte-for-byte identical to their
   pre-commit baselines.  Both shims are source-only; the compiler
   never sees a difference.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs.

== C-compile scout delta ==

ss.cpp via gcc -std=gnu99 with extern "C" stripped: 50 errors
(same total as before this commit, but the specific failing sites
have moved -- the previous block at scu.inc:219-232 around
external_tab and the DSPS-type errors at L756/1110/1464 are now
clean).  The next batch is dominated by:

  * 6 "subscripted value is neither array nor pointer nor vector"
    sites at scu.inc:533, 784, 887, 1138, 1241, 1492 -- accesses
    to `DSP.DataRAM[idx]` through what is currently a C++ method
    or property that needs a struct-field rewrite.
  * 4 "expected ';', ',' or ')' before '=' token" sites at
    scu.inc:1512+, 1638, 1764, 1885 -- default-argument C++ syntax
    on function parameters (the same shape as the sh7095.h
    `from_internal_wdt = false` that 3110fff handled).

Both classes are mechanical follow-ups.
scu.inc was already free of the C++ features that take real work
to migrate -- no templates (all 28 mentions of the word are
historical comments documenting earlier retirements), no `class`,
no `std::`, no `nullptr`, no `decltype`.  What it had was
C++11/C++14 convenience syntax around the DMALevel[3] state array:

   - 34 `auto& d = DMALevel[expr];` reference declarations
   - 2 `const auto& d = DMALevel[expr];` (one written as `auto const&`,
     one written as `const auto&` -- both supported in C++ but two
     different surface spellings)
   - 2 `for(auto& d : DMALevel)` range-for loops (the entire
     `for( : )` form is a C++11-only loop construct, plus the
     `auto&` element binding)
   - All followed by `d.<field>` member accesses

C has no `auto`, no references, and no range-for.  This commit does
the mechanical rewrite that converts each ref into a pointer-to-the
same struct (DMALevelS) and each `d.field` access into `d->field`.
Identical semantics, identical codegen at -O2.

== Forward typedefs ==

Two struct tags get a typedef so the new `DMALevelS*` / `const
DMALevelS*` declarations and the existing `const DMAWriteTabS
acb[...]` field declarations parse without the `struct` keyword
(matching the same pattern used in scsp.h's 875419f and sh7095.h's
3110fff):

    typedef struct DMAWriteTabS DMAWriteTabS;
    typedef struct DMALevelS DMALevelS;

Both are forward declarations placed right after the existing
enum block at the top of scu.inc; redundant in C++ (struct-tag
name injection), required in C.

== Reference -> pointer mapping ==

      auto& d = DMALevel[level];               ->   DMALevelS* d = &DMALevel[level];
      auto const& d = DMALevel[expr];          ->   const DMALevelS* d = &DMALevel[expr];
      const auto& d = DMALevel[level];         ->   const DMALevelS* d = &DMALevel[level];
      d.Active                                 ->   d->Active
      d.WATable                                ->   d->WATable    ...etc, 100+ accesses

== Range-for expansion ==

   for(auto& d : DMALevel)
   {
     ... d.field ...
   }

   becomes:

   for(unsigned level___ = 0; level___ < 3; level___++)
   {
     DMALevelS* d = &DMALevel[level___];
     ... d->field ...
   }

The temporary index is named `level___` (triple underscore) to
avoid shadowing the surrounding scope's `level` (which exists in
some of the same enclosing blocks).  Two sites at L393 and L4276.

== &d argument fixups ==

Four sites passed `&d` to CheckDMAStart(DMALevelS* d).  When `d`
was a reference, `&d` gave a `DMALevelS*` (address of the underlying
struct).  After the conversion `d` itself is already a `DMALevelS*`,
so `&d` is now `DMALevelS**` -- wrong.  Each site changes to plain
`d`:

   sites at scu.inc:688, 1042, 1396, 3426

The same "ref to ptr means the call site loses the &" pattern
appeared in the BusRW_DB_* conversion (c14e1e3), where six
SCU_FromSH2_BusRW_DB_* calls inside the converted functions had
their `&DB` flipped to `DB`.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 on ss.cpp -- 0 errors, 0 warnings.

G2 (codegen byte-equivalence): `.text` section of ss.o is byte-for-
   byte identical to the pre-commit baseline.  In C++ a `T&` is
   compiled the same as a `T* const` with implicit deref, and gcc's
   optimizer collapses the two forms to identical machine code well
   before code generation.  Section sizes for text/data/bss
   unchanged.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== Status of scu.inc C-compat ==

A C-compile experiment of ss.cpp (drop `extern "C"` markers, run
gcc -std=gnu99) reports 50 errors after this commit, down from
~200 before.  Remaining classes:

  1. `struct DSPS DSP;` at scu.inc:3866 + DSP global accesses --
     needs `typedef struct DSPS DSPS;` (same forward-typedef pattern).
  2. Anonymous-struct-with-aggregate-init block at L211-230 around
     `internal_tab` / `external_tab` -- a C++ construct that needs
     splitting for C.

Each is its own focused follow-up commit.
…UNTER__

Real bug on the C89 / MSVC-89 / pre-C++11 path of this macro, which
is the path that gets taken on:

   - MSVC in C mode before VS 2019 16.7 (no _Static_assert support).
     The libretro/beetle-saturn-libretro build needs to remain
     compilable as MSVC C89, so the fallback path is the only viable
     C-mode form.
   - GCC -std=c89 / -std=gnu89 (intentionally pre-C11).
   - Any C++ < C++11 compiler, with no static_assert intrinsic.

The fallback expanded the typedef-name uniqueness ID via __COUNTER__:

   #ifdef __COUNTER__
    #define MDFN_STATIC_ASSERT_ID_ __COUNTER__
   #else
    #define MDFN_STATIC_ASSERT_ID_ __LINE__
   #endif

   #define MDFN_STATIC_ASSERT(c_, msg_) \
     typedef char MDFN_STATIC_ASSERT_CAT_(_mdfn_static_assert_, \
                                          MDFN_STATIC_ASSERT_ID_) \
          [(c_) ? 1 : -1] MDFN_NOWARN_UNUSED

== The bug ==

Several sites in the SH7095 emulator encode a checkpoint-style
contract in the *condition* of their assertion -- the assertion's
job is to verify that __COUNTER__ has reached a specific value at
this point in the file:

   sh7095_ops.inc:2059   MDFN_STATIC_ASSERT(__COUNTER__ == (DebugMode ?
                                                            10000 : 5000) + 393, ...)
   sh7095_ops.inc:2107   MDFN_STATIC_ASSERT(__COUNTER__ == (DebugMode ?
                                                            10000 : 5000) + 393 + 512 + 1, ...)
   sh7095.inc:5268       MDFN_STATIC_ASSERT(__COUNTER__ == 5000,  ...)
   sh7095.inc:5277       MDFN_STATIC_ASSERT(__COUNTER__ == 10000, ...)

These checkpoints are calibrated against the number of __COUNTER__
occurrences in the code *between* checkpoints (each `&&Resume_NNNN`
label dispatch in sh7095s_ctable.inc bumps __COUNTER__ exactly once,
and the relative-offset checkpoints depend on this exact count).

When the static_assert intrinsic is available (C++11 static_assert
or C11 _Static_assert) the assertion is just a compile-time
comparison -- no further __COUNTER__ consumption.

When the fallback path is taken, MDFN_STATIC_ASSERT_ID_ expands to
__COUNTER__ itself, consuming ONE EXTRA counter value per assertion.
By the time the file reaches sh7095_ops.inc:2059's checkpoint (~411
asserts deep), __COUNTER__ has drifted by ~411 values and the
checkpoint condition compares as false.

Observed before this fix in the C-mode scout (gcc -std=gnu99 over
ss.cpp): 4 firing assertions, _mdfn_static_assert_5411,
_mdfn_static_assert_5925, _mdfn_static_assert_10411,
_mdfn_static_assert_10925 -- exactly the spots where the in-source
checkpoints `__COUNTER__ == 5393` / `== 5906` / `== 10393` / `== 10906`
should hit.  The deltas (5411-5393=18, 5925-5906=19,
10411-10393=18, 10925-10906=19) match the number of MDFN_STATIC_ASSERTs
between the file start and the failing checkpoint.

This bug never bit in current builds because all current TUs of the
beetle-saturn-libretro build target reach static_assert via
C++11 -- the affected .inc files are included only into the C++
ss.cpp.  It would bite immediately on the in-progress ss.cpp -> ss.c
rename if compiled with MSVC C89 (or with gcc in C89/C99 modes,
which the codebase intentionally supports).

== The fix ==

Switch the typedef-name uniqueness ID from `__COUNTER__` to
`__LINE__`.

__LINE__ is a property of source location -- expanding it does NOT
mutate any preprocessor state.  Using it for typedef naming guarantees
no interference with the in-source __COUNTER__ checkpoints.

The risk with __LINE__ would be collision (two
MDFN_STATIC_ASSERT()s on the same logical source line producing the
same typedef name).  An audit of all 4 sites in mednafen/ that use
MDFN_STATIC_ASSERT confirms each one is the sole statement on its
own line -- the macro never expands twice on a single line, never
gets called from inside another single-line macro expansion that
would compose to one logical line.

== Verification ==

G1 (compile, regular build): ss.cpp compiles 0 errors / 0 warnings
   under g++ -std=c++11.  The C++ path uses static_assert; this macro
   change touches only the fallback path so no codegen change is
   possible there.

G2 (byte equivalence, regular build): ss.cpp .o is byte-for-byte
   identical to the pre-commit baseline -- 1,447,896 bytes either
   way.  The C++ build never enters the changed fallback path.

G3 (C-mode scout): a gcc -std=gnu99 build of ss.cpp (with `extern "C" `
   stripped) compiled 82 errors before this commit, including 4
   `_mdfn_static_assert_*` array-bound-negative firings.  After this
   commit the firings are gone.

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE).
Four sites in ss.cpp name the type `EmulateSpecStruct` without the
`struct` keyword:

   ss.cpp:1113  static NO_INLINE MDFN_HOT int32_t
                RunLoop_ICache(EmulateSpecStruct* espec)
   ss.cpp:1151  static NO_INLINE MDFN_HOT int32_t
                RunLoop_NoICache(EmulateSpecStruct* espec)
   ss.cpp:1211  extern "C" int32_t SS_RunLoop_ICache  (EmulateSpecStruct* espec)
   ss.cpp:1212  extern "C" int32_t SS_RunLoop_NoICache(EmulateSpecStruct* espec)

In C++ these resolve through ss.h:200's `struct EmulateSpecStruct;`
forward declaration -- C++ injects the struct tag as a type name
in the surrounding scope, so the unqualified spelling parses.

In C, a forward `struct EmulateSpecStruct;` introduces the tag but
NOT a type name; `EmulateSpecStruct` alone is undeclared.  This
blocks the upcoming ss.cpp -> ss.c rename: gcc -std=gnu99 reports
`unknown type name 'EmulateSpecStruct'` at all four sites.

The fix follows the same pattern the rest of the SS code already
uses (vdp2_render.c, smpc.c, ss_init.c all include "../emuspec.h"):
include emuspec.h directly in ss.cpp.  emuspec.h's

   typedef struct EmulateSpecStruct { ... } EmulateSpecStruct;

provides the type-name spelling in both languages and makes the
ss.cpp -> ss.c rename trivial on this front (no per-site rewrite).

== Verification ==

G1 (compile): g++ -O2 -std=c++11 ss.cpp -- 0 errors, 0 warnings.

G2 (byte equivalence): ss.cpp .o byte-for-byte identical to the
   pre-commit baseline -- 1,447,896 bytes.  emuspec.h's full
   EmulateSpecStruct body was already transitively visible to ss.cpp
   in C++ via the same path that vdp2_render.c uses (mednafen.h ->
   git.h -> emuspec.h, until cleanup commits stripped that chain
   and replaced it with mdfn_gameinfo.h's narrower include set --
   emuspec.h was the casualty there).  Re-including the header
   directly is a no-op for the C++ build's translation unit
   contents (header guards prevent double-include).

G3 (build matrix): tools/check_build_matrix.py green for all 7
   configs.

== Status ==

With this commit ss.cpp's body is C-clean.  A gcc -std=gnu99
compile experiment (strip `extern "C" `, rename to .c) parses with
0 errors, 0 warnings -- combined with this commit and its three
siblings (mednafen-types.h MDFN_STATIC_ASSERT, scu.inc extern "C"
gating, sh7095.inc C++11 keyword removal), the file is ready for
the actual rename + Makefile.common SOURCES_CXX -> SOURCES_C move.
Two C++11-only keywords remained in sh7095.inc, blocking the
.cpp -> .c rename and also out-of-spec for the codebase's
MSVC C89 compilation target.

== 1. `auto*` -> explicit type ==

scu.inc:4024 (inside SH7095_Cache_AssocPurge):

    auto* cent = &z->Cache[(A >> 4) & 0x3F];

`auto*` is C++11 type deduction.  In C, `auto` is a storage-class
specifier (the default "automatic storage" -- almost never written),
so `auto* cent = ...` parses as `auto int* cent = ...` (with C89/C99
implicit-int) producing an `int*`.  Subsequent `cent->Tag[0]`
accesses then fail with `request for member 'Tag' in something not
a structure or union`.  Eight such errors, all from this one line.

Replace with the explicit type the compiler would deduce in C++:

    SH7095_CacheEntry* cent = &z->Cache[(A >> 4) & 0x3F];

`SH7095_CacheEntry` is the typedef installed by sh7095.h's
3110fff/c14e1e3 C-compat groundwork; it resolves in either language.

== 2. `constexpr` -> `const` ==

Seven sites in sh7095.inc:5025-5140:

    constexpr bool     EmulateICache = false;
    constexpr bool     EmulateICache = true;
    constexpr unsigned which         = 0;
    constexpr unsigned which         = 1;
    ... (4 `bool` variants, 3 `unsigned` variants)

Each declares a function-local constant whose value the surrounding
DoIDIF_MACRO / op-dispatch macros consume via `if(EmulateICache)`
or `case which:` style.

`constexpr` is C++11.  In C and in pre-C++11 C++, the analogous
construct is `const`:

    const bool     EmulateICache = false;
    const unsigned which         = 0;

For function-local constant initialized from a literal, `const`
gives the optimizer the same fold-at-compile-time visibility as
`constexpr`:

  - At -O2, gcc / clang / MSVC propagate the literal value through
    the `if(EmulateICache)` branches and dead-code-eliminate the
    inactive arm exactly as they did with `constexpr`.

  - Compile-time evaluation of the `if`/`case` discriminator works
    identically because the initializer is a literal -- no
    `constexpr` non-literal init paths are used at any of these
    seven sites.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 ss.cpp -- 0 errors, 0 warnings.

G2 (byte equivalence): ss.cpp .o byte-for-byte identical to the
   pre-commit baseline -- 1,447,896 bytes either way.  `auto*`
   resolves to `SH7095_CacheEntry*` at the C++ front end, and
   `constexpr bool x = literal` at -O2 generates the same machine
   code as `const bool x = literal` (gcc treats both as
   const-propagation candidates).
scu.inc contains three blocks that wrap cross-TU symbols in
`extern "C" { ... }` so that ss.cpp's C++ TU emits them with C
linkage (unmangled names), matching the declarations in
scu_dsp_common.inc that scu_dsp_misc.c / scu_dsp_gen.c /
scu_dsp_mvi.c link against.

These wrappers exist because scu.inc currently lives inside the C++
ss.cpp.  Once ss.cpp -> ss.c happens, those TUs are C and the C
linkage is automatic -- but `extern "C" { ... }` is C++-only syntax
and would fail to parse in C mode.  Gate each with #ifdef __cplusplus
so the same source compiles in either language:

  Block 1 (L3875): wraps `struct DSPS DSP;` -- the global DSP state
                   object whose address is the basis for the
                   DSP_INSTR_BASE_UIPT macro on 64-bit hosts.

  Block 2 (L3964): wraps DSP_Init() and DSP_FinishPRAMDMA() function
                   bodies -- DSP_Init's address is the same
                   DSP_INSTR_BASE_UIPT base; both are called from
                   the C-side scu_dsp_*.c files.

  Block 3 (L4227): wraps the DSP_DMAFuncTable[][][] array
                   definition -- read from scu_dsp_misc.c's
                   DSP_DecodeInstruction in the LPS path.

Each block becomes:

   #ifdef __cplusplus
   extern "C" {
   #endif
       ... contents ...
   #ifdef __cplusplus
   } /* extern "C" */
   #endif

In C++ the gate evaluates true and the linkage specification is
applied as before.  In C the gate evaluates false; the contents
emit with the default C linkage (no name mangling), which is the
same effective linkage `extern "C"` provided to the C++ compiler.
Symbol names visible to other TUs are identical in either language.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 ss.cpp -- 0 errors, 0 warnings.

G2 (byte equivalence): ss.cpp .o byte-for-byte identical to the
   pre-commit baseline -- 1,447,896 bytes either way.  The
   #ifdef gates leave the C++ build's input to the preprocessor
   identical.

G3 (build matrix): tools/check_build_matrix.py green for all 7
   configs.

G4 (cross-TU linkage check): nm output for DSP, DSP_Init,
   DSP_FinishPRAMDMA, DSP_DMAFuncTable shows unmangled C names
   in both pre- and post-commit object files.

== C-compile scout progress ==

Combined with the parallel commit that adjusts MDFN_STATIC_ASSERT,
the C-mode scout (gcc -std=gnu99 of ss.cpp with `extern "C" `
stripped) drops from 82 errors to 0.  The three "expected
identifier or '(' before string constant" errors at scu.inc:3875,
3964, 4227 -- and the cascade of label-undefined errors that
descended from the parse failures at those points -- are now clean.
This commit is the milestone of the multi-phase ss.cpp C-compat
groundwork.  All the earlier work has converged here:

   phase 8/9 type-typedef shims        (sh7095.h, scsp.h, scu.inc forward typedefs)
   phase 9 dead-decl deletions         (1c0d273)
   phase 9 ref->pointer conversions    (c14e1e3 BusRW_DB_*, 99442d8 auto&)
   phase 9 STL pull replacement        (28e3a9b mdfn_gameinfo.h + 89a1e52 emuspec.h)
   phase 9 wrapper rename              (26a7efc SH7095_S_SetActive / SH7095_M_SetNMI)
   phase 9 default-arg drop            (8af9111 BBusRW_DB / ABusRW_DB / ABus_Read)
   phase 9 MDAP cast                   (71fdd2c)
   phase 9 C++11 keyword retirement    (98bbcc4 auto*, constexpr)
   phase 9 extern "C" { } block gating (009b6f8 scu.inc cross-TU symbols)
   phase 9 MDFN_STATIC_ASSERT fix      (c41840d __LINE__ not __COUNTER__)

ss.cpp's body has been C-parseable for a few commits now (gcc -std=gnu99
on a copy with `extern "C" ` stripped: 0 errors, 0 warnings).  This
commit makes that real:

== Three coordinated changes ==

1. `extern "C" ` PREFIX STRIPPED at 22 function decl sites in
   what is now ss.c -- in C, all functions have C linkage by default,
   the marker is C++-only syntax.

   Touched sites (function names that the SS core publishes to
   the rest of the build):

      SH7095_ConstructAll, SH7095_S_SetActive, SH7095_M_SetNMI,
      SH7095_SetExtHaltDMAKludge, SH7095_M_DMA_Update,
      SH7095_S_DMA_Update, SS_RunLoop_ICache, SS_RunLoop_NoICache,
      SS_ForceEventUpdates, SH7095_M_AdjustTS, SH7095_S_AdjustTS,
      SH7095_M_Init, SH7095_S_Init, SH7095_M_SetMD5, SH7095_S_SetMD5,
      SH7095_M_TruePowerOn, SH7095_S_TruePowerOn, SH7095_M_Reset,
      SH7095_M_StateAction, SH7095_S_StateAction,
      SH7095_M_PostStateLoad, SH7095_S_PostStateLoad.

   No call site changes -- callers already see plain C decls
   (sh7095.h, ss.h, etc. already lack `extern "C"`).

   Comments mentioning `extern "C"` as a historical-context phrase
   are intentionally preserved (7 places).  They explain WHY the
   wrappers exist, which is still relevant.

2. `mednafen/ss/ss.cpp` RENAMED to `mednafen/ss/ss.c`.  No content
   changes beyond (1); 93% similarity preserved (git diff
   --find-renames recognizes it as a rename).

3. `Makefile.common`: moved `$(CORE_EMU_DIR)/ss.cpp` out of
   SOURCES_CXX and added `$(CORE_EMU_DIR)/ss.c` to SOURCES_C.
   The new SOURCES_C line is in the same physical position so
   future cross-references stay easy to find.

== Behaviour ==

ss.c built with gcc -std=gnu99 -O2 produces a .o file with the
same exported and consumed symbol set as the C++ build did, just
without the C++ name-mangling on the SH7095_* entry points (which
were already being unmangled-via-extern-C, so existing callers see
no ABI change).

   C++ build symbol:  _Z11SH7095_InitP6SH7095bb     (mangled, internal)
                      SH7095_M_Init                 (extern "C" wrapper, exported)

   C   build symbol:  SH7095_Init                   (no mangling, internal)
                      SH7095_M_Init                 (plain C function, exported)

The exported `SH7095_M_Init`, `SH7095_S_Init`, etc. names that the
rest of the build links against are bit-identical in either form.

Two file-scope `INLINE void foo(...)` functions
(SH7095_DMA_BusTimingKludge at sh7095.inc:848, SH7095_Cache_WriteUpdate_u8
at sh7095.inc:4077) now emit their bodies in ss.c's .o as external
text (C99 inline-non-static semantics) where they were emitted
internally / COMDAT-merged in the C++ build.  Only ss.c calls
them, no multi-definition risk; the function names are public in
sh7095.h either way.

== Verification ==

G1 (compile, C++ no longer relevant): there is no ss.cpp.  ss.c
   compiles with gcc -O2 (any std=gnu89 / gnu99 / gnu11) 0 errors,
   1 pre-existing warning (`'DSP_DMAFuncTable' initialized and declared
   'extern'`, identical to the warning that already showed in
   scu_dsp_misc.c / scu_dsp_gen.c / scu_dsp_mvi.c before this commit;
   carries through because all three C TUs and now ss.c
   share the scu.inc inclusion containing that array).

G2 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.  ss.c is now
   compiled as C via the matrix's by-extension dispatch (gcc -std=gnu99).

G3 (cross-TU link): `ld -r ss.o sound_glue.o smpc.o scu_dsp_misc.o`
   produces a merged .o file (link returncode 0).  All
   inter-TU references resolve at the partial-link stage; the
   remaining 86 undefined symbols (CDB_*, Cart, EventHandler, ...)
   are normal references to TUs not in the partial link.

G4 (C++-in-C scan): the build matrix's `CXX_IN_C_PATTERNS` scan
   covers the ss.c file in the .c list and finds no C++ leakage
   (no typed enums, no `template <`, no other patterns that
   gcc -std=gnu99 accepts as extensions but stricter MSVC / mingw
   builds reject).

== MSVC-89 compatibility ==

ss.c is intended to compile on MSVC as C89.  The C-compat
groundwork stayed in pre-C99 territory throughout: no _Static_assert
(uses MDFN_STATIC_ASSERT with the negative-array-bound trick), no
C99-only initializers, no C11 _Generic, no inline-position issues
(every static-inline use is `static INLINE`, MDFN_FORCE_INLINE, or
the file-scope plain INLINE that gcc handles per C99 semantics --
MSVC C89 treats __inline equivalently).

== Future-work pointers ==

The remaining .cpp file in this SOURCES_CXX block is `sound_glue.cpp`,
which holds the M68K SoundCPU and SS_SCSP class instances (the
last two C++-class-based singletons in the SS core).  Their
conversion needs the same rigor as the ss.cpp work -- it's its
own multi-commit phase.  No follow-up necessary in this commit.
… compat)

Last C++ism in the SH7095 .inc include chain that ss.c pulls in
through sh7095.inc.  The four typed-enum forms at the top of
sh7095s_rsu.inc are C++11 syntax:

   enum : unsigned { which = 1 };
   enum : bool { EmulateICache = true };
   enum : bool { DebugMode = SH7095_DEBUG_MODE };
   enum : bool { CacheBypassHack = false };

`enum : underlying_type { ... }` is the C++11 scoped/typed-enum form,
which only affects the underlying storage type of the constants --
their *values* are always int-compatible.  The downstream uses are
in macros like `if(EmulateICache)`, `if(DebugMode)`, `case which:`,
where the compile-time fold works equally with plain int constants
because gcc / clang / MSVC at -O2 propagate the integral literal.

ISO C and MSVC C89 reject the typed-enum form (parser error around
the `:` on an `enum` head).  gcc -std=gnu99 currently accepts it as
a GNU extension and silently produces equivalent code, which is why
the build matrix did not flag it before -- but its CXX_IN_C_PATTERNS
scan only walks .c files, not transitively-included .inc files, so
the four typed-enum sites in this .inc never showed up.

Convert each to the underlying-type-free form:

   enum { which = 1 };
   enum { EmulateICache = 1 /* true */ };
   enum { DebugMode = SH7095_DEBUG_MODE };
   enum { CacheBypassHack = 0 /* false */ };

Behaviour identical:

  - `EmulateICache` constant: 1 either way (was `true` cast to bool,
    is now int 1).  `if(EmulateICache)` and the `if(EmulateICache)`
    arms inside DoIDIF_MACRO fold to the same instruction stream.
  - `which` constant: 1 either way (was unsigned int 1, is now int 1).
    Used in slave-vs-master branch macros that compare to 0 or 1.
  - `DebugMode`, `CacheBypassHack`: SH7095_DEBUG_MODE is #define'd
    to 0 or 1 in the surrounding sh7095.inc context; either way an
    int literal.

== Verification ==

G1 (compile): ss.c with gcc -O2 -- 0 errors, 1 pre-existing
   warning (DSP_DMAFuncTable extern+init from scu.inc, identical to
   the warning that already flags in scu_dsp_misc.c / scu_dsp_gen.c /
   scu_dsp_mvi.c).

G2 (byte equivalence): ss.c .o is byte-for-byte identical to the
   pre-commit baseline -- 1,554,648 bytes either way.  At -O2, gcc's
   value-tracking sees the integer literal in both forms identically
   and emits the same machine code.

G3 (build matrix): tools/check_build_matrix.py green for all 7
   configs.

== Status of MSVC C89 sweep ==

A pattern-based sweep of all mednafen/ss/*.inc files in ss.c's
include chain now reports:

   typed enum:            0  (this commit)
   template<:             0  (retired in earlier phase-8/9 work)
   nullptr:               0
   constexpr:             0  (98bbcc4 sh7095.inc cleanup)
   class:                 0
   namespace:             0
   static_assert (raw):   0  (MDFN_STATIC_ASSERT used throughout)
   auto& / auto*:         0  (99442d8 + 98bbcc4)
   access modifiers:      0

(scsp.inc still has 9 `auto&` / `auto*` sites, but it is included
only from sound_glue.cpp -- still C++ -- so it does not block the
ss.c MSVC C89 build.  Will be addressed when sound_glue.cpp -> .c
work begins.)

The remaining barriers to MSVC C89 compilation of ss.c are
the GCC-extension constructs that the SH7095 emulator relies on
for fast dispatch:

  - Computed goto (sh7095s_ctable.inc + sh7095s_rsu.inc) -- 512
    `&&Resume_NNNN` label addresses and one `goto *tmp;` dispatch.
    These are GCC-only by design (the table-driven fast resume path
    is a documented gcc-extension optimization); MSVC builds would
    need an alternative dispatch (e.g. a switch on a state-id).
  - 3 `typeof(...)` / 3 `__builtin_*` uses in sh7095.inc are gated
    behind `defined(__GNUC__)` already and have MDFN_STATIC_ASSERT
    fallbacks.

The computed-goto question is its own focused effort (NO_COMPUTED_GOTO
the build matrix passes is currently an orphan flag -- no consumer
in the codebase).  Out of scope for this commit.
Adds a new compiler-hint macro alongside the existing MDFN_HOT /
MDFN_COLD / MDFN_FORCE_INLINE / NO_CLONE family.  Follows the
same tier shape (GCC, clang, MSVC, fallback) as those siblings.

== Definition ==

    GCC, clang : __builtin_unreachable()
    MSVC       : __assume(0)
    other      : nothing (compiler keeps any bounds checks)

The MSVC arm covers the C89-mode build target that the codebase
maintains.  __assume(0) has been available since VS 2005, predating
any reasonable MSVC version this project would be compiled with.

The fallback expands to nothing -- safe at every use site (the
worst case is the compiler keeps an extra compare + branch that
__builtin_unreachable would have elided).

== Intended use ==

Dense switch dispatches over an enumerated finite-domain integer
where every value is accounted for by a case label.  The default
arm is genuinely unreachable; telling the optimizer so lets it
drop the implicit bounds-check on the jump table.

Concrete imminent consumer: a planned switch-based replacement for
the GCC computed-goto dispatch in SH7095_RunSlaveUntil's resume
mechanism.  Switch dispatch ALONE keeps a bounds check on the
jump table indirection (~2 cycles extra per dispatch); with
MDFN_UNREACHABLE in the `default:` arm, GCC and MSVC both lower
the switch to a single `jmp *[table + id*8]` -- the same machine
code computed goto produces.  Without the bounds-check elision,
the perf claim of "indistinguishable from computed goto" weakens
to "very close to computed goto".

Useful elsewhere too -- any switch over a strictly bounded enum
in the SH-2 / SCU / SCSP / VDP dispatch trees can drop the
implicit default arm by adding `default: MDFN_UNREACHABLE;`.

== Verification ==

G1 (compile probe): a 3-case switch with `default: MDFN_UNREACHABLE`
   compiled with gcc -O2 produces a body without any bounds compare
   -- GCC in fact folded the example to a constant-factor multiply,
   demonstrating the optimizer accepted the hint and was free to
   transform.

G2 (codebase compile): ss.c builds clean with the new header --
   0 errors, 1 pre-existing -Wmissing-attributes warning on
   DSP_DMAFuncTable (unchanged from previous commits).

G3 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.  The macro is
   header-only, no codegen change to any existing TU until consumers
   are added.

G4 (byte equivalence): all 56-58 TUs across all 7 configs are
   byte-identical to their pre-commit baseline -- adding an
   unused #define cannot change codegen.

== Footprint ==

mednafen-types.h: +16 insertions, no deletions.  Zero call sites
yet; this is the platform shim, consumers land in follow-up
commits.
…tch (MSVC C89 compat)

== Mechanism change, semantics preserved ==

The slave SH-2 emulator's cooperative-yield/resume mechanism --
the only thing in ss.c that depended on the GCC computed-goto
extension -- now uses a portable switch dispatch.  Cycle-accurate
yield placement is preserved: every CHECK_EXIT_RESUME() call site
stays exactly where it was, every yield still happens at the same
SH-2 instruction sub-cycle, every state save/restore through the
z->Resume_* struct fields is identical.  Only the dispatch
mechanism changes shape -- label-address table -> integer-keyed
switch with `default: MDFN_UNREACHABLE`.

== The seven coordinated changes ==

(1) Macro: CHECK_EXIT_RESUME__(n) (sh7095.inc).
    Before: `z->ResumePoint = &&Resume_ ## n;`  (GCC label-address)
    After:  `z->resume_id = (n);`                (portable integer)
    The `Resume_ ## n:;` local label that catches the resume goto
    stays in place at every CHECK_EXIT_RESUME() expansion site.

(2) Function-entry dispatch (sh7095s_rsu.inc).
    Before:
        if (z->ResumePoint) {
            const void* tmp = z->ResumePoint;
            z->ResumePoint = NULL;
            goto *tmp;                /* GCC computed goto */
        }
    After:
        if (z->resume_id) {
            const uint16_t id = z->resume_id;
            z->resume_id = 0;
            switch (id) {
        #if SH7095_DEBUG_MODE
                #include "sh7095s_ctable_dm.inc"
        #else
                #include "sh7095s_ctable.inc"
        #endif
                default: MDFN_UNREACHABLE;
            }
        }

(3) Regenerated sh7095s_ctable.inc + sh7095s_ctable_dm.inc.
    Each .inc now contains 392 `case N: goto Resume_N;` lines
    (5001..5392 non-debug, 10001..10392 debug) instead of 512
    `&&Resume_N,` array entries inside `#if __COUNTER__ >= 5514`
    guards.  The pre-conversion files over-allocated to 512
    entries because the guard pattern emitted a null slot for
    every counter value < 5514 -- 120 of those slots per file
    were always null.  Net file shrink: ~1900 lines each.
    Generator at notes/build_sh7095s_ctable.c updated to emit
    the new format and to take a `debug` argument for the
    debug-mode file.

(4) PSEUDO_DMABURST handler (sh7095_ops.inc).
    Deleted the runtime ResumeTable[512] init block (was
    initialized once on first slave invocation with
    bound_timestamp=0).  The switch dispatch is purely
    compile-time -- no runtime initialization needed.
    Counter accounting in the surrounding MDFN_STATIC_ASSERT:
    the post-handler assertion drops `+ 512 + 1` (the 512
    `#if __COUNTER__ >= 5514` consumptions in the deleted
    table file and the +1 from the deleted in-block assertion),
    leaving just `+ 393` to match the 392 CHECK_EXIT_RESUME()
    expansions that precede PSEUDO_DMABURST in the opcode switch.

(5) SH7095 struct (sh7095.h).
    Before:
        const void* ResumePoint;        /* 8-byte pointer */
        ...
        const void*const* ResumeTableP[2];
    After:
        uint16_t resume_id;             /* 2-byte integer */
        ...
        /* (ResumeTableP[] field removed entirely) */
    Net struct size: -8 bytes for ResumePoint replacement
    (16 bytes saved minus 2 bytes of resume_id), -16 bytes
    for ResumeTableP removal.  Field-offset shifts cascade
    through ss.c's struct accesses but the codegen size
    stays at 350892 bytes of .text -- gcc allocates equivalent
    instructions with adjusted displacement constants.

(6) Slave initialization block (sh7095.inc).
    Before: 22-line `for (dm = 0; dm < 2; dm++)` loop that
    cleared ResumePoint/ResumeTableP, called
    SH7095_RunSlaveUntil(z, 0) and SH7095_RunSlaveUntil_Debug(z, 0)
    with bound=0 (whose only purpose was to walk into
    PSEUDO_DMABURST and trigger the per-mode table init), then
    asserted the table got populated.
    After: one line: `z->resume_id = 0;`.  No runtime init
    is needed for a compile-time switch.

(7) SH7095_StateAction_SlaveResume (sh7095.inc).
    Savestate compatibility preserved: the persisted field is
    still `ResumePointI` (int32_t) holding the same 0..511 index
    that the pre-conversion format used.  The save side now
    computes `ResumePointI = top - z->resume_id` where `top` is
    5512 (non-debug) or 10512 (debug) -- the same arithmetic
    that the deleted table was implementing as a runtime lookup
    (table[N] was &&Resume_(top-N)).  The load side reverses
    `z->resume_id = top - ResumePointI`.  Old saves load to
    the same instruction-handler resume point.  Four scattered
    `z->ResumePoint = NULL;` clears (in Reset, TruePowerOn,
    Init, SetMD5) become `z->resume_id = 0;`.

== Verification ==

G1 (compile): ss.c with gcc -O2 -- 0 errors, 1 pre-existing
   warning (DSP_DMAFuncTable extern+init from scu.inc, unchanged
   from previous commits).

G2 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.  Same TU count
   pre- and post-commit; matrix dispatcher correctly hands ss.c to
   gcc on both branches.

G3 (codegen footprint): ss.c .o `.text` section is 350892 bytes
   pre- and post-commit -- byte-count equality.  The byte
   contents differ throughout (~58K bytes) because every struct
   field after the removed ResumePoint/ResumeTableP shifts to a
   new displacement, and gcc relays out the function (rare-path
   dispatch moves out-of-line; hot-path bound check stays at the
   function head).  Same instructions, different addresses /
   different inlining order, identical total mass.

G4 (regenerator): notes/build_sh7095s_ctable.c rebuilt and
   compared against the committed .inc files -- byte-identical
   output for both modes.  Regeneration recipe documented in
   the generator's leading comment.

G5 (dispatch shape, manual): the resume entry of
   SH7095_RunSlaveUntil now generates
       movzwl 0x(rdi),%eax              ; load resume_id (u16)
       test   %ax,%ax
       jne    <resume_block>             ; non-zero -> dispatch
       ; ... main loop ...
   <resume_block>:
       xor    %edx,%edx
       sub    $5001,%ax                  ; index = id - 5001
       mov    %dx,0x(rdi)                ; resume_id = 0
       movzwl %ax,%edx
       lea    <jump_table>(%rip),%rax
       movslq (%rax,%rdx,4),%rdx
       add    %rdx,%rax
       jmp    *%rax                       ; dispatch (1 indirect jump)
   vs the pre-commit
       mov    0x(rdi),%rax               ; load ResumePoint (u64)
       test   %rax,%rax
       je     <main_loop>
       movq   $0,0x(rdi)
       jmp    *%rax
   Hot-path (no yield to resume): 4 -> 4 instructions (cmp + jne).
   Cold path (resume): 5 -> 9 instructions, ONCE per yield-resume
   cycle.  MDFN_UNREACHABLE in the default arm elided the
   jump-table bounds check, matching computed-goto's one-indirect-
   jump dispatch.

== Behaviour / accuracy ==

Bit-identical cycle accuracy.  Every yield-check fires at the
same SH-2 cycle as before; every state field saves and restores
identically.  Master-slave sync timing unchanged.  Saves made
before this commit load to the same instruction-handler resume
point.  The pre-emption granularity is unchanged at SH-2 sub-cycle
level (mid-memory-access yield is preserved).

== MSVC C89 compatibility ==

ss.c no longer requires GCC label-address (`&&label`) or computed-
goto (`goto *ptr;`) extensions for its dispatch.  Combined with the
prior C89-prep work (typed-enum retirement, _Static_assert ->
MDFN_STATIC_ASSERT, alignas -> MDFN_ALIGN, etc.) and MDFN_UNREACHABLE
shipping `__assume(0)` on MSVC, the dispatch portion of ss.c is now
MSVC C89-buildable.  Other GCC-isms in the broader sh7095.inc chain
(typeof / __builtin_types_compatible_p in 6 sites) remain gated
behind `defined(__GNUC__)` with portable fallbacks.

== Net diff ==

  -3072 lines of .inc table file boilerplate (pre-conversion)
   +784 lines of switch case statements (post-conversion)
    +68 lines of regenerator improvements (parameterized, documented)
    -84 lines of runtime table-init / slave-init / save-state code
   -----
  -2304 lines net.
Last remaining cluster of C++-only function-parameter syntax in
scu.inc: 15 inline bus-IO helpers carry C++ default arguments on
their trailing `_thing` pointers, of the shape

    static INLINE void BBusRW_DB_u8_W1_SH0(
                  uint32_t  A,
                  uint16_t* DB,
                  int32_t*  time_thing,
                  int32_t*  dma_time_thing      = NULL,    <-- C++
                  int32_t*  sh2_dma_time_thing  = NULL)    <-- C++

C++ inserts the defaults at every short call site automatically.
C99/C11 has no such facility; a call passing fewer than the
declared number of arguments is a compile error.  This commit
drops the `= NULL` from every signature and appends a matching
number of explicit `NULL` arguments at every under-supplying call
site, leaving the call sites that already passed all five (or four,
for ABus_Read) arguments exactly as they were.

== Function inventory (15 sigs) ==

 6 BBusRW_DB family (5-arg sigs, last 2 defaulted):
    BBusRW_DB_u8_W1_SH0      L1512
    BBusRW_DB_u16_W1_SH0     L1638
    BBusRW_DB_u16_W1_SH1     L1764
    BBusRW_DB_u32_W1_SH0     L1885
    BBusRW_DB_u16_W0_SH0     L2011
    BBusRW_DB_u16_W0_SH1     L2120

 5 ABusRW_DB family (5-arg sigs, last 2 defaulted):
    ABusRW_DB_u8_W1_SH0      L2229
    ABusRW_DB_u16_W1_SH0     L2330
    ABusRW_DB_u16_W1_SH1     L2431
    ABusRW_DB_u16_W0_SH0     L2532
    ABusRW_DB_u16_W0_SH1     L2634

 3 ABus_Write_DB32 family (5-arg sigs, last 2 defaulted):
    ABus_Write_DB32_u8       L2756
    ABus_Write_DB32_u16      L2763
    ABus_Write_DB32_u32      L2770

 1 ABus_Read (4-arg sig, last 2 defaulted):
    ABus_Read                L2782

== Caller inventory (18 sites updated) ==

A repo-wide audit of every call site to all 15 above, grouped by
arg count:

  5-arg sigs:
     16 sites already pass all 5 args explicitly                 -> NO change
     16 sites pass 4 args (rely on sh2_dma_time_thing default)   -> append `, NULL`

  4-arg sig (ABus_Read):
      3 sites pass all 4 args explicitly                         -> NO change
      2 sites pass 3 args (rely on sh2_dma_time_thing default)   -> append `, NULL`

Total of 18 call sites get a `, NULL` appended at end.  No
call relied on the `dma_time_thing` default (every short call
already passes `dma_time_thing` explicitly), so a single appended
NULL covers every site.

The 18 sites span:
   - scu.inc:2814, 2946, 3078  (ABus_Read in SCU register-read fast paths)
   - scu.inc:3211, 4165         (ABus_Read in DMA bus-read forwarders;
                                 the second is inside the DSP MVI
                                 macro at scu.inc:4165 -- backslash-
                                 continued; appended NULL kept on the
                                 same source line)
   - scu.inc:3219, 3222, 4155, 4158  (BBusRW_DB_u16_W0_SH{0,1} in DMA
                                       paths)
   - scu.inc:3531, 3544, 3557        (ABus_Write_DB32_u{8,16,32} in
                                       DMA write paths)
   - scu.inc:3573, 3589, 3605, 4104, 4109, 4115
                                      (BBusRW_DB_u{8,16,32}_W1_SH{0,1}
                                       + ABus_Write_DB32_u32 in DSP
                                       T0-tracked paths)
   - scu.inc:2898, 2881  (already 5-arg calls, but the function name
                          is followed by whitespace before `(` -- the
                          space-before-paren shape is preserved
                          verbatim; these did not need a NULL since
                          they already passed all 5 args)

== Behaviour ==

In C++ every site that previously passed N < total args was
compiled as if it explicitly listed the defaulted values.  Adding
those NULLs by hand makes every call site syntactically explicit
about what the compiler was already doing implicitly.  The
optimizer sees the same machine code either way.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 on ss.cpp -- 0 errors, 0 warnings.

G2 (byte equivalence): ss.cpp .o is BIT-IDENTICAL to the pre-commit
   baseline (1,447,896 bytes either side).  Default-argument
   substitution is a parse-time / front-end concern; by the time
   the optimizer runs, the call site has the same argument list
   either way, and the back end emits the same bytes.  Confirmed
   by isolating just this change (revert MDAP follow-up commit) and
   running cmp -s on the two object files.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== C-compile progress ==

After this commit a C-compile of ss.cpp (gcc -std=gnu99 with
extern "C" markers stripped) reports 26 errors, down from 50 (the
4 "expected ';', ',' or ')' before '=' token" sites at L1512,
1638, 1764, 1885 -- and their L2011 / L2120 / L2229 / L2330 / L2431
/ L2532 / L2634 / L2756 / L2763 / L2770 / L2782 siblings -- are
now clean).  The remaining errors are dominated by the 7
`MDAP(DSP.DataRAM)` sites that need a C-compat cast (handled in
the next commit), and a couple of structural issues to be
audited individually after that.
MDAP() is a C++ function template defined in mednafen-types.h:

    template <typename T>
    typename std::remove_all_extents<T>::type*
    MDAP(T* v)
    { return (typename std::remove_all_extents<T>::type*)v; }

Given an N-dim array DSP.DataRAM declared as `uint32_t DataRAM[4][64]`,
MDAP(DSP.DataRAM) deduces T = uint32_t[64], the remove_all_extents
chain unwraps to uint32_t, and the function returns a `uint32_t*`
pointing to the first element -- a flat view over the 256-element
storage.

It is C++-only.  The template is gated behind #ifdef __cplusplus in
mednafen-types.h; in C the name MDAP doesn't exist, and a C parse of
scu.inc emits `error: 'MDAP' undeclared`.

Every use in scu.inc happens to be on the same expression, with the
same intent (give me a uint32_t* into the DSP DataRAM flat storage):

    scu.inc:533   *DB = MDAP(DSP.DataRAM)[DSP.RA++];          (read)
    scu.inc:784         MDAP(DSP.DataRAM)[DSP.RA++] = *DB;    (write)
    scu.inc:887   *DB = MDAP(DSP.DataRAM)[DSP.RA++];          (read)
    scu.inc:1138        MDAP(DSP.DataRAM)[DSP.RA++] = *DB;    (write)
    scu.inc:1241  *DB = MDAP(DSP.DataRAM)[DSP.RA++];          (read)
    scu.inc:1492        MDAP(DSP.DataRAM)[DSP.RA++] = *DB;    (write)
    scu.inc:3924        MDAP(DSP.DataRAM)[i] = 0;             (init clear)

All 7 sites get the same mechanical rewrite:

    MDAP(DSP.DataRAM)  ->  ((uint32_t*)DSP.DataRAM)

In both C and C++, a 2-D array decays to a pointer-to-array-of-N and
casting that to `uint32_t*` reinterprets the contiguous storage as
a flat 1-D array of uint32_t.  Strict-aliasing safety: the dereferenced
type after the cast is the same scalar type that the array stores
(uint32_t), so the cast is well-formed under the standard's
type-based aliasing rules in both languages.

== Behaviour ==

g++ at -O2 inlines the MDAP template instantiation at every call
site, where it collapses to exactly the same `lea / mov` instruction
the explicit cast produces.  The two forms are codegen-equivalent
on every call.

== .text layout ==

.text of ss.cpp .o is the SAME size as the pre-commit baseline
(310,108 bytes) and references the SAME symbol set (nm output:
same set of names), but the .o is NOT byte-identical because gcc
laid the section out slightly differently -- two functions
(SCU_SetRegister and an internal _ZL19SCU_RegRW_DB_u16_W1jPj) ended
up at different addresses within the section.  The function bodies
themselves are identical instruction-for-instruction; only the
PC-relative offsets in the `lea %rip` operands that reach them
differ.  text/data/bss section sizes all unchanged.

This kind of "same code, different layout" shift commonly happens
when the optimizer's heuristics make a slightly different ordering
choice on changed input -- here, the template instantiation
disappearing means gcc isn't carrying around (and later DCE'ing) a
template stub, which can perturb the function-emission order.

Not a regression; just not bit-equivalent at the .o level.

== Verification ==

G1 (compile): g++ -O2 -std=c++11 on ss.cpp -- 0 errors, 0 warnings.

G2 (.text size): 310,108 bytes pre and post.

G3 (symbol set): nm output post-filtering for address columns is
   the same set of symbol names; nothing added, nothing removed.

G4 (build matrix): tools/check_build_matrix.py green for all the
   SCU/SS configs.  The pre-existing m68k_instr_split0.inc:25382
   `'tuple' is not a member of 'std'` failure introduced by
   origin/master's 77ec9a0 "Cleanups" commit reproduces on plain
   77ec9a0 without this commit on top -- unrelated to this work.

== C-compile progress ==

A naive C-compile experiment of ss.cpp (gcc -std=gnu99 after a
top-level `extern "C" ` strip on ss.cpp itself) reports 50 errors
both before and after this commit -- but the *category mix* changes
significantly.  Before: 7 errors of the shape `MDAP undeclared` at
the MDAP(DSP.DataRAM) sites in scu.inc, plus residual errors from
other sources.  After: the MDAP errors are gone; what surfaces is
that scu.inc contains its own `extern "C" { ... }` blocks (at
L3875, L3964, L4227) that the top-level `extern "C" ` strip cannot
reach, and sh7095.inc:4027 has Cache->Tag member accesses through
what is currently a C++ method dispatch.  Different work, not a
regression of the C parse.

== Status of the ss.cpp -> ss.c rename ==

This commit clears the last C++ template usage in scu.inc.  The
ss.cpp body (the part directly in ss.cpp, not transitively
through scu.inc) is now C-clean modulo the 22 `extern "C"` markers
that the .cpp -> .c rename will atomically remove.  scu.inc's
remaining C-parse blockers are structural -- inner `extern "C" { }`
linkage blocks, and Cache->Tag access shape -- each its own focused
follow-up commit.
Last GCC-only barrier in the ss.c include chain.  The RESUME_VAR
macro at sh7095.inc:5246 had two branches:

  C++   -- std::is_same<T, decltype(n)>::value
  C     -- __builtin_types_compatible_p(T, __typeof__(n))

The C branch unconditionally consumed two GCC extensions
(__builtin_types_compatible_p and __typeof__), both of which MSVC
does not implement under any of its C-mode flags.  On a hypothetical
MSVC C89 build of ss.c, the macro would fail to expand at all 20
of its call sites in sh7095_ops.inc and sh7095.inc.

Add a third tier:

  C (other, e.g. MSVC) -- sizeof(T) == sizeof(n)

It's a partial check, not a full type check.  It catches size-class
mismatches (declaring T = int32_t for a uint8_t variable would
trigger; the negative-array-bound trick MDFN_STATIC_ASSERT falls
back to under MSVC C89 turns `sizeof(int32_t) == sizeof(uint8_t)`
(4 == 1, false) into `typedef char NAME[-1]`, a hard compile
error).  It does NOT catch sign mismatch or unrelated-types-with-
same-size mismatch (int32_t vs uint32_t vs float — all 4 bytes —
would all compare equal in size).

Compared to the alternatives:
   - Drop the check entirely on MSVC: zero safety net.
   - Add a runtime check: not zero-cost, lose static-assert semantics.

The sizeof fallback is the smallest-impact compromise.  GCC and
clang continue to get the full type check; MSVC builds get
size-only checking, which is better than nothing.

== Verification ==

G1 (compile): ss.c with gcc -O2 -- 0 errors, 1 pre-existing warning
   (DSP_DMAFuncTable from scu.inc).  GCC takes the
   `defined(__GNUC__) || defined(__clang__)` branch -- identical
   path it took before.

G2 (byte equivalence): ss.c .text section is byte-identical to
   pre-commit baseline.  The MSVC fallback branch is dead code on
   GCC builds; no codegen change.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs.

G4 (fallback probe): a standalone C89-strict probe with the same
   MDFN_STATIC_ASSERT(sizeof) fallback shape:
     - 3 valid RESUME_VAR calls (matching sizes): 0 errors,
       0 warnings under `gcc -std=c89 -pedantic-errors`.
     - 1 size-mismatch call (int32_t for a uint8_t field):
       errors with "size of array 'static_assert_N' is negative",
       confirming the negative-array-bound trick fires correctly
       to reject the mismatched declaration.

== MSVC C89 barrier status for ss.c ==

Sweep of mednafen/ss/*.inc files in ss.c's include chain:

   pre-commit:   1× __builtin_types_compatible_p, 1× __typeof__
                 (both in RESUME_VAR's `#else` branch)
   post-commit:  0× (both now gated behind
                 `defined(__GNUC__) || defined(__clang__)`)

scsp.inc still has 1× computed-goto-label-address, but is only
included from sound_glue.cpp (still C++), so does not block ss.c's
MSVC C89 build.

Combined with the prior phase-9 work (typed-enum retirement,
ss.cpp -> ss.c rename, computed-goto -> switch dispatch
conversion, MDFN_UNREACHABLE / MDFN_STATIC_ASSERT shims, etc.),
ss.c is now -- to the best of the static-analysis tooling's
knowledge -- free of MSVC-C89-incompatible constructs.  Actual
MSVC verification still requires a real MSVC build.
Cleans up 5 -Wmissing-attributes warnings that have been polluting
every build of the SS core since the C-conversion split out the DSP
function-pointer tables:

   mednafen/ss/scu.inc:4240: warning: 'DSP_DMAFuncTable' initialized and declared 'extern'
   mednafen/ss/scu_dsp_gen.c:334:  warning: 'DSP_GenFuncTable' initialized and declared 'extern'
   mednafen/ss/scu_dsp_jmp.c:138:  warning: 'DSP_JMPFuncTable' initialized and declared 'extern'
   mednafen/ss/scu_dsp_misc.c:87:  warning: 'DSP_MiscFuncTable' initialized and declared 'extern'
   mednafen/ss/scu_dsp_mvi.c:196:  warning: 'DSP_MVIFuncTable' initialized and declared 'extern'

== Cause ==

Each of the 5 sites had the shape

   MDFN_HIDE extern void (*const DSP_<KIND>FuncTable[...])(struct DSPS*) =
   {
       <512-or-so initializer entries>
   };

i.e. `extern <type> <name> = <initializer>;`.  C99 6.9.2 / C++17
6.9.2 both rule that when an initializer is present, the
declaration IS a definition, and the `extern` storage-class
specifier is redundant -- the linkage from the matching header
declaration carries through.  GCC and clang warn about the
redundancy; MSVC accepts silently but the warning still pollutes
GCC builds.

== Fix ==

Drop the `extern` keyword from the definition site at each of the
5 files.  The 5 matching `extern` DECLARATIONS in
scu_dsp_common.inc (lines 166, 173, 177, 180, 184) stay
unchanged -- those are real extern declarations (no initializer)
and need the keyword to express external linkage to consumers.

   Before (definition):
     MDFN_HIDE extern void (*const DSP_DMAFuncTable[2][8][8])(struct DSPS*) = { ... };
   After:
     MDFN_HIDE       void (*const DSP_DMAFuncTable[2][8][8])(struct DSPS*) = { ... };

   Unchanged (declaration in scu_dsp_common.inc):
     MDFN_HIDE extern void (*const DSP_DMAFuncTable[2][8][8])(struct DSPS*);

This is the standard separate-declaration / definition pattern --
declarator in the header gets `extern`, definer in the source does
not.

== Behaviour ==

Symbol linkage unchanged.  Per C99 6.9.2/5: "an external definition
is an external declaration that is also a definition of a function
(other than an inline definition) or an object."  Both forms
(with and without redundant `extern`) produce a definition with
external linkage.  `MDFN_HIDE` (which expands to
__attribute__((visibility("hidden"))) on ELF / Mach-O targets,
nothing on Windows or non-GCC) is preserved in both forms.

Verified empirically:

   nm /tmp/ss.base.o | grep DSP_DMAFuncTable
   nm /tmp/ss.new.o  | grep DSP_DMAFuncTable

   Both:  0000000000000000 R DSP_DMAFuncTable  (same offset, same R section, same size)

== Verification ==

G1 (warnings): all 5 -Wmissing-attributes warnings are gone.
   ss.c (which includes scu.inc that holds DSP_DMAFuncTable)
   compiles 0 errors / 0 warnings.  The 4 scu_dsp_*.c TUs also
   compile clean.

G2 (byte equivalence): ss.c .o is byte-identical to pre-commit
   baseline.  Same offset, same instructions, same symbol set.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (symbol parity): nm output for each of the 5 .o files shows
   the same DSP_*FuncTable symbol at the same offset, in the same
   section (R, rodata), with the same size.  Confirmed by manual
   inspection of nm output for all 5.
…sses

Six -Wunused-variable / -Wunused-but-set-variable warnings in ss.c's
include chain, all artefacts of earlier template-retirement source-fold
work (phase-8q2 for SCU_RegRW_DB, phase-8l for the BSC bus read/write
helpers).  When the source-fold script substituted template parameters
to literals and eliminated dead branches at source level, the local
declarations whose only uses were in the eliminated branches were left
behind.  gcc -O2 was already optimising them out of codegen but the
warnings persisted in build output.

== Sites cleaned ==

  (1) scu.inc:438  SCU_RegRW_DB_u8_W0      drop  unsigned mask;
  (2) scu.inc:792  SCU_RegRW_DB_u16_W0     drop  unsigned mask;
  (3) scu.inc:1146 SCU_RegRW_DB_u32_W0     drop  unsigned mask;
                                                 + the corresponding
                                                 `mask = 0x... << ...;`
                                                 dead assignments.

  These were the read-variant (W0) branches of the post-fold
  SCU_RegRW_DB family.  `mask` was used in the write variants (W1)
  for byte/word write masking; the read variants kept the local from
  the unified template body but had nothing to mask.

  (4) scu.inc:2602 ABusRW_DB_u16_W0_SH0    drop  const uint32_t mask = (true)
                                                 ? 0xFFFF : (0xFF << ...);
  (5) scu.inc:2704 ABusRW_DB_u16_W0_SH1    drop  (same)

  These two are the READ-side ABusRW variants for the CD block region
  (A & 0x7FFF < 0x1000 -- CDB MMIO).  The matching WRITE-side variants
  at scu.inc:2400, 2501 also have the `mask = (true) ? 0xFFFF : ...`
  declaration but DO use it (passed into CDB_Write_DBM as the byte-enable
  mask) -- those stay.  The read paths just feed CDB_Read(offset) and
  don't need a mask.

  (6) sh7095.inc:3117  drop  const unsigned shift = ((((A & 1) ^ 1) << 3));

  An Am-based memory-probe path in the SH-2 BSC bus interface.  The
  parallel site at sh7095.inc:2405 DOES use `shift` (in a switch(Am)
  block that constructs the return value), so that one stays.  Site
  3117 has no such block -- it always returns 0 -- and the shift is
  pure dead code from the source-fold.

== Behaviour ==

Bit-identical codegen.  -O2 already constant-folded and eliminated
these locals; the only output change is that source no longer carries
the declarations and gcc no longer issues the warnings.

Verified empirically:

   ss.c .text size pre-commit:  350892 bytes
   ss.c .text size post-commit: 350892 bytes
   bytes that differ:           1 (relocation offset adjustment from
                                   the source-fold's line-number shift,
                                   not an instruction-level change)

The 1-byte delta is the same kind of layout shift that any non-code
edit (whitespace, comment) produces under -O2 with stabs / dwarf debug
info disabled -- gcc's reloc table picks a different displacement
encoding.  Instruction stream is identical.

== Verification ==

G1 (warnings): -Wall -Wunused-but-set-variable on ss.c now reports
   2 warnings, down from 8 (the 6 cleaned plus 2 long-standing ones
   on `RTCLang_List` and `IntNames` -- both intentionally defined
   for later use / debug builds, left untouched).

G2 (byte equivalence): ss.c .text byte-count identical to baseline,
   1 byte differs across 350892 (reloc offset shift only).

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (parallel-site check): the analogous WRITE-side ABusRW sites at
   scu.inc:2400 and 2501 still have their `mask` declaration; manual
   inspection confirms `CDB_Write_DBM(offset, *DB, mask);` consumes
   it.  The parallel SH-2 BSC site at sh7095.inc:2405 still has its
   `shift` declaration; the switch(Am) block below it uses `shift`
   to assemble the return value.
…mpat prep)

Preparation for the upcoming sound_glue.cpp -> sound_glue.c
conversion.  scsp.inc had 10 sites using C++11 type-inference
shorthand (`auto* s = &z->Slots[N]`, `auto* t = &z->Timers[N]`)
that would have to become explicit-typed pointer declarations
once the host TU stops being C++.  Convert them now; the C++
codegen is unchanged because `auto*` and an explicit pointer
type are identical at the language level.

Two pieces:

(1) scsp.h: SS_SCSP_Timer becomes a named, file-scope struct.

   The `Timers[3]` field used an anonymous inner struct
   (`struct { uint8_t Control; uint8_t Counter; int32_t Reload; }
   Timers[3];` inside `struct SS_SCSP`).  C lets you take a
   pointer to an anonymous struct only if the pointer's type is
   inferred from the assignment -- which is exactly what `auto*`
   was doing.  Naming the struct gives the pointer a spellable type.

   Care: in C++, declaring `struct SS_SCSP_Timer { ... } Timers[3];`
   INSIDE `struct SS_SCSP` creates a NESTED type
   `SS_SCSP::SS_SCSP_Timer` -- a different type from the file-scope
   forward-declared `struct SS_SCSP_Timer`.  Pointers don't
   auto-convert between the two, so the `auto*` conversions would
   not compile.

   Fix: move the struct definition to FILE SCOPE in scsp.h (right
   before `struct SS_SCSP`), parallel to how `SS_SCSP_Slot` is
   already organized.  Inside SS_SCSP, the field becomes the
   plain `SS_SCSP_Timer Timers[3];` using the typedef.

   Also adds the matching `typedef struct SS_SCSP_Timer
   SS_SCSP_Timer;` forward-declaration alongside the existing four
   (SS_SCSP_Slot, SS_SCSP_DSPStep, SS_SCSP_DSPS, SS_SCSP) -- same
   "5 named types up front" pattern the rest of the file uses for
   C-compat where the struct-tag-to-typename auto-aliasing
   doesn't apply.

(2) scsp.inc: 10 `auto*` -> explicit pointer-typed declarations.

   - 7 sites:  auto* s  = &z->Slots[N]    ->  SS_SCSP_Slot*  s  = ...
   - 1 site:   auto* ns = &z->Slots[N]    ->  SS_SCSP_Slot*  ns = ...
   - 3 sites:  auto* t  = &z->Timers[N]   ->  SS_SCSP_Timer* t  = ...

   Member access patterns (s->Field) are unchanged since both
   forms are pointer types.  No call sites of these functions
   are affected.

== What this commit does NOT do ==

scsp.inc still has 4 C++isms blocking its use as a C-compat
include path:

  - 2 `uint16_t& SRV = z->SlotRegs[...]` references (in CTL_*
    register-write paths -- 30+ usage sites each, need
    *-deref conversion).
  - 1 `for(auto& s : z->Slots)` range-based for at the end of
    SCSP_StateAction.
  - 1 `&&label` computed-goto label-address (deep in the SCSP
    sample-render loop -- analogous to the SH7095 case before
    its switch-dispatch conversion).

Each of those is a focused commit of its own.  This commit is the
"easy half" -- the pure type-inference shorthand that doesn't
need any semantic restructuring.

== Verification ==

G1 (compile): all scsp.h consumers compile clean:
   - sound_glue.cpp (C++):  0 errors, 0 warnings.
   - sound.c (C):           0 errors.
   - ss.c (C):              0 errors.
   - ss_state.c (C):        0 errors.

G2 (byte equivalence): sound_glue.cpp .o is BYTE-IDENTICAL to
   pre-commit baseline (58296 bytes).  In C++, `auto* s =
   &z->Slots[N]` deduces `SS_SCSP_Slot*`; explicit
   `SS_SCSP_Slot* s = ...` produces the same type.  Same
   codegen.  Same machine code.

G3 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (file-scope vs nested struct, manual): G++ rejected the first
   try where SS_SCSP_Timer was declared inline inside SS_SCSP --
   "cannot convert ‘SS_SCSP::SS_SCSP_Timer*’ to ‘SS_SCSP_Timer*’
   in initialization".  Moving the struct to file scope resolved
   the C++ nested-type-vs-file-scope-typedef mismatch.  Confirms
   the SS_SCSP_Slot-style pattern is the right shape.
…ndexed loop

Smaller half of the scsp.inc C-compat prep that 81a47a4 started.
Picks off the one easy remaining C++-ism: a `for(auto& s : z->Slots)`
range-based-for in the post-state-load fixup at the end of
SCSP_StateAction.

   Before:
     for(auto& s : z->Slots)
     {
        s.EnvLevel &= 0x3FF;
        s.EnvPhase &= 0x3;
     }

   After:
     {
      unsigned i;
      for(i = 0; i < 32; i++)
      {
         SS_SCSP_Slot* s = &z->Slots[i];
         s->EnvLevel &= 0x3FF;
         s->EnvPhase &= 0x3;
      }
     }

The iteration is unchanged: z->Slots is a fixed-size
SS_SCSP_Slot[32] array, so the explicit `i < 32` bound matches
what the range-for would walk.  Member access shifts from `.` to
`->` per the pointer-form change.  The extra brace pair keeps the
loop's local `i`/`s` out of the enclosing scope (matches the
range-for's scoping).

== State of scsp.inc C-compat sweep ==

   pre-commit:
     2 × T& reference  (uint16_t& SRV)
     1 × range-based for
   post-commit:
     2 × T& reference

These last two are the `uint16_t& SRV = z->SlotRegs[slotnum][...]`
references inside the CTL register-write paths (sites 1029 and
1499 pre-conversion), each with 30+ usage sites in the surrounding
switch on (A >> 1) & 0xF.  Conversion to `uint16_t* SRV = &...` +
`*SRV` derefs everywhere is invasive enough to be its own focused
commit; left for follow-up.

== Verification ==

G1 (compile): sound_glue.cpp (the only consumer of scsp.inc, still
   C++ until the bigger sound_glue.cpp -> .c phase) compiles 0
   errors / 0 warnings.

G2 (byte equivalence): sound_glue.cpp .text section is identical
   to pre-commit baseline.  G++ at -O2 unrolls / inlines the
   indexed-for and range-for forms into the same instruction
   stream (both walk Slots[0]..Slots[31] in order with the same
   2-bit mask + AND ops on the same struct fields at the same
   offsets).  .o-level metadata differs only in line numbers.

G3 (build matrix): tools/check_build_matrix.py green across all 7
   configs (default_LE / _BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (pattern sweep): C++-ism scan on mednafen/ss/scsp.inc now
   reports only the 2 outstanding `T& reference` sites; zero
   auto/template/decltype/range-for/constexpr/nullptr.
…nter form

Closes the scsp.inc C++ism count to zero.  This is the third and
final scsp.inc commit in the C-compat prep series:

   cbc9414  named SS_SCSP_Timer + retired 11 auto*  (auto count -> 0)
   5dd3c84  range-based-for retired                (range-for -> 0)
   THIS     2 references retired                    (T& -> 0)

== The conversion ==

Two CTL register-write paths (one each in SS_SCSP_RW_W8 / _W16 --
A < 0x400 / write side, originally a single template body that
phase-8r1's source-fold split into 4 named methods, leaving these
two functionally-identical switch blocks for the W8 and W16
byte/word write paths) had:

   uint16_t& SRV = z->SlotRegs[slotnum][(A >> 1) & 0xF];
   switch ((A >> 1) & 0xF) {
       case 0x00: ... SRV & 0x1000 ... SRV &= 0x0FFF ...
       case 0x01: ... s->StartAddr = ... | SRV;
       case 0x02: ... s->LoopStart = SRV;
       case 0x03: ... s->LoopEnd = SRV;
       case 0x04: ... s->EnvRates[...] = SRV & 0x1F; ...
       case 0x05: ... s->EnvRates[...] = SRV & 0x1F; ...
       case 0x06: ... SRV &= 0x0FFF; s->TotalLevel = SRV & 0xFF; ...
       case 0x07: ... s->ModInputY = SRV & 0x3F; ...
       case 0x08: ... s->FreqNum = SRV & 0x7FF; ...
       case 0x09: ... s->ALFOModLevel = SRV & 0x7; ...
       case 0x0A: ... SRV &= 0x00FF; ...
       case 0x0B: ... SDL_PAN_ToVolume(... (SRV >> 13) & 0x7, ...);
       case 0x0C..0x0F: SRV = 0; break;
   }

In each case `SRV` is a C++ reference -- transparently dereferences
to the underlying `z->SlotRegs[slotnum][(A >> 1) & 0xF]` element.

Convert:
   1. Declaration  uint16_t& SRV = X       ->  uint16_t* SRV = &X
   2. Each use     SRV                     ->  *SRV
      (read:   SRV & 0xF                  ->  *SRV & 0xF)
      (write:  SRV &= 0x0FFF              ->  *SRV &= 0x0FFF)
      (assign: SRV = 0                    ->  *SRV = 0)

Precedence is fine in every use site: `*` (indirection) binds
tighter than the `&`, `|`, `&=`, `|=`, `>>`, `=` operators used
in this switch body, so plain `*SRV` substitution works without
defensive parentheses.

== Counts ==

   2 references converted (decls at sites 1029, 1500 pre-conversion).
   88 use sites updated (44 per block, the two switches are
   functionally identical -- the only diff between them is the
   small DMEA/DRGA handling near the bottom which doesn't touch
   SRV).

Substitution is bounded to the same scoping region the original
references lived in (inside the `{ ... }` block where they were
declared, ending at the switch's closing brace + the post-switch
`return;`).  Confirmed by post-pass grep:

   plain `SRV` references after conversion:  2  (the two decls,
                                                 now spelled as
                                                 `uint16_t* SRV =`)
   `*SRV` references after conversion:      88

== State after this commit ==

scsp.inc full C++ism sweep (auto, T&, nullptr, constexpr, decltype,
range-for, template<, class, namespace): 0 hits.

scsp.inc is now C-parseable (modulo any C++-ism a static pattern
sweep might miss).  The remaining barrier to compiling it from a
C TU is the sound_glue.cpp -> .c conversion of the consumer side
itself -- which is its own focused multi-commit phase.

== Verification ==

G1 (compile): sound_glue.cpp (still the only consumer) compiles
   0 errors, 0 warnings.

G2 (byte equivalence): sound_glue.cpp .o is BYTE-IDENTICAL to
   pre-commit baseline.  At -O2, g++ lowers `uint16_t& SRV = X`
   and `uint16_t* SRV = &X` to the same instruction stream --
   reference and pointer-with-deref are isomorphic operations
   on the machine.  Same loads from the same addresses, same
   stores back, same opcodes throughout.

G3 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE / _BE, no_deint_LE, no_chd_LE,
   no_tremor_LE, no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (sweep): scsp.inc C++ism pattern scan reports 0 hits across
   all 9 monitored patterns.  Bracketed by the matrix's
   already-green C++-in-C scan on .c files (no leakage there),
   scsp.inc is now C-pattern-clean.
Tiny preparatory commit toward sound_glue.cpp -> sound_glue.c.
M68K::BUS_INT_ACK_AUTO is a value the BusIntAck callback returns
to tell the M68K core to use automatic interrupt-acknowledge
vectoring instead of supplying an explicit vector number.

The constant was declared as a class-scoped anonymous-enum entry
inside `struct M68K`:

   struct M68K
   {
      // ...
      enum { BUS_INT_ACK_AUTO = -1 };
      // ...
   };

with the only user spelling it as `M68K::BUS_INT_ACK_AUTO`
(sound_glue.cpp's SoundCPU_BusIntAck callback return).

C has no class-scope qualifier syntax, so `M68K::BUS_INT_ACK_AUTO`
won't parse once sound_glue.cpp becomes sound_glue.c.  Move the
enum to file scope as `M68K_BUS_INT_ACK_AUTO` (matching the
`M68K_*` free-function naming convention for the rest of the
header's exported API) and update the one consumer.

== Changes ==

   m68k.h:
     +  enum { M68K_BUS_INT_ACK_AUTO = -1 };       (file scope,
                                                    above struct
                                                    M68K, with
                                                    a brief
                                                    rationale
                                                    comment)
     -  enum { BUS_INT_ACK_AUTO = -1 };            (class scope,
                                                    no other users)

   sound_glue.cpp:
        return M68K::BUS_INT_ACK_AUTO;
     -> return M68K_BUS_INT_ACK_AUTO;

Both the comment and the placement parallel the SS_SCSP_* /
SS_SCSP_Timer / SS_SCSP_Slot file-scope-typedef pattern at the
top of scsp.h, where the same C-vs-C++-name-resolution issue was
already worked through (cbc9414).

== Verification ==

G1 (compile):
   sound_glue.cpp: 0 errors, 0 warnings.
   m68k.cpp:       0 errors, 0 warnings.

G2 (byte equivalence vs pre-commit baseline):
   sound_glue.cpp .o: byte-identical.
   m68k.cpp .o:       byte-identical.
   The enum value (-1) is unchanged; the only thing that
   changed is the unqualified vs class-qualified name used at
   one call site -- in C++ both resolve to the same compile-time
   constant integer, so codegen is bit-for-bit unchanged.

G3 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G4 (scope sweep, manual):
   `grep -rn BUS_INT_ACK_AUTO mednafen/` reports exactly 2 hits:
   the new file-scope definition in m68k.h, and the single
   consumer in sound_glue.cpp -- now using the file-scope name.
   No third party referenced the class-scoped name, so dropping
   it is safe.
…rev_e) constructor

Sound_glue.cpp -> sound_glue.c prep, step 2.  Adds a C-callable
free-function counterpart to the M68K class constructor so the
upcoming `.c` consumer can construct M68K instances without the
C++ ctor-call syntax that doesn't exist in C.

== The contract ==

Both code paths do the same work in the same order:

   1. Stash the revision flag (Revision_E = rev_e).
   2. Null all 7 bus-callback function pointers (BusReadInstr,
      BusRead{8,16}, BusWrite{8,16}, BusRMW, BusIntAck).
   3. Install the file-static Dummy_BusRESET as the default
      BusRESET callback (this one is unconditional rather than
      nullable -- M68K's Reset path unconditionally invokes it).
   4. Zero the per-run state scalars (timestamp, XPending, IPL).
   5. Power-on Reset via Reset(true) / M68K_Reset(z, true).

The C++ ctor uses member-initializer-list syntax for steps 1-3
(needed to satisfy the formerly-`const bool Revision_E` field --
see below); M68K_Construct does the same writes as plain
assignments through its `M68K* z` argument.

== The const-correctness give-up ==

`Revision_E` was declared `const bool Revision_E;` -- a
contractual "set once at construction, never modified again"
marker.  C++ member-initializer-list syntax (`: Revision_E(rev_e)`)
is the only way to assign to a const member.

C has no member-initializer-list syntax.  M68K_Construct has to
spell the same write as `z->Revision_E = rev_e;`, which a `const`
member rejects ("assignment of read-only member").

Drop the const.  Comments on the field now document the
set-once contract for human consumption; it's preserved by
convention and code review instead of compiler enforcement.
This is the same trade scsp.h made when retiring its C++-only
default-member-initializers in the SS_SCSP struct.

Storage / ABI:  `const` on a non-static data member does not
affect struct layout in either C or C++, so this is a no-op
for sizeof, alignof, offsetof, and savestate field offsets.

The 4 read sites of Revision_E (all `if(!Revision_E ||
CheckPrivilege())` checks in m68k_instr.inc) are unaffected.
No write sites exist outside the constructor (verified via
`grep -rn Revision_E mednafen/` -- only m68k.cpp:60 and
m68k_instr.inc reads, plus the new M68K_Construct write).

== Why M68K_Construct lives in m68k.cpp not m68k.h ==

The body needs two things that are file-local to m68k.cpp:

   - `Dummy_BusRESET` -- a `static MDFN_FASTCALL void
     Dummy_BusRESET(bool state) { }` at the top of m68k.cpp,
     not exposed in any header.
   - `z->Reset(true)` -- calls into the 130k-line m68k_instr.inc
     dispatch machinery; inlining this from a header would pull
     all of m68k.cpp's static private state into every consumer
     TU.

The other M68K_* free-function wrappers in m68k.h are
`static FORCE_INLINE` 1-liner thunks that just call the matching
member method; M68K_Construct is the first free-function wrapper
that's actually substantial enough to deserve its own out-of-line
definition.  Header gets just the declaration; m68k.cpp owns
the body.

== Verification ==

G1 (compile):
   m68k.cpp        : 0 errors, 0 warnings.
   sound_glue.cpp  : 0 errors, 0 warnings.
   m68k.h transitively included by m68k_split.cpp and all SS
   TUs: matrix green (see G3).

G2 (semantic equivalence of M68K::M68K vs M68K_Construct):
   Disassembly of the two emitted functions:

     M68K::M68K(bool)     : 17 instructions
     M68K_Construct(M68K*, bool) : 17 instructions

   Sans address-dependent fixups (RIP-relative offset to the
   literal pool, relative-jump distance to M68K::Reset --
   inevitable because the two functions live at different .o
   offsets), the two instruction streams are identical.  Same
   stores to the same offsets within `M68K`, in the same order,
   followed by the tail call to Reset.

G3 (M68K::M68K body unchanged structurally):
   Baseline vs new m68k.cpp.o, M68K::M68K disassembly:
     instruction count identical (17 vs 17).
     opcodes identical.
     only diffs are the absolute literal-pool offset (shifted
     because the .o is bigger now) and the relative-jump distance
     to M68K::Reset (same reason).

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== What this commit does NOT do ==

   * The `static M68K SoundCPU(true);` declaration in
     sound_glue.cpp is unchanged -- still uses the C++ ctor call.
     Migrating it to `static M68K SoundCPU = {0};` +
     `M68K_Construct(&SoundCPU, true);` happens together with
     the sound_glue.cpp -> .c rename so the file's whole-buffer
     conversion lands atomically.
   * The C++ M68K::M68K constructor is NOT retired.  It still
     exists for any future C++ caller and for back-compat during
     the migration; once nothing calls it the ctor+dtor pair
     becomes retire-able.
LibretroAdmin and others added 11 commits May 17, 2026 21:06
… "C" linkage

Phase-9 prep for sound_glue.cpp -> sound_glue.c, step 3.

The M68K_* free-function API (M68K_Reset, M68K_Run, M68K_SetIPL,
M68K_SignalDTACKHalted, M68K_SignalAddressError, M68K_SetExtHalted,
M68K_StateAction, M68K_GetRegister, M68K_SetRegister, plus
M68K_Construct from 5cafd34) was previously a set of 9
`static FORCE_INLINE` 1-liner thunks in m68k.h that forwarded
to the matching struct M68K member methods (`{ z->Foo(arg); }`).
Worked fine from C++ consumers (sound_glue.cpp, m68k.cpp itself,
m68k_instr_split0/1.cpp under M68K_SPLIT_SWITCH); cannot be
called from a C TU because:

 1. The bodies invoke C++ class methods (`z->Foo(arg)`) -- no
    such syntax in C.
 2. Even if you gated the bodies out for C consumers, a plain
    forward-declaration `void M68K_Foo(M68K*, ...);` in a C++
    header gets C++ name mangling, but a C caller would emit
    the unmangled name -- linker fails to resolve.

Both must be fixed so sound_glue.cpp (the only remaining .cpp
in mednafen/ss/) can be converted to sound_glue.c.

== The conversion ==

m68k.h:

  -  9 `static FORCE_INLINE T M68K_Foo(M68K* z, args) { z->Foo(args); }`
  +  10 `T M68K_Foo(M68K* z, args);` declarations inside one
     `extern "C" { ... }` block (gated by `#ifdef __cplusplus`
     so C consumers see plain C declarations directly).

  M68K_Construct (already out-of-line since 5cafd34) joins the
  same block.  M68K_Construct and M68K_Reset keep their MDFN_COLD
  hints; the rest are hot-path-adjacent enough that COLD would be
  misleading.

m68k.cpp:

  +  10 out-of-line function definitions inside an
     `extern "C" { ... }` block.  M68K_Construct moves into the
     block; the previous standalone `void M68K_Construct(...)`
     definition with no linkage spec becomes
     `extern "C" void M68K_Construct(...)` so the symbol matches
     the C-linkage declaration in m68k.h.  The 9 new thunk
     bodies are the same `{ z->Foo(args); }` 1-liners the header
     used to carry.

  M68K_Construct's earlier comment in the prep commit (pre-empted
  the in-bulk wrapper block by 1 commit) is replaced with a
  single comment over the whole `extern "C"` block.

== Trade-off ==

We lose call-site inlining of the 9 thunk bodies.  Previously
`SoundCPU.timestamp += 4; M68K_SignalDTACKHalted(&SoundCPU, A);`
inlined to `SoundCPU.timestamp += 4; SoundCPU.XPending |= ...`
at the call site -- the thunk vanished.  Now it's a real PLT
call (`callq M68K_SignalDTACKHalted`).

None of the wrappers are on the M68K inner instruction loop
(M68K::Run dispatches via m68k_instr.inc internal jumps; the
9 wrappers are external orchestration helpers for IRQ change,
savestate, reset, scheduler step, debugger register R/W, and
the cold-path bus-error path).  Per-call overhead is one
function-call ABI cycle; profile-wise negligible.

Cross-TU LTO can still inline these, since the bodies are tiny
and visible to the optimizer at link time.

== Verification ==

G1 (compile):
   m68k.cpp                     : 0 errors, 0 warnings.
   sound_glue.cpp               : 0 errors, 0 warnings.
   m68k_instr_split0.cpp        : 0/0 (with -DM68K_SPLIT_SWITCH).
   m68k_instr_split1.cpp        : 0/0 (with -DM68K_SPLIT_SWITCH).

G2 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

G3 (ABI symmetry):
   nm on m68k.o: defines 10 unmangled `T M68K_<name>` symbols
   (M68K_Construct, M68K_SetIPL, M68K_SignalDTACKHalted,
    M68K_SignalAddressError, M68K_Reset, M68K_Run,
    M68K_SetExtHalted, M68K_StateAction, M68K_GetRegister,
    M68K_SetRegister).  Zero C++-mangled `_Z` symbols for these.

   nm on sound_glue.o: 9 unresolved `U M68K_<name>` references
   (Construct still not used from there -- that's its own
   follow-up commit), all matching the 10-symbol T-set in
   m68k.o.  Zero orphans (comm -23 / cross-check).

G4 (codegen, expected delta):
   sound_glue.cpp .text bytes grew from 0xe01 (3585) to 0xe21
   (3617), +32 bytes / +0.9%.  Inline thunk bodies replaced by
   PLT calls; SoundCPU_BusRead_u8/u16/BusReadInstr/BusRMW/...
   all have new relocations to M68K_SignalDTACKHalted /
   M68K_SignalAddressError.  No relocations were present in the
   baseline (everything was inlined).

   No code path was added or removed -- only the
   inline-vs-out-of-line boundary moved.

G5 (SoundGlue_* C-linkage preserved): sound_glue.cpp is still
   wrapped in its own `extern "C" { ... }` block, so the 18
   `SoundGlue_*` symbols stay unmangled in sound_glue.o.
   sound.c can still call them.

== What this commit does NOT do ==

This commit makes the M68K_* function API callable from C.  It
does NOT yet make the M68K struct definition C-includable.  The
struct still has class methods inside its body (ctor, dtor, Run,
Reset, Read_u8/16/32, Write_u8/16/32, GetC/V/Z/N/X, etc.) plus
~10 `template<typename...>` declarations at file scope.  A C TU
that includes m68k.h still gets parse errors on those
declarations.

Gating those C++-only declarations with `#ifdef __cplusplus` is
the next focused commit; once that lands, sound_glue.cpp can
become sound_glue.c and the existing `static M68K SoundCPU(true);`
ctor-call site can switch to `M68K SoundCPU; M68K_Construct(&,
true);`.
…lusplus`

Phase-9 prep for sound_glue.cpp -> sound_glue.c, step 4 (final
gating).  Makes m68k.h includable from a C TU -- the last
remaining structural blocker before the rename.

Before this commit, including m68k.h from a C source produced
12 errors at the first class-method declaration line:

    m68k.h:43:2: error: expected specifier-qualifier-list before 'M68K'
    m68k.h: ... 11 cascading "unknown type name 'M68K'" errors.

The first error is the `M68K(const bool rev_e = false) MDFN_COLD;`
constructor declaration at the top of `struct M68K { ... }` --
C doesn't allow class-method declarations inside struct bodies,
which then cascades: the struct fails to parse so `M68K` is never
defined as a type, so subsequent uses of bare `M68K*` in the
struct's BusRMW function-pointer signature and in the post-struct
M68K_* free-function declarations all fail too.

== The gating ==

`struct M68K { ... }` carries 5 distinct method-laden regions
interleaved with data members and enum tag-blocks.  Each region
gets bracketed by `#ifdef __cplusplus ... #endif`:

  Region 1 (lines 43-74):
    M68K ctor + dtor, Run, Reset, SetIPL, SetExtHalted,
    INLINE SignalDTACKHalted (body), INLINE SignalAddressError
    (body), StateAction.

  Region 2 (lines 132-168, post-Revision_E):
    RecalcInt, Read_u{8,16,32}, Write_u{8,16,32},
    Write_u32_longdec, Push_u{16,32}, Pull_u{16,32}, ReadOp,
    plus the `#ifdef M68K_SPLIT_SWITCH` { RunSplit0, RunSplit1 }
    block (now nested under the __cplusplus guard).

  Region 3 (lines 201-235, post-AddressMode enum):
    `template<typename T, M68K::AddressMode am> struct HAM;`
    forward declaration, GetC/V/Z/N/X, SetCX, CalcZN_u{8,16,32}
    (clear / no-clear variants), the CalcZN<T, Z_OnlyClear>
    template, GetCCR, SetCCR, GetSR, SetSR, GetSVisor.

  Region 4 (lines 278-510, post-EXCEPTION enum):
    The BIG one.  Exception, all 75+ M68K instruction templates
    (ADD, ADDX, Subtract, SUB, SUBX, NEG, NEGX, CMP, CHK, OR,
    EOR, AND, ORI_CCR/SR, ANDI_CCR/SR, EORI_CCR/SR, MULU, MULS,
    Divide_{u,s}, DIVU, DIVS, ABCD, SBCD, NBCD, MOVEP_{w,l}_
    {mem_to_reg,reg_to_mem}, BTST, BCHG, BCLR, BSET, MOVE,
    MOVEA, MOVEM_to_{MEM,REGS}, ShiftBase, ASL, ASR, LSL, LSR,
    RotateBase, ROL, ROR, ROXL, ROXR, TAS, TST, CLR, NOT, EXT,
    SWAP, EXG, TestCond, Bxx, DBcc, Scc, JSR, JMP,
    MOVE_from_SR, MOVE_to_CCR, MOVE_to_SR, MOVE_USP, LEA, PEA,
    UNLK, LINK, RTE, RTR, RTS, TRAP, TRAPV, ILLEGAL, LINEA,
    LINEF, NOP, RESET, STOP), and CheckPrivilege.

  Region 5 (lines 562-563, post-GSREG enum):
    GetRegister, SetRegister.

Data members (lines 84-103, 122-129, 518-527) and enum tag-blocks
(XPENDING_MASK_*, AddressMode, VECNUM_*, EXCEPTION_*, GSREG_*)
stay OUTSIDE the gates so C consumers see them.  All five enum
blocks are anonymous and contribute no struct field, but the
enumerator names sit at file scope from C's perspective -- their
prefixed names (XPENDING_MASK_, VECNUM_, EXCEPTION_, GSREG_)
already avoid collisions, and no caller anywhere references them
as `M68K::XPENDING_MASK_*` etc. (verified with grep across
mednafen/).

Plus: a `typedef struct M68K M68K;` forward declaration is added
above the struct, parallel to the scsp.h's SS_SCSP_Slot /
SS_SCSP_Timer / SS_SCSP_DSPStep / SS_SCSP_DSPS / SS_SCSP set.
Required because the bare `M68K*` spellings inside the BusRMW
fn-ptr signature and the post-struct M68K_* free-function
declarations don't resolve in C without the typedef.

== Layout / ABI ==

Class methods don't take up storage in C++ (they're not virtual
-- no vtable), so the C view of `struct M68K` has the same
data-member layout, the same `sizeof(M68K)`, and the same
offsetof for every field.  Verified by g++ producing
BYTE-IDENTICAL m68k.cpp and sound_glue.cpp .o files vs.
pre-commit baseline -- gating is pure conditional compilation,
C++ TUs see exactly the same struct + methods they did before.

== Verification ==

G1 (C-compile probe): a synthetic `#include "m68k.h"; void
   probe(M68K* z) { z->timestamp = 0; z->BusRead8 = 0; ...
   M68K_Reset(z, 1); M68K_Run(z, 100); }` C99 TU compiles with:
     gcc -std=gnu99: 0 errors, 5 advisory warnings.
   The 5 warnings are "declaration does not declare anything"
   on the 5 anonymous `enum { ... };` block closings inside the
   struct body.  Advisory only; the enumerators are introduced
   correctly.  Future commits that actually include m68k.h from
   .c TUs can move these enums to file scope or add `__extension__`
   to silence them; out of scope here since no .c TU includes
   m68k.h yet.

G2 (C++-compile preservation): m68k.cpp, sound_glue.cpp,
   m68k_instr_split0.cpp, m68k_instr_split1.cpp all compile
   0 errors / 0 warnings (matches baseline).

G3 (byte equivalence):
   m68k.cpp .o:        BYTE-IDENTICAL to baseline.
   sound_glue.cpp .o:  BYTE-IDENTICAL to baseline.
   The `#ifdef __cplusplus` guards are true in C++ mode, so the
   preprocessor output is unchanged, so the compiler input is
   unchanged, so the codegen is unchanged.

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE,
   no_tremor_LE, no_threading_LE, m68k_split_LE), 56-58 TUs each.

== What's now unblocked ==

m68k.h is C-includable.  sound_glue.cpp can be renamed to
sound_glue.c, since the only remaining C-incompat in that file
is the `static M68K SoundCPU(true);` ctor-call syntax line --
which 5cafd34 (M68K_Construct) already gave us the C-callable
replacement for.

The rename + Makefile.common SOURCES_CXX -> SOURCES_C move +
single-line ctor->explicit-init swap is the next focused commit.

== What this commit does NOT do ==

- Move the 4 anonymous enums (XPENDING_MASK_*, VECNUM_*,
  EXCEPTION_*, GSREG_*) out of the struct.  They generate
  advisory C warnings but no errors; defer to the next commit
  if the warning load becomes annoying once sound_glue.c
  builds.
- Detempleting struct M68K::HAM<T, AM> or any instruction
  template.  Out of scope -- gated behind __cplusplus, doesn't
  affect C compilation.
- Provide a C-callable Read_u8 / Write_u16 / etc. API.  Not
  needed: sound_glue.c only touches the bus-callback function-
  pointer slots and the M68K_* wrapper API (already C-callable
  after 78f4919).
Phase-9 finale.  sound_glue.cpp -- the C++-side glue file that
held SS_SCSP and M68K class instances plus the M68K bus
callbacks since the Phase-6c sound.cpp split -- becomes
sound_glue.c.  Mednafen/ss/ is now 100% C-source.

All the structural prerequisites landed earlier in the series:

   af811af  scsp.inc: 2 ref decls + 88 use sites to pointer form (T& -> 0)
   cf8425b  m68k.h: M68K::BUS_INT_ACK_AUTO -> M68K_BUS_INT_ACK_AUTO file-scope
   5cafd34  m68k: M68K_Construct(M68K*, bool) free-function constructor
   78f4919  m68k: M68K_* wrapper API out-of-line, extern "C" linkage (10 fns)
   ef45c3f  m68k.h: class-method / template decls behind #ifdef __cplusplus

== Sound_glue.c content changes ==

(1) static M68K SoundCPU(true);  -- C++ ctor-call syntax; no
    equivalent in C.
    ->
    static M68K SoundCPU;        -- zero-initialised at program
                                    load (file-scope `static`).
    +
    M68K_Construct(&SoundCPU, true);  -- explicit at top of
                                         SoundGlue_Init(), in
                                         lieu of the ctor.

    What M68K_Construct does is identical to what the C++ ctor
    did: stash Revision_E, null the 7 bus-callback slots,
    install Dummy_BusRESET as the BusRESET default, zero
    timestamp/XPending/IPL, call Reset(true).  The 8
    SoundCPU.Bus* assignments later in SoundGlue_Init
    overwrite the nulls with real callbacks.

    Pre-existing  memset(SS_SCSP_GetRAMPtr(&SCSP) + 0x40000, ...)
    + SS_SCSP_Reset(&SCSP, true) stays as-is; it does what the
    SS_SCSP::SS_SCSP() retired in an earlier phase used to do
    implicitly.  SS_SCSP was a pure-data struct already, so its
    "constructor" reduces to the zero-initialisation that the
    file-scope `static SS_SCSP SCSP;` provides for free.

(2) extern "C" { ... }  wrapping the 18 SoundGlue_* /
    SOUND_RunSCSP definitions.
    ->
    No wrapper.  This is C now; the function bodies have C
    linkage by default.  sound_internal.h still wraps the
    matching declarations in
        #ifdef __cplusplus
        extern "C" { ... }
        #endif
    so any future C++ caller would see C-linkage names; from
    a C TU plain external linkage matches what sound.c expects.

(3) Comment-cleanup.  The file header, the bus-callback section
    intro, and the SoundGlue_* section intro had narrative
    paragraphs that talked about this being "the C++ side" with
    "class globals" reachable only "through extern \"C\"
    wrappers".  All updated to reflect the C reality.  The
    Makefile.common phase-6c block also gets refreshed.

== Carried-along scsp.h / scsp.inc fixes ==

The C-compile cascade surfaced three more C++isms that the static
scsp.inc pattern-sweep didn't catch (the sweep matched only
primitive types like `uint16_t&`; `SS_SCSP_DSPStep&` slipped
through):

(4) scsp.h:
       const uint16_t SB_XOR_Table[4] = { 0x0000, 0x7FFF, 0x8000, 0xFFFF };
    A C++11 in-class default-member-initializer inside struct
    SS_SCSP.  C99/C11 doesn't allow `=` initializers on
    struct members.  Moved to file scope as
       static const uint16_t SS_SCSP_SB_XOR_Table[4] = { ... };
    above struct SS_SCSP.  The 2 use sites in scsp.inc become
       s->SBXOR = SS_SCSP_SB_XOR_Table[(*SRV >> 9) & 0x3];
    (was `z->SB_XOR_Table[...]`).  Table is constant data and
    not in the savestate; the per-TU `static const` copy is
    8 bytes.

(5) scsp.inc: 4 `const SS_SCSP_DSPStep&` references in the DSP
    decode + liveness + run-step loops (in
    SS_SCSP_PrepareDSPDecoded / SS_SCSP_LivenessPass /
    SS_SCSP_RunDSP).  Converted to `const SS_SCSP_DSPStep*`,
    with `.X` member access -> `->X`.  88 use sites updated
    via bounded-by-enclosing-for-loop substitution -- same
    technique used for the SRV uint16_t& conversion in af811af.

(6) scsp.inc: 1 `MDAP(z->SlotRegs)` call -- mednafen-types.h's
    `template<typename T> T* MDAP(T (*v)[N])` helper isn't C-
    parseable.  Replaced with the explicit
    `&z->SlotRegs[0][0]` (parallels the scu.inc fix in 1f3e4e8).

== Makefile.common ==

   SOURCES_CXX += $(CORE_EMU_DIR)/sound_glue.cpp
   SOURCES_C   += $(CORE_EMU_DIR)/ss.c
   ->
   SOURCES_C   += $(CORE_EMU_DIR)/ss.c \
                  $(CORE_EMU_DIR)/sound_glue.c

The Phase-6c explanatory comment in Makefile.common is refreshed
to describe the post-Phase-9 reality (sound.c + sound_glue.c,
both C, exchanging state via plain C-linkage SoundGlue_*
wrappers around the M68K_* / SS_SCSP_* free-function APIs).

== Verification ==

G1 (compile, primary):
   sound_glue.c (gcc -std=gnu99):  0 errors, 7 warnings.
   The 7 warnings are all "declaration does not declare
   anything" advisories on the anonymous `enum { ... };` block
   closings inside struct M68K (5) and struct SS_SCSP (2).
   Advisory only -- the enumerators are introduced correctly,
   no -Werror in the project's WARNINGS list.  Out of scope
   here; addressable in a follow-up by moving the enums to
   file scope or using __extension__.

G2 (compile, byte-identical preservation on other TUs):
   m68k.cpp:     byte-identical to baseline.
   ss.c:         byte-identical to baseline.
   sound.c:      byte-identical to baseline.
   ss_state.c:   byte-identical to baseline.
   Confirms the m68k.h / scsp.h / scsp.inc changes don't ripple
   into any consumer outside sound_glue itself.

G3 (link surface): nm orphan check on sound_glue.c.o:
   needs (U):  10 symbols  (M68K_*, SS_SCSP_*).
   defs  (T):  20 symbols in m68k.o + ss.o + sg.o combined.
   orphans:    0  (comm -23 / cross-check, empty).
   All M68K_* and SS_SCSP_* references resolve via the existing
   ABI surface.  SoundGlue_* exports are unmangled C-linkage
   names (Init, M68K_*, SCSP_*, RunSCSP) -- matches what sound.c
   already calls.

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE,
   no_tremor_LE, no_threading_LE, m68k_split_LE), 56-58 TUs
   each.  The matrix's CXX-in-C pre-compile pattern gate now
   scans sound_glue.c too; clean (no typed-enum/template/
   namespace/nullptr/static_assert/class/alignas/extern "C" /
   STL hits).

G5 (mednafen/ss/ .cpp inventory): post-rename,
   `find mednafen/ss/ -name '*.cpp'` returns empty.  The
   directory is 100% C.  The remaining .cpp files in the project
   (mednafen/hw_cpu/m68k/m68k.cpp + m68k_instr_split0/1.cpp +
   gen.cpp / gen_split.cpp) all live under m68k/.

== Trade-offs ==

sound_glue.c.o is ~20KB bigger than the pre-rename
sound_glue.cpp.o (24KB vs 3.6KB .text).  Cause: the 5 non-
static INLINE functions in scsp.inc (SS_SCSP_RW_u8_W0/W1,
SS_SCSP_RW_u16_W0/W1, SS_SCSP_RunSample) emit out-of-line
copies in C (T-linkage), whereas the C++ frontend used vague
linkage (W) and inlined them aggressively into the SoundGlue_*
wrappers at -O2.  Final .so size grows by roughly the same
amount, ~0.5-1% of typical libretro-core size.  Easily clawed
back in a follow-up by changing those 5 to `static INLINE` --
scsp.inc is included from only one TU (sound_glue.c) so file-
local scope is correct; left as a separate focused commit so
this one doesn't get the additional churn.

== Dead code now ==

M68K::M68K(const bool) and M68K::~M68K() in m68k.cpp have
no remaining callers.  Left in place for this commit so the
rename diff stays narrowly focused; their retirement is a
separate cleanup.
Trivial cleanup following fd5bf98 (sound_glue.cpp -> sound_glue.c).
The sole caller of `M68K::M68K(true)` was the file-scope
`static M68K SoundCPU(true);` declaration in sound_glue.cpp, which
got rewritten as
   static M68K SoundCPU;             /* zero-init at load */
   ...
   M68K_Construct(&SoundCPU, true);  /* in SoundGlue_Init() */
when that file became sound_glue.c.  M68K::~M68K() had an empty
body and was implicitly invoked only at program shutdown for the
matching file-scope object -- and even that's gone, since the
SoundCPU instance has no destructor to call now that M68K is a
pure-data struct.

Verified zero callers across the tree (grep for `M68K\s*\(` and
`new\s+M68K`, filtered to exclude pointer types and M68K_*
unrelated names): only the definitions themselves remained, plus
one doc-comment reference in sound_glue.c that's narrative
prose (not a call).

== Surgery ==

m68k.cpp:  delete the M68K::M68K(const bool) body (12 lines,
   member-initializer list + 4 body lines + Reset(true) tail)
   and the M68K::~M68K() empty-body destructor.  Body matched
   the M68K_Construct free-function counterpart in the same
   file 1:1 (it was 5cafd34's prep commit that introduced
   M68K_Construct alongside the ctor specifically so this
   retirement could happen).

m68k.h:   delete the two declarations:
              M68K(const bool rev_e = false) MDFN_COLD;
              ~M68K() MDFN_COLD;
   inside the `#ifdef __cplusplus`-gated method region of
   struct M68K.  Replace with a brief comment noting the
   retirement.

== Verification ==

G1 (compile):
   m68k.cpp:               0 errors, 0 warnings.
   sound_glue.c:           0 errors, 7 warnings (unchanged from
                           pre-commit -- the anonymous-enum-in-
                           struct advisories from m68k.h + scsp.h
                           that have nothing to do with this change).
   m68k_instr_split0.cpp:  0/0 (with -DM68K_SPLIT_SWITCH).
   m68k_instr_split1.cpp:  0/0 (with -DM68K_SPLIT_SWITCH).
   ss.c, sound.c, ss_state.c: 0/0.

G2 (byte equivalence on consumers):
   sound_glue.c .o: byte-identical to baseline.
   ss.c .o:         byte-identical.
   sound.c .o:      byte-identical.
   ss_state.c .o:   byte-identical.

   Confirms removing the unused ctor/dtor doesn't ripple into
   any consumer of m68k.h.  M68K-as-data-struct layout is
   unchanged (non-virtual methods don't contribute to struct
   storage in C++); existing C++ TUs see exactly the same
   struct + member set they did before, minus two function
   declarations they never referenced.

G3 (m68k.cpp size, expected change):
   baseline: 424,686 text bytes (size /tmp/m68k.base.o)
   new:      424,568 text bytes
   delta:    -118 bytes  (the ctor + empty-dtor symbols and
                          their PLT/PIC fixups).

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE,
   no_tremor_LE, no_threading_LE, m68k_split_LE), 56-58 TUs each.

== State of M68K class surface after this commit ==

struct M68K now has, in the `#ifdef __cplusplus`-gated method
region of region 1:

   void Run(int32_t run_until_time);
   void Reset(bool powering_up) MDFN_COLD;
   void SetIPL(uint8_t ipl_new);
   void SetExtHalted(bool state);
   INLINE void SignalDTACKHalted(uint32_t addr)    { ... }
   INLINE void SignalAddressError(uint32_t addr, uint8_t type) { ... }
   void StateAction(StateMem* sm, ...);

Still C++-only (the bodies use `this`/`XPending` member access).
Each is reachable from C through its M68K_<Name> free-function
wrapper in the post-struct extern "C" block.  Retiring these
remaining 7 methods entirely (move bodies to m68k.cpp as
free-function bodies that take M68K*) is a much bigger Phase-9d
that retires the C++-side of m68k.cpp entirely.  Out of scope
for this cleanup.
…" wrappers

Phase-9d-1: first focused step toward retiring m68k.cpp's C++
surface.  Two small class methods get folded into the M68K_*
extern "C" wrappers that were already 1-line forwarders to them.

M68K::SetIPL was an 8-line body, M68K::SetExtHalted a 4-line
body.  Both were called by exactly one place each -- the
extern "C" wrapper at the top of m68k.cpp:

   void M68K_SetIPL       (M68K* z, uint8_t i)     { z->SetIPL(i); }
   void M68K_SetExtHalted (M68K* z, bool s)        { z->SetExtHalted(s); }

Folding the bodies inline drops two unused-out-of-line-symbol
definitions and removes the class-method->free-function
dispatch round-trip from m68k.h's struct M68K's `#ifdef
__cplusplus` block.

== Diff shape ==

m68k.cpp:
   - the 1-line wrappers M68K_SetIPL and M68K_SetExtHalted, plus
     the corresponding M68K::SetIPL (10 lines) + M68K::SetExtHalted
     (6 lines) definitions further down the file.
   + the same wrappers, with the bodies folded in.  Class-scope
     enum references that worked bare inside method bodies
     (XPENDING_MASK_NMI, XPENDING_MASK_EXTHALTED) become
     M68K::XPENDING_MASK_<NAME> qualified spellings -- the
     anonymous enum still lives inside struct M68K and isn't
     reachable from a free function without the qualifier.
   + z->RecalcInt() in place of bare RecalcInt() (same reason).

m68k.h:
   - 2 method declarations from inside the `#ifdef __cplusplus`-
     gated region 1 of struct M68K.
   + a comment marking the Phase-9d-1 retirement.

== Verification ==

G1 (codegen, the strong test):  disassembly of the resulting
   M68K_SetIPL and M68K_SetExtHalted bodies, instruction by
   instruction:

      M68K_SetIPL:        25 instructions  ->  25 instructions (IDENTICAL)
      M68K_SetExtHalted:  10 instructions  ->  10 instructions (IDENTICAL)

   The compiler's inlining decision for the wrapper +
   class-method pair was already to inline; the new explicit
   "bodies-folded-into-wrapper" shape produces the same
   instruction stream.  Moving the body across the call boundary
   manually matches what gcc -O2 was doing implicitly.

G2 (size delta on m68k.cpp .o):
      baseline:  424,568 text bytes
      new:       424,416 text bytes
      delta:     -152 bytes

   Two unused-elsewhere out-of-line bodies dropped: the
   M68K::SetIPL and M68K::SetExtHalted standalone symbols.

G3 (compile):
      m68k.cpp:               0 errors, 0 warnings.
      sound_glue.c:           0/7  (the 7 unchanged anonymous-enum
                                    advisories from m68k.h + scsp.h).
      m68k_instr_split0.cpp:  0/0 (with -DM68K_SPLIT_SWITCH).
      m68k_instr_split1.cpp:  0/0 (with -DM68K_SPLIT_SWITCH).

G4 (build matrix): tools/check_build_matrix.py green across all
   7 configs (default_LE/_BE, no_deint_LE, no_chd_LE, no_tremor_LE,
   no_threading_LE, m68k_split_LE), 56-58 TUs each.

== Phase-9d roadmap ==

This is the first commit of a multi-commit Phase-9d series moving
m68k.cpp / m68k_private.h / m68k_instr_split{0,1}.cpp toward C.
Next steps: similar small fold-the-body-into-the-wrapper moves
for StateAction, GetRegister, SetRegister, plus a generator
retirement that clears 2,550 lines of C++ build-tool from the
tree before the bigger m68k_private.h template work starts.
… longer used

Phase-9d-2: drop the two standalone code-generators from the
tree.  ~2,550 lines of build-time C++ that:

  1. were never compiled by the libretro core build (zero
     references in Makefile / Makefile.common -- they were
     run manually, by hand, per the comments at the top:
        // g++ -std=gnu++14 -Wall -O2 -o gen gen.cpp && ./gen > m68k_instr.inc
     and emit their corresponding m68k_instr*.inc files);

  2. had drifted out of sync with the checked-in artifacts they
     were supposed to produce.  `g++ ... gen.cpp && ./gen >
     /tmp/gen_out.inc; diff /tmp/gen_out.inc
     mednafen/hw_cpu/m68k/m68k_instr.inc` shows 596 lines of
     non-context diff -- the checked-in m68k_instr.inc has
     `Bxx(0x00, ...)` (runtime first-arg, post Phase-8c
     detempleting) where gen.cpp still emits the older
     `Bxx<0x00>(...)` template form.  The .inc file is already
     the source of truth, and gen.cpp's output is stale;

  3. are C++ (`std::string`, `std::map`, `std::list`,
     iterators, RAII) when the Phase-9d goal is full-C for
     mednafen/hw_cpu/m68k/.  Keeping them means either
     converting them too (extra work for no runtime benefit)
     or having "the codebase is C except for these 2,550
     lines of build tooling that never runs anyway" as the
     end state;

  4. would only matter if someone wanted to *restructure*
     the M68K dispatch table -- and the 68000 opcode set is
     frozen.  Saturn's sound CPU isn't getting new
     instructions.  No live regeneration scenario exists.

Whatever value gen.cpp / gen_split.cpp had as documentation of
"how were the dispatch tables originally derived from the 68000
opcode space" is preserved in git history; `git log
-- mednafen/hw_cpu/m68k/gen.cpp` reaches the full source.

== State of mednafen/hw_cpu/m68k/ post-retirement ==

   .h:    m68k.h, m68k_private.h
   .cpp:  m68k.cpp, m68k_instr_split0.cpp, m68k_instr_split1.cpp
   .inc:  m68k_instr.inc, m68k_instr_split0.inc, m68k_instr_split1.inc

The three remaining .cpp files are runtime code (the M68K
interpreter / instruction dispatch).  Phase-9d's remaining
hand-conversion work targets these.  m68k_instr_split{0,1}.cpp
are 14- and 12-line stubs that just `#include` their matching
.inc inside `void M68K::RunSplit<N>(...)` body -- trivial to
convert once the M68K::RunSplit methods become free functions.

== Verification ==

G1 (no consumer breakage): the libretro core build doesn't
   reference these files anywhere.  `grep -rnE
   "\bgen(_split)?\.cpp\b|/gen\b" Makefile* mednafen/hw_cpu/m68k/Makefile*`
   returns empty.  Removal is a no-op for every active build
   target.

G2 (build matrix): tools/check_build_matrix.py green across all
   7 configs.  TU count drops from 56-58 to 54-56 per config --
   exactly the 2 generators no longer being scanned.

G3 (output unchanged): the three .inc files
   (m68k_instr.inc, m68k_instr_split0.inc, m68k_instr_split1.inc)
   are checked in.  Removing their generators doesn't touch
   them.  The M68K interpreter's actual dispatch logic is
   completely preserved.

== What's left to make m68k fully C ==

Following commits will:
  - Continue inlining the remaining 4 class methods in
    m68k.cpp (StateAction, GetRegister, SetRegister, Reset)
    into their extern "C" wrappers, parallel to the Phase-9d-1
    treatment of SetIPL/SetExtHalted.
  - Detemplate HAM<T, AM> from m68k_private.h into a runtime-
    tagged `struct M68K_HAM` with free-function methods.
  - Detemplate the ~50 op bodies (ADD/SUB/CMP/MOVE/AND/OR/...)
    in groups of 5-10 per commit.
  - Rename m68k.cpp / m68k_instr_split{0,1}.cpp to .c and
    update Makefile.common's SOURCES_CXX -> SOURCES_C.

Realistic estimate: ~8 more focused commits to reach 100% C in
mednafen/hw_cpu/m68k/.
also factor jitdump writer into a shared translation unit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants