Skip to content

Conversation

@pepijndevos
Copy link
Member

We currently use pickle databases into a python packer which is becoming a performance bottleneck and has been a thorn in the side of packagers who like simple static C++ binaries without whole python runtimes and numpy dependencies.

This PR aims to get us there by first replacing pickle with msgpack and the porting the python pack script to C++

@pepijndevos pepijndevos force-pushed the msgspec-serialization branch 5 times, most recently from 137aef9 to 08c08ce Compare February 6, 2026 14:48
- Use msgspec for chipdb serialization (faster, typed)
- Update Makefile to produce .msgpack.gz instead of .pickle
- Update CI workflow for new file extension
- Update setup.py and gowin_pack/unpack to use new format

msgspec.msgpack.decode() with type=Device handles all conversions
(list->tuple, list->set) automatically based on dataclass annotations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pepijndevos pepijndevos force-pushed the msgspec-serialization branch from 08c08ce to 9b6b102 Compare February 6, 2026 14:58
pepijndevos and others added 17 commits February 6, 2026 17:44
Update all grid iteration patterns to work with the new Device
structure where grid stores ttyp indices instead of Tile objects:
- grid[row][col] now returns ttyp (int) directly
- Use Device.__getitem__ accessor (db[row, col]) to get Tile objects
- Update enumerate(dev.grid) patterns to use range() with accessor

This fixes memory explosion issue where msgspec duplicated tile
objects instead of preserving shared references like pickle did.
With this change, GW2A-18 uses 74 unique tiles instead of 3080.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use pepijndevos/nextpnr with msgspec-serialization branch until
upstream nextpnr is updated to support the new chipdb format.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
sip_cst stores a list of tuples per package, not a single tuple.
Also iostd can be None for some pin types.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Initial port of gowin_pack.py to C++20 for faster bitstream generation.

Implemented:
- ChipDB loading via msgpack-c with custom adaptors
- Netlist parsing from Nextpnr JSON
- Basic routing (PIPs, clock routing)
- LUT/DFF/ALU placement
- Slice fuse application via shortval tables
- Attribute ID constants (attrids.hpp)
- Bitstream generation framework

Still WIP:
- IOB, PLL, BSRAM, DSP, IOLOGIC, OSC placements are placeholders
- Debugging std::bad_cast during chipdb loading

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Finish all stub BEL placement functions (IOB, PLL, BSRAM, DSP, IOLOGIC,
OSC, BUFS, RAM16SDP, CLKDIV, DCS, DQCE, DHCEN) with full attribute
handling ported from the Python gowin_pack.py.

Key changes:
- attrids.hpp: Add all missing attribute tables (PLL vals, BSRAM, DSP,
  DCS, DLLDLY, DLL, OSC, CFG, GSR, HCLK, IOLOGIC, ADC)
- place.cpp: Implement all BEL placement with proper fuse lookup via
  add_attr_val/get_shortval_fuses/get_longval_fuses pattern
- route.cpp: Add isolate_segments, HCLK pip routing, set_clock_fuses
- bitstream.cpp: Add PackArgs, GSR/CFG fuses via attrids, dual-mode
  pin fuses, proper checksum calculation, frame generation with CRC
- chipdb_types.hpp: Handle Python msgpack format where grid contains
  inline Tile objects (auto-deduplicates by ttyp)
- chipdb_adaptors.hpp: Add BIN->vector<uint8_t> adaptor for Python
  bytes/bytearray serialization
- chipdb.cpp: Clean up loading, remove verbose debug output
- main.cpp: Wire up PackArgs struct with all GPIO flags

Tested: loads chipdb, parses netlist, generates valid .fs bitstream.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
- Make gowin_pack configurable via GOWIN_PACK variable in both Makefiles
- Add .SECONDARY: to prevent Make from deleting intermediate .json files
- Split CI example job to build C++ gowin_pack alongside yosys/nextpnr
- After running examples with Python gowin_pack, repack with C++ and
  compare md5 checksums to verify identical bitstream output

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
The root .gitignore has *.cpp which prevented these files from being
tracked. Add a local .gitignore with negation patterns and force-add
the bels stub files.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
Add !gowin_pack_cpp/** negation so C++ sources aren't blocked by the
*.cpp, *.txt, *.json patterns (which exist for FPGA tooling outputs).

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
The C++ gowin_pack should not depend on the Python package for chipdb
files. Download the chipdb artifacts from the build jobs and set
APYCULA_CHIPDB_DIR to point to them. Also improve error message to
show searched paths when chipdb lookup fails.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
The gowin_pack_cpp/.gitignore was interfering with the negation
pattern !gowin_pack_cpp/** in the root .gitignore.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
The Python gowin_pack has special handling for XD-prefixed wires in
get_pips(): XD->F pips are skipped, and XD->LUT-input pips are converted
to synthetic pass-through LUT4 BELs. The C++ code was missing this logic,
causing "not found in tile" errors for all XD wire pips.

- Add XD wire detection in get_pips() with pass-through LUT BelInfo generation
- Return pip_bels from route_nets() and feed them into place_cells()
- Fix src/dest variable mapping to match Python's confusing naming convention

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
The !gowin_pack_cpp/** negation (needed to un-ignore *.cpp files)
was overriding the gowin_pack_cpp/build/ ignore. Reorder so the
build/ exclusion comes last and takes effect.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
Three fixes to make C++ gowin_pack produce identical bitstreams to
the Python implementation (verified with blinky-tangnano on GW1N-1):

1. Fix IOB attribute prefix: nextpnr himbaechel uses '&' as the
   mode_attr_sep (e.g. "&IO_TYPE=LVCMOS33"), not '@'. This caused
   IOBs to miss user-specified IO_TYPE, falling back to LVCMOS18.

2. Add set_iob_default_fuses(): iterates all pins in db.io_cfg and
   sets default IO_TYPE + BANK_VCCIO fuses for every IOB pin. For
   used banks, the IO standard is determined from placed output IOBs.
   For unused banks, defaults to LVCMOS18 (or LVCMOS33 for GW5A).
   Also sets bank-level fuses via get_bank_fuses + get_longval_fuses
   on IOBA/IOBB tables. Matches Python gowin_pack.py lines 3758-3849.

3. Wire set_iob_default_fuses into generate_bitstream after place_cells.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
…pack

Multiple fixes to achieve byte-identical bitstream output for all 21 CI
test examples:

- Fix use-after-move bug: bank IO standard was determined from iob_info
  after std::move, causing empty mode strings and wrong BANK_VCCIO
- Move bank fuse generation to after IOB processing (Step 2c) so that
  accumulated in_bank_attrs includes DRIVE, OPENDRAIN, etc. from all IOBs
- Add DIFF mode entries to default_iostd map (TLVDS/ELVDS defaults)
- Always set IO_TYPE in user_attrs to match Python line 3360
- Skip LVDS DRIVE override: Python line 3662 uses a stale `mode`
  variable from the first pass, making it effectively dead code
- Implement HCLK interbank fuses (BRGMUX0/1_BRGOUT) for routing

All 21 tangnano examples now produce identical output to Python packer.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
- Implement RLE bitstream compression for -c flag, matching Python
  bslib.py compressLine() behavior:
  - Find unused byte values for 8-zero, 4-zero, 2-zero run keys
  - Update header lines 0x10 (compress enable) and 0x51 (keys)
  - Compress each frame line before CRC calculation
  - Fall back to stripping extra padding when no unused bytes available
- Sort CI checksum comparison by filename instead of hash for clearer
  diffs when examples fail
- Add pass/fail summary counts to CI output

All 21 GW1N-1 tangnano examples pass with both -c and without -c.

https://claude.ai/code/session_01RfXhHDnRZtk65ShHZSL9qu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants