Skip to content

ics-jku/RVVTS_RTL_AFC_Ara

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RVVTS with Automated Failure Categorization (AFC): Results for PULP Ara RTL implementation

This repository contains the RVVTS test sets, reports, and categorized failure cases discussed in the paper From Generation to Failure Categorization: An Open-Source automated RTL Verification Framework for RVV by Manfred Schlägl, Jonas Reichhardt, and Daniel Große, presented at the ACM Great Lakes Symposium on VLSI (GLSVLSI) 2026.

This work is based on the RVVTS RISC-V Vector Test Framework. RVVTS provides the coverage-guided test generation, automated execution, failure minimization, single-instruction isolation and automated failure categorization used for these results.

RVVTS executed each test case on Spike as reference and the verilated PULP Ara as DUT, compared the resulting machine state (registers, CSRs, trap counts, etc.), minimized each detected deviation to the triggering instruction and categorized the detected case in a specified failure category.

Test Setup

  • Reference model (REF): Spike / riscv-isa-sim, commit 1553a2a896.
  • DUT: PULP Ara, commit a6436df6ad.
  • Test framework: RVVTS, commit 2fe4c04d00 (tag RVVTSv2_AFC_GLSVLSI_2026).
  • Target: RV64 with RVV 1.0 and VLEN = 4096 bit.
  • Test sets: TestSets_RVV_RefSpike_RV64_VLEN4096_v1.
  • VS contains valid non-trapping sequences; IVS contains invalid/trap-triggering sequences for negative testing.
  • RVV test case split: 25 code fragments.

Test Sets pre-generated with RVVTS and applied on PULP Ara

The following table reproduces Table 2 from the paper. The VS/IVS entries link to the corresponding generated test sets. Note: The pre-generated test sets can also be used for testing other DUTs with similar configuration (e.g., VLEN = 4096).

Test set Million Instructions Million RVV Instr. (% w.r.t. all Instr.) Functional Coverage Detected Fails Minimized Cases (% w.r.t. detected) Isolated Failing Instructions
Valid Sequences (VS) 2.17 0.95 (43.78%) 31,403 / 33,076 (94.94%) 48,973 46,601 (95.16%) 584 overall, 13 exclusive
Invalid+Valid Sequences (IVS) 2.04 1.00 (49.02%) 31,950 / 33,076 (96.60%) 33,630 33,391 (99.29%) 598 overall, 27 exclusive
Merged Sequences (MS = VS + IVS) 4.21 1.95 (46.32%) 31,951 / 33,076 (96.60%) 82,603 79,992 (96.84%) 611 overall

Deviation Categories on PULP Ara

Unfortunately, the results are too large for a GitHub repository. The results can be downloaded from an external source by running the download_extract_results.sh script at the top level of the repository (compressed/download: 1.5 GiB; decompressed: 25 GiB).

The following table repeats the deviation categories and adds a short description of the architectural symptom captured by the AFC rules. The descriptions summarize the rule intent; each categorized failure is still further grouped by its isolated instruction in the result directories.

ID Failure Category Category Description VS Detected Fails (% w.r.t. detected VS) IVS Detected Fails (% w.r.t. detected IVS) MS Detected Fails (% w.r.t detected MS) MS Minimized Cases (% w.r.t detected MS in Cat.) MS Isolated Instructions RVV Instruction Classes in MS (class: #cases / #instr)
1 VREG_ONLY Only vector register contents differ between reference and DUT. This usually points to wrong vector instruction results or vector register handling. 34,315 (70.07%) 3,654 (10.87%) 37,969 (45.97%) 37,108 (97.73%) 448 integer: 10,405 / 132; mask: 10,290 / 13; fixed-point: 6,324 / 32; floating-point: 4,135 / 89; permutation: 2,478 / 15; reduction: 1,774 / 16; load: 1,702 / 151
2 VTYPE_VILL_SET_ERROR The reference sets the vill bit in vtype, but the DUT does not. The DUT therefore fails to mark an illegal vector configuration as illegal. 90 (0.18%) 19,858 (59.05%) 19,948 (24.15%) 19,948 (100.00%) 1 config: 19,948 / 1
3 ARA_HANG Ara does not complete the test case and the DUT machine state records an invalid last committed PC. This captures deadlocks or execution stalls rather than a normal architectural mismatch. 8,384 (17.12%) 1,606 (4.78%) 9,990 (12.09%) 8,521 (85.30%) 291 permutation: 3,650 / 10; store: 1,404 / 104; load: 1,349 / 134; fixed-point: 1,204 / 6; mask: 514 / 2; integer: 298 / 9; floating-point: 57 / 22; reduction: 45 / 4
4 MSTATUS_EXT_DUT The mstatus.fs/vs extension-state bits differ, with more bits set on the DUT than on the reference. This suggests that the DUT invalidly enables or dirties floating-point or vector extension state. 6 (0.01%) 4,303 (12.80%) 4,309 (5.22%) 4,297 (99.72%) 101 floating-point: 3,277 / 91; reduction: 589 / 6; permutation: 431 / 4
5 EXC_INVALID_REJECT The DUT reports more traps than the reference. The DUT rejected an instruction or configuration that the reference considers valid. 1,174 (2.40%) 1,242 (3.69%) 2,416 (2.92%) 2,415 (99.96%) 65 reduction: 1,043 / 16; mask: 825 / 10; integer: 511 / 27; floating-point: 13 / 8; permutation: 11 / 2; store: 7 / 1; load: 5 / 1
6 EXC_INVALID_ACCEPT The reference reports more traps than the DUT. The DUT accepted an instruction or configuration that the reference considers invalid. 3 (0.01%) 2,309 (6.87%) 2,312 (2.80%) 2,310 (99.91%) 413 floating-point: 750 / 91; load: 367 / 107; integer: 332 / 92; fixed-point: 240 / 32; permutation: 199 / 15; store: 164 / 63; reduction: 148 / 8; mask: 110 / 5
7 DMEM_ONLY Only the dedicated data-memory hash differs. The failure manifests as an unexpected data-memory update, typically from store-side behavior. 2,159 (4.41%) 139 (0.41%) 2,298 (2.78%) 2,076 (90.34%) 124 store: 2,076 / 124
8 VCSR_ONLY Only vector CSR state such as vxrm, vxsat, or vcsr differs. No other architectural state mismatch is present. 1,442 (2.94%) 272 (0.81%) 1,714 (2.07%) 1,714 (100.00%) 2 Zicsr config: 1,714 / 2
9 IREG_ONLY Only integer register contents differ. This can indicate a wrong scalar result or an unintended write to an integer register. 819 (1.67%) 147 (0.44%) 966 (1.17%) 925 (95.76%) 3 permutation: 543 / 1; mask: 382 / 2
10 FREG_ONLY Only floating-point register contents differ. This captures failures whose visible symptom is limited to floating-point register values. 536 (1.09%) 26 (0.08%) 562 (0.68%) 561 (99.82%) 1 permutation: 561 / 1
11 VTYPE_INVALID_ACCEPT vtype differs because the reference marks a non-zero illegal encoding with vill, while the DUT does not. This is an invalid vector-type acceptance symptom. 4 (0.01%) 51 (0.15%) 55 (0.07%) 55 (100.00%) 2 config: 55 / 2
12 FCSR_FFLAGS_ONLY Only fcsr differs and the mismatch is confined to the floating-point exception flags fflags. Other fcsr fields and architectural state match. 28 (0.06%) 5 (0.01%) 33 (0.04%) 33 (100.00%) 5 floating-point: 32 / 4; reduction: 1 / 1
13 VREGFCSR_ONLY Only vector registers and fcsr differ. This combines a vector result mismatch with floating-point status side effects, with no other state deviations. 10 (0.02%) 5 (0.01%) 15 (0.02%) 15 (100.00%) 6 floating-point: 14 / 5; reduction: 1 / 1
14 VCSR A vector CSR mismatch is present together with other architectural deviations. This separates mixed failures from pure VCSR_ONLY cases. 3 (0.01%) 6 (0.02%) 9 (0.01%) 9 (100.00%) 7 fixed-point: 9 / 7
15 VSTART_WEXC vstart differs while matching non-zero exception counts show that a trap occurred. This points to vector restart-state handling around exceptions. 0 (0.00%) 5 (0.01%) 5 (0.01%) 5 (100.00%) 5 load: 5 / 5
16 VALREG_ONLY All deviations are confined to architectural value registers across integer, floating-point, and vector registers. CSRs, memory, and PC-related state match. 0 (0.00%) 2 (0.01%) 2 (0.00%) 0 (0.00%) 0
17 PC_DIFF The program counter or last committed PC differs, but no Ara hang was classified. This captures control-flow or commit-PC mismatches. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
18 VLENB The vlenb CSR differs. The DUT and reference therefore disagree on the vector-register byte length reported to software. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
19 MSTATUS_EXT_REF The mstatus.fs/vs extension-state bits differ, with more bits set on the reference than on the DUT. This suggests that the DUT failed to enable or dirty expected extension state. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
20 MSTATUS_DIFF The mstatus.fs/vs bits differ, but neither side simply has more enabled or dirty extension-state bits. This captures other mstatus.fs/vs pattern mismatches. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
21 VTYPE_VILL_CLEAR_ERROR The DUT sets vtype.vill while the reference clears it for an otherwise zero vector type. The DUT marks a legal vector configuration as illegal. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
22 VTYPE_INVALID_REJECT vtype differs because the DUT marks a non-zero vector-type encoding as illegal while the reference accepts it. This is the counterpart of invalid vector-type acceptance. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
23 VTYPE_DIFF vtype differs in a way not covered by the specific vill set, clear, accept, or reject categories. This captures remaining vector-type CSR mismatches. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
24 VL_ONLY Only the vl CSR differs. The DUT and reference agree on all other state but compute or retain a different active vector length. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
25 VSTART_WOEXC vstart differs while the matching exception count is zero. This points to an unexpected vstart update or reset during normal, non-trapping execution. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
26 FCSR_ONLY Only fcsr differs, but the mismatch is not limited to fflags. This includes rounding-mode or other floating-point CSR differences. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
27 XMEM_ONLY Only the dedicated instruction-memory hash differs. The visible symptom is a memory-side effect outside the dedicated data-memory region. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
28 FREGFCSR_ONLY Only floating-point registers and fcsr differ. This combines floating-point value deviations with floating-point status side effects. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
29 UNKNOWN Fallback category for deviations that do not match any AFC rule. No Ara result in this data set falls into this category. 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0
Total Summary over all listed categories. 48,973 (100.00%) 33,630 (100.00%) 82,603 (100.00%) 79,992 (96.84%) 611

Citation

If you use this material or find it useful, please cite our papers as follows:

About

This repository contains the RVVTS test sets, reports, and categorized failures for PULP Ara discussed in the paper "From Generation to Failure Categorization: An Open-Source automated RTL Verification Framework for RVV" by Manfred Schlägl, Jonas Reichhardt, and Daniel Große, presented at GLSVLSI 2026.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages