[rocky8_10] Hisotry rebuild to kernel-4.18.0-553.50.1.el8_10 #227

PlaidCat · 2025-04-22T23:12:10Z

General Process:

Download all unprocessed src.rpm
for each src,pm
- Find all commits in changelog up to last known tag ... in this case 4.18.0-553
- Re-play commits in revese order (oldest in change log to newest) with git cherry-pick
- After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
- Tag Rebuild branch
Do local build with https://github.com/ctrliq/kernel-src-tree/wiki/Kernel-Make,-KABI,-Install,-and-Reboot-script

Checking Rebuild Commits for potentially missing commits:

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 538898
Number of commits in rpm: 41
Number of commits matched with upstream: 31 (75.61%)
Number of commits in upstream but not in rpm: 538867
Number of commits NOT found in upstream: 10 (24.39%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.50.1.el8_10 for kernel-4.18.0-553.50.1.el8_10
Clean Cherry Picks: 13 (41.94%)
Empty Cherry Picks: 18 (58.06%)
_______________________________

__EMPTY COMMITS__________________________
ce895cf15ab60b93464ebbb515f2fc9e7a8cef9a gfs2: Remove misleading comments in gfs2_evict_inode
03ff3781bf6c149554d88e7b702a3abd5e400dc0 gfs2: gfs2_evict_inode clarification
86934198eefa10a71f35162b06c44c36d85b98ba gfs2: Clear flags when withdraw prevents xmote
9947a06d29c0a30da88cdc6376ca5fd87083e130 gfs2: do_xmote fixes
1e86044402c45b70a9b31beeaefb5cc732a7470c gfs2: Remove and replace gfs2_glock_queue_work
8bbfde0875590b71f012bd8b0c9cb988c9a873b9 gfs2: Add GLF_PENDING_REPLY flag
3774f53d7f0b30a996eab4a1264611489b48f14c gfs2: Replace GIF_DEFER_DELETE with GLF_DEFER_DELETE
0b93bac2271e11beb980fca037a34a9819c7dc37 gfs2: Remove LM_FLAG_PRIORITY flag
bb25b97562e52b2b5808b348db32568b1f5394b5 gfs2: remove dead code in add_to_queue
0360faca5d4dfc18d06644c7661cea1dc2b44dcf gfs2: Remove more dead code in add_to_queue
a431d49243a012738f132054b2303e0815663aac gfs2: Fix request cancelation bug
6cb3b1c2df87a8048ee1d54ec16d2e757af86c7f gfs2: Fix additional unlikely request cancelation race
9136cad723ec3e5ab5ca85a839f151abf1c9a106 gfs2: Prevent inode creation race (2)
b1c2cb86f4a7861480ad54bb9a58df3cbebf8e92 x86/xen: use new hypercall functions instead of hypercall page
7fa0da5373685e7ed249af3fa317ab1e1ba8b0a6 x86/xen: remove hypercall page
0e2bddf9e5f926ce32ed635012d0f8a0b54075d5 ice: add ice_adapter for shared data across PFs on the same NIC
d29a8134c78232213fb88f20d7ae865ec364e367 ice: avoid the PTP hardware semaphore in gettimex64 path
22118810fc7cc98f3afb38919348060ab67ddc5b ice: fold ice_ptp_read_time into ice_ptp_gettimex64

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values
redhat: drop Y issues from changelog
md/md-bitmap: fix writing non bitmap changes local to RHEL
raid1: update discard granularity when adding new disk
rhel-8.10: gate kernel on kernel-qe tests results not cki ones

BUILD

/mnt/code/kernel-src-tree-build
no .config file found, moving on
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-32fa0f457b22"
Making olddefconfig
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/zconf.tab.o
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h

...skipping...
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1916s
Making Modules
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/blowfish-x86_64.ko

  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-32fa0f457b22+
[TIMER]{MODULES}: 21s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-32fa0f457b22+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 24s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-32fa0f457b22+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 1916s
[TIMER]{MODULES}: 21s
[TIMER]{INSTALL}: 24s
[TIMER]{TOTAL} 1967s
Rebooting in 10 seconds

Boot

Linux r8-sigcloud-builder 4.18.0-rocky8_10_rebuild-32fa0f457b22+ #1 SMP Tue Apr 22 22:08:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Kselftest

$ grep '^ok ' 4.18.0-jmaple_sig-cloud-8_4.18.0-553.47.1.el8_10-226faf214012+.keselftest.log | wc -l
206

$ grep '^ok ' kselftest.resf_kernel-4.18.0-553.50.1.el8_10.4.18.0-rocky8_10_rebuild-32fa0f457b22+.log | wc -l
206

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit ce895cf Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/ce895cf1.failed Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit ce895cf) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/super.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 03ff378 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/03ff3781.failed When function evict_should_delete() returns SHOULD_DEFER_EVICTION, gh is never initialized, but that isn't obvious; if it did initialize gh and then return SHOULD_DEFER_EVICTION, gfs2_evict_inode() would fail to release it. To clarify the code, change gfs2_evict_inode() to always check if gh needs to be released, no matter what evict_should_delete() returns. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 03ff378) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/super.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Bob Peterson <[email protected]> commit 865cc3e Before this patch, gfs2 would deadlock because of the following sequence during mount: mount gfs2_fill_super gfs2_make_fs_rw <--- Detects IO error with glock kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish logd Detects IO error and the need to withdraw calls gfs2_withdraw gfs2_make_fs_ro kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish gfs2_quotad gfs2_statfs_sync gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted glock_work_func do_xmote <---Detects IO error, can't release glock: blocked on withdraw glops->go_inval glock_blocked_by_withdraw requeue glock work & exit <--- work requeued, blocked by withdraw This patch makes a special exception for the statfs system inode glock, which allows the statfs glock UNLOCK to proceed normally. That allows the quotad daemon to exit during the withdraw, which allows the logd daemon to exit during the withdraw, which allows the mount to exit. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 865cc3e) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Bob Peterson <[email protected]> commit 8693419 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/86934198.failed There are a couple places in function do_xmote where normal processing is circumvented due to withdraws in progress. However, since we bypass most of do_xmote() we bypass telling dlm to lock the dlm lock, which means dlm will never respond with a completion callback. Since the completion callback ordinarily clears GLF_LOCK, this patch changes function do_xmote to handle those situations more gracefully so the file system may be unmounted after withdraw. A very similar situation happens with the GLF_DEMOTE_IN_PROGRESS flag, which is cleared by function finish_xmote(). Since the withdraw causes us to skip the majority of do_xmote, it therefore also skips the call to finish_xmote() so the DEMOTE_IN_PROGRESS flag needs to be cleared manually. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 8693419) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 9947a06 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/9947a06d.failed Function do_xmote() is called with the glock spinlock held. Commit 8693419 added a 'goto skip_inval' statement at the beginning of the function to further below where the glock spinlock is expected not to be held anymore. Then it added code there that requires the glock spinlock to be held. This doesn't make sense; fix this up by dropping and retaking the spinlock where needed. In addition, when ->lm_lock() returned an error, do_xmote() didn't fail the locking operation, and simply left the glock hanging; fix that as well. (This is a much older error.) Fixes: 8693419 ("gfs2: Clear flags when withdraw prevents xmote") Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 9947a06) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 1e86044 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/1e860444.failed There are no more callers of gfs2_glock_queue_work() left, so remove that helper. With that, we can now rename __gfs2_glock_queue_work() back to gfs2_glock_queue_work() to get rid of some unnecessary clutter. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 1e86044) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 8bbfde0 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/8bbfde08.failed Introduce a new GLF_PENDING_REPLY flag to indicate that a reply from DLM is expected. Include that flag in glock dumps to show more clearly what's going on. (When the GLF_PENDING_REPLY flag is set, the GLF_LOCK flag will also be set but the GLF_LOCK flag alone isn't sufficient to tell that we are waiting for a DLM reply.) Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 8bbfde0) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 3774f53 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/3774f53d.failed Having this flag attached to the iopen glock instead of the inode is much simpler; it eliminates a protential weird race in gfs2_try_evict(). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 3774f53) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/incore.h

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 0b93bac Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/0b93bac2.failed The last user of this flag was removed in commit b77b4a4 ("gfs2: Rework freeze / thaw logic"). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 0b93bac) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Su Hui <[email protected]> commit bb25b97 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/bb25b975.failed clang static analyzer complains that value stored to 'gh' is never read. The code of this line is useless after commit 0b93bac ("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space. Signed-off-by: Su Hui <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit bb25b97) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 0360fac Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/0360faca.failed Remove some more dead code in add_to_queue() that commit 0b93bac ("gfs2: Remove LM_FLAG_PRIORITY flag") has rendered obsolete. This is a continuation of commit 3302764610057 ("gfs2: remove dead code in add_to_queue"); no functional change. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 0360fac) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit d838605 In run_queue(), check if the queue of pending requests is empty instead of blindly assuming that it won't be. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit d838605) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit a431d49 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/a431d492.failed In finish_xmote(), when a locking request is canceled, the corresponding holder is moved to the tail of the holders list instead of being dequeued immediately. When there is only a single holder, the canceled locking request is then immediately repeated. This makes no sense; it looks like another remnant of LM_FLAG_PRIORITY support. Instead, dequeue canceled holders and proceed with the next holder in finish_xmote(). We can then easily detect in gfs2_glock_dq() when a holder has been canceled. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit a431d49) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 6cb3b1c Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/6cb3b1c2.failed In gfs2_glock_dq(), we must drop the glock spin lock before calling ->lm_cancel, but this means that in the meantime, the operation we are trying to cancel could complete. If the operation completes unsuccessfully, another holder can end up at the head of the queue and another ->lm_lock operation can get started. In this case, we would end up canceling that second operation by accident. To prevent that, introduce a new GLF_CANCELING flag. Set that flag in gfs2_glock_dq() when trying to cancel an operation. When seeing that flag, finish_xmote() will then keep the GLF_LOCK flag set to prevent other glock operations from taking place. gfs2_glock_dq() then completes the cancelation attempt by clearing GLF_LOCK and GLF_CANCELING. In addition, add a missing GLF_DEMOTE_IN_PROGRESS check in gfs2_glock_dq() to make sure that we won't accidentally cancel a demote request. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 6cb3b1c) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c # fs/gfs2/incore.h # fs/gfs2/trace_gfs2.h

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 9136cad Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/9136cad7.failed In gfs2_try_evict(), we try grabbing the inode to evict, we try to evict it, and then we try grabbing it again to see if it still exists. There is no guarantee that we will end up with the same inode both times; the inode validity check that commit ffd1cf0 ("gfs2: Prevent inode creation race") added to the first grab is actually needed both times. (To avoid code duplication, add a grab_existing_inode() helper.) Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 9136cad) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/glock.c

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit e9e38ed In evict_should_delete(), when gfs2_upgrade_iopen_glock() fails, we detach the iopen glock from the inode without calling glock_clear_object(). This leads to a warning in glock_set_object() when the same inode is recreated and the glock is reused. Fix that by only detaching the iopen glock in gfs2_evict_inode(). In addition, remove the dequeue code from evict_should_delete(); we already perform a conditional dequeue in gfs2_evict_inode(). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit e9e38ed) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 79fe790 In glock_set_object() and glock_clear_object(), there is no need to print the glock type and number when we dump the entire glock, anyway. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 79fe790) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 41a8e04 In gfs2_evict_inode(), in the unlikely case that we cannot defer deleting the inode, it is not safe to fall back to deleting the inode; the only valid choice we have is to skip the delete. In addition, in evict_should_delete(), if we cannot lock the inode glock exclusively, we are in a bad enough state that skipping the delete is likely a better choice than trying to recover from the failure later. Fixes: c5b7a24 ("gfs2: Only defer deletes when we have an iopen glock") Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 41a8e04) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 cve CVE-2024-53241 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Juergen Gross <[email protected]> commit b1c2cb8 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/b1c2cb86.failed Call the Xen hypervisor via the new xen_hypercall_func static-call instead of the hypercall page. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Co-developed-by: Peter Zijlstra <[email protected]> Co-developed-by: Josh Poimboeuf <[email protected]> (cherry picked from commit b1c2cb8) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # arch/x86/include/asm/xen/hypercall.h

jira LE-2815 cve CVE-2024-53241 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Juergen Gross <[email protected]> commit 7fa0da5 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/7fa0da53.failed The hypercall page is no longer needed. It can be removed, as from the Xen perspective it is optional. But, from Linux's perspective, it removes naked RET instructions that escape the speculative protections that Call Depth Tracking and/or Untrain Ret are trying to achieve. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Andrew Cooper <[email protected]> Reviewed-by: Jan Beulich <[email protected]> (cherry picked from commit 7fa0da5) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # arch/x86/include/asm/xen/hypercall.h # arch/x86/kernel/callthunks.c # arch/x86/kernel/vmlinux.lds.S # arch/x86/xen/enlighten.c # arch/x86/xen/enlighten_pvh.c # arch/x86/xen/xen-head.S

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Christoph Hellwig <[email protected]> commit 59cefee Set BITMAP_WRITE_ERROR directly in write_sb_page instead of propagating the error to the caller and setting it there. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 59cefee) Signed-off-by: Jonathan Maple <[email protected]>

…_unmap jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Christoph Hellwig <[email protected]> commit 546ac0b Just a small tidyup to prepare for bigger changes. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 546ac0b) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Christoph Hellwig <[email protected]> commit 9234851 Don't bother allocating an extra buffer in the I/O failure handler and instead use the printk built-in format to print the last 4 path name components. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 9234851) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Ofir Gal <[email protected]> commit ab99a87 __write_sb_page() rounds up the io size to the optimal io size if it doesn't exceed the data offset, but it doesn't check the final size exceeds the bitmap length. For example: page count - 1 page size - 4K data offset - 1M optimal io size - 256K The final io size would be 256K (64 pages) but md_bitmap_storage_alloc() allocated 1 page, the IO would write 1 valid page and 63 pages that happens to be allocated afterwards. This leaks memory to the raid device superblock. This issue caused a data transfer failure in nvme-tcp. The network drivers checks the first page of an IO with sendpage_ok(), it returns true if the page isn't a slabpage and refcount >= 1. If the page !sendpage_ok() the network driver disables MSG_SPLICE_PAGES. As of now the network layer assumes all the pages of the IO are sendpage_ok() when MSG_SPLICE_PAGES is on. The bitmap pages aren't slab pages, the first page of the IO is sendpage_ok(), but the additional pages that happens to be allocated after the bitmap pages might be !sendpage_ok(). That cause skb_splice_from_iter() to stop the data transfer, in the case below it hangs 'mdadm --create'. The bug is reproducible, in order to reproduce we need nvme-over-tcp controllers with optimal IO size bigger than PAGE_SIZE. Creating a raid with bitmap over those devices reproduces the bug. In order to simulate large optimal IO size you can use dm-stripe with a single device. Script to reproduce the issue on top of brd devices using dm-stripe is attached below (will be added to blktest). I have added some logs to test the theory: ... md: created bitmap (1 pages) for device md127 __write_sb_page before md_super_write offset: 16, size: 262144. pfn: 0x53ee === __write_sb_page before md_super_write. logging pages === pfn: 0x53ee, slab: 0 <-- the only page that allocated for the bitmap pfn: 0x53ef, slab: 1 pfn: 0x53f0, slab: 0 pfn: 0x53f1, slab: 0 pfn: 0x53f2, slab: 0 pfn: 0x53f3, slab: 1 ... nvme_tcp: sendpage_ok - pfn: 0x53ee, len: 262144, offset: 0 skbuff: before sendpage_ok() - pfn: 0x53ee skbuff: before sendpage_ok() - pfn: 0x53ef WARNING at net/core/skbuff.c:6848 skb_splice_from_iter+0x142/0x450 skbuff: !sendpage_ok - pfn: 0x53ef. is_slab: 1, page_count: 1 ... Cc: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ofir Gal <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit ab99a87) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Gerd Bayer <[email protected]> commit 2bcae12 Remove the erroneous unmap in case no DMA mapping was established The multi-packet WQE transmit code attempts to obtain a DMA mapping for the skb. This could fail, e.g. under memory pressure, when the IOMMU driver just can't allocate more memory for page tables. While the code tries to handle this in the path below the err_unmap label it erroneously unmaps one entry from the sq's FIFO list of active mappings. Since the current map attempt failed this unmap is removing some random DMA mapping that might still be required. If the PCI function now presents that IOVA, the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI function in error state. The erroneous behavior was seen in a stress-test environment that created memory pressure. Fixes: 5af75c7 ("net/mlx5e: Enhanced TX MPWQE for SKBs") Signed-off-by: Gerd Bayer <[email protected]> Reviewed-by: Zhu Yanjun <[email protected]> Acked-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> (cherry picked from commit 2bcae12) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Michal Schmidt <[email protected]> commit 0e2bddf Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/0e2bddf9.failed There is a need for synchronization between ice PFs on the same physical adapter. Add a "struct ice_adapter" for holding data shared between PFs of the same multifunction PCI device. The struct is refcounted - each ice_pf holds a reference to it. Its first use will be for PTP. I expect it will be useful also to improve the ugliness that is ice_prot_id_tbl. Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Michal Schmidt <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 0e2bddf) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/net/ethernet/intel/ice/Makefile # drivers/net/ethernet/intel/ice/ice.h

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Michal Schmidt <[email protected]> commit d29a813 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/d29a8134.failed The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize operations that program the PTP timers. The operations involve issuing commands to the sideband queue. The E810 does not have a hardware sideband queue, so the admin queue is used. The admin queue is slow. I have observed delays in hundreds of milliseconds waiting for ice_sq_done. When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is held by a task performing one of the slow operations, ice_ptp_lock can easily time out. phc2sys gets -EBUSY and the kernel prints: ice 0000:XX:YY.0: PTP failed to get time These messages appear once every few seconds, causing log spam. The E810 datasheet recommends an algorithm for reading the upper 64 bits of the GLTSYN_TIME register. It matches what's implemented in ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not necessarily against the concurrent setting of the register (with GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why ice_ptp_gettimex64 also takes PFTSYN_SEM. The race with time setters can be prevented without relying on the PTP hardware semaphore. Using the "ice_adapter" from the previous patch, we can have a common spinlock for the PFs that share the clock hardware. It will protect the reading and writing to the GLTSYN_TIME register. The writing is performed indirectly, by the hardware, as a result of the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't sure if the ice_flush there is enough to make sure GLTSYN_TIME has been updated, but it works well in my testing. My test code can be seen here: https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock-10 It consists of: - kernel threads reading the time in a busy loop and looking at the deltas between consecutive values, reporting new maxima. - a shell script that sets the time repeatedly; - a bpftrace probe to produce a histogram of the measured deltas. Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing. Deltas in the [2G, 4G) range appear in the histograms. With the spinlock added, there is no tearing and the biggest delta I saw was in the range [1M, 2M), that is under 2 ms. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Michal Schmidt <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit d29a813) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/net/ethernet/intel/ice/ice_adapter.c # drivers/net/ethernet/intel/ice/ice_adapter.h

jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Michal Schmidt <[email protected]> commit 2211881 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/22118810.failed This is a cleanup. It is unnecessary to have this function just to call another function. Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Michal Schmidt <[email protected]> Reviewed-by: Sai Krishna <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Reviewed-by: Kalesh AP <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 2211881) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/net/ethernet/intel/ice/ice_ptp.c

…ut payload jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Long Li <[email protected]> commit 87c4b5e In StorVSC, payload->range.len is used to indicate if this SCSI command carries payload. This data is allocated as part of the private driver data by the upper layer and may get passed to lower driver uninitialized. For example, the SCSI error handling mid layer may send TEST_UNIT_READY or REQUEST_SENSE while reusing the buffer from a failed command. The private data section may have stale data from the previous command. If the SCSI command doesn't carry payload, the driver may use this value as is for communicating with host, resulting in possible corruption. Fix this by always initializing this value. Fixes: be0cf6c ("scsi: storvsc: Set the tablesize based on the information given by the host") Cc: [email protected] Tested-by: Roman Kisel <[email protected]> Reviewed-by: Roman Kisel <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Signed-off-by: Long Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 87c4b5e) Signed-off-by: Jonathan Maple <[email protected]>

jira LE-2815 cve CVE-2024-53150 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Takashi Iwai <[email protected]> commit a3dd4d6 The current USB-audio driver code doesn't check bLength of each descriptor at traversing for clock descriptors. That is, when a device provides a bogus descriptor with a shorter bLength, the driver might hit out-of-bounds reads. For addressing it, this patch adds sanity checks to the validator functions for the clock descriptor traversal. When the descriptor length is shorter than expected, it's skipped in the loop. For the clock source and clock multiplier descriptors, we can just check bLength against the sizeof() of each descriptor type. OTOH, the clock selector descriptor of UAC2 and UAC3 has an array of bNrInPins elements and two more fields at its tail, hence those have to be checked in addition to the sizeof() check. Reported-by: Benoît Sevens <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/[email protected] Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> (cherry picked from commit a3dd4d6) Signed-off-by: Jonathan Maple <[email protected]>

…rect values in perf_quiet_option() jira LE-2815 Rebuild_History Non-Buildable kernel-4.18.0-553.50.1.el8_10 commit-author Yang Jihong <[email protected]> commit 188ac72 When perf uses quiet mode, perf_quiet_option() sets the 'debug_peo_args' variable to -1, and display_attr() incorrectly determines the value of 'debug_peo_args'. As a result, unexpected information is displayed. Before: # perf record --quiet -- ls > /dev/null ------------------------------------------------------------ perf_event_attr: size 128 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ ... After: # perf record --quiet -- ls > /dev/null # redirect_to_stderr is a similar problem. Fixes: f78eaef ("perf tools: Allow to force redirect pr_debug to stderr.") Fixes: ccd2674 ("perf tool: Provide an option to print perf_event_open args and return value") Suggested-by: Adrian Hunter <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Carsten Haitzler <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> (cherry picked from commit 188ac72) Signed-off-by: Jonathan Maple <[email protected]>

Rebuild_History BUILDABLE Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50% Number of commits in upstream range v4.18~1..kernel-mainline: 538898 Number of commits in rpm: 41 Number of commits matched with upstream: 31 (75.61%) Number of commits in upstream but not in rpm: 538867 Number of commits NOT found in upstream: 10 (24.39%) Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.50.1.el8_10 for kernel-4.18.0-553.50.1.el8_10 Clean Cherry Picks: 13 (41.94%) Empty Cherry Picks: 18 (58.06%) _______________________________ Full Details Located here: ciq/ciq_backports/kernel-4.18.0-553.50.1.el8_10/rebuild.details.txt Includes: * git commit header above * Empty Commits with upstream SHA * RPM ChangeLog Entries that could not be matched Individual Empty Commit failures contained in the same containing directory. The git message for empty commits will have the path for the failed commit. File names are the first 8 characters of the upstream SHA

thefossguy-ciq

🚤

bmastbergen

🥌

PlaidCat added 30 commits April 22, 2025 17:49

PlaidCat added 2 commits April 22, 2025 17:49

PlaidCat requested review from jdieter, juphoff, kerneltoast, bmastbergen and thefossguy-ciq April 22, 2025 23:12

PlaidCat self-assigned this Apr 22, 2025

thefossguy-ciq approved these changes Apr 23, 2025

View reviewed changes

bmastbergen approved these changes Apr 23, 2025

View reviewed changes

PlaidCat merged commit 32fa0f4 into rocky8_10 Apr 23, 2025
2 checks passed

PlaidCat deleted the rocky8_10_rebuild branch April 23, 2025 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rocky8_10] Hisotry rebuild to kernel-4.18.0-553.50.1.el8_10 #227

[rocky8_10] Hisotry rebuild to kernel-4.18.0-553.50.1.el8_10 #227

Uh oh!

PlaidCat commented Apr 22, 2025

Uh oh!

thefossguy-ciq left a comment

Uh oh!

bmastbergen left a comment

Uh oh!

Uh oh!

Uh oh!

[rocky8_10] Hisotry rebuild to kernel-4.18.0-553.50.1.el8_10 #227

[rocky8_10] Hisotry rebuild to kernel-4.18.0-553.50.1.el8_10 #227

Uh oh!

Conversation

PlaidCat commented Apr 22, 2025

Checking Rebuild Commits for potentially missing commits:

BUILD

Boot

Kselftest

Uh oh!

thefossguy-ciq left a comment

Choose a reason for hiding this comment

Uh oh!

bmastbergen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!