Skip to content

[rocky9_5] Histroy Rebuild to kernel-5.14.0-503.38.1.el9_5 #225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 22, 2025

Conversation

PlaidCat
Copy link
Collaborator

General Process:

Contains the following: http://download.rockylinux.org/pub/rocky/8.10/BaseOS/source/tree/Packages/k/

Cchecking Rebuild Commits for potentially missing commits:

commit 9e0a22d560937e6132749dc6290090901196064d (HEAD -> rocky9_5_rebuild, tag: resf_kernel-5.14.0-503.38.1.el9_5, rocky9_5_rebuild_kernel-5.14.0-503.38.1.el9_5)
Author: Jonathan Maple <[email protected]>
Date:   Mon Apr 21 13:48:08 2025 -0400

    Rebuild rocky9_5 with kernel-5.14.0-503.38.1.el9_5

    Rebuild_History BUILDABLE
    Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
    Number of commits in upstream range v5.14~1..kernel-mainline: 295501
    Number of commits in rpm: 17
    Number of commits matched with upstream: 15 (88.24%)
    Number of commits in upstream but not in rpm: 295486
    Number of commits NOT found in upstream: 2 (11.76%)

    Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.38.1.el9_5 for kernel-5.14.0-503.38.1.el9_5
    Clean Cherry Picks: 13 (86.67%)
    Empty Cherry Picks: 2 (13.33%)
    _______________________________

    Full Details Located here:
    ciq/ciq_backports/kernel-5.14.0-503.38.1.el9_5/rebuild.details.txt

    Includes:
    * git commit header above
    * Empty Commits with upstream SHA
    * RPM ChangeLog Entries that could not be matched

    Individual Empty Commit failures contained in the same containing directory.
    The git message for empty commits will have the path for the failed commit.
    File names are the first 8 characters of the upstream SHA

Build

Build log was corrupted, the PR builds will have done the same thing.

Boot

[maple@r9-sigcloud-builder kernel-src-tree-build]$ uname -r
5.14.0-rocky9_5_rebuild-9e0a22d56093

KSelfTest

$ grep '^ok ' kselftest.5.14.0-rocky9_5_rebuild-9e0a22d56093.log | wc -l
317

PlaidCat added 16 commits April 21, 2025 13:47
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit 1b9f756

TSO currently fails when the skb's gso_type field has more than one bit
set.

TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a
few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes
virtualization, such as QEMU, a real use-case.

The gso_type and gso_size fields as passed from userspace in
virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type
|= SKB_GSO_DODGY to force the packet to enter the software GSO stack
for verification.

This issue might similarly come up when the CWR bit is set in the TCP
header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit
to be set.

Fixes: a57e5de ("gve: DQO: Add TX path")
	Signed-off-by: Joshua Washington <[email protected]>
	Reviewed-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Harshitha Ramamurthy <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Suggested-by: Eric Dumazet <[email protected]>
	Acked-by: Andrei Vagin <[email protected]>

v2 - Remove unnecessary comments, remove line break between fixes tag
and signoffs.

v3 - Add back unrelated empty line removal.

Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 1b9f756)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
cve CVE-2024-40937
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Ziwei Xiao <[email protected]>
commit 6f4d93b

gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
is freed with dev_kfree_skb_any(). This can result in a subsequent call
to napi_get_frags returning a dangling pointer.

Fix this by clearing napi->skb before the skb is freed.

Fixes: 9b8dd5e ("gve: DQO: Add RX path")
	Cc: [email protected]
	Reported-by: Shailend Chand <[email protected]>
	Signed-off-by: Ziwei Xiao <[email protected]>
	Reviewed-by: Harshitha Ramamurthy <[email protected]>
	Reviewed-by: Shailend Chand <[email protected]>
	Reviewed-by: Praveen Kaligineedi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 6f4d93b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit 03b54ba

In gve_clean_xdp_done, the driver processes the TX completions based on
a 32-bit NIC counter and a 32-bit completion counter stored in the tx
queue.

Fix the for loop so that the counter wraparound is handled correctly.

Fixes: 75eaae1 ("gve: Add XDP DROP and TX support for GQI-QPL format")
	Signed-off-by: Joshua Washington <[email protected]>
	Signed-off-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 03b54ba)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Bailey Forrest <[email protected]>
commit 36e3b94

The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).

The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).

gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.

gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.

Fixes: a57e5de ("gve: DQO: Add TX path")
	Signed-off-by: Praveen Kaligineedi <[email protected]>
	Signed-off-by: Bailey Forrest <[email protected]>
	Reviewed-by: Jeroen de Borst <[email protected]>
	Cc: [email protected]
	Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 36e3b94)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
cve CVE-2024-57932
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit ff7c2de

In GVE, dedicated XDP queues only exist when an XDP program is installed
and the interface is up. As such, the NDO XDP XMIT callback should
return early if either of these conditions are false.

In the case of no loaded XDP program, priv->num_xdp_queues=0 which can
cause a divide-by-zero error, and in the case of interface down,
num_xdp_queues remains untouched to persist XDP queue count for the next
interface up, but the TX pointer itself would be NULL.

The XDP xmit callback also needs to synchronize with a device
transitioning from open to close. This synchronization will happen via
the GVE_PRIV_FLAGS_NAPI_ENABLED bit along with a synchronize_net() call,
which waits for any RCU critical sections at call-time to complete.

Fixes: 39a7f4a ("gve: Add XDP REDIRECT support for GQI-QPL format")
	Cc: [email protected]
	Signed-off-by: Joshua Washington <[email protected]>
	Signed-off-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Shailend Chand <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit ff7c2de)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
cve CVE-2024-57933
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit 40338d7

This patch predicates the enabling and disabling of XSK pools on the
existence of queues. As it stands, if the interface is down, disabling
or enabling XSK pools would result in a crash, as the RX queue pointer
would be NULL. XSK pool registration will occur as part of the next
interface up.

Similarly, xsk_wakeup needs be guarded against queues disappearing
while the function is executing, so a check against the
GVE_PRIV_FLAGS_NAPI_ENABLED flag is added to synchronize with the
disabling of the bit and the synchronize_net() in gve_turndown.

Fixes: fd8e403 ("gve: Add AF_XDP zero-copy support for GQI-QPL format")
	Cc: [email protected]
	Signed-off-by: Joshua Washington <[email protected]>
	Signed-off-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Praveen Kaligineedi <[email protected]>
	Reviewed-by: Shailend Chand <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Reviewed-by: Larysa Zaremba <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit 40338d7)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit ba0925c
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.38.1.el9_5/ba0925c3.failed

When busy polling is enabled, xsk_sendmsg for AF_XDP zero copy marks
the NAPI ID corresponding to the memory pool allocated for the socket.
In GVE, this NAPI ID will never correspond to a NAPI ID of one of the
dedicated XDP TX queues registered with the umem because XDP TX is not
set up to share a NAPI with a corresponding RX queue.

This patch moves XSK TX descriptor processing from the TX NAPI to the RX
NAPI, and the gve_xsk_wakeup callback is updated to use the RX NAPI
instead of the TX NAPI, accordingly. The branch on if the wakeup is for
TX is removed, as the NAPI poll should be invoked whether the wakeup is
for TX or for RX.

Fixes: fd8e403 ("gve: Add AF_XDP zero-copy support for GQI-QPL format")
	Cc: [email protected]
	Signed-off-by: Praveen Kaligineedi <[email protected]>
	Signed-off-by: Joshua Washington <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit ba0925c)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/net/ethernet/google/gve/gve.h
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Joshua Washington <[email protected]>
commit fb3a9a1

Commit ba0925c ("gve: process XSK TX descriptors as part of RX NAPI")
moved XSK TX processing to be part of the RX NAPI. However, that commit
did not include triggering the RX NAPI in gve_xsk_wakeup. This is
necessary because the TX NAPI only processes TX completions, meaning
that a TX wakeup would not actually trigger XSK descriptor processing.
Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as
the NAPI should be scheduled whether the wakeup is for RX or TX.

Fixes: ba0925c ("gve: process XSK TX descriptors as part of RX NAPI")
	Cc: [email protected]
	Signed-off-by: Joshua Washington <[email protected]>
	Signed-off-by: Praveen Kaligineedi <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit fb3a9a1)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Maciej Fijalkowski <[email protected]>
commit 743bbd9

Introduce a new helper ice_put_rx_mbuf() that will go through gathered
frags from current frame and will call ice_put_rx_buf() on them. Current
logic that was supposed to simplify and optimize the driver where we go
through a batch of all buffers processed in current NAPI instance turned
out to be broken for jumbo frames and very heavy load that was coming
from both multi-thread iperf and nginx/wrk pair between server and
client. The delay introduced by approach that we are dropping is simply
too big and we need to take the decision regarding page
recycling/releasing as quick as we can.

While at it, address an error path of ice_add_xdp_frag() - we were
missing buffer putting from day 1 there.

As a nice side effect we get rid of annoying and repetitive three-liner:

	xdp->data = NULL;
	rx_ring->first_desc = ntc;
	rx_ring->nr_frags = 0;

by embedding it within introduced routine.

Fixes: 1dc1a7e ("ice: Centrallize Rx buffer recycling")
Reported-and-tested-by: Xu Du <[email protected]>
	Reviewed-by: Przemek Kitszel <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Co-developed-by: Jacob Keller <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Signed-off-by: Maciej Fijalkowski <[email protected]>
	Tested-by: Chandan Kumar Rout <[email protected]> (A Contingent Worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 743bbd9)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Maciej Fijalkowski <[email protected]>
commit 11c4aa0

If we store the pgcnt on few fragments while being in the middle of
gathering the whole frame and we stumbled upon DD bit not being set, we
terminate the NAPI Rx processing loop and come back later on. Then on
next NAPI execution we work on previously stored pgcnt.

Imagine that second half of page was used actively by networking stack
and by the time we came back, stack is not busy with this page anymore
and decremented the refcnt. The page reuse algorithm in this case should
be good to reuse the page but given the old refcnt it will not do so and
attempt to release the page via page_frag_cache_drain() with
pagecnt_bias used as an arg. This in turn will result in negative refcnt
on struct page, which was initially observed by Xu Du.

Therefore, move the page count storage from ice_get_rx_buf() to a place
where we are sure that whole frame has been collected, but before
calling XDP program as it internally can also change the page count of
fragments belonging to xdp_buff.

Fixes: ac07533 ("ice: Store page count inside ice_rx_buf")
Reported-and-tested-by: Xu Du <[email protected]>
	Reviewed-by: Przemek Kitszel <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Co-developed-by: Jacob Keller <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Signed-off-by: Maciej Fijalkowski <[email protected]>
	Tested-by: Chandan Kumar Rout <[email protected]> (A Contingent Worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 11c4aa0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Maciej Fijalkowski <[email protected]>
commit 468a195

Idea behind having ice_rx_buf::act was to simplify and speed up the Rx
data path by walking through buffers that were representing cleaned HW
Rx descriptors. Since it caused us a major headache recently and we
rolled back to old approach that 'puts' Rx buffers right after running
XDP prog/creating skb, this is useless now and should be removed.

Get rid of ice_rx_buf::act and related logic. We still need to take care
of a corner case where XDP program releases a particular fragment.

Make ice_run_xdp() to return its result and use it within
ice_put_rx_mbuf().

Fixes: 2fba7dc ("ice: Add support for XDP multi-buffer on Rx side")
	Reviewed-by: Przemek Kitszel <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
	Signed-off-by: Maciej Fijalkowski <[email protected]>
	Tested-by: Chandan Kumar Rout <[email protected]> (A Contingent Worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 468a195)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Anumula Murali Mohan Reddy <[email protected]>
commit 356983f

t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl()
uses port number to get mac addr, this leads to error when an attempt
to set MAC address on VF's of PF2 and PF3.
This patch fixes the issue by using port number to set mac address.

Fixes: e0cdac6 ("cxgb4vf: configure ports accessible by the VF")
	Signed-off-by: Anumula Murali Mohan Reddy <[email protected]>
	Signed-off-by: Potnuri Bharat Teja <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 356983f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Srinivas Pandruvada <[email protected]>
commit 7e1c3f5
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.38.1.el9_5/7e1c3f58.failed

Prevent intel_pstate from loading when OOB (Out Of Band) P-states mode is
enabled in Emerald Rapids.

The OOB identifying bits are same as for the prior generation CPUs
like Sapphire Rapids servers, so also add Emerald Rapids to the
intel_pstate_cpu_oob_ids[] list.

	Signed-off-by: Srinivas Pandruvada <[email protected]>
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 7e1c3f5)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/cpufreq/intel_pstate.c
jira LE-2842
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Mike Christie <[email protected]>
commit 8604f63

scsi_check_passthrough() is always called, but it doesn't check for if a
command completed successfully. As a result, if a command was successful and
the caller used SCMD_FAILURE_RESULT_ANY to indicate what failures it wanted
to retry, we will end up retrying the command. This will cause delays during
device discovery because of the command being sent multiple times. For some
USB devices it can also cause the wrong device size to be used.

This patch adds a check for if the command was successful. If it is we
return immediately instead of trying to match a failure.

Fixes: 994724e ("scsi: core: Allow passthrough to request midlayer retries")
	Reported-by: Kris Karas <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219652
	Signed-off-by: Mike Christie <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: Bart Van Assche <[email protected]>
	Reviewed-by: John Garry <[email protected]>
	Signed-off-by: Martin K. Petersen <[email protected]>
(cherry picked from commit 8604f63)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2842
cve CVE-2024-53150
Rebuild_History Non-Buildable kernel-5.14.0-503.38.1.el9_5
commit-author Takashi Iwai <[email protected]>
commit a3dd4d6

The current USB-audio driver code doesn't check bLength of each
descriptor at traversing for clock descriptors.  That is, when a
device provides a bogus descriptor with a shorter bLength, the driver
might hit out-of-bounds reads.

For addressing it, this patch adds sanity checks to the validator
functions for the clock descriptor traversal.  When the descriptor
length is shorter than expected, it's skipped in the loop.

For the clock source and clock multiplier descriptors, we can just
check bLength against the sizeof() of each descriptor type.
OTOH, the clock selector descriptor of UAC2 and UAC3 has an array
of bNrInPins elements and two more fields at its tail, hence those
have to be checked in addition to the sizeof() check.

	Reported-by: Benoît Sevens <[email protected]>
	Cc: <[email protected]>
Link: https://lore.kernel.org/[email protected]
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Takashi Iwai <[email protected]>
(cherry picked from commit a3dd4d6)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 295501
Number of commits in rpm: 17
Number of commits matched with upstream: 15 (88.24%)
Number of commits in upstream but not in rpm: 295486
Number of commits NOT found in upstream: 2 (11.76%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.38.1.el9_5 for kernel-5.14.0-503.38.1.el9_5
Clean Cherry Picks: 13 (86.67%)
Empty Cherry Picks: 2 (13.33%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.38.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@thefossguy-ciq thefossguy-ciq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚤

@PlaidCat PlaidCat merged commit 9e0a22d into rocky9_5 Apr 22, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky9_5_rebuild branch April 23, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants