LinuxLists.cc - [PATCH AUTOSEL 6.6 01/22] scsi: ufs: core: Fix MCQ MAC configuration

2024-04-07 13:24:01

Subject: [PATCH AUTOSEL 6.6 01/22] scsi: ufs: core: Fix MCQ MAC configuration

From: Rohit Ner <[email protected]>

[ Upstream commit 767712f91de76abd22a45184e6e3440120b8bfce ]

As per JEDEC Standard No. 223E Section 5.9.2, the max # active commands
value programmed by the host sw in MCQConfig.MAC should be one less than
the actual value.

Signed-off-by: Rohit Ner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Peter Wang <[email protected]>
Reviewed-by: Can Guo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/ufs/core/ufs-mcq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 0787456c2b892..c873fd8239427 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -94,7 +94,7 @@ void ufshcd_mcq_config_mac(struct ufs_hba *hba, u32 max_active_cmds)

val = ufshcd_readl(hba, REG_UFS_MCQ_CFG);
val &= ~MCQ_CFG_MAC_MASK;
- val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds);
+ val |= FIELD_PREP(MCQ_CFG_MAC_MASK, max_active_cmds - 1);
ufshcd_writel(hba, val, REG_UFS_MCQ_CFG);
}
EXPORT_SYMBOL_GPL(ufshcd_mcq_config_mac);
--
2.43.0

2024-04-07 13:24:30

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 02/22] scsi: lpfc: Move NPIV's transport unregistration to after resource clean up

From: Justin Tee <[email protected]>

[ Upstream commit 4ddf01f2f1504fa08b766e8cfeec558e9f8eef6c ]

There are cases after NPIV deletion where the fabric switch still believes
the NPIV is logged into the fabric. This occurs when a vport is
unregistered before the Remove All DA_ID CT and LOGO ELS are sent to the
fabric.

Currently fc_remove_host(), which calls dev_loss_tmo for all D_IDs including
the fabric D_ID, removes the last ndlp reference and frees the ndlp rport
object. This sometimes causes the race condition where the final DA_ID and
LOGO are skipped from being sent to the fabric switch.

Fix by moving the fc_remove_host() and scsi_remove_host() calls after DA_ID
and LOGO are sent.

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc_vport.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c
index 6c7559cf1a4b6..9e0e9e02d2c47 100644
--- a/drivers/scsi/lpfc/lpfc_vport.c
+++ b/drivers/scsi/lpfc/lpfc_vport.c
@@ -683,10 +683,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
lpfc_free_sysfs_attr(vport);
lpfc_debugfs_terminate(vport);

- /* Remove FC host to break driver binding. */
- fc_remove_host(shost);
- scsi_remove_host(shost);
-
/* Send the DA_ID and Fabric LOGO to cleanup Nameserver entries. */
ndlp = lpfc_findnode_did(vport, Fabric_DID);
if (!ndlp)
@@ -730,6 +726,10 @@ lpfc_vport_delete(struct fc_vport *fc_vport)

skip_logo:

+ /* Remove FC host to break driver binding. */
+ fc_remove_host(shost);
+ scsi_remove_host(shost);
+
lpfc_cleanup(vport);

/* Remove scsi host now. The nodes are cleaned up. */
--
2.43.0

2024-04-07 13:24:34

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 03/22] scsi: lpfc: Remove IRQF_ONESHOT flag from threaded IRQ handling

From: Justin Tee <[email protected]>

[ Upstream commit 4623713e7ade46bfc63a3eade836f566ccbcd771 ]

IRQF_ONESHOT is found to mask HBA generated interrupts when thread_fn is
running. As a result, some EQEs/CQEs miss timely processing resulting in
SCSI layer attempts to abort commands due to io_timeout. Abort CQEs are
also not processed leading to the observations of hangs and spam of "0748
abort handler timed out waiting for aborting I/O" log messages.

Remove the IRQF_ONESHOT flag. The cmpxchg and xchg atomic operations on
lpfc_queue->queue_claimed already protect potential parallel access to an
EQ/CQ should the thread_fn get interrupted by the primary irq handler.

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 2c336953e56ca..76c883cc66ed6 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -13051,7 +13051,7 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba)
rc = request_threaded_irq(eqhdl->irq,
&lpfc_sli4_hba_intr_handler,
&lpfc_sli4_hba_intr_handler_th,
- IRQF_ONESHOT, name, eqhdl);
+ 0, name, eqhdl);
if (rc) {
lpfc_printf_log(phba, KERN_WARNING, LOG_INIT,
"0486 MSI-X fast-path (%d) "
--
2.43.0

2024-04-07 13:24:51

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 04/22] scsi: lpfc: Update lpfc_ramp_down_queue_handler() logic

From: Justin Tee <[email protected]>

[ Upstream commit bb011631435c705cdeddca68d5c85fd40a4320f9 ]

Typically when an out of resource CQE status is detected, the
lpfc_ramp_down_queue_handler() logic is called to help reduce I/O load by
reducing an sdev's queue_depth.

However, the current lpfc_rampdown_queue_depth() logic does not help reduce
queue_depth. num_cmd_success is never updated and is always zero, which
means new_queue_depth will always be set to sdev->queue_depth. So,
new_queue_depth = sdev->queue_depth - new_queue_depth always sets
new_queue_depth to zero. And, scsi_change_queue_depth(sdev, 0) is
essentially a no-op.

Change the lpfc_ramp_down_queue_handler() logic to set new_queue_depth
equal to sdev->queue_depth subtracted from number of times num_rsrc_err was
incremented. If num_rsrc_err is >= sdev->queue_depth, then set
new_queue_depth equal to 1. Eventually, the frequency of Good_Status
frames will signal SCSI upper layer to auto increase the queue_depth back
to the driver default of 64 via scsi_handle_queue_ramp_up().

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc.h | 1 -
drivers/scsi/lpfc/lpfc_scsi.c | 13 ++++---------
2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 04d608ea91060..be016732ab2ea 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1325,7 +1325,6 @@ struct lpfc_hba {
struct timer_list fabric_block_timer;
unsigned long bit_flags;
atomic_t num_rsrc_err;
- atomic_t num_cmd_success;
unsigned long last_rsrc_error_time;
unsigned long last_ramp_down_time;
#ifdef CONFIG_SCSI_LPFC_DEBUG_FS
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index bf879d81846b6..cf506556f3b0b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -167,11 +167,10 @@ lpfc_ramp_down_queue_handler(struct lpfc_hba *phba)
struct Scsi_Host *shost;
struct scsi_device *sdev;
unsigned long new_queue_depth;
- unsigned long num_rsrc_err, num_cmd_success;
+ unsigned long num_rsrc_err;
int i;

num_rsrc_err = atomic_read(&phba->num_rsrc_err);
- num_cmd_success = atomic_read(&phba->num_cmd_success);

/*
* The error and success command counters are global per
@@ -186,20 +185,16 @@ lpfc_ramp_down_queue_handler(struct lpfc_hba *phba)
for (i = 0; i <= phba->max_vports && vports[i] != NULL; i++) {
shost = lpfc_shost_from_vport(vports[i]);
shost_for_each_device(sdev, shost) {
- new_queue_depth =
- sdev->queue_depth * num_rsrc_err /
- (num_rsrc_err + num_cmd_success);
- if (!new_queue_depth)
- new_queue_depth = sdev->queue_depth - 1;
+ if (num_rsrc_err >= sdev->queue_depth)
+ new_queue_depth = 1;
else
new_queue_depth = sdev->queue_depth -
- new_queue_depth;
+ num_rsrc_err;
scsi_change_queue_depth(sdev, new_queue_depth);
}
}
lpfc_destroy_vport_work_array(phba, vports);
atomic_set(&phba->num_rsrc_err, 0);
- atomic_set(&phba->num_cmd_success, 0);
}

/**
--
2.43.0

2024-04-07 13:25:09

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 05/22] scsi: lpfc: Replace hbalock with ndlp lock in lpfc_nvme_unregister_port()

From: Justin Tee <[email protected]>

[ Upstream commit d11272be497e48a8e8f980470eb6b70e92eed0ce ]

The ndlp object update in lpfc_nvme_unregister_port() should be protected
by the ndlp lock rather than hbalock.

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc_nvme.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 96e11a26c297e..a7479258e8055 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -2614,9 +2614,9 @@ lpfc_nvme_unregister_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
/* No concern about the role change on the nvme remoteport.
* The transport will update it.
*/
- spin_lock_irq(&vport->phba->hbalock);
+ spin_lock_irq(&ndlp->lock);
ndlp->fc4_xpt_flags |= NVME_XPT_UNREG_WAIT;
- spin_unlock_irq(&vport->phba->hbalock);
+ spin_unlock_irq(&ndlp->lock);

/* Don't let the host nvme transport keep sending keep-alives
* on this remoteport. Vport is unloading, no recovery. The
--
2.43.0

2024-04-07 13:25:50

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 07/22] scsi: lpfc: Use a dedicated lock for ras_fwlog state

From: Justin Tee <[email protected]>

[ Upstream commit f733a76ea0a9a84aee4ac41b81fad4d610ecbd8e ]

To reduce usage of and contention for hbalock, a separate dedicated lock is
used to protect ras_fwlog state.

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc.h | 1 +
drivers/scsi/lpfc/lpfc_attr.c | 4 ++--
drivers/scsi/lpfc/lpfc_bsg.c | 20 ++++++++++----------
drivers/scsi/lpfc/lpfc_debugfs.c | 12 ++++++------
drivers/scsi/lpfc/lpfc_init.c | 3 +++
drivers/scsi/lpfc/lpfc_sli.c | 20 ++++++++++----------
6 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index be016732ab2ea..9670cb2bf198e 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1429,6 +1429,7 @@ struct lpfc_hba {
struct timer_list inactive_vmid_poll;

/* RAS Support */
+ spinlock_t ras_fwlog_lock; /* do not take while holding another lock */
struct lpfc_ras_fwlog ras_fwlog;

uint32_t iocb_cnt;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index b1c9107d34083..79b45ea5fdb5e 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5864,9 +5864,9 @@ lpfc_ras_fwlog_buffsize_set(struct lpfc_hba *phba, uint val)
if (phba->cfg_ras_fwlog_func != PCI_FUNC(phba->pcidev->devfn))
return -EINVAL;

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
state = phba->ras_fwlog.state;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

if (state == REG_INPROGRESS) {
lpfc_printf_log(phba, KERN_ERR, LOG_SLI, "6147 RAS Logging "
diff --git a/drivers/scsi/lpfc/lpfc_bsg.c b/drivers/scsi/lpfc/lpfc_bsg.c
index 595dca92e8db5..593b1cf78979e 100644
--- a/drivers/scsi/lpfc/lpfc_bsg.c
+++ b/drivers/scsi/lpfc/lpfc_bsg.c
@@ -5070,12 +5070,12 @@ lpfc_bsg_get_ras_config(struct bsg_job *job)
bsg_reply->reply_data.vendor_reply.vendor_rsp;

/* Current logging state */
- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (ras_fwlog->state == ACTIVE)
ras_reply->state = LPFC_RASLOG_STATE_RUNNING;
else
ras_reply->state = LPFC_RASLOG_STATE_STOPPED;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

ras_reply->log_level = phba->ras_fwlog.fw_loglevel;
ras_reply->log_buff_sz = phba->cfg_ras_fwlog_buffsize;
@@ -5132,13 +5132,13 @@ lpfc_bsg_set_ras_config(struct bsg_job *job)

if (action == LPFC_RASACTION_STOP_LOGGING) {
/* Check if already disabled */
- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (ras_fwlog->state != ACTIVE) {
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
rc = -ESRCH;
goto ras_job_error;
}
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

/* Disable logging */
lpfc_ras_stop_fwlog(phba);
@@ -5149,10 +5149,10 @@ lpfc_bsg_set_ras_config(struct bsg_job *job)
* FW-logging with new log-level. Return status
* "Logging already Running" to caller.
**/
- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (ras_fwlog->state != INACTIVE)
action_status = -EINPROGRESS;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

/* Enable logging */
rc = lpfc_sli4_ras_fwlog_init(phba, log_level,
@@ -5268,13 +5268,13 @@ lpfc_bsg_get_ras_fwlog(struct bsg_job *job)
goto ras_job_error;

/* Logging to be stopped before reading */
- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (ras_fwlog->state == ACTIVE) {
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
rc = -EINPROGRESS;
goto ras_job_error;
}
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

if (job->request_len <
sizeof(struct fc_bsg_request) +
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index ea9b42225e629..20662b4f339eb 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -2196,12 +2196,12 @@ static int lpfc_debugfs_ras_log_data(struct lpfc_hba *phba,

memset(buffer, 0, size);

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (phba->ras_fwlog.state != ACTIVE) {
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
return -EINVAL;
}
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

list_for_each_entry_safe(dmabuf, next,
&phba->ras_fwlog.fwlog_buff_list, list) {
@@ -2252,13 +2252,13 @@ lpfc_debugfs_ras_log_open(struct inode *inode, struct file *file)
int size;
int rc = -ENOMEM;

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
if (phba->ras_fwlog.state != ACTIVE) {
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
rc = -EINVAL;
goto out;
}
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

if (check_mul_overflow(LPFC_RAS_MIN_BUFF_POST_SIZE,
phba->cfg_ras_fwlog_buffsize, &size))
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 76c883cc66ed6..416816d74ea1c 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -7698,6 +7698,9 @@ lpfc_setup_driver_resource_phase1(struct lpfc_hba *phba)
"NVME" : " "),
(phba->nvmet_support ? "NVMET" : " "));

+ /* ras_fwlog state */
+ spin_lock_init(&phba->ras_fwlog_lock);
+
/* Initialize the IO buffer list used by driver for SLI3 SCSI */
spin_lock_init(&phba->scsi_buf_list_get_lock);
INIT_LIST_HEAD(&phba->lpfc_scsi_buf_list_get);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 9dab33686a931..5af669b930193 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -6844,9 +6844,9 @@ lpfc_ras_stop_fwlog(struct lpfc_hba *phba)
{
struct lpfc_ras_fwlog *ras_fwlog = &phba->ras_fwlog;

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
ras_fwlog->state = INACTIVE;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

/* Disable FW logging to host memory */
writel(LPFC_CTL_PDEV_CTL_DDL_RAS,
@@ -6889,9 +6889,9 @@ lpfc_sli4_ras_dma_free(struct lpfc_hba *phba)
ras_fwlog->lwpd.virt = NULL;
}

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
ras_fwlog->state = INACTIVE;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
}

/**
@@ -6993,9 +6993,9 @@ lpfc_sli4_ras_mbox_cmpl(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
goto disable_ras;
}

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
ras_fwlog->state = ACTIVE;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
mempool_free(pmb, phba->mbox_mem_pool);

return;
@@ -7027,9 +7027,9 @@ lpfc_sli4_ras_fwlog_init(struct lpfc_hba *phba,
uint32_t len = 0, fwlog_buffsize, fwlog_entry_count;
int rc = 0;

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
ras_fwlog->state = INACTIVE;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);

fwlog_buffsize = (LPFC_RAS_MIN_BUFF_POST_SIZE *
phba->cfg_ras_fwlog_buffsize);
@@ -7090,9 +7090,9 @@ lpfc_sli4_ras_fwlog_init(struct lpfc_hba *phba,
mbx_fwlog->u.request.lwpd.addr_lo = putPaddrLow(ras_fwlog->lwpd.phys);
mbx_fwlog->u.request.lwpd.addr_hi = putPaddrHigh(ras_fwlog->lwpd.phys);

- spin_lock_irq(&phba->hbalock);
+ spin_lock_irq(&phba->ras_fwlog_lock);
ras_fwlog->state = REG_INPROGRESS;
- spin_unlock_irq(&phba->hbalock);
+ spin_unlock_irq(&phba->ras_fwlog_lock);
mbox->vport = phba->pport;
mbox->mbox_cmpl = lpfc_sli4_ras_mbox_cmpl;

--
2.43.0

2024-04-07 13:25:57

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 08/22] gfs2: Fix invalid metadata access in punch_hole

From: Andrew Price <[email protected]>

[ Upstream commit c95346ac918c5badf51b9a7ac58a26d3bd5bb224 ]

In punch_hole(), when the offset lies in the final block for a given
height, there is no hole to punch, but the maximum size check fails to
detect that. Consequently, punch_hole() will try to punch a hole beyond
the end of the metadata and fail. Fix the maximum size check.

Signed-off-by: Andrew Price <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/gfs2/bmap.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index ef7017fb69512..2b578615607e4 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1715,7 +1715,8 @@ static int punch_hole(struct gfs2_inode *ip, u64 offset, u64 length)
struct buffer_head *dibh, *bh;
struct gfs2_holder rd_gh;
unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift;
- u64 lblock = (offset + (1 << bsize_shift) - 1) >> bsize_shift;
+ unsigned int bsize = 1 << bsize_shift;
+ u64 lblock = (offset + bsize - 1) >> bsize_shift;
__u16 start_list[GFS2_MAX_META_HEIGHT];
__u16 __end_list[GFS2_MAX_META_HEIGHT], *end_list = NULL;
unsigned int start_aligned, end_aligned;
@@ -1726,7 +1727,7 @@ static int punch_hole(struct gfs2_inode *ip, u64 offset, u64 length)
u64 prev_bnr = 0;
__be64 *start, *end;

- if (offset >= maxsize) {
+ if (offset + bsize - 1 >= maxsize) {
/*
* The starting point lies beyond the allocated metadata;
* there are no blocks to deallocate.
--
2.43.0

2024-04-07 13:28:10

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 14/22] net: mark racy access on sk->sk_rcvbuf

From: linke li <[email protected]>

[ Upstream commit c2deb2e971f5d9aca941ef13ee05566979e337a4 ]

sk->sk_rcvbuf in __sock_queue_rcv_skb() and __sk_receive_skb() can be
changed by other threads. Mark this as benign using READ_ONCE().

This patch is aimed at reducing the number of benign races reported by
KCSAN in order to focus future debugging effort on harmful races.

Signed-off-by: linke li <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/core/sock.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 383e30fe79f41..790e847d86ec3 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -485,7 +485,7 @@ int __sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
unsigned long flags;
struct sk_buff_head *list = &sk->sk_receive_queue;

- if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) {
+ if (atomic_read(&sk->sk_rmem_alloc) >= READ_ONCE(sk->sk_rcvbuf)) {
atomic_inc(&sk->sk_drops);
trace_sock_rcvqueue_full(sk, skb);
return -ENOMEM;
@@ -555,7 +555,7 @@ int __sk_receive_skb(struct sock *sk, struct sk_buff *skb,

skb->dev = NULL;

- if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
+ if (sk_rcvqueues_full(sk, READ_ONCE(sk->sk_rcvbuf))) {
atomic_inc(&sk->sk_drops);
goto discard_and_relse;
}
--
2.43.0

2024-04-07 13:28:36

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 06/22] scsi: lpfc: Release hbalock before calling lpfc_worker_wake_up()

From: Justin Tee <[email protected]>

[ Upstream commit ded20192dff31c91cef2a04f7e20e60e9bb887d3 ]

lpfc_worker_wake_up() calls the lpfc_work_done() routine, which takes the
hbalock. Thus, lpfc_worker_wake_up() should not be called while holding the
hbalock to avoid potential deadlock.

Signed-off-by: Justin Tee <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc_els.c | 20 ++++++++++----------
drivers/scsi/lpfc/lpfc_hbadisc.c | 5 ++---
drivers/scsi/lpfc/lpfc_sli.c | 14 +++++++-------
3 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 18b8325fd419e..44d3ada9fbbcb 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -4432,23 +4432,23 @@ lpfc_els_retry_delay(struct timer_list *t)
unsigned long flags;
struct lpfc_work_evt *evtp = &ndlp->els_retry_evt;

+ /* Hold a node reference for outstanding queued work */
+ if (!lpfc_nlp_get(ndlp))
+ return;
+
spin_lock_irqsave(&phba->hbalock, flags);
if (!list_empty(&evtp->evt_listp)) {
spin_unlock_irqrestore(&phba->hbalock, flags);
+ lpfc_nlp_put(ndlp);
return;
}

- /* We need to hold the node by incrementing the reference
- * count until the queued work is done
- */
- evtp->evt_arg1 = lpfc_nlp_get(ndlp);
- if (evtp->evt_arg1) {
- evtp->evt = LPFC_EVT_ELS_RETRY;
- list_add_tail(&evtp->evt_listp, &phba->work_list);
- lpfc_worker_wake_up(phba);
- }
+ evtp->evt_arg1 = ndlp;
+ evtp->evt = LPFC_EVT_ELS_RETRY;
+ list_add_tail(&evtp->evt_listp, &phba->work_list);
spin_unlock_irqrestore(&phba->hbalock, flags);
- return;
+
+ lpfc_worker_wake_up(phba);
}

/**
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index 5154eeaee0ec3..93703ab6ce037 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -257,7 +257,9 @@ lpfc_dev_loss_tmo_callbk(struct fc_rport *rport)
if (evtp->evt_arg1) {
evtp->evt = LPFC_EVT_DEV_LOSS;
list_add_tail(&evtp->evt_listp, &phba->work_list);
+ spin_unlock_irqrestore(&phba->hbalock, iflags);
lpfc_worker_wake_up(phba);
+ return;
}
spin_unlock_irqrestore(&phba->hbalock, iflags);
} else {
@@ -275,10 +277,7 @@ lpfc_dev_loss_tmo_callbk(struct fc_rport *rport)
lpfc_disc_state_machine(vport, ndlp, NULL,
NLP_EVT_DEVICE_RM);
}
-
}
-
- return;
}

/**
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 4dfadf254a727..9dab33686a931 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -1217,9 +1217,9 @@ lpfc_set_rrq_active(struct lpfc_hba *phba, struct lpfc_nodelist *ndlp,
empty = list_empty(&phba->active_rrq_list);
list_add_tail(&rrq->list, &phba->active_rrq_list);
phba->hba_flag |= HBA_RRQ_ACTIVE;
+ spin_unlock_irqrestore(&phba->hbalock, iflags);
if (empty)
lpfc_worker_wake_up(phba);
- spin_unlock_irqrestore(&phba->hbalock, iflags);
return 0;
out:
spin_unlock_irqrestore(&phba->hbalock, iflags);
@@ -11369,18 +11369,18 @@ lpfc_sli_post_recovery_event(struct lpfc_hba *phba,
unsigned long iflags;
struct lpfc_work_evt *evtp = &ndlp->recovery_evt;

+ /* Hold a node reference for outstanding queued work */
+ if (!lpfc_nlp_get(ndlp))
+ return;
+
spin_lock_irqsave(&phba->hbalock, iflags);
if (!list_empty(&evtp->evt_listp)) {
spin_unlock_irqrestore(&phba->hbalock, iflags);
+ lpfc_nlp_put(ndlp);
return;
}

- /* Incrementing the reference count until the queued work is done. */
- evtp->evt_arg1 = lpfc_nlp_get(ndlp);
- if (!evtp->evt_arg1) {
- spin_unlock_irqrestore(&phba->hbalock, iflags);
- return;
- }
+ evtp->evt_arg1 = ndlp;
evtp->evt = LPFC_EVT_RECOVER_PORT;
list_add_tail(&evtp->evt_listp, &phba->work_list);
spin_unlock_irqrestore(&phba->hbalock, iflags);
--
2.43.0

2024-04-07 13:28:43

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 16/22] scsi: bnx2fc: Remove spin_lock_bh while releasing resources after upload

From: Saurav Kashyap <[email protected]>

[ Upstream commit c214ed2a4dda35b308b0b28eed804d7ae66401f9 ]

The session resources are used by FW and driver when session is offloaded,
once session is uploaded these resources are not used. The lock is not
required as these fields won't be used any longer. The offload and upload
calls are sequential, hence lock is not required.

This will suppress following BUG_ON():

[ 449.843143] ------------[ cut here ]------------
[ 449.848302] kernel BUG at mm/vmalloc.c:2727!
[ 449.853072] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 449.858712] CPU: 5 PID: 1996 Comm: kworker/u24:2 Not tainted 5.14.0-118.el9.x86_64 #1
Rebooting.
[ 449.867454] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.3.4 11/08/2016
[ 449.876966] Workqueue: fc_rport_eq fc_rport_work [libfc]
[ 449.882910] RIP: 0010:vunmap+0x2e/0x30
[ 449.887098] Code: 00 65 8b 05 14 a2 f0 4a a9 00 ff ff 00 75 1b 55 48 89 fd e8 34 36 79 00 48 85 ed 74 0b 48 89 ef 31 f6 5d e9 14 fc ff ff 5d c3 <0f> 0b 0f 1f 44 00 00 41 57 41 56 49 89 ce 41 55 49 89 fd 41 54 41
[ 449.908054] RSP: 0018:ffffb83d878b3d68 EFLAGS: 00010206
[ 449.913887] RAX: 0000000080000201 RBX: ffff8f4355133550 RCX: 000000000d400005
[ 449.921843] RDX: 0000000000000001 RSI: 0000000000001000 RDI: ffffb83da53f5000
[ 449.929808] RBP: ffff8f4ac6675800 R08: ffffb83d878b3d30 R09: 00000000000efbdf
[ 449.937774] R10: 0000000000000003 R11: ffff8f434573e000 R12: 0000000000001000
[ 449.945736] R13: 0000000000001000 R14: ffffb83da53f5000 R15: ffff8f43d4ea3ae0
[ 449.953701] FS: 0000000000000000(0000) GS:ffff8f529fc80000(0000) knlGS:0000000000000000
[ 449.962732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 449.969138] CR2: 00007f8cf993e150 CR3: 0000000efbe10003 CR4: 00000000003706e0
[ 449.977102] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 449.985065] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 449.993028] Call Trace:
[ 449.995756] __iommu_dma_free+0x96/0x100
[ 450.000139] bnx2fc_free_session_resc+0x67/0x240 [bnx2fc]
[ 450.006171] bnx2fc_upload_session+0xce/0x100 [bnx2fc]
[ 450.011910] bnx2fc_rport_event_handler+0x9f/0x240 [bnx2fc]
[ 450.018136] fc_rport_work+0x103/0x5b0 [libfc]
[ 450.023103] process_one_work+0x1e8/0x3c0
[ 450.027581] worker_thread+0x50/0x3b0
[ 450.031669] ? rescuer_thread+0x370/0x370
[ 450.036143] kthread+0x149/0x170
[ 450.039744] ? set_kthread_struct+0x40/0x40
[ 450.044411] ret_from_fork+0x22/0x30
[ 450.048404] Modules linked in: vfat msdos fat xfs nfs_layout_nfsv41_files rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver dm_service_time qedf qed crc8 bnx2fc libfcoe libfc scsi_transport_fc intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp dcdbas rapl intel_cstate intel_uncore mei_me pcspkr mei ipmi_ssif lpc_ich ipmi_si fuse zram ext4 mbcache jbd2 loop nfsv3 nfs_acl nfs lockd grace fscache netfs irdma ice sd_mod t10_pi sg ib_uverbs ib_core 8021q garp mrp stp llc mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt mxm_wmi fb_sys_fops cec crct10dif_pclmul ahci crc32_pclmul bnx2x drm ghash_clmulni_intel libahci rfkill i40e libata megaraid_sas mdio wmi sunrpc lrw dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid6_pq libcrc32c crc32c_intel raid1 raid0 iscsi_ibft squashfs be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls
[ 450.048497] libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi edd ipmi_devintf ipmi_msghandler
[ 450.159753] ---[ end trace 712de2c57c64abc8 ]---

Reported-by: Guangwu Zhang <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/bnx2fc/bnx2fc_tgt.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_tgt.c b/drivers/scsi/bnx2fc/bnx2fc_tgt.c
index 2c246e80c1c4d..d91659811eb3c 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_tgt.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_tgt.c
@@ -833,7 +833,6 @@ static void bnx2fc_free_session_resc(struct bnx2fc_hba *hba,

BNX2FC_TGT_DBG(tgt, "Freeing up session resources\n");

- spin_lock_bh(&tgt->cq_lock);
ctx_base_ptr = tgt->ctx_base;
tgt->ctx_base = NULL;

@@ -889,7 +888,6 @@ static void bnx2fc_free_session_resc(struct bnx2fc_hba *hba,
tgt->sq, tgt->sq_dma);
tgt->sq = NULL;
}
- spin_unlock_bh(&tgt->cq_lock);

if (ctx_base_ptr)
iounmap(ctx_base_ptr);
--
2.43.0

2024-04-07 13:29:32

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 19/22] drm/amdkfd: range check cp bad op exception interrupts

From: Jonathan Kim <[email protected]>

[ Upstream commit 0cac183b98d8a8c692c98e8dba37df15a9e9210d ]

Due to a CP interrupt bug, bad packet garbage exception codes are raised.
Do a range check so that the debugger and runtime do not receive garbage
codes.
Update the user api to guard exception code type checking as well.

Signed-off-by: Jonathan Kim <[email protected]>
Tested-by: Jesse Zhang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 3 ++-
.../gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 3 ++-
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 3 ++-
include/uapi/linux/kfd_ioctl.h | 17 ++++++++++++++---
4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c
index a7697ec8188e0..f85ca6cb90f56 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c
@@ -336,7 +336,8 @@ static void event_interrupt_wq_v10(struct kfd_node *dev,
break;
}
kfd_signal_event_interrupt(pasid, context_id0 & 0x7fffff, 23);
- } else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE) {
+ } else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE &&
+ KFD_DBG_EC_TYPE_IS_PACKET(KFD_DEBUG_CP_BAD_OP_ECODE(context_id0))) {
kfd_set_dbg_ev_from_interrupt(dev, pasid,
KFD_DEBUG_DOORBELL_ID(context_id0),
KFD_EC_MASK(KFD_DEBUG_CP_BAD_OP_ECODE(context_id0)),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
index 2a65792fd1162..3ca9c160da7c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
@@ -325,7 +325,8 @@ static void event_interrupt_wq_v11(struct kfd_node *dev,
/* CP */
if (source_id == SOC15_INTSRC_CP_END_OF_PIPE)
kfd_signal_event_interrupt(pasid, context_id0, 32);
- else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE)
+ else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE &&
+ KFD_DBG_EC_TYPE_IS_PACKET(KFD_CTXID0_CP_BAD_OP_ECODE(context_id0)))
kfd_set_dbg_ev_from_interrupt(dev, pasid,
KFD_CTXID0_DOORBELL_ID(context_id0),
KFD_EC_MASK(KFD_CTXID0_CP_BAD_OP_ECODE(context_id0)),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
index 27cdaea405017..8a6729939ae55 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
@@ -385,7 +385,8 @@ static void event_interrupt_wq_v9(struct kfd_node *dev,
break;
}
kfd_signal_event_interrupt(pasid, sq_int_data, 24);
- } else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE) {
+ } else if (source_id == SOC15_INTSRC_CP_BAD_OPCODE &&
+ KFD_DBG_EC_TYPE_IS_PACKET(KFD_DEBUG_CP_BAD_OP_ECODE(context_id0))) {
kfd_set_dbg_ev_from_interrupt(dev, pasid,
KFD_DEBUG_DOORBELL_ID(context_id0),
KFD_EC_MASK(KFD_DEBUG_CP_BAD_OP_ECODE(context_id0)),
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index eeb2fdcbdcb70..cd924c959d732 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -909,14 +909,25 @@ enum kfd_dbg_trap_exception_code {
KFD_EC_MASK(EC_DEVICE_NEW))
#define KFD_EC_MASK_PROCESS (KFD_EC_MASK(EC_PROCESS_RUNTIME) | \
KFD_EC_MASK(EC_PROCESS_DEVICE_REMOVE))
+#define KFD_EC_MASK_PACKET (KFD_EC_MASK(EC_QUEUE_PACKET_DISPATCH_DIM_INVALID) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_DISPATCH_GROUP_SEGMENT_SIZE_INVALID) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_DISPATCH_CODE_INVALID) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_RESERVED) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_UNSUPPORTED) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_DISPATCH_WORK_GROUP_SIZE_INVALID) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_DISPATCH_REGISTER_INVALID) | \
+ KFD_EC_MASK(EC_QUEUE_PACKET_VENDOR_UNSUPPORTED))

/* Checks for exception code types for KFD search */
+#define KFD_DBG_EC_IS_VALID(ecode) (ecode > EC_NONE && ecode < EC_MAX)
#define KFD_DBG_EC_TYPE_IS_QUEUE(ecode) \
- (!!(KFD_EC_MASK(ecode) & KFD_EC_MASK_QUEUE))
+ (KFD_DBG_EC_IS_VALID(ecode) && !!(KFD_EC_MASK(ecode) & KFD_EC_MASK_QUEUE))
#define KFD_DBG_EC_TYPE_IS_DEVICE(ecode) \
- (!!(KFD_EC_MASK(ecode) & KFD_EC_MASK_DEVICE))
+ (KFD_DBG_EC_IS_VALID(ecode) && !!(KFD_EC_MASK(ecode) & KFD_EC_MASK_DEVICE))
#define KFD_DBG_EC_TYPE_IS_PROCESS(ecode) \
- (!!(KFD_EC_MASK(ecode) & KFD_EC_MASK_PROCESS))
+ (KFD_DBG_EC_IS_VALID(ecode) && !!(KFD_EC_MASK(ecode) & KFD_EC_MASK_PROCESS))
+#define KFD_DBG_EC_TYPE_IS_PACKET(ecode) \
+ (KFD_DBG_EC_IS_VALID(ecode) && !!(KFD_EC_MASK(ecode) & KFD_EC_MASK_PACKET))

/* Runtime enable states */
--
2.43.0

2024-04-07 13:29:51

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 20/22] bpf: Check bloom filter map value size

From: Andrei Matei <[email protected]>

[ Upstream commit a8d89feba7e54e691ca7c4efc2a6264fa83f3687 ]

This patch adds a missing check to bloom filter creating, rejecting
values above KMALLOC_MAX_SIZE. This brings the bloom map in line with
many other map types.

The lack of this protection can cause kernel crashes for value sizes
that overflow int's. Such a crash was caught by syzkaller. The next
patch adds more guard-rails at a lower level.

Signed-off-by: Andrei Matei <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/bpf/bloom_filter.c | 13 +++++++++++++
.../selftests/bpf/prog_tests/bloom_filter_map.c | 6 ++++++
2 files changed, 19 insertions(+)

diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c
index addf3dd57b59b..35e1ddca74d21 100644
--- a/kernel/bpf/bloom_filter.c
+++ b/kernel/bpf/bloom_filter.c
@@ -80,6 +80,18 @@ static int bloom_map_get_next_key(struct bpf_map *map, void *key, void *next_key
return -EOPNOTSUPP;
}

+/* Called from syscall */
+static int bloom_map_alloc_check(union bpf_attr *attr)
+{
+ if (attr->value_size > KMALLOC_MAX_SIZE)
+ /* if value_size is bigger, the user space won't be able to
+ * access the elements.
+ */
+ return -E2BIG;
+
+ return 0;
+}
+
static struct bpf_map *bloom_map_alloc(union bpf_attr *attr)
{
u32 bitset_bytes, bitset_mask, nr_hash_funcs, nr_bits;
@@ -191,6 +203,7 @@ static u64 bloom_map_mem_usage(const struct bpf_map *map)
BTF_ID_LIST_SINGLE(bpf_bloom_map_btf_ids, struct, bpf_bloom_filter)
const struct bpf_map_ops bloom_filter_map_ops = {
.map_meta_equal = bpf_map_meta_equal,
+ .map_alloc_check = bloom_map_alloc_check,
.map_alloc = bloom_map_alloc,
.map_free = bloom_map_free,
.map_get_next_key = bloom_map_get_next_key,
diff --git a/tools/testing/selftests/bpf/prog_tests/bloom_filter_map.c b/tools/testing/selftests/bpf/prog_tests/bloom_filter_map.c
index d2d9e965eba59..f79815b7e951b 100644
--- a/tools/testing/selftests/bpf/prog_tests/bloom_filter_map.c
+++ b/tools/testing/selftests/bpf/prog_tests/bloom_filter_map.c
@@ -2,6 +2,7 @@
/* Copyright (c) 2021 Facebook */

#include <sys/syscall.h>
+#include <limits.h>
#include <test_progs.h>
#include "bloom_filter_map.skel.h"

@@ -21,6 +22,11 @@ static void test_fail_cases(void)
if (!ASSERT_LT(fd, 0, "bpf_map_create bloom filter invalid value size 0"))
close(fd);

+ /* Invalid value size: too big */
+ fd = bpf_map_create(BPF_MAP_TYPE_BLOOM_FILTER, NULL, 0, INT32_MAX, 100, NULL);
+ if (!ASSERT_LT(fd, 0, "bpf_map_create bloom filter invalid value too large"))
+ close(fd);
+
/* Invalid max entries size */
fd = bpf_map_create(BPF_MAP_TYPE_BLOOM_FILTER, NULL, 0, sizeof(value), 0, NULL);
if (!ASSERT_LT(fd, 0, "bpf_map_create bloom filter invalid max entries size"))
--
2.43.0

2024-04-07 13:30:29

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 18/22] drm/amdkfd: Check cgroup when returning DMABuf info

From: Mukul Joshi <[email protected]>

[ Upstream commit 9d7993a7ab9651afd5fb295a4992e511b2b727aa ]

Check cgroup permissions when returning DMA-buf info and
based on cgroup info return the GPU id of the GPU that have
access to the BO.

Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index c37f1fcd2165b..c7933d7d11b10 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1516,7 +1516,7 @@ static int kfd_ioctl_get_dmabuf_info(struct file *filep,

/* Find a KFD GPU device that supports the get_dmabuf_info query */
for (i = 0; kfd_topology_enum_kfd_devices(i, &dev) == 0; i++)
- if (dev)
+ if (dev && !kfd_devcgroup_check_permission(dev))
break;
if (!dev)
return -EINVAL;
@@ -1538,7 +1538,7 @@ static int kfd_ioctl_get_dmabuf_info(struct file *filep,
if (xcp_id >= 0)
args->gpu_id = dmabuf_adev->kfd.dev->nodes[xcp_id]->id;
else
- args->gpu_id = dmabuf_adev->kfd.dev->nodes[0]->id;
+ args->gpu_id = dev->id;
args->flags = flags;

/* Copy metadata buffer to user mode */
--
2.43.0

2024-04-07 13:30:52

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 21/22] selftests/ftrace: Fix event filter target_func selection

From: Mark Rutland <[email protected]>

[ Upstream commit 8ecab2e64572f1aecdfc5a8feae748abda6e3347 ]

The event filter function test has been failing in our internal test
farm:

| # not ok 33 event filter function - test event filtering on functions

Running the test in verbose mode indicates that this is because the test
erroneously determines that kmem_cache_free() is the most common caller
of kmem_cache_free():

# # + cut -d: -f3 trace
# # + sed s/call_site=([^+]*)+0x.*/1/
# # + sort
# # + uniq -c
# # + sort
# # + tail -n 1
# # + sed s/^[ 0-9]*//
# # + target_func=kmem_cache_free

.. and as kmem_cache_free() doesn't call itself, setting this as the
filter function for kmem_cache_free() results in no hits, and
consequently the test fails:

# # + grep kmem_cache_free trace
# # + grep kmem_cache_free
# # + wc -l
# # + hitcnt=0
# # + grep kmem_cache_free trace
# # + grep -v kmem_cache_free
# # + wc -l
# # + misscnt=0
# # + [ 0 -eq 0 ]
# # + exit_fail

This seems to be because the system in question has tasks with ':' in
their name (which a number of kernel worker threads have). These show up
in the trace, e.g.

test:.sh-1299 [004] ..... 2886.040608: kmem_cache_free: call_site=putname+0xa4/0xc8 ptr=000000000f4d22f4 name=names_cache

.. and so when we try to extact the call_site with:

cut -d: -f3 trace | sed 's/call_site=$[^+]*$+0x.*/\1/'

.. the 'cut' command will extrace the column containing
'kmem_cache_free' rather than the column containing 'call_site=...', and
the 'sed' command will leave this unchanged. Consequently, the test will
decide to use 'kmem_cache_free' as the filter function, resulting in the
failure seen above.

Fix this by matching the 'call_site=<func>' part specifically to extract
the function name.

Signed-off-by: Mark Rutland <[email protected]>
Reported-by: Aishwarya TCV <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../selftests/ftrace/test.d/filter/event-filter-function.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
index 2de7c61d1ae30..3f74c09c56b62 100644
--- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
+++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
@@ -24,7 +24,7 @@ echo 0 > events/enable
echo "Get the most frequently calling function"
sample_events

-target_func=`cut -d: -f3 trace | sed 's/call_site=$[^+]*$+0x.*/\1/' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
+target_func=`cat trace | grep -o 'call_site=$[^+]*$' | sed 's/call_site=//' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
if [ -z "$target_func" ]; then
exit_fail
fi
--
2.43.0

2024-04-07 13:31:10

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 22/22] kbuild: Disable KCSAN for autogenerated *.mod.c intermediaries

From: "Borislav Petkov (AMD)" <[email protected]>

[ Upstream commit 54babdc0343fff2f32dfaafaaa9e42c4db278204 ]

When KCSAN and CONSTRUCTORS are enabled, one can trigger the

"Unpatched return thunk in use. This should not happen!"

catch-all warning.

Usually, when objtool runs on the .o objects, it does generate a section
return_sites which contains all offsets in the objects to the return
thunks of the functions present there. Those return thunks then get
patched at runtime by the alternatives.

KCSAN and CONSTRUCTORS add this to the object file's .text.startup
section:

-------------------
Disassembly of section .text.startup:

...

0000000000000010 <_sub_I_00099_0>:
10: f3 0f 1e fa endbr64
14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9>
15: R_X86_64_PLT32 __tsan_init-0x4
19: e9 00 00 00 00 jmp 1e <__UNIQUE_ID___addressable_cryptd_alloc_aead349+0x6>
1a: R_X86_64_PLT32 __x86_return_thunk-0x4
-------------------

which, if it is built as a module goes through the intermediary stage of
creating a <module>.mod.c file which, when translated, receives a second
constructor:

-------------------
Disassembly of section .text.startup:

0000000000000010 <_sub_I_00099_0>:
10: f3 0f 1e fa endbr64
14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9>
15: R_X86_64_PLT32 __tsan_init-0x4
19: e9 00 00 00 00 jmp 1e <_sub_I_00099_0+0xe>
1a: R_X86_64_PLT32 __x86_return_thunk-0x4

...

0000000000000030 <_sub_I_00099_0>:
30: f3 0f 1e fa endbr64
34: e8 00 00 00 00 call 39 <_sub_I_00099_0+0x9>
35: R_X86_64_PLT32 __tsan_init-0x4
39: e9 00 00 00 00 jmp 3e <__ksymtab_cryptd_alloc_ahash+0x2>
3a: R_X86_64_PLT32 __x86_return_thunk-0x4
-------------------

in the .ko file.

Objtool has run already so that second constructor's return thunk cannot
be added to the .return_sites section and thus the return thunk remains
unpatched and the warning rightfully fires.

Drop KCSAN flags from the mod.c generation stage as those constructors
do not contain data races one would be interested about.

Debugged together with David Kaplan <[email protected]> and Nikolay
Borisov <[email protected]>.

Reported-by: Paul Menzel <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Paul Menzel <[email protected]> # Dell XPS 13
Reviewed-by: Nikolay Borisov <[email protected]>
Reviewed-by: Marco Elver <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
scripts/Makefile.modfinal | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index b3a6aa8fbe8cb..1979913aff682 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -23,7 +23,7 @@ modname = $(notdir $(@:.mod.o=))
part-of-module = y

quiet_cmd_cc_o_c = CC [M] $@
- cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI) $(CFLAGS_GCOV), $(c_flags)) -c -o $@ $<
+ cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI) $(CFLAGS_GCOV) $(CFLAGS_KCSAN), $(c_flags)) -c -o $@ $<

%.mod.o: %.mod.c FORCE
$(call if_changed_dep,cc_o_c)
--
2.43.0

2024-04-07 13:32:04

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 15/22] scsi: mpi3mr: Avoid memcpy field-spanning write WARNING

From: Shin'ichiro Kawasaki <[email protected]>

[ Upstream commit 429846b4b6ce9853e0d803a2357bb2e55083adf0 ]

When the "storcli2 show" command is executed for eHBA-9600, mpi3mr driver
prints this WARNING message:

memcpy: detected field-spanning write (size 128) of single field "bsg_reply_buf->reply_buf" at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 (size 1)
WARNING: CPU: 0 PID: 12760 at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 mpi3mr_bsg_request+0x6b12/0x7f10 [mpi3mr]

The cause of the WARN is 128 bytes memcpy to the 1 byte size array "__u8
replay_buf[1]" in the struct mpi3mr_bsg_in_reply_buf. The array is intended
to be a flexible length array, so the WARN is a false positive.

To suppress the WARN, remove the constant number '1' from the array
declaration and clarify that it has flexible length. Also, adjust the
memory allocation size to match the change.

Suggested-by: Sathya Prakash Veerichetty <[email protected]>
Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/mpi3mr/mpi3mr_app.c | 2 +-
include/uapi/scsi/scsi_bsg_mpi3mr.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpi3mr/mpi3mr_app.c b/drivers/scsi/mpi3mr/mpi3mr_app.c
index 9dacbb8570c93..aa5b535e6662b 100644
--- a/drivers/scsi/mpi3mr/mpi3mr_app.c
+++ b/drivers/scsi/mpi3mr/mpi3mr_app.c
@@ -1345,7 +1345,7 @@ static long mpi3mr_bsg_process_mpt_cmds(struct bsg_job *job, unsigned int *reply
if ((mpirep_offset != 0xFF) &&
drv_bufs[mpirep_offset].bsg_buf_len) {
drv_buf_iter = &drv_bufs[mpirep_offset];
- drv_buf_iter->kern_buf_len = (sizeof(*bsg_reply_buf) - 1 +
+ drv_buf_iter->kern_buf_len = (sizeof(*bsg_reply_buf) +
mrioc->reply_sz);
bsg_reply_buf = kzalloc(drv_buf_iter->kern_buf_len, GFP_KERNEL);

diff --git a/include/uapi/scsi/scsi_bsg_mpi3mr.h b/include/uapi/scsi/scsi_bsg_mpi3mr.h
index 907d345f04f93..353183e863e47 100644
--- a/include/uapi/scsi/scsi_bsg_mpi3mr.h
+++ b/include/uapi/scsi/scsi_bsg_mpi3mr.h
@@ -382,7 +382,7 @@ struct mpi3mr_bsg_in_reply_buf {
__u8 mpi_reply_type;
__u8 rsvd1;
__u16 rsvd2;
- __u8 reply_buf[1];
+ __u8 reply_buf[];
};

/**
--
2.43.0

2024-04-07 13:32:38

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.6 17/22] btrfs: return accurate error code on open failure in open_fs_devices()

From: Anand Jain <[email protected]>

[ Upstream commit 2f1aeab9fca1a5f583be1add175d1ee95c213cfa ]

When attempting to exclusive open a device which has no exclusive open
permission, such as a physical device associated with the flakey dm
device, the open operation will fail, resulting in a mount failure.

In this particular scenario, we erroneously return -EINVAL instead of the
correct error code provided by the bdev_open_by_path() function, which is
-EBUSY.

Fix this, by returning error code from the bdev_open_by_path() function.
With this correction, the mount error message will align with that of
ext4 and xfs.

Reviewed-by: Boris Burkov <[email protected]>
Signed-off-by: Anand Jain <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/volumes.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 722a1dde75636..0bf08672caaa2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1224,23 +1224,30 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *device;
struct btrfs_device *latest_dev = NULL;
struct btrfs_device *tmp_device;
+ int ret = 0;

list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
dev_list) {
- int ret;
+ int ret2;

- ret = btrfs_open_one_device(fs_devices, device, flags, holder);
- if (ret == 0 &&
+ ret2 = btrfs_open_one_device(fs_devices, device, flags, holder);
+ if (ret2 == 0 &&
(!latest_dev || device->generation > latest_dev->generation)) {
latest_dev = device;
- } else if (ret == -ENODATA) {
+ } else if (ret2 == -ENODATA) {
fs_devices->num_devices--;
list_del(&device->dev_list);
btrfs_free_device(device);
}
+ if (ret == 0 && ret2 != 0)
+ ret = ret2;
}
- if (fs_devices->open_devices == 0)
+
+ if (fs_devices->open_devices == 0) {
+ if (ret)
+ return ret;
return -EINVAL;
+ }

fs_devices->opened = 1;
fs_devices->latest_dev = latest_dev;
--
2.43.0