Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp1344098ybc; Tue, 12 Nov 2019 19:21:11 -0800 (PST) X-Google-Smtp-Source: APXvYqxk73KDFhypGrNsVuiTrnzSPTEt3mpDNN6tf2q0M0GfACCIaSvjYy3UTglueo6JvNw5g1AN X-Received: by 2002:a05:6402:1299:: with SMTP id w25mr1313086edv.10.1573615271154; Tue, 12 Nov 2019 19:21:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573615271; cv=none; d=google.com; s=arc-20160816; b=Dt+NPPipAUCdMVpUROSZ2QDXXyPPcVJyvkJiJdzbKMnbt8CUfPcrqufStGMIv49q1D VslbIuwl0XxBeGJgJHgyFQ2+OHpWIQUX27CzDcFVb/jKls+hWfJqjbRpzLY/CULEYA4f VLMuyhjdIMdyAyz/xzFujqxqgVjhWpVrUZTMnnH5Rs1jwzfLWUYJHU3Qj8nj8l395tCL YsTyn9zwtOm1dGVJhmrx5qXDF37c2Xbr7IGOLFODNkE7shp1WoQdw1jG5PkqvSbcAmKo uTwdcCnD8OyhsYznwNenP4xUe+WX4i33fASN1Q7GN3aarg9hMHBLmJXvyNTfWFPWKunY 5jvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:organization:from:subject:cc:to :dkim-signature; bh=sZV6H98v0HPlnhfbrGjgaNH6H6SgCrfRRbDmLCVnuUM=; b=J/5Oy0RR68h5m67ofLhfMQyPiO4IPmuL+TTekN3bhSMmiCYxk9eNqiZPh6p0ep/j+7 eHTteOSjFELAztzz4NNgZkW0MvQ56bOXAoF0eXJaTpQJci7uEAmVWJn25VivC7XRDfOh xwAnk4v1Lc3Mqw2NYMb/OEPNyHxriX8Q1X8EGn14QfeM86fkIHPHX1+jEIb92va+X4tJ AymCKb1mYSO54rrjnWw/KFHHyGtNwAcTR5jyX+BueagtGrsWwOYErAG3sfmycLQdozZZ c0BncwRGZNto2uCgCJWaU9hb+b7x7igKr8Xp3Zqcp9ypsF8z0zumGUpF71dJTiqsoIS6 B58w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=bvh348oJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t6si366054eji.80.2019.11.12.19.20.47; Tue, 12 Nov 2019 19:21:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=bvh348oJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727275AbfKMDRL (ORCPT + 99 others); Tue, 12 Nov 2019 22:17:11 -0500 Received: from userp2130.oracle.com ([156.151.31.86]:38394 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727100AbfKMDRK (ORCPT ); Tue, 12 Nov 2019 22:17:10 -0500 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xAD39ubg106447; Wed, 13 Nov 2019 03:17:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=to : cc : subject : from : references : date : in-reply-to : message-id : mime-version : content-type; s=corp-2019-08-05; bh=sZV6H98v0HPlnhfbrGjgaNH6H6SgCrfRRbDmLCVnuUM=; b=bvh348oJKNxw0e2166NUEHOHJW/hnhEIJsJqCroLg/mdp4hMYmFpGO+nr8RYHOhbaP5T f5TBNdxf435Rg0RkddIPcN1sHXPpB7l1U3EKikYycVpIKXH9IUDC4QyvnFMA5luRxWEF vAXMTrBGwJaK5zp83Q5wAjqBaG6VmPPyKRkAIUihbBXM+mGRHe0dVkZgL4Oy4nKwP/Xt RQKqNssYC3hGQ4ZZlhVKa1CoRwCB+gqgM+8Qx+NXX7HrLtM4rG1+w2pUjvljjjo2O3Rt zynT4UgGRy6txkvpLvVvRzLNsx/f2yPEqyxVyFcplYzTXTmLdsS/GZ41FJOQtW+QWJ5C GQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 2w5mvts866-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Nov 2019 03:17:05 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xAD39D5u179641; Wed, 13 Nov 2019 03:15:04 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 2w7vbc1wa5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Nov 2019 03:15:04 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id xAD3F3IB001063; Wed, 13 Nov 2019 03:15:03 GMT Received: from ca-mkp.ca.oracle.com (/10.159.214.123) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 12 Nov 2019 19:15:03 -0800 To: James Smart , Dick Kennedy Cc: Daniel Wagner , linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] scsi: lpfc: Move work items to a stack list From: "Martin K. Petersen" Organization: Oracle Corporation References: <20191105080855.16881-1-dwagner@suse.de> Date: Tue, 12 Nov 2019 22:15:00 -0500 In-Reply-To: <20191105080855.16881-1-dwagner@suse.de> (Daniel Wagner's message of "Tue, 5 Nov 2019 09:08:55 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1.92 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9439 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1910280000 definitions=main-1911130026 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9439 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1910280000 definitions=main-1911130026 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org James/Dick: Please review! > Move all work items to a list which lives on the stack while holding > the corresponding lock. > > With this we avoid a race between testing if the list is empty and > extracting an element of the list. Although, the list_remove_head() > macro tests will return an NULL pointer if the list is empty the two > functions lpfc_sli_handle_slow_ring_event_s4() and > lpfc_sli4_els_xri_abort_event_proc() do not test the return element if > it's NULL. > > Instead adding another test if the pointer is NULL just avoid this > access pattern by using the stack list. This also avoids toggling the > interrupts on/off for every item. > > Fixes: 4f774513f7b3 ("[SCSI] lpfc 8.3.2 : Addition of SLI4 Interface - Queues") > Cc: James Smart > Cc: Dick Kennedy > Signed-off-by: Daniel Wagner > --- > > Hi, > > While trying to understand what's going on in the Oops below I figured > that it could be the result of the invalid pointer access. The patch > still needs testing by our customer but indepent of this I think the > patch fixes a real bug. > > [ 139.392029] general protection fault: 0000 [#1] SMP PTI > [ 139.397862] CPU: 5 PID: 998 Comm: kworker/5:13 Tainted: G 4.12.14-226.g94364da-default #1 SLE15-SP1 (unreleased) > [ 139.410962] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.9.1 12/04/2018 > [ 139.419339] Workqueue: lpfc_wq lpfc_sli4_hba_process_cq [lpfc] > [ 139.425847] task: ffff95c996051440 task.stack: ffffaa038601c000 > [ 139.432459] RIP: 0010:lpfc_set_rrq_active+0xa6/0x2a0 [lpfc] > [ 139.438676] RSP: 0018:ffffaa038601fcf8 EFLAGS: 00010046 > [ 139.444504] RAX: 0000000000000292 RBX: ffff95c5a9a0a000 RCX: 000000000000ffff > [ 139.452466] RDX: ffff95c5accbb7b8 RSI: 0064695f74726f70 RDI: ffff95c5a9a0b160 > [ 139.460427] RBP: ffff95c5a9a0b160 R08: 0000000000000001 R09: 0000000000000002 > [ 139.468389] R10: ffffaa038601fdd8 R11: 61c8864680b583eb R12: 0000000000000001 > [ 139.476350] R13: 000000000000ffff R14: 00000000000002bb R15: 0064695f74726f70 > [ 139.484311] FS: 0000000000000000(0000) GS:ffff95c9bfc80000(0000) knlGS:0000000000000000 > [ 139.493340] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 139.499749] CR2: 0000560354607098 CR3: 00000007a000a003 CR4: 00000000001606e0 > [ 139.507711] Call Trace: > [ 139.510451] lpfc_sli4_io_xri_aborted+0x1a7/0x250 [lpfc] > [ 139.516386] lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0xa0/0x180 [lpfc] > [ 139.523964] ? __switch_to_asm+0x40/0x70 > [ 139.528338] ? __switch_to_asm+0x34/0x70 > [ 139.532718] ? lpfc_sli4_fp_handle_cqe+0xc3/0x450 [lpfc] > [ 139.538649] lpfc_sli4_fp_handle_cqe+0xc3/0x450 [lpfc] > [ 139.544383] ? __switch_to_asm+0x34/0x70 > [ 139.548762] __lpfc_sli4_process_cq+0xea/0x220 [lpfc] > [ 139.554393] ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0x180/0x180 [lpfc] > [ 139.562557] __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc] > [ 139.568486] process_one_work+0x1da/0x400 > [ 139.572959] worker_thread+0x2b/0x3f0 > [ 139.577044] ? process_one_work+0x400/0x400 > [ 139.581710] kthread+0x113/0x130 > [ 139.585310] ? kthread_create_worker_on_cpu+0x50/0x50 > [ 139.590945] ret_from_fork+0x35/0x40 > > Thanks, > Daniel > > drivers/scsi/lpfc/lpfc_sli.c | 32 ++++++++++++++++++++++---------- > 1 file changed, 22 insertions(+), 10 deletions(-) > > diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c > index 294f041961a8..cbeb1f408ccc 100644 > --- a/drivers/scsi/lpfc/lpfc_sli.c > +++ b/drivers/scsi/lpfc/lpfc_sli.c > @@ -3903,16 +3903,16 @@ lpfc_sli_handle_slow_ring_event_s4(struct lpfc_hba *phba, > struct lpfc_cq_event *cq_event; > unsigned long iflag; > int count = 0; > + LIST_HEAD(work_list); > > spin_lock_irqsave(&phba->hbalock, iflag); > phba->hba_flag &= ~HBA_SP_QUEUE_EVT; > + list_splice_init(&phba->sli4_hba.sp_queue_event, &work_list); > spin_unlock_irqrestore(&phba->hbalock, iflag); > - while (!list_empty(&phba->sli4_hba.sp_queue_event)) { > + while (!list_empty(&work_list)) { > /* Get the response iocb from the head of work queue */ > - spin_lock_irqsave(&phba->hbalock, iflag); > - list_remove_head(&phba->sli4_hba.sp_queue_event, > - cq_event, struct lpfc_cq_event, list); > - spin_unlock_irqrestore(&phba->hbalock, iflag); > + list_remove_head(&work_list, cq_event, > + struct lpfc_cq_event, list); > > switch (bf_get(lpfc_wcqe_c_code, &cq_event->cqe.wcqe_cmpl)) { > case CQE_CODE_COMPL_WQE: > @@ -3941,6 +3941,17 @@ lpfc_sli_handle_slow_ring_event_s4(struct lpfc_hba *phba, > if (count == 64) > break; > } > + > + /* > + * If the limit stops the processing of events, move the > + * remaining events back to the main event queue. > + */ > + spin_lock_irqsave(&phba->hbalock, iflag); > + if (!list_empty(&work_list)) { > + phba->hba_flag |= HBA_SP_QUEUE_EVT; > + list_splice(&work_list, &phba->sli4_hba.sp_queue_event); > + } > + spin_unlock_irqrestore(&phba->hbalock, iflag); > } > > /** > @@ -12877,18 +12888,19 @@ lpfc_sli_intr_handler(int irq, void *dev_id) > void lpfc_sli4_els_xri_abort_event_proc(struct lpfc_hba *phba) > { > struct lpfc_cq_event *cq_event; > + LIST_HEAD(work_list); > > /* First, declare the els xri abort event has been handled */ > spin_lock_irq(&phba->hbalock); > phba->hba_flag &= ~ELS_XRI_ABORT_EVENT; > + list_splice_init(&phba->sli4_hba.sp_els_xri_aborted_work_queue, > + &work_list); > spin_unlock_irq(&phba->hbalock); > /* Now, handle all the els xri abort events */ > - while (!list_empty(&phba->sli4_hba.sp_els_xri_aborted_work_queue)) { > + while (!list_empty(&work_list)) { > /* Get the first event from the head of the event queue */ > - spin_lock_irq(&phba->hbalock); > - list_remove_head(&phba->sli4_hba.sp_els_xri_aborted_work_queue, > - cq_event, struct lpfc_cq_event, list); > - spin_unlock_irq(&phba->hbalock); > + list_remove_head(&work_list, cq_event, > + struct lpfc_cq_event, list); > /* Notify aborted XRI for ELS work queue */ > lpfc_sli4_els_xri_aborted(phba, &cq_event->cqe.wcqe_axri); > /* Free the event processed back to the free pool */ -- Martin K. Petersen Oracle Linux Engineering