Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp2262131pxv; Sat, 17 Jul 2021 08:30:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxYIgpyTjWm6erqU/u+GDUpMidyQn5W2Y1hTreTLLbnEkZUaZ8UCGJ5CwuEvTr5POcyTMkF X-Received: by 2002:a5d:850d:: with SMTP id q13mr11243799ion.3.1626535829181; Sat, 17 Jul 2021 08:30:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626535829; cv=none; d=google.com; s=arc-20160816; b=eS8L42S7Kuv+K7OWz783k5PVeftnmZHIslAgoJlipSmDocELeWXCPHr3+7gadocWl/ /aGTdFOniPA3gNm7i3aYCJN+tH2p6Ir9bzpZ1g4/P8sFhByYt67USEkFhuWWTDmmqnrg 0jv3QcHpRnuXTQBCWh0LDjrCJtxGn567EdForOI+GJ2vtsbbXg+PZtbpnd0IYBrdqAQ/ c+Egyb9dC5HwLst1nXwqMD3aiX8Qxo4jcHAvhqgoombx46IAOFvjOPzeZ20777Ri8cSP xZosMLBjfcBoAsvDuc9hkUHsi3GDnEd+upZ1tO6z/UN6ydu2xk/KnLAmiyWS1hCH9L7O 7YIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-disposition :mime-version:user-agent:in-reply-to:subject:cc:to:from:message-id :date; bh=Cfp4Rm2qdEnwSezp0ZVGt6ZhiTui+6YPaLajwL7g1uw=; b=navYhtfbfv35kKPBTXYMebeB3jC/iVFw6wh4v8JsLht5jjG8RjkDM4KmD7SnLuSKBq j8ClngdmdE2V9aeSxc6pUixRpylFUSxVmwYaVX5ochTdg7a6Zj6ZTd5cU461JJXsjnyL 3CFbMFAQxyodDawxRirPYck2GiDlBzTn6lzG1NmqT/C/SS48NrSy+8rkCF91Tcm8Te7N 0bClXm3pdb8JxNQkQIhr++qDrmQd9Zzl+W2CJvy0EYn/pJRnh/TmA0SPLXYeihTwsQsS gh967mtPqTDnZLES1O4mK74XmRynM0UbvLl2nHE0u//2t1nURdcDb4iM/h/zlde1+28A H97w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j3si9487431ilu.69.2021.07.17.08.30.13; Sat, 17 Jul 2021 08:30:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234609AbhGQPcY convert rfc822-to-8bit (ORCPT + 99 others); Sat, 17 Jul 2021 11:32:24 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:31512 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234255AbhGQPcX (ORCPT ); Sat, 17 Jul 2021 11:32:23 -0400 Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4GRsTx3TnMzB6G4; Sat, 17 Jul 2021 17:29:25 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CsnSEz9Ns0sb; Sat, 17 Jul 2021 17:29:25 +0200 (CEST) Received: from vm-hermes.si.c-s.fr (vm-hermes.si.c-s.fr [192.168.25.253]) by pegase1.c-s.fr (Postfix) with ESMTP id 4GRsTx2MlBzB6FJ; Sat, 17 Jul 2021 17:29:25 +0200 (CEST) Received: by vm-hermes.si.c-s.fr (Postfix, from userid 33) id C1C925EF; Sat, 17 Jul 2021 17:34:35 +0200 (CEST) Received: from 37-171-38-5.coucou-networks.fr (37-171-38-5.coucou-networks.fr [37.171.38.5]) by messagerie.c-s.fr (Horde Framework) with HTTP; Sat, 17 Jul 2021 17:34:35 +0200 Date: Sat, 17 Jul 2021 17:34:35 +0200 Message-ID: <20210717173435.Horde.Yjk9m3mjnYfLI-Xv6-IIdg8@messagerie.c-s.fr> From: Christophe Leroy To: Tyrel Datwyler Cc: linuxppc-dev@lists.ozlabs.org, brking@linux.ibm.com, stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, martin.petersen@oracle.com, james.bottomley@hansenpartnership.com Subject: Re: [PATCH] ibmvfc: fix command state accounting and stale response detection In-Reply-To: <20210716205220.1101150-1-tyreld@linux.ibm.com> User-Agent: Internet Messaging Program (IMP) H5 (6.2.3) Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Tyrel Datwyler a écrit : > Prior to commit 1f4a4a19508d ("scsi: ibmvfc: Complete commands outside > the host/queue lock") responses to commands were completed sequentially > with the host lock held such that a command had a basic binary state of > active or free. It was therefore a simple affair of ensuring the > assocaiated ibmvfc_event to a VIOS response was valid by testing that it > was not already free. The lock relexation work to complete commands > outside the lock inadverdently made it a trinary command state such that > a command is either in flight, received and being completed, or > completed and now free. This breaks the stale command detection logic as > a command may be still marked active and been placed on the delayed > completion list when a second stale response for the same command > arrives. This can lead to double completions and list corruption. This > issue was exposed by a recent VIOS regression were a missing memory > barrier could occasionally result in the ibmvfc client receiveing a > duplicate response for the same command. > > Fix the issue by introducing the atomic ibmvfc_event.active to track the > trinary state of a command. The state is explicitly set to 1 when a > command is successfully sent. The CRQ response handlers use > atomic_dec_if_positive() to test for stale responses and correctly > transition to the completion state when a active command is received. > Finally, atomic_dec_and_test() is used to sanity check transistions > when commands are freed as a result of a completion, or moved to the > purge list as a result of error handling or adapter reset. > > Cc: stable@vger.kernel.org > Fixes: 1f4a4a19508d ("scsi: ibmvfc: Complete commands outside the > host/queue lock") > Signed-off-by: Tyrel Datwyler > --- > drivers/scsi/ibmvscsi/ibmvfc.c | 19 +++++++++++++++++-- > drivers/scsi/ibmvscsi/ibmvfc.h | 1 + > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c > index bee1bec49c09..935b01ee44b7 100644 > --- a/drivers/scsi/ibmvscsi/ibmvfc.c > +++ b/drivers/scsi/ibmvscsi/ibmvfc.c > @@ -807,6 +807,13 @@ static int ibmvfc_init_event_pool(struct > ibmvfc_host *vhost, > for (i = 0; i < size; ++i) { > struct ibmvfc_event *evt = &pool->events[i]; > > + /* > + * evt->active states > + * 1 = in flight > + * 0 = being completed > + * -1 = free/freed > + */ > + atomic_set(&evt->active, -1); > atomic_set(&evt->free, 1); > evt->crq.valid = 0x80; > evt->crq.ioba = cpu_to_be64(pool->iu_token + (sizeof(*evt->xfer_iu) * i)); > @@ -1017,6 +1024,7 @@ static void ibmvfc_free_event(struct ibmvfc_event *evt) > > BUG_ON(!ibmvfc_valid_event(pool, evt)); > BUG_ON(atomic_inc_return(&evt->free) != 1); > + BUG_ON(atomic_dec_and_test(&evt->active)); Avoid new BUG_ONs. See https://www.kernel.org/doc/html/latest/process/deprecated.html > > spin_lock_irqsave(&evt->queue->l_lock, flags); > list_add_tail(&evt->queue_list, &evt->queue->free); > @@ -1072,6 +1080,12 @@ static void ibmvfc_complete_purge(struct > list_head *purge_list) > **/ > static void ibmvfc_fail_request(struct ibmvfc_event *evt, int error_code) > { > + /* > + * Anything we are failing should still be active. Otherwise, it > + * implies we already got a response for the command and are doing > + * something bad like double completing it. > + */ > + BUG_ON(!atomic_dec_and_test(&evt->active)); Same