Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754603AbcKUPRI (ORCPT ); Mon, 21 Nov 2016 10:17:08 -0500 Received: from szxga01-in.huawei.com ([58.251.152.64]:63006 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753227AbcKUPRG (ORCPT ); Mon, 21 Nov 2016 10:17:06 -0500 Subject: Re: [RFC PATCH] scsi: libsas: fix WARN on device removal To: Dan Williams References: <1478185120-5509-1-git-send-email-john.garry@huawei.com> <9870e7bc-a472-1913-1930-ac022e8ad5e8@huawei.com> <58257D52.6090507@huawei.com> <93ae84f6-75a2-f576-808e-f98c6256b6a6@huawei.com> <58258631.1090203@huawei.com> <9bdd2ca5-aa72-6a18-b66d-8e791e4852c7@huawei.com> CC: , "Martin K. Petersen" , wangyijing , linux-scsi , John Garry , "linux-kernel@vger.kernel.org" , , , Tejun Heo , Jinpu Wang From: John Garry Message-ID: <7d4e4aa5-0d15-ca8c-243f-24c60e1378ed@huawei.com> Date: Mon, 21 Nov 2016 15:16:26 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.203.181.152] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1516 Lines: 49 >>>> @Maintainers, would you be willing to accept this patch as an interim fix >>>> for the dastardly WARN while we try to fix the flutter issue? >>> >>> >>> To me this adds a bug to quiet a benign, albeit noisy, warning. >>> >> >> What is the bug which is being added? > > The bug where we queue a port teardown, but see a port formation event > in the meantime. As I understand, this vulnerability already exists: http://marc.info/?l=linux-scsi&m=143801026028006&w=2 I actually don't understand how libsas dealt with flutter (which I take to mean a burst of up and down events) before these changes, as it can only queue simultaneously one up and one down event per port. So, if we get a flutter, then the events are lost and we get indeterminate state. > >> And it's a very noisy warning, as in 6K lines on the console when an >> expander is unplugged. > > Does something like this modulate the failure? > > diff --git a/drivers/scsi/scsi_transport_sas.c > b/drivers/scsi/scsi_transport_sas.c index > 60b651bfaa01..11401e5c88ba 100644 > --- a/drivers/scsi/scsi_transport_sas.c > +++ b/drivers/scsi/scsi_transport_sas.c > @@ -262,9 +262,10 @@ static void sas_bsg_remove(struct Scsi_Host > *shost, struct sas_rphy *rphy > { > struct request_queue *q; > > - if (rphy) > + if (rphy) { > q = rphy->q; > - else > + rphy->q = NULL; > + } else > q = to_sas_host_attrs(shost)->q; > > if (!q) > > . >