Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp5417359rwj; Wed, 21 Dec 2022 02:18:48 -0800 (PST) X-Google-Smtp-Source: AMrXdXuW/mQyJ0+3hH8mTEQceV1dNjxTWd8Tx98Hcj8FXRCrjp0eROPxU3OnbE1EN/QN+f1RCWxb X-Received: by 2002:a17:906:a842:b0:816:ef2a:631a with SMTP id dx2-20020a170906a84200b00816ef2a631amr910060ejb.31.1671617928456; Wed, 21 Dec 2022 02:18:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671617928; cv=none; d=google.com; s=arc-20160816; b=WRJQNC7UrY5j2unAvZviC7KIAhGGeaog/mhh8akoddrMn5TJokbWLHJ7DwuMc7j/TX nFrpLyG/HwVeM1aI8VB+wLFw917oGZlnYn1074waKpXQR4p0u+NLxtLzLQMdoHZcC9FY O1fK9foqAJSi/tTPuDK0OJmzjI04jvieV2oIm62JFSjpuJWA+PO5dreu0R/8uiwtY1Ws I7mM/f+cW9Z4K009n72bANgeNq8+YJPI0gor2sYYu6/UALrHiCnKEMqG1Vd/mfVXZac2 +9SJ75i6PkSN0I5Jzctp5r8hDhci9c/iR+uoByXJZdxUMhVJ1nvfrtqdfPvNaPZV6b+i fJMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=hzBE8KVyZ2GF0GOH7PphoRRq9WA/Ow9+gFGwLtTqqnY=; b=AmP/Eivdd8C4Qnn+9deb/FfX5Df1jL2jRsBUegm7ywNgNRWsz67+vOoWhu7T9F2Oxq i1r1Hj8qvx11oWiJs1r7mWa3d3TgQDEKHYhWqkIdjbpnvSQ6VGto8aLQV1l45NdCDRDg Vx/U92xt/j+LKdr6dO41Tkp3CaKV/LBBNqxw152P2pd/gGvEczqrkd/WvJQ6SyDcFE8e QUeF7qVh7xW6q/5GYr3/kC2Zjk1JqxpzVDTjeoiamEnkjn6tFsybvYRpPZyXzfdII+Ob tgNdHu3tVSSe+zZyt+Rmo0IGkoZenbmE+C5R7XUbLCnNifG82eGZO0ORtQW0pjgCO8NE fDMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hq26-20020a1709073f1a00b007b2a6aaff06si16011388ejc.50.2022.12.21.02.18.32; Wed, 21 Dec 2022 02:18:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229836AbiLUJ2c (ORCPT + 69 others); Wed, 21 Dec 2022 04:28:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234366AbiLUJ23 (ORCPT ); Wed, 21 Dec 2022 04:28:29 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 349CB13CE5; Wed, 21 Dec 2022 01:28:26 -0800 (PST) Received: from dggpemm500012.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NcSl46Lg1zRpnj; Wed, 21 Dec 2022 17:27:12 +0800 (CST) Received: from [10.67.101.126] (10.67.101.126) by dggpemm500012.china.huawei.com (7.185.36.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Wed, 21 Dec 2022 17:28:23 +0800 Message-ID: Date: Wed, 21 Dec 2022 17:28:23 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.3.1 Subject: Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device fell off Content-Language: en-CA To: John Garry , , , , , , CC: , , , , References: <20221216100327.7386-1-yangxingui@huawei.com> <565fcf28-ec53-8d74-00a3-94be8e5b60e4@oracle.com> <9b8da72d-f251-9c1b-0727-28254d7007c3@oracle.com> From: yangxingui In-Reply-To: <9b8da72d-f251-9c1b-0727-28254d7007c3@oracle.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.101.126] X-ClientProxiedBy: dggpemm500006.china.huawei.com (7.185.36.236) To dggpemm500012.china.huawei.com (7.185.36.89) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/12/19 22:53, John Garry wrote: > On 19/12/2022 12:59, yangxingui wrote: >>> Firstly, I think that there is a bug in sas_ata_device_link_abort() >>> -> ata_link_abort() code in that the host lock in not grabbed, as the >>> comment in ata_port_abort() mentions. Having said that, libsas had >>> already some dodgy host locking usage - specifically dropping the >>> lock for the queuing path (that's something else to be fixed up ... I >>> think that is due to queue command CB calling task_done() in some >>> cases), but I still think that sas_ata_device_link_abort() should be >>> fixed (to grab the host lock). >> ok, I agree with you very much for this, I had doubts about whether we >> needed to grab lock before. > > ok, I hope that you can fix this up separately. > >>> >>> Secondly, this just seems like a half solution to the age-old problem >>> - that is, EH eventually kicking in only after 30 seconds when a disk >>> is removed with active IO. I say half solution as SAS disks still >>> have this issue for libsas. Can we instead push to try to solve both >>> of them now? >> >> Jason said you must have such an opinion "a half solution". As libsas >> does not have any interface to mark all outstanding commands as failed >> for SAS disk currently and SAS disk support I/O resumable transmission >> after intermittent disconnections > > I don't know what you mean by "resumable transmission after intermittent > disconnections". > >> , so I want to optimize sata disk first. >> If we want to achieve a complete solution, perhaps we need to define >> such an interface in libsas and implement it by lldd. My current idea >> is to call sas_abort_task() for all outstanding commands in lldd. I >> wonder if you approve of this? > > Are you sure you mean sas_abort_task()? That is for the LLDD to issue an > abort TMF. I assume that you mean sas_task_abort(). If so, I am not too > keen on the idea of libsas calling into the LLDD to inform of such an > event. I've implemented this solution. The verification seems to be ok both for sas/sata device. I'll update the version again. Please have a look? Thanks, Xingui Note that maybe a tagset iter function could be used by libsas to > abort each active IO, but I don't like libsas messing with such a thing; > in addition, there may be some conflict between libsas aborting the IO > and the IO completing with error in the LLDD. > > Please note that I need to refresh my memory on this whole EH topic... > > Thanks, > John > > .