Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2239543pxb; Wed, 30 Mar 2022 20:22:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw94BhV/cV8q8wha9PB6ceUQY9LRa2+/+yObR1DWTnugdWMZu8FRep/9sW0vt3Kg194hfWL X-Received: by 2002:a63:6a41:0:b0:386:5d6f:a643 with SMTP id f62-20020a636a41000000b003865d6fa643mr8919271pgc.169.1648696947746; Wed, 30 Mar 2022 20:22:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648696947; cv=none; d=google.com; s=arc-20160816; b=NlLAgdjFhFQIQ5+MOrQG/ckqYD7xDWjJRazHdUs2gGE5vkCEXk2thzufrp9zfCEH55 Kc3js3hkyUwkOtPbQtY/dSP7xor2T4UlrGCdsCvbHuDB5GHDmFeqUgSWqroxUrQTB+dr XO2CDTYUhAgf5A+qVorYCSP61D5eWeUNQ5k0UBNvT4/Wxj7Y4CNyVAZeHz58ywzV02/t 4IW6n2SA+YZd9kVTnkU4i8hir+wDV4qDxIXfl4PcLHzgAvLxwHZl5ewC8mLSvU9RRH1F Bnv2gXVtDFkLDmuEIco8o36T1dWynoX+oE3ZOmUqQVBhKEUCyFh3MiQXp1uP45QFoLWc NVoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=brw6H2OCUIULpDRKJQPJS6QPF6tRR+hTsmxsHFeIOFI=; b=HHpxpZbA/rIkk/lXgW4m3HgIHbB1xvl6WembxcDj62oLvcqRqnqJ5+rttxibu65jfX IOQjuL505yolzBxt77sdsIZHLlwcGu4wZxV+LsOLfKh1SyLsRQ+CoFLo0ex0IkbOLt4M 4bosL1bJw8m6zKH4ojiOX+AE/sOmpejZ8VyUqiF3eVm4O06kFA8d0ertrGQNj1UAUBba sI4TSwZlj/IlSHQasw1z/5/WCKTwXC+k733CoX5wA4eGGggGZApu8xCtFL5WDRn641l0 KcqxOcBuZhWIklxgPQ9WNdfCD+iXBOtgKUaXgNjSscUMH2sW7Bufc1e3Uqxq2aFOMuGK z6aQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id i7-20020a635847000000b003816043ee7asi22961366pgm.111.2022.03.30.20.22.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 20:22:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A465C114354; Wed, 30 Mar 2022 19:52:54 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245675AbiC3LBe (ORCPT + 99 others); Wed, 30 Mar 2022 07:01:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241893AbiC3LBb (ORCPT ); Wed, 30 Mar 2022 07:01:31 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B9BCDF70; Wed, 30 Mar 2022 03:59:45 -0700 (PDT) Received: from dggpemm500020.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4KT3L22t5KzDq7q; Wed, 30 Mar 2022 18:57:30 +0800 (CST) Received: from dggpemm500017.china.huawei.com (7.185.36.178) by dggpemm500020.china.huawei.com (7.185.36.49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Wed, 30 Mar 2022 18:59:43 +0800 Received: from [10.174.178.220] (10.174.178.220) by dggpemm500017.china.huawei.com (7.185.36.178) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Wed, 30 Mar 2022 18:59:42 +0800 Subject: Re: [REQUEST DISCUSS]: speed up SCSI error handle for host with massive devices To: Hannes Reinecke , Steffen Maier , , "linux-kernel@vger.kernel.org" , "James E.J. Bottomley" , "Martin K. Petersen" , Mike Christie , Lee Duncan , John Garry CC: Wu Bo , Feilong Lin , References: <71e09bb4-ff0a-23fe-38b4-fe6425670efa@huawei.com> <331aafe1-df9b-cae4-c958-9cf1800e389a@huawei.com> <64d5a997-a1bf-7747-072d-711a8248874d@suse.de> <1dd69d03-b4f6-ab20-4923-0995b40f045d@suse.de> From: Wenchao Hao Message-ID: Date: Wed, 30 Mar 2022 18:59:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 MIME-Version: 1.0 In-Reply-To: <1dd69d03-b4f6-ab20-4923-0995b40f045d@suse.de> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.178.220] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500017.china.huawei.com (7.185.36.178) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/3/30 17:32, Hannes Reinecke wrote: > On 3/30/22 11:11, Wenchao Hao wrote: >> On 2022/3/30 2:56, Hannes Reinecke wrote: >>> On 3/29/22 14:40, Wenchao Hao wrote: >>>> On 2022/3/29 18:56, Steffen Maier wrote: >>>>> On 3/29/22 11:06, Wenchao Hao wrote: >>>>>> SCSI timeout would call scsi_eh_scmd_add() on some conditions, host would be set >>>>>> to SHOST_RECOVERY state. Once host enter SHOST_RECOVERY, IOs submitted to all >>>>>> devices in this host would not succeed until the scsi_error_handler() finished. >>>>>> The scsi_error_handler() might takes long time to be done, it's unbearable when >>>>>> host has massive devices. >>>>>> >>>>>> I want to ask is anyone applying another error handler flow to address this >>>>>> phenomenon? >>>>>> >>>>>> I think we can move some operations(like scsi get sense, scsi send startunit >>>>>> and scsi device reset) out of scsi_unjam_host(), to perform these operations >>>>>> without setting host to SHOST_RECOVERY? It would reduce the time of block the >>>>>> whole host. >>>>>> >>>>>> Waiting for your discussion. >>>>> >>>>> We already have "async" aborts before even entering scsi_eh. So your use case seems to imply that those aborts fail and we enter scsi_eh? >>>>> >>>> >>>> Yes, I mean when scsi_abort_command() failed and scsi_eh_scmd_add() is called. >>>> >>>>> There's eh_deadline for limiting the time spent in escalation of scsi_eh, and instead directly go to host reset. Would this help? >>>>> >>>>> >>>> >>>> The deadline seems not helpful. What we want to see is a single LUN's command error >>>> would not stop other LUNs which share the same host. So my plan is to move reset LUN out >>>> from scsi_unjam_host() which run with host set to SHOST_RECOVERY. >>> >>> Nope. One of the key points of scsi_unjam_host() is that is has to stop all I/O before proceeding. Without doing so basically all SCSI parallel HBAs will fail EH as they _require_ I/O to be stopped. >>> >> >> I still can not understand why we must stop all I/O. In my comprehension, stopping all I/O >> is because we might reset host during scsi_error_handler() and we must wait host's number of >> failed command equal to number of busy command then we can wake up scsi_error_handler(). >> >> If move reset LUN out of scsi_error_handler(), and perform single LUN reset, we only need >> stop I/O of this single LUN, this would not affect other LUNs. If single LUN reset failed, >> we can then call in the large scale error handle. >> > I know the EH flow. > > Problem here is the way parallel SCSI operates. Remember, parallel SCSI is a _bus_, and there can be only one command at a time on the bus. > So if one command on the bus misfires and you have to start EH you have to stop all I/O on the bus to ensure that your EH command is the only one active on the bus. > Thank you for you explanation, it's clear to me now. > For modern HBAs we sure can device other ways and means of error recovery, but I can't really see how we would do that on legacy HBAs. > How about define a new return value of scsi_host_template's eh_timed_out callback which indicate this timeout is totally handled by LLDs. Like following: --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -359,6 +359,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) set_host_byte(scmd, DID_TIME_OUT); scsi_eh_scmd_add(scmd); } + } else if (rtn == EH_HANDLED_BY_DRIVERS) { + return BLK_EH_DONE; } Or scsi_host_template's eh_timed_out should not do this, we can define another callback? In the LLDs's timeout handler callback, apply single LUN reset first flow as previous mail metioned. Anyway, what we need is a way to reduce the time of setting host to SHOST_RECOVERY. >> Here is a brief flow: >> >> abort command >>     || >>     || failed >>     || >>     \/ >> stop single LUN's I/O (need to wait LUN's failed command number equal to busy command  number) >>     || >>     || failed  (according to our statistic, 90% reset LUN would succeed) >>     || >>     \/ > > Interesting. This does not match up with my experience, where 99% of the errors were due to a command timeout. > > So which errors do you see here? What are the causes? These error statistic are from our consumers' environment,they told me about 90% timeout triggered errors can be handled by reset LUN. Thanks.