Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1444054ybz; Wed, 29 Apr 2020 22:42:45 -0700 (PDT) X-Google-Smtp-Source: APiQypKC7KK6Pm+hyNOjdKQ+vw9lytxBkoO32+Jhz3KoMCHAShFXvb+Hx6kA2uBI08LBDvQPj5Rj X-Received: by 2002:a17:907:2142:: with SMTP id rk2mr1212433ejb.356.1588225365296; Wed, 29 Apr 2020 22:42:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588225365; cv=none; d=google.com; s=arc-20160816; b=FD+teJMeMZa52/KkKCsfRK/W/3Xwc+GwxAOmaTJwJQpFvuCzZCST8ZhvSZ6/tJ8zNQ zebcuvD/M4IVun18JGL1KpI/8du9keaFFXnpaNxmAdVzSpM6P+Bo5atzvAbQ1jpDqEXf 9zZHixJinWnsKrUVlWg+AjE2Lyo625QjKXdVdTPxxo6obDn6qX7D+PeKNc6fX4SzKAdj dPcgdUbvTU7Bmwu3Rel49zOrlHB/eqReItIShrK2tbMBzLSlStk+e3Z9+CwdXpsWmr+a BJAFjSfmJ0Rjvx8t8tae1KnJ2oHnOQUa9VfCNo4YyzWiK5r5ICr8MLw73TrZ5CZPp4H7 cqzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:references :in-reply-to:subject:cc:to:from:date:content-transfer-encoding :mime-version:dkim-signature; bh=/s8xOLocV/YTRQ/Nzq+NBZkj5U9IzFR+p0Ifn/H2Edo=; b=j+IBJxn4GzdqRLAX7RvvSWSQJgrKhBOaAaTdGPhODkRfZYP8EfmM9g4Rs16wmeHnBU nlFgvBBFrjWxnz7d9JSbiXuSIj7r78iwU22KUuiXXvoQX4hepN4vzXjR6i4DUdAXkzfE 1y3D65I9+165OZSima4uYVps2bVtJssoROfmuS9K7TQfJQc24pLsjlb47UN1uJ8cajyk Nm1SnTbzq2wCE/ggJDNEKcjP5KWrAh1SStjtrgxUFUYNh+o5ma5NXfiuYUmDvodnLclm SvQ8bLAsDOSCGGykeVTxy6gMDvxK8vauRXmYNMuVFlThaI8Mx2VQe0E0r5mTO4sYZNDC ivGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=mnX7Cg+i; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a6si5473676edb.367.2020.04.29.22.42.21; Wed, 29 Apr 2020 22:42:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=mnX7Cg+i; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726455AbgD3Fkq (ORCPT + 99 others); Thu, 30 Apr 2020 01:40:46 -0400 Received: from mail26.static.mailgun.info ([104.130.122.26]:23811 "EHLO mail26.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726391AbgD3Fkq (ORCPT ); Thu, 30 Apr 2020 01:40:46 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1588225245; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=/s8xOLocV/YTRQ/Nzq+NBZkj5U9IzFR+p0Ifn/H2Edo=; b=mnX7Cg+iQBgIFgzMQIScYwJvmGId1zr+BXVyPj09L0ibHfrPmnV+iFR2a8sMABCyRyvgxucJ NLTxDj/03hRTRELh629MkclZGeDLREC5Fe8uSxES5yyM6C8Jw3A5JCF/DPLslAUjJSx2xn6I Cnp+UHb01oN95nsFJ2du6i9ZMQ4= X-Mailgun-Sending-Ip: 104.130.122.26 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by mxa.mailgun.org with ESMTP id 5eaa64dc.7f707f237e30-smtp-out-n01; Thu, 30 Apr 2020 05:40:44 -0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 24818C44791; Thu, 30 Apr 2020 05:40:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: cang) by smtp.codeaurora.org (Postfix) with ESMTPSA id 4EF4BC433CB; Thu, 30 Apr 2020 05:40:43 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 30 Apr 2020 13:40:43 +0800 From: Can Guo To: Bart Van Assche Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, stanley.chu@mediatek.com, alim.akhtar@samsung.com, beanhuo@micron.com, Avri.Altman@wdc.com, bjorn.andersson@linaro.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, "James E.J. Bottomley" , "Martin K. Petersen" , open list Subject: Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume In-Reply-To: <9e15123e-4315-15cd-3d23-2df6144bd376@acm.org> References: <1588219805-25794-1-git-send-email-cang@codeaurora.org> <9e15123e-4315-15cd-3d23-2df6144bd376@acm.org> Message-ID: <1ef85ee212bee679f7b2927cbbc79cba@codeaurora.org> X-Sender: cang@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bart, On 2020-04-30 13:08, Bart Van Assche wrote: > On 2020-04-29 21:10, Can Guo wrote: >> During system resume, scsi_resume_device() decreases a request queue's >> pm_only counter if the scsi device was quiesced before. But after >> that, >> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter >> is >> still held (non-zero). Current scsi resume hook only sets the RPM >> status >> of the scsi device and its request queue to RPM_ACTIVE, but leaves the >> pm_only counter unchanged. This may make the request queue's pm_only >> counter remain non-zero after resume hook returns, hence those who are >> waiting on the mq_freeze_wq would never be woken up. Fix this by >> calling >> blk_post_runtime_resume() if pm_only is non-zero to balance the >> pm_only >> counter which is held by the scsi device's RPM ops. > > How was this issue discovered? How has this patch been tested? > > Thanks, > > Bart. As the issue was found after system resumes, so the issue was discovered during system suspend/resume test, and it is very easy to be replicated. After system resumes, if this issue hits some scsi devices, all bios sent to their request queues are blocked, which may cause a system hang if the scsi devices are vital to system functionality. To make sure the patch work well, we have tested system suspend/resume and made sure no system hang happen due to request queues got blocked by imbalanced pm_only counter. Thanks, Can Guo.