Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp505133pxb; Tue, 3 Nov 2020 05:30:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwVyxO6B198lDGnKts8WhkCWxC0agQ+ab9UJoOrqjd/uSGNOXyu0KM0Qhq3X/9078VTeNl5 X-Received: by 2002:aa7:dc52:: with SMTP id g18mr22011050edu.369.1604410213256; Tue, 03 Nov 2020 05:30:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604410213; cv=none; d=google.com; s=arc-20160816; b=DAe9gF+6qJGC0EKiOKxxJg8wwtawHlCuZNoqMB4b5Zay/kqM7wbiQnNhhpQGLPHWkd 5NlbJXrkUVBlbLYxtyDGDe1JNbAs1RC2mCLsB4oecxFHscvIxUrBOyMZOIuWbtnZbP2h DTm6Oi70QeHPBAHXfEQzlKzqkputMwQAAZ6HcNHRd+PP6bT1/LVLI9z/jwyjZtbnXfvD TUhL3VIeFZ6pK0YynKGd/oS3ZnqN8pBhLMGUc5dLWgBmiw6D+WYz4PARavPbN1a4Yje8 q9UM2yhjeCXj3jhbZNLNh9rA0yzc0l1551wX/pFmxGCFYI6jTZpI1Ii+3vrpAgQvamun qNaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:message-id:references:in-reply-to :subject:cc:to:from:date:content-transfer-encoding:mime-version :sender:dkim-signature; bh=JLjIZJZw1u9HJgVMWdn4LPLwRFKRMpegIpRTqok6leY=; b=a1dNlgLLoqHrc48DkZinHmwfkjJkoY2OYT9g4yv81i+f2FicU2y8U6S8fbKkUs/HF3 X/aj8p1xxkLcN6PBt223djGQ8OvOJjet3ic4re60TdxW4unnGfbd6j5PuFN2qyFSUR5Q N1Fk0uiWnLpEkgOCcVRVGnd1KLs8BWvq0j8q2f5yD9scJ/0lhjh6oM8vZ/kekXJSxJpp cBccCliRuAKj/t/R+9WsbcanKAzGTHOeZnafkpT5KC9gQzrfm26gSR7QtTJMPvJnoItW itNLcF2JiD4VxeQW/yAA9aBw21mQcU5jUQ5xrhjluyShdKYmGjl2Q3v70Yytf44ZLG7T lclQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=ZKaYr7+x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gj23si13394043ejb.600.2020.11.03.05.29.49; Tue, 03 Nov 2020 05:30:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=ZKaYr7+x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729250AbgKCN22 (ORCPT + 99 others); Tue, 3 Nov 2020 08:28:28 -0500 Received: from m42-4.mailgun.net ([69.72.42.4]:63763 "EHLO m42-4.mailgun.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729219AbgKCN2Z (ORCPT ); Tue, 3 Nov 2020 08:28:25 -0500 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1604410104; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=JLjIZJZw1u9HJgVMWdn4LPLwRFKRMpegIpRTqok6leY=; b=ZKaYr7+xRV3N4/VH1IXQud8PUdj/C+BUrxUKgmt36+23Z5IrIk+ciYjNrNmpVnfd2Z9iM3jS YD9wfcwob+kQQh5/LZnCrR8RqNyID2xNVth9EVJpbX9bVVMAdhN5KDeFuasSwocp5pSFIJkm zfaHBlCLSiQ8AaFWgXywuJQVR/E= X-Mailgun-Sending-Ip: 69.72.42.4 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n06.prod.us-west-2.postgun.com with SMTP id 5fa15af7875877e3edaa822e (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Tue, 03 Nov 2020 13:28:23 GMT Sender: cang=codeaurora.org@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id A0073C38591; Tue, 3 Nov 2020 10:01:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: cang) by smtp.codeaurora.org (Postfix) with ESMTPSA id 69B04C38586; Tue, 3 Nov 2020 10:01:02 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Tue, 03 Nov 2020 18:01:01 +0800 From: Can Guo To: Stanley Chu Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Bean Huo , Bart Van Assche , open list Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold() In-Reply-To: <1604387262.13152.2.camel@mtkswgap22> References: <1604384682-15837-1-git-send-email-cang@codeaurora.org> <1604384682-15837-2-git-send-email-cang@codeaurora.org> <1604387262.13152.2.camel@mtkswgap22> Message-ID: <09c5d4d31a0bd9bed99815cfbf51aaad@codeaurora.org> X-Sender: cang@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-11-03 15:07, Stanley Chu wrote: > Hi Can, > > On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote: >> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be >> decreased back in ufshcd_ungate_work() in a paired way. However, if >> specific ufshcd_hold/release sequences are met, it is possible that >> scsi_block_reqs_cnt is increased twice but only one ungate work is >> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() >> and > > Just curious that how could this be possible? Would you have some > failed > examples? > [1] One gate_work() is in the workqueue, not yet executed, now clk state == REQ_CLKS_OFF. [2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) -> active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate work -> active_req-- -> return -EAGAIN. [3] Now gate_work() starts to run, but since the clk state is REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail. [3] Someone calls ufshcd_hold(async == false) -> do something -> ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that, till now, ungate_work() is still in the work queue, not executed yet. [4] Now, if someone calls ufshcd_hold(), we will hit the issue. Above sequence is a very common clk gate/ungate sequence. The issue is because ungate_work is queued but cannot be executed in time. In my case, I see the ungate_work is somehow delayed for about 150ms. This change has been tested by customers on multiple platforms. And you can tell from the code that it won't break anything. :) Thanks, Can Guo. >> ufshcd_ungate_work() in a paired way, increase it only if queue_work() >> returns true. >> >> Signed-off-by: Can Guo >> Reviewed-by: Hongwu Su >> --- >> drivers/scsi/ufs/ufshcd.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> index 847f355..efa7d86 100644 >> --- a/drivers/scsi/ufs/ufshcd.c >> +++ b/drivers/scsi/ufs/ufshcd.c >> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool >> async) >> */ >> /* fallthrough */ >> case CLKS_OFF: >> - ufshcd_scsi_block_requests(hba); >> hba->clk_gating.state = REQ_CLKS_ON; >> trace_ufshcd_clk_gating(dev_name(hba->dev), >> hba->clk_gating.state); >> - queue_work(hba->clk_gating.clk_gating_workq, >> - &hba->clk_gating.ungate_work); >> + if (queue_work(hba->clk_gating.clk_gating_workq, >> + &hba->clk_gating.ungate_work)) >> + ufshcd_scsi_block_requests(hba); >> /* >> * fall through to check if we should wait for this >> * work to be done or not. > > Thanks, > Stanley Chu