Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3653359pxb; Mon, 16 Nov 2020 23:08:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJwWFLh6U0r1jlZQj/1xaPOb/mDNkjed7ezWCgTS+cAIsRZafjLo1T5wnb2YdxYCKqfQ6maB X-Received: by 2002:a17:906:e2da:: with SMTP id gr26mr13109304ejb.265.1605596914952; Mon, 16 Nov 2020 23:08:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605596914; cv=none; d=google.com; s=arc-20160816; b=H3GLVh1HcP5Ej80OKGUUK0XYozVLRlB58MJMnE4t9NV5mooV4cBhNcbCtLKgQeEva7 KPRcLZ2y+rTjrxeerJaQ36Z/CQnQt9SQ/s8qPBw09dmXPAQpqfG6wL8i62+qnJ75wp6n yfDs/fNiQfJVeAuva+57JhS1g26e44lAFEwE0g2WnT0/7lOMc3LnwxRJ+4Na/nIeFh3n t2fZa/nTPknZNHKiyUbyRnfXpqZT8KV6bmtFbkBIuDTf6fV6u/HckALB+uoduY8oTclQ 1pmoZ6KtRfoFWpC0UjdPFW8XQbfOXxkbmGC045KDgvJs10IFXmThgv2gG5OrN65SIl4d xQYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:ironport-sdr; bh=rgHrcxF/xP4ZSyiLSSD01gIOzAFitgDggUvnL0emQu0=; b=KMpP3w4GUsW0ZeY5xDAwKvKAXeg7R6K1Ko38tpEmnhYtWW8/HkpwR+K1XDUdze4wBl 7MKUnOXQcPdFpK6XOYEJuQ31GXBMY0PQq+dzuQqN/Ut10ER2g4W6pwncH1gH3EnhjRJS KJ9TvgPNMjS0Cd+W21vObuHQx9Z3ZI8zoE3gK9mF93oqR9+4iGDZKAoSOvN9W0jacXp6 yxNesT2PajCV5KJhKkFVCzO+BXan936zBjro8je1/t9HcuZxJBAZ5vlVi9Y5LUTL5+EB vd8kS2k3lhGwmB/vLgUUnprfnPqW1Oc72M+BN65kFt/lR2biEPm40efxOxzeIOrwVKV3 1ivA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s15si13582806edj.165.2020.11.16.23.08.06; Mon, 16 Nov 2020 23:08:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726611AbgKQHEh (ORCPT + 99 others); Tue, 17 Nov 2020 02:04:37 -0500 Received: from labrats.qualcomm.com ([199.106.110.90]:3222 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725792AbgKQHEh (ORCPT ); Tue, 17 Nov 2020 02:04:37 -0500 IronPort-SDR: 0O0OOfrPox7P6MJSLk+5Qd5EjP5h1bfofyvRRfZTxUIqQo0dhcnGzQgsNF6Ov7JcTwpgYlh1lG oIu5tJMkd6IR5hcx3Nz90rs8C9dE/C+uU+34HU4T2zD12T5gubaKOuPyxhmhB/njs736wyJwoz JGxmFjh4zVR8MHp8vuyDosQp7W7Gi+i5FlgWbiYBjRkbNJvQwJlYg83dPrKEwZJQodXQoNMjL0 bQF9/OOPslP4bCFU+NJ0Oasu062UaDEdoo76Ai6dbzxhvhkAaBgc8vklKSKlPySL0nS6Xk43dj pzw= X-IronPort-AV: E=Sophos;i="5.77,484,1596524400"; d="scan'208";a="29284338" Received: from unknown (HELO ironmsg-SD-alpha.qualcomm.com) ([10.53.140.30]) by labrats.qualcomm.com with ESMTP; 16 Nov 2020 23:04:36 -0800 X-QCInternal: smtphost Received: from wsp769891wss.qualcomm.com (HELO stor-presley.qualcomm.com) ([192.168.140.85]) by ironmsg-SD-alpha.qualcomm.com with ESMTP; 16 Nov 2020 23:04:36 -0800 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 1EDB02181A; Mon, 16 Nov 2020 23:04:36 -0800 (PST) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, ziqichen@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Stanley Chu , Bean Huo , Bart Van Assche , Satya Tangirala , linux-kernel@vger.kernel.org (open list) Subject: [PATCH v3 2/3] scsi: ufs: Fix a racing problem between ufshcd_abort and eh_work Date: Mon, 16 Nov 2020 23:04:18 -0800 Message-Id: <1605596660-2987-3-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1605596660-2987-1-git-send-email-cang@codeaurora.org> References: <1605596660-2987-1-git-send-email-cang@codeaurora.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In current task abort routine, if task abort happens to the device W-LU, the code directly jumps to ufshcd_eh_host_reset_handler() to perform a full reset and restore then returns FAIL or SUCCESS. Commands sent to the device W-LU are most likely the SSU cmds sent during UFS PM operations. If such SSU cmd enters task abort routine, when ufshcd_eh_host_reset_handler() flushes eh_work, there will be racing because err_handler is serialized with any PM operations. Since the main idea of aborting one cmd to the device W-LU is to perform a full reset and restore, in order to resolve the racing problem, we merely clean up the lrb taken by this cmd, queue the eh_work and abort the cmd. Since the cmd has been aborted, the PM operation which sends the cmd simply errors out, thus err_handler will not be blocked by ongoing PM operations and err_handler can also recover PM error if any, which comes as another benefit of this change. Because such cmd is aborted even before it is actually cleared from HW, set the lrb->in_use flag to prevent subsequent cmds, including SCSI cmds and dev cmds, from taking the lrb released by this cmd. Flag lrb->in_use shall evetually be cleared in __ufshcd_transfer_req_compl() invoked by the full reset and restore from err_handler. Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 55 ++++++++++++++++++++++++++++++++++++----------- drivers/scsi/ufs/ufshcd.h | 2 ++ 2 files changed, 45 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 7e764e8..cd7394e 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2539,6 +2539,14 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) (hba->clk_gating.state != CLKS_ON)); lrbp = &hba->lrb[tag]; + if (unlikely(lrbp->in_use)) { + if (hba->pm_op_in_progress) + set_host_byte(cmd, DID_BAD_TARGET); + else + err = SCSI_MLQUEUE_HOST_BUSY; + ufshcd_release(hba); + goto out; + } WARN_ON(lrbp->cmd); lrbp->cmd = cmd; @@ -2781,6 +2789,11 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, init_completion(&wait); lrbp = &hba->lrb[tag]; + if (unlikely(lrbp->in_use)) { + err = -EBUSY; + goto out; + } + WARN_ON(lrbp->cmd); err = ufshcd_compose_dev_cmd(hba, lrbp, cmd_type, tag); if (unlikely(err)) @@ -2797,6 +2810,7 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout); +out: ufshcd_add_query_upiu_trace(hba, tag, err ? "query_complete_err" : "query_complete"); @@ -4932,6 +4946,7 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba *hba, for_each_set_bit(index, &completed_reqs, hba->nutrs) { lrbp = &hba->lrb[index]; + lrbp->in_use = false; lrbp->compl_time_stamp = ktime_get(); cmd = lrbp->cmd; if (cmd) { @@ -6374,8 +6389,12 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, init_completion(&wait); lrbp = &hba->lrb[tag]; - WARN_ON(lrbp->cmd); + if (unlikely(lrbp->in_use)) { + err = -EBUSY; + goto out; + } + WARN_ON(lrbp->cmd); lrbp->cmd = NULL; lrbp->sense_bufflen = 0; lrbp->sense_buffer = NULL; @@ -6447,6 +6466,7 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, } } +out: blk_put_request(req); out_unlock: up_read(&hba->clk_scaling_lock); @@ -6696,16 +6716,6 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) BUG(); } - /* - * Task abort to the device W-LUN is illegal. When this command - * will fail, due to spec violation, scsi err handling next step - * will be to send LU reset which, again, is a spec violation. - * To avoid these unnecessary/illegal step we skip to the last error - * handling stage: reset and restore. - */ - if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN) - return ufshcd_eh_host_reset_handler(cmd); - ufshcd_hold(hba, false); reg = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL); /* If command is already aborted/completed, return SUCCESS */ @@ -6726,7 +6736,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) * to reduce repeated printouts. For other aborted requests only print * basic details. */ - scsi_print_command(hba->lrb[tag].cmd); + scsi_print_command(cmd); if (!hba->req_abort_count) { ufshcd_update_reg_hist(&hba->ufs_stats.task_abort, 0); ufshcd_print_host_regs(hba); @@ -6745,6 +6755,27 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) goto cleanup; } + /* + * Task abort to the device W-LUN is illegal. When this command + * will fail, due to spec violation, scsi err handling next step + * will be to send LU reset which, again, is a spec violation. + * To avoid these unnecessary/illegal steps, first we clean up + * the lrb taken by this cmd and mark the lrb as in_use, then + * queue the eh_work and bail. + */ + if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN) { + spin_lock_irqsave(host->host_lock, flags); + if (lrbp->cmd) { + __ufshcd_transfer_req_compl(hba, (1UL << tag)); + __set_bit(tag, &hba->outstanding_reqs); + lrbp->in_use = true; + hba->force_reset = true; + ufshcd_schedule_eh_work(hba); + } + spin_unlock_irqrestore(host->host_lock, flags); + goto out; + } + /* Skip task abort in case previous aborts failed and report failure */ if (lrbp->req_abort_skip) err = -EIO; diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 1e680bf..66e5338 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -163,6 +163,7 @@ struct ufs_pm_lvl_states { * @crypto_key_slot: the key slot to use for inline crypto (-1 if none) * @data_unit_num: the data unit number for the first block for inline crypto * @req_abort_skip: skip request abort task flag + * @in_use: indicates that this lrb is still in use */ struct ufshcd_lrb { struct utp_transfer_req_desc *utr_descriptor_ptr; @@ -192,6 +193,7 @@ struct ufshcd_lrb { #endif bool req_abort_skip; + bool in_use; }; /** -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.