Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp3378016imc; Wed, 13 Mar 2019 16:52:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqyo8W3G0NCwOBNRmMvJ7UiXlKB262FZax4Ywpmtf/jmzI5sKFtoW8DlJBACCc4BNKUHhb+w X-Received: by 2002:a63:c118:: with SMTP id w24mr3706011pgf.67.1552521138686; Wed, 13 Mar 2019 16:52:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552521138; cv=none; d=google.com; s=arc-20160816; b=IXdP9Dpc+GZM1+Lyez04vnSfGWH4erNWwSs84mOwb+/n0gNj6Hkx4aUNs6oAJDHOmF 0R/Y5PSoQOwMxu0r705vGgBINzEis/e7dBZYwceC8Ueddhztidxx53FR2+zqeLwQFOyK f6Jyfs6I8GCINEr3MDvDhcJnTeokLZkanDDSGb4JZ8sywshSD8OQAaGehG41SodBE3oZ sc6/982BV6dexpQQfvSo8KFUOV7Zqt1C0APuhO9WMOo8G15fpEznLY3mVUTVabgODwKl gHspZl4ovXbFfFq1ndMxulDGhJQwxNA7DQJJY8XMsIj/mFynQIx1pSOW3U9n6vDh/Nks nVqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=y+ljKQWjT4ARPUkGZsWYVzIbVQKEj0XxQ5Wv22F/TDU=; b=X2cwriiEni/e7oO3a80NGzqUN35IsT4u4wVmcl7ezNabI8icQgykFfwwzeQHM6h2Ry oPBQYC3stBHaLpldT5F9fmHKJ1b8JtyGEbRkvnFNYcEpKvFiH8GJVdgwjdxetbC3YtvL 2mIgYG+NJXu40pmhAzdH6VbMCNwaxuG2JSfim238zFtVM1u4iPbVXsuO8QAWSlvgg44x 0VkRytXXaPbPYRM5vKJCsweOtEDUOYv2likwoe3W9Kg80YFRPuMo/Zx1dwU/EVBkQxEp GN1V42soXnvbU06xqRt5AL868hjC1Pr7+iy6rklmDvj8XluK2L9UxOX/yiW+iMZ0CjAI oyqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h9si4736797pgs.120.2019.03.13.16.52.01; Wed, 13 Mar 2019 16:52:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727163AbfCMXvX (ORCPT + 99 others); Wed, 13 Mar 2019 19:51:23 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:33309 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726623AbfCMXvX (ORCPT ); Wed, 13 Mar 2019 19:51:23 -0400 Received: by mail-pf1-f195.google.com with SMTP id i19so2520811pfd.0; Wed, 13 Mar 2019 16:51:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=y+ljKQWjT4ARPUkGZsWYVzIbVQKEj0XxQ5Wv22F/TDU=; b=cMlbBKJ0SYX6URIVoqRuD1wHDLUqmZ2UGN/IoAvIDTg/vSvy9nPM5ghc18ySGRx/pt 9EYGbJe+38+aG00I1K68MUbwUpGe+cXzO+v5a5qTtib14uE5YHeRChUc/Bj1PSdC4g9A wl6Uftnm7KQMuhzxl9pfnA4KkHvttp2pQDa7Wg4jZWvUey5bTbxWvogR6WIchvYUGM61 Z0mvFPkfkWewwWylsfef0fqKuYBJ3KH0xq/Re2YL5YYwSzkC6G77JIIECiFhIAoI3GvK 980d79UngqLRBnaWT8ACnT2GCy5hPKHqC5n6//AvawOKEnfla2+k2AZj2qzdqlrx0jDN HP0g== X-Gm-Message-State: APjAAAXCMbzcycxGfc/q3dVHzGFYat+tkSs9YeKFUj7rHQIxJ6nqlEUh 3Ngr7mfxDVLNQogK9VWxwbqdIe+h X-Received: by 2002:a62:4684:: with SMTP id o4mr46348097pfi.254.1552521081918; Wed, 13 Mar 2019 16:51:21 -0700 (PDT) Received: from ?IPv6:2620:15c:2cd:203:5cdc:422c:7b28:ebb5? ([2620:15c:2cd:203:5cdc:422c:7b28:ebb5]) by smtp.gmail.com with ESMTPSA id b6sm19843659pfo.27.2019.03.13.16.51.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 Mar 2019 16:51:19 -0700 (PDT) Message-ID: <1552521077.45180.119.camel@acm.org> Subject: Re: [RFC PATCH] scsi: fix oops in scsi_uninit_cmd() From: Bart Van Assche To: Jason Yan , Christoph Hellwig Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, Jens Axboe , linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, hare@suse.com, dan.j.williams@intel.com, jthumshirn@suse.de, Steffen Maier Date: Wed, 13 Mar 2019 16:51:17 -0700 In-Reply-To: <10ea95ec-e259-3511-44c4-58e4d255eb9f@huawei.com> References: <20190219072743.13606-1-yanaijie@huawei.com> <1550595388.31902.133.camel@acm.org> <20190220151836.GA11695@infradead.org> <10ea95ec-e259-3511-44c4-58e4d255eb9f@huawei.com> Content-Type: text/plain; charset="UTF-7" X-Mailer: Evolution 3.26.2-1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2019-02-21 at 16:53 +-0800, Jason Yan wrote: +AD4 On 2019/2/20 23:18, Christoph Hellwig wrote: +AD4 +AD4 +AFs-fullquote removed, please follow proper mail etiquette+AF0 +AD4 +AD4 +AD4 +AD4 On Tue, Feb 19, 2019 at 08:56:28AM -0800, Bart Van Assche wrote: +AD4 +AD4 +AD4 regression in the SCSI sd driver due to the switch from the legacy block +AD4 +AD4 +AD4 layer to scsi-mq. The above patch introduces two atomic operations in the +AD4 +AD4 +AD4 hot path and hence would introduce a performance regression. I think this +AD4 +AD4 +AD4 can be avoided by making sure that sd+AF8-uninit+AF8-command() gets called before +AD4 +AD4 +AD4 the request tag is freed. What changes would be required to make the block +AD4 +AD4 +AD4 layer core call sd+AF8-uninit+AF8-command() before the request tag is freed? Would +AD4 +AD4 +AD4 introducing prep+AF8-rq+AF8-fn and unprep+AF8-rq+AF8-fn callbacks in struct blk+AF8-mq+AF8-ops and +AD4 +AD4 +AD4 making sure that the SCSI core sets these callback function pointers +AD4 +AD4 +AD4 appropriately be sufficient? Would such a change allow to simplify the NVMe +AD4 +AD4 +AD4 initiator driver? Are there any alternatives to this approach that are more +AD4 +AD4 +AD4 elegant? +AD4 +AD4 +AD4 +AD4 Additional indirect calls in the I/O fast path is something I'd rather +AD4 +AD4 avoid. But I don't fully understand the problem yet - where do +AD4 +AD4 we release a disk reference from blk+AF8-update+AF8-request? +AD4 +AD4 When userspace close the fd after blk+AF8-update+AF8-request() and before +AD4 scsi+AF8-mq+AF8-uninit+AF8-cmd(), a disk reference will be released. It is not the +AD4 blk+AF8-update+AF8-request() directly released it. +AD4 +AD4 close +AD4 -+AD4-sd+AF8-release +AD4 -+AD4-scsi+AF8-disk+AF8-put +AD4 -+AD4-scsi+AF8-disk+AF8-release +AD4 -+AD4-disk-+AD4-private+AF8-data +AD0 NULL+ADs +AD4 +AD4 The userspace can close the fd because blk+AF8-update+AF8-request() returned the +AD4 last IO , the userspace application does not have to stuck on read() or +AD4 write(). The window is very small, but it can be reproduce every day +AD4 in our testcases. So I'm very curious why. One possible explanation is +AD4 that we enabled kernel preempt(CONFIG+AF8-PREEMPT). +AD4 +AD4 And why can't we move that release to +AF8AXw-blk+AF8-mq+AF8-end+AF8-request? Hi Jason, What is the current status of this issue? Thanks, Bart.