Received: by 2002:ac0:a874:0:0:0:0:0 with SMTP id c49csp217911ima; Fri, 15 Mar 2019 00:58:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqwi91q4bqNovq1lciDFKWGFyLSWPTnG8MqWpp9f4Z9euWAeXPm99Tm8TgRjyIx+buWvrJia X-Received: by 2002:a17:902:1621:: with SMTP id g30mr2803269plg.116.1552636699481; Fri, 15 Mar 2019 00:58:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552636699; cv=none; d=google.com; s=arc-20160816; b=MpiT5KarRgYkfSqKxp1Xc4t2/qqAqPG8LOrFUwgOxqWKAHYfpX5ToX5gZYJ+cPo0HZ wPFWRE3cn68Gb5y8f5L+xWrDc+zBgszna13/wVH1G5BRw7tSpn0iIKq1473dvE/0LSgE hSY5MTWDwg2CJR9CuTWJBteK00dI5fr3WzyyOqpZkPH1gWq4eF6DvyxdPmIE+KM/WHoe 5dENdv7RiQEP/ME6YvlluHmXv6MwAQQc2cRc5+p4DWkGvlF3CdHJAAXZ/gESnHp1T5DR OzgK5g+PFHxe3HxXUoiSuuATVjQ6iq/ORfMRJrZyCa9/ZTK3Dmdl3mjS94fQFS+f7hz9 8fow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=FDToaXoGDM3HZpW27l4n5SxEsOzQOt6FD5PqD/o0Mu8=; b=LbYqiQRkakrAXjhUIWfpq+ZRVNuBhQ80oAdF9CcwTgL5IN/VrammV35LwPC6jeefaD E9pVz6fc/IPlF3+8xF1Y6NqJ9NQ8izpsWTBFM4PSPAM0YvvdGpzV8bLgjSzFbwdBpgQo UXDlCj62VZbFHP2hsOXGAw0tiwMMRWB6nbPXC2EmDhC1H/n7jkkvf70XvQq0NG/w7+Ou xgOKDYNymzduFZ28QWynVUW5WqHosneyOFiTrudmZO9E1MwOvaf6RQDUfqvt7UX4iKV5 2YpnessqSmoavNm2avIzpdB0QbcLJoRBu9qWvhle1tlncqOOefUdXQIZAC2phn4i+5cR b69w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tR9ODxM1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k22si1301493pfa.215.2019.03.15.00.58.04; Fri, 15 Mar 2019 00:58:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tR9ODxM1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728531AbfCOH5L (ORCPT + 99 others); Fri, 15 Mar 2019 03:57:11 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:43781 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728479AbfCOH5K (ORCPT ); Fri, 15 Mar 2019 03:57:10 -0400 Received: by mail-wr1-f66.google.com with SMTP id d17so8488517wre.10; Fri, 15 Mar 2019 00:57:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FDToaXoGDM3HZpW27l4n5SxEsOzQOt6FD5PqD/o0Mu8=; b=tR9ODxM1Uzy7lUWblPlmK+PjJqLPegZvtFyo6b1vpYP/FIYwGwhkIbQoCAZNhI0lz/ CcTjnKCfrAfZpAL8GKFCZ+gYBnIQL+aBzevN2+FaQgNjBmcJBmAZsM38AfyHPU74zzY3 xfVGCALQ8Da8F0lO0sTaeDKUKkwqODvg6nCdV3hrWgwuPHmbSmyE2NPPIjFHlVV6xcvA ucasa7Sk7aM4TDGkIABdAse4h0EUZhepA/6ZozrVF5m2VslGTxG8gwEJJ7v3cOHxT8v6 jbh+pcvMdOlV6//1LedpbIVYaHHdFi/98xF0EOaQQ8bEC202c0XQbzVrtfUrNVOHTWc7 8kJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FDToaXoGDM3HZpW27l4n5SxEsOzQOt6FD5PqD/o0Mu8=; b=e/avC5E6dsHxkX8pUT7aokD2J0uKmM5jxp0SG0XX/W0GrP9LaLvkfyC+Gb5GH4souU /K828wOEjoXyShKKAjgpluxqwP4UnAydX/shTlMUqBkwOOsgqMPhE3p4ylYbjWpPh7hD my74Fg0HJp4Oyp3jcjHyRh51/406VC2QZq5U6LTKENR8ED3QWwwpsC3/VSXpulHht6qr 6JXvqehSRMGiI8moseuTztG7rMWKjAfROMfMlTScyneANfbp7Q5X12DDmWmCiYXXp4be DCdLpVxJoh7LzP6ZvvX0UEarmTbGYgJwiGcplL02Vued7WuYAe+T/OmpZMRjLr4KESZx Ok3w== X-Gm-Message-State: APjAAAVA0wbHywIGvRpUnKWdhHLANJbslHRy2WoVEahkKbDBBFX3A31t M/lKnSAAop+Ej5sGNoHXO5FEhpFZaARPjmHDOEA= X-Received: by 2002:adf:d081:: with SMTP id y1mr1332114wrh.283.1552636628689; Fri, 15 Mar 2019 00:57:08 -0700 (PDT) MIME-Version: 1.0 References: <20190219072743.13606-1-yanaijie@huawei.com> <1550595388.31902.133.camel@acm.org> <20190220151836.GA11695@infradead.org> <10ea95ec-e259-3511-44c4-58e4d255eb9f@huawei.com> In-Reply-To: <10ea95ec-e259-3511-44c4-58e4d255eb9f@huawei.com> From: Ming Lei Date: Fri, 15 Mar 2019 15:56:56 +0800 Message-ID: Subject: Re: [RFC PATCH] scsi: fix oops in scsi_uninit_cmd() To: Jason Yan Cc: Christoph Hellwig , Bart Van Assche , "Martin K. Petersen" , James Bottomley , Jens Axboe , Linux SCSI List , Linux Kernel Mailing List , Hannes Reinecke , Dan Williams , Johannes Thumshirn , Steffen Maier Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 21, 2019 at 4:55 PM Jason Yan wrote: > > Hi, Christoph > > On 2019/2/20 23:18, Christoph Hellwig wrote: > > [fullquote removed, please follow proper mail etiquette] > > > > On Tue, Feb 19, 2019 at 08:56:28AM -0800, Bart Van Assche wrote: > >> regression in the SCSI sd driver due to the switch from the legacy block > >> layer to scsi-mq. The above patch introduces two atomic operations in the > >> hot path and hence would introduce a performance regression. I think this > >> can be avoided by making sure that sd_uninit_command() gets called before > >> the request tag is freed. What changes would be required to make the block > >> layer core call sd_uninit_command() before the request tag is freed? Would > >> introducing prep_rq_fn and unprep_rq_fn callbacks in struct blk_mq_ops and > >> making sure that the SCSI core sets these callback function pointers > >> appropriately be sufficient? Would such a change allow to simplify the NVMe > >> initiator driver? Are there any alternatives to this approach that are more > >> elegant? > > > > Additional indirect calls in the I/O fast path is something I'd rather > > avoid. But I don't fully understand the problem yet - where do > > we release a disk reference from blk_update_request? > > When userspace close the fd after blk_update_request() and before > scsi_mq_uninit_cmd(), a disk reference will be released. It is not the > blk_update_request() directly released it. > > close > ->sd_release > ->scsi_disk_put > ->scsi_disk_release > ->disk->private_data = NULL; > > The userspace can close the fd because blk_update_request() returned the > last IO , the userspace application does not have to stuck on read() or > write(). The window is very small, but it can be reproduce every day > in our testcases. So I'm very curious why. One possible explanation is > that we enabled kernel preempt(CONFIG_PREEMPT). Another solution is to drain in-flight FS IO in scsi_disk_release(), and one counter is needed for tracking in-flight passthrough IO, so we can use sdev->device_busy - sdev->passthrough_ios to drain inflight FS IO. > > And why can't > > we move that release to __blk_mq_end_request? I think it is doable, then ending bio needs to be moved out of blk_update_request(), such as, add one list of rq->done_bio to track completed bio, then complete all in free request. And for partial completion, the completed bio can be done in blk_update_request(), since the remained bios will cause fs to hold the disk. Thanks, Ming Lei