Received: by 10.223.176.46 with SMTP id f43csp1798373wra; Wed, 24 Jan 2018 23:53:21 -0800 (PST) X-Google-Smtp-Source: AH8x224fe8WDJWaEy+IIO7g3pQQTJxeiFrHeooGBv598J9e7g7o/kueLL2F6P9zSQtvvLmGcAHrR X-Received: by 10.98.138.21 with SMTP id y21mr15359371pfd.147.1516866801091; Wed, 24 Jan 2018 23:53:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516866801; cv=none; d=google.com; s=arc-20160816; b=PMXDJ6o2KywAQXixNO187pYNo9jxxEdN9LC555ayLovi0qEvNVcImYXpfnLmFpTdaq msuP6P08ek1T4UFilJbNwleNzrhSXYhSSGG4NXikfeOjfsAlgOvBlyMKNG4PC+jHojp+ C4wolHYN5Gz8QrWfb4aYNyA9Tfvyk+bvyZ2jsdbqJ/CupRbcf2xddbIQNSctHnYbZjAy e5KBLGC4Vnda6XzJspxglOyHi1/BFvkJw7qEkVlRcOjEt2nvmYY11SNaT1cL++JZxDYl ZfZkb6SH+D1YcptSf8mY/u9DWJekmhU7sDbKWnvIzaJAktvf4ujVsBnJ3B1I1+3hc3Fh a1vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=DyNqthhmcO+VOxEp9FE/geB1BqXkT2D1yHlVPPOXbbI=; b=S9rc4pwiOtgnJVg9RWt650XgKHtfUUX9X3+ugAgVEuQcCDi0CwakhY3TqewOEJmMnH Jq+3zddZNsbCG1S+d0fbn2dtOHFrAMOgZoyK44pg2aD3piSBVFXgG0QeIODBsshurZVM uvE6S/0aUx+CEQ0OX1FbFrF+Q9R6KeDJZJ+Ew/XOTfxcTrWyZU/IbjfnX+6lBilnS2iR Uq7d/kSKjfw5PfKBakPChul96yx+LP6tTfSfMF7Pr4HTDVwA9jb6b2n5eOWCY3Ly7f3d /FfJzV8jJjaI5tUiWz2LhTb/chNZS0OYRZA1OyYXDi6E5I5F785VkOaadyMWB49XSnUn ELxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QzlUIBcd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l4-v6si1603871pln.619.2018.01.24.23.53.06; Wed, 24 Jan 2018 23:53:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QzlUIBcd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751311AbeAYHwa (ORCPT + 99 others); Thu, 25 Jan 2018 02:52:30 -0500 Received: from mail-io0-f196.google.com ([209.85.223.196]:43055 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751056AbeAYHw2 (ORCPT ); Thu, 25 Jan 2018 02:52:28 -0500 Received: by mail-io0-f196.google.com with SMTP id 72so7686468iom.10; Wed, 24 Jan 2018 23:52:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=DyNqthhmcO+VOxEp9FE/geB1BqXkT2D1yHlVPPOXbbI=; b=QzlUIBcdpHWU+nupyQpnme985VS5NPHB0e7Oln1ZkoyfcxQkiO8OOosN2hRgCH3RU5 ml6BhJtivD+SBaAgMNJ9FrmsyzqDStHQyTKmMVvhb+kdbnaURZG0C+UeILREU0xUWzm2 gCCiDuA4JPSqSeB/UboK13F4F1UfZZLIBoO1PE7SxSBjxPPXSKAkLmAdqhhdvcLE7xxf uawfXLWi7KPsz2xipnoDgZPXwvFRS8rrGBRSI04YMEmg+lSdN6cxrXqLNaVKIZZnzBq+ gnsSbOLTM7nKITGXdJY2wT+PzqBw9QcwU7uYA3JaJzFsKVI6vDdYPlLinWK8n5KIXdAv kPGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=DyNqthhmcO+VOxEp9FE/geB1BqXkT2D1yHlVPPOXbbI=; b=qhMg6YX6ULyTpXkxBAPGA7iTOiNIfrIHGRWdzY/svi3upFdgVBC/7yEk43TyYAKdn4 2sW+hPite4o2RgL31iScyktylEobJYwWukc0/iBj484RzaoH+N91pJ5pQf3G5fwVhyn6 omOVrMcxaRGPVeiwL5HDT5uZqkyLUCL6aYaXVyCwdvfnYF0N3YMSwNv5oRWk+MDymu0j G3VljTmUfr7rHliY/suM55xNAOYL1nzCQVIthrju+WNExFFkizEq5U7i28hcQBcmp4Sg n7GUlfwGzg8YjrEJ9eRjENwIIrpE+U967dcAMk9DaTjYR0f9QCAdodQrdAJPK7SoGbej /S/g== X-Gm-Message-State: AKwxytcQeMEraw9udscsQuqz+I5unFCGb/BXPKOz/VDSFRFhhAOHhVR/ UuuGElGuEaf+n2guOCyCVGHEOFCn X-Received: by 10.107.173.144 with SMTP id m16mr12815643ioo.122.1516866747909; Wed, 24 Jan 2018 23:52:27 -0800 (PST) Received: from ?IPv6:2402:f000:1:1501:200:5efe:166.111.70.14? ([2402:f000:1:1501:200:5efe:a66f:460e]) by smtp.gmail.com with ESMTPSA id w125sm530763itb.31.2018.01.24.23.52.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jan 2018 23:52:27 -0800 (PST) Subject: Re: [PATCH] block: blk-mq-sched: Replace GFP_ATOMIC with GFP_KERNEL in blk_mq_sched_assign_ioc To: Al Viro Cc: axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org References: <1516848386-5720-1-git-send-email-baijiaju1990@gmail.com> <20180125025811.GT13338@ZenIV.linux.org.uk> <70efc238-9517-7cfa-03ce-ba9c3ba0ebd4@gmail.com> <20180125041605.GV13338@ZenIV.linux.org.uk> From: Jia-Ju Bai Message-ID: <12bc3f97-6405-697a-87a6-5d8e66c120fa@gmail.com> Date: Thu, 25 Jan 2018 15:52:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20180125041605.GV13338@ZenIV.linux.org.uk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/1/25 12:16, Al Viro wrote: > On Thu, Jan 25, 2018 at 11:13:56AM +0800, Jia-Ju Bai wrote: > >> I have checked the given call chain, and find that nvme_dev_disable in >> nvme_timeout calls mutex_lock that can sleep. >> Thus, I suppose this call chain is not in atomic context. > ... or it is broken. > >> Besides, how do you find that "function (nvme_timeout()) strongly suggests >> that it *is* meant to be called from bloody atomic context"? >> I check the comments in nvme_timeout, and do not find related description... > Anything that reads registers for controller state presumably won't be > happy if it can happen in parallel with other threads poking the same > hardware. Not 100% guaranteed, but it's a fairly strong sign that there's > some kind of exclusion between whatever's submitting requests / handling > interrupts and the caller of that thing. And such exclusion is likely > to be spin_lock_irqsave()-based. > > Again, that does not _prove_ it's called from atomic contexts, but does > suggest such possibility. > > Looking through the callers of that method, blk_abort_request() certainly > *is* called from under queue lock. Different drivers, though. No idea > if nvme_timeout() blocking case is broken - I'm not familiar with that > code. Question should go to nvme maintainers... > > However, digging through other call chains, there's this: > drivers/md/dm-mpath.c:530: clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); > in multipath_clone_and_map(), aka. ->clone_and_map_rq(), called at > drivers/md/dm-rq.c:480: r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone); > in map_request(), which is called from dm_mq_queue_rq(), aka ->queue_rq(), > which is called from blk_mq_dispatch_rq_list(), called from > blk_mq_do_dispatch_sched(), called from blk_mq_sched_dispatch_requests(), > called under rcu_read_lock(). Not a context where you want GFP_KERNEL > allocations... > >> By the way, do you mean that I should add "My tool has proved that this >> function is never called in atomic context" in the description? > I mean that proof itself should be at least outlined. Crediting the tool > for finding the proof is fine *IF* it's done along wiht the proof itself. > > You want to convince the people applying the patch that it is correct. > Leaving out something trivial to verify is fine - "foo_bar_baz() has > no callers" doesn't require grep output + quoting the area around each > instance to prove that all of them are in the comments, etc.; readers > can bloody well check the accuracy of that claim themselves. This > kind of analysis, however, is decidedly *NOT* trivial to verify from > scratch. > > Moreover, if what you've proven is that for each call chain leading > to that place there's a blocking operation nearby, there is still > a possibility that some of those *are* called while in non-blocking > context. In that case you've found real bugs, and strictly speaking > your change doesn't break correct code. However, it does not make > the change itself correct - if you have something like > enter non-blocking section > ..... > in very unusual cases grab a mutex (or panic, or...) > ..... > do GFP_ATOMIC allocation > ..... > leave non-blocking section > changing that to GFP_KERNEL will turn "we deadlock in very hard to > hit case" into "we deadlock easily"... > > At the very least, I'd like to see those cutoffs - i.e. the places > that already could block on the callchains. You might very well > have found actual bugs there. Okay, thanks for your detailed explanation :) I admit that my report here is not correct, and I will improve my tool. Thanks, Jia-Ju Bai