Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3836863imu; Mon, 10 Dec 2018 08:34:23 -0800 (PST) X-Google-Smtp-Source: AFSGD/XHgRFcj2HlBdtoOQoY7zRTKaHLMslXsmgdEyW9h6Gv5DJVvLqDKc6WbIKoRyxOGiq/08e4 X-Received: by 2002:a17:902:a83:: with SMTP id 3mr11901012plp.276.1544459663635; Mon, 10 Dec 2018 08:34:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544459663; cv=none; d=google.com; s=arc-20160816; b=KBblHKHVF1e0vTxTuI3G1LZKoJlwpF/i7ICbFwhxpG7ojZMj/Cfdv4PCoojbTNM545 nLHbOzOecATh6UNwjQrPj4a0MSvHv9DQ58eel8Tq4OHgBJcL9AcNcac8cSOgt/r13/jB xSUVgBa1QOyEgwCK94XdVANPb9GnyeNwcl6tlXfmwq+ghlAxybQhXrRxo8VejaHsCVTI zKGQi/WSV/CRSdxQ3q+bb/MOLdC7Vaxk8rUboM8p2ej1Upl9imsiR2w2Tsdg067u/yVE bq1fI6cZm0viZBeE4Mi8IYihYom+zOLFIcOTa2Q8mSv+VRV3AJrd7b4rXoPNKZ0DmASh yMVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=GwzFNJvtsZjj0QCW9Ig1Z7aNr8W9e2QzKy7/jfvbx/0=; b=PdSQmft9Oo0dLb+Bn03QdoU2umRz5HGyijIz4v/ytNxTZ+2W9OV5ll8SQMA9ZGINpz wxqteNeT0G1wJpBOwHb6OXKKYSiKMyz7KNNdqNF16aqwcyJgcKyrieMdA7ic7kusBC7C HNrTd7B4QDiXtm9SFJhSMKEQqLi2GOlcvhF9cI/F4VDzf4iqOmjMRxEow6KbWzn/r7bN rOtPf20c5syaPE/rd1QL7GklhtbJ0P/GgPwzfY4NW49VCFLvfQBGnNNYtYWfrM0LNQwq 1TYL/hGoVJhrkUrtFMmuVoGInCbg21blvTwAyVZAA60yYpgYacW0GpqkOaIXIe4KOGul C4wQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h5si9632074pgc.237.2018.12.10.08.34.07; Mon, 10 Dec 2018 08:34:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727190AbeLJQUQ (ORCPT + 99 others); Mon, 10 Dec 2018 11:20:16 -0500 Received: from mx2.suse.de ([195.135.220.15]:43134 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726562AbeLJQUQ (ORCPT ); Mon, 10 Dec 2018 11:20:16 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9CDD0AE3D; Mon, 10 Dec 2018 16:20:13 +0000 (UTC) Date: Mon, 10 Dec 2018 17:20:10 +0100 From: Michal Hocko To: Peter Zijlstra Cc: Daniel Vetter , Intel Graphics Development , DRI Development , LKML , linux-mm@kvack.org, Andrew Morton , David Rientjes , Christian =?iso-8859-1?Q?K=F6nig?= , =?iso-8859-1?B?Suly9G1l?= Glisse , Daniel Vetter Subject: Re: [PATCH 2/4] kernel.h: Add non_block_start/end() Message-ID: <20181210162010.GS1286@dhcp22.suse.cz> References: <20181210103641.31259-1-daniel.vetter@ffwll.ch> <20181210103641.31259-3-daniel.vetter@ffwll.ch> <20181210141337.GQ1286@dhcp22.suse.cz> <20181210144711.GN5289@hirez.programming.kicks-ass.net> <20181210150159.GR1286@dhcp22.suse.cz> <20181210152253.GP5289@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181210152253.GP5289@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 10-12-18 16:22:53, Peter Zijlstra wrote: > On Mon, Dec 10, 2018 at 04:01:59PM +0100, Michal Hocko wrote: > > On Mon 10-12-18 15:47:11, Peter Zijlstra wrote: > > > On Mon, Dec 10, 2018 at 03:13:37PM +0100, Michal Hocko wrote: > > > > I do not see any scheduler guys Cced and it would be really great to get > > > > their opinion here. > > > > > > > > On Mon 10-12-18 11:36:39, Daniel Vetter wrote: > > > > > In some special cases we must not block, but there's not a > > > > > spinlock, preempt-off, irqs-off or similar critical section already > > > > > that arms the might_sleep() debug checks. Add a non_block_start/end() > > > > > pair to annotate these. > > > > > > > > > > This will be used in the oom paths of mmu-notifiers, where blocking is > > > > > not allowed to make sure there's forward progress. > > > > > > > > Considering the only alternative would be to abuse > > > > preempt_{disable,enable}, and that really has a different semantic, I > > > > think this makes some sense. The cotext is preemptible but we do not > > > > want notifier to sleep on any locks, WQ etc. > > > > > > I'm confused... what is this supposed to do? > > > > > > And what does 'block' mean here? Without preempt_disable/IRQ-off we're > > > subject to regular preemption and execution can stall for arbitrary > > > amounts of time. > > > > The notifier is called from quite a restricted context - oom_reaper - > > which shouldn't depend on any locks or sleepable conditionals. > > You want to exclude spinlocks too? We could maybe frob something with > lockdep if you need that? Spinlocks are less of a problem because you cannot have a (in)direct dependency on the page allocator that would deadlock. Spinlocks, or preemption disabled in general should be short enough to guarantee a forward progress. > > The code > > should be swift as well but we mostly do care about it to make a forward > > progress. Checking for sleepable context is the best thing we could come > > up with that would describe these demands at least partially. > > OK, no real objections to the thing. Just so long we're all on the same > page as to what it does and doesn't do ;-) I am not really sure whether there are other potential users besides this one and whether the check as such is justified. > I suppose you could extend the check to include schedule_debug() as > well, maybe something like: Do you mean to make the check cheaper? > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index f66920173370..b1aaa278f1af 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3278,13 +3278,18 @@ static noinline void __schedule_bug(struct task_struct *prev) > /* > * Various schedule()-time debugging checks and statistics: > */ > -static inline void schedule_debug(struct task_struct *prev) > +static inline void schedule_debug(struct task_struct *prev, bool preempt) > { > #ifdef CONFIG_SCHED_STACK_END_CHECK > if (task_stack_end_corrupted(prev)) > panic("corrupted stack end detected inside scheduler\n"); > #endif > > +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP > + if (!preempt && prev->state && prev->non_block_count) > + // splat > +#endif > + > if (unlikely(in_atomic_preempt_off())) { > __schedule_bug(prev); > preempt_count_set(PREEMPT_DISABLED); > @@ -3391,7 +3396,7 @@ static void __sched notrace __schedule(bool preempt) > rq = cpu_rq(cpu); > prev = rq->curr; > > - schedule_debug(prev); > + schedule_debug(prev, preempt); > > if (sched_feat(HRTICK)) > hrtick_clear(rq); -- Michal Hocko SUSE Labs