Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp255821ybl; Thu, 15 Aug 2019 16:48:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqzDiNkRdmPYTuWRsaWbqz7VGLigp30XZR1KXZU7f08uYS2/wUgldlHI2bz3BTxkE4u21oBd X-Received: by 2002:aa7:8e17:: with SMTP id c23mr7934713pfr.227.1565912902661; Thu, 15 Aug 2019 16:48:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565912902; cv=none; d=google.com; s=arc-20160816; b=dfi6XkO5XTHkyNIh5QE0fyU5h8kpF/SF3R0J9UhJECkTaaUWX6B8nhWuz1lntAAB70 4mA+J9gwxV9EI+Z5uzO89ClqsLfqqZV5Q69bdyN+swv0WXtVT7LvnkAQkEEkPFuuItNe cE6A+xhisWYMrNoynRmXaQXfIH3PCIiPIkdSwwHmNRWWDyza3gNYfkN5MVvZuIaFaiDv OpzT2I+TZjM2jz151dHlVqKYWdcYr+wy7HlXrt1ifJePIP7tWoykUiH3vJC/9viyTf41 xuxDNlFFIIXySWC/B+6wIWuHPquyJdFVqlGAeox3QNBoAs+MYmEomG5vz1XYDp6sA2i/ 4JjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=ZDe6C+0LLcMWptsnzp59EWqqNUClrihTe+e1Gjjq+tg=; b=uKhvz8yvM3EfGnDnGD+dDz2hrqmmAkPmsKhgXQXdZL9VDAtu2huzKNllJsk74uUKnY 9+PTih72aMCtnnAGvLmJ+fhHrsvgPLfD76XtKA064CQ6YHxbhhqQX3U235l8CuCH6cyx 4VoPjDUhxZ+tVEaD7SBZh9nnxFg4aDmA4OEcmDF8EUE6i7hohDBv5L2NXqwu34rA72h/ fXrjKXowcEU5NnIcydF09W2eLCUPzu3/h41I3zNoz6IaEXpt4aNKY3T+mRIjwFdGFIQc qSb9RGZjrSy8uJDywM5sUhYc3RV0i4qQTgOwCexnP3EzFEQg6JSA4oxPGypaPGZ3KfnX hwDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=aTOvUoVh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5si3070498pfa.214.2019.08.15.16.48.07; Thu, 15 Aug 2019 16:48:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=aTOvUoVh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731820AbfHOWPL (ORCPT + 99 others); Thu, 15 Aug 2019 18:15:11 -0400 Received: from mail.kernel.org ([198.145.29.99]:59934 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728685AbfHOWPL (ORCPT ); Thu, 15 Aug 2019 18:15:11 -0400 Received: from akpm3.svl.corp.google.com (unknown [104.133.8.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C578E20644; Thu, 15 Aug 2019 22:15:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565907310; bh=uN4VcpeCW2jBJTI1Si370+ePLjzfshSmi9q5RwHK3zw=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=aTOvUoVh3yDGHY2VKKGjzrt2j37IUIAGRxiOv+BFw5caUUjN4/hUeJnHSChr2JqpI ttZvBwHNpv3VJSotvsrCun1FEqYSEDpoeVcWA9n9Ed8qlXTHvUVYtRTbD5b2RUVVCX GolZZPhsHLpPwXJ8M/alcsoSO7uLILhwTOk/xb7I= Date: Thu, 15 Aug 2019 15:15:09 -0700 From: Andrew Morton To: Michal Hocko Cc: Daniel Vetter , LKML , linux-mm@kvack.org, DRI Development , Intel Graphics Development , Jason Gunthorpe , Peter Zijlstra , Ingo Molnar , David Rientjes , Christian =?ISO-8859-1?Q?K=F6nig?= , =?ISO-8859-1?Q?J=E9r=F4me?= Glisse , Masahiro Yamada , Wei Wang , Andy Shevchenko , Thomas Gleixner , Jann Horn , Feng Tang , Kees Cook , Randy Dunlap , Daniel Vetter Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() Message-Id: <20190815151509.9ddbd1f11fb9c4c3e97a67a5@linux-foundation.org> In-Reply-To: <20190815084429.GE9477@dhcp22.suse.cz> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 15 Aug 2019 10:44:29 +0200 Michal Hocko wrote: > > I continue to struggle with this. It introduces a new kernel state > > "running preemptibly but must not call schedule()". How does this make > > any sense? > > > > Perhaps a much, much more detailed description of the oom_reaper > > situation would help out. > > The primary point here is that there is a demand of non blockable mmu > notifiers to be called when the oom reaper tears down the address space. > As the oom reaper is the primary guarantee of the oom handling forward > progress it cannot be blocked on anything that might depend on blockable > memory allocations. These are not really easy to track because they > might be indirect - e.g. notifier blocks on a lock which other context > holds while allocating memory or waiting for a flusher that needs memory > to perform its work. If such a blocking state happens that we can end up > in a silent hang with an unusable machine. > Now we hope for reasonable implementations of mmu notifiers (strong > words I know ;) and this should be relatively simple and effective catch > all tool to detect something suspicious is going on. > > Does that make the situation more clear? Yes, thanks, much. Maybe a code comment along the lines of This is on behalf of the oom reaper, specifically when it is calling the mmu notifiers. The problem is that if the notifier were to block on, for example, mutex_lock() and if the process which holds that mutex were to perform a sleeping memory allocation, the oom reaper is now blocked on completion of that memory allocation. btw, do we need task_struct.non_block_count? Perhaps the oom reaper thread could set a new PF_NONBLOCK (which would be more general than PF_OOM_REAPER). If we run out of PF_ flags, use (current == oom_reaper_th).