Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp752948imm; Wed, 20 Jun 2018 06:08:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ8zTOLBLI9AffsNyvOVwvKDWlAV8zMMR8zFDJ7++ZS8YWSePBWuXTvuf7myHW7jB//+z+B X-Received: by 2002:a65:4985:: with SMTP id r5-v6mr18694947pgs.110.1529500131938; Wed, 20 Jun 2018 06:08:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529500131; cv=none; d=google.com; s=arc-20160816; b=Jt374kemtIYipCZzxk/V4cN4ErtKUBvdc9XVZeijiDD7YHEUANMQ5fC1EznRxyzAyk uHRJYpvt1K86hsH7jZ+ndfByDBkE5yiNXxGIl/naY6gtXrXDITAhPnK3cmxdA2y+Ia4L MDX03DpPAPHXmeE5xG+eDSSlgAA4OudgksrZ3y4DxvA9p8L2oe4GZePa0To5+7pFwSIf ilAWDDwCe1xiZiBzxYJN3IkMEMipToHrORc1sqd/qmxPcpl6WcAQdqW/fDkPYqn6UuA6 M7BFAQZK/rv5EUWmaWQZFcYSBrNP6TERvFMO/HLr259GQ2FPytOEO3NpDKHbg/oqPoRG /ppQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=SI0mNkn0Ax1n/uRu3MO/MuneOE8SQot7+iTR4YYmyUg=; b=U6ws5iW437Zo4i8etRBm9Mr93sZvw0DFFN7DwGp3TyHaHqkWgXf4Oo9DclrNx1k5US zTspYSYL9Tffr7yMrsw5hmy6vcL3IZGh4kRKrjvvBQs/V6ZqMCMDlIp4guJh0jRHtNPs Evixxt6eOr1QTlFQILpMUVNzRo6pxoXKAfvnIi8FhN22ZI9MxpYtZgisjn9kWBPMCdsQ HkW5goXVDGTAc0HqtrXK9zv+YTYucShEVSSxalP4qxCjb7SeVGaGvq59UGxJNpHXJXYH 74hG3x5Pyt3bNTqbwgUHzDt9robnbyQBjDhDD3AZuYH815VXtzAmlCKFi+bJ+Ctkolub Mesg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b30-v6si2849818pli.427.2018.06.20.06.08.38; Wed, 20 Jun 2018 06:08:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753931AbeFTNHu (ORCPT + 99 others); Wed, 20 Jun 2018 09:07:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:50613 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751648AbeFTNHt (ORCPT ); Wed, 20 Jun 2018 09:07:49 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id ED537AE75; Wed, 20 Jun 2018 13:07:47 +0000 (UTC) Date: Wed, 20 Jun 2018 15:07:46 +0200 From: Michal Hocko To: Tetsuo Handa Cc: linux-mm@kvack.org, rientjes@google.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer. Message-ID: <20180620130746.GN13685@dhcp22.suse.cz> References: <1529493638-6389-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <20180620115531.GL13685@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 20-06-18 21:21:21, Tetsuo Handa wrote: > On 2018/06/20 20:55, Michal Hocko wrote: > > On Wed 20-06-18 20:20:38, Tetsuo Handa wrote: > >> Sleeping with oom_lock held can cause AB-BA lockup bug because > >> __alloc_pages_may_oom() does not wait for oom_lock. Since > >> blocking_notifier_call_chain() in out_of_memory() might sleep, sleeping > >> with oom_lock held is currently an unavoidable problem. > > > > Could you be more specific about the potential deadlock? Sleeping while > > holding oom lock is certainly not nice but I do not see how that would > > result in a deadlock assuming that the sleeping context doesn't sleep on > > the memory allocation obviously. > > "A" is "owns oom_lock" and "B" is "owns CPU resources". It was demonstrated > at "mm,oom: Don't call schedule_timeout_killable() with oom_lock held." proposal. This is not a deadlock but merely a resource starvation AFAIU. > But since you don't accept preserving the short sleep which is a heuristic for > reducing the possibility of AB-BA lockup, the only way we would accept will be > wait for the owner of oom_lock (e.g. by s/mutex_trylock/mutex_lock/ or whatever) > which is free of heuristic and free of AB-BA lockup. > > > > >> As a preparation for not to sleep with oom_lock held, this patch brings > >> OOM notifier callbacks to outside of OOM killer, with two small behavior > >> changes explained below. > > > > Can we just eliminate this ugliness and remove it altogether? We do not > > have that many notifiers. Is there anything fundamental that would > > prevent us from moving them to shrinkers instead? > > > > For long term, it would be possible. But not within this patch. For example, > I think that virtio_balloon wants to release memory only when we have no > choice but OOM kill. If virtio_balloon trivially releases memory, it will > increase the risk of killing the entire guest by OOM-killer from the host > side. I would _prefer_ to think long term here. The sleep inside the oom lock is not something real workload are seeing out there AFAICS. Adding quite some code to address such a case doesn't justify the inclusion IMHO. -- Michal Hocko SUSE Labs