Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp2439806ybh; Mon, 16 Mar 2020 03:15:47 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuwKJiK3a94QHYrM8F+Fxvy9bL2RakBB+EiiwzU/37GKnYSIroW5BcfhN+GydN2x00CHbj9 X-Received: by 2002:aca:acd5:: with SMTP id v204mr16227686oie.124.1584353738883; Mon, 16 Mar 2020 03:15:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584353732; cv=none; d=google.com; s=arc-20160816; b=uF13U1JYoFsJRjytd28NJux7tBI0cFEFPsEXO4cI7r0Eat6T8/h19NVv//8abWc9AL 9JaK2ERiTJ4lmjBLXAncD/2zEqsNfbAX6npGuEpej5QK8NW6TxTJAiidICOrAZxwVQcB /cB5clASDhqc1iNrXvsVRNowfdxIIB5vmyrpX/nQaP+2yotN2cwx3IkoeOYuklppTN7T g3yBevAv/cOOBxRkVL1C8VQxGukyR+1txHOqOFUwbR4+MAdVeEs68cZcvJX9jTFf67Lc eJ2WkAghUNgdwp0xsb8NrfFzqSTk+6Lxd/ksQwcH4UVxetGeR/jrxD9wU36CkMpB+uUA 1ZWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=mT1+gDN6f+fs8b4YY3DWtQhvan8/bp51m7MCR4G4Mzo=; b=gIqM1bwkVO696bXNCIbuihD8IHmT76V8CvvuP5XXmrKK/dgAqpgNtstyrISG/uoD0Z YJo4E8FTkCE+YHliPNH6XdQi9h6Tj2MvI5une98ycczQNOekGy8zs8Lf6YHLt+Y77A7R glUhz4mD4UR/WfBDNxaaa/drumCNU7pNZcNud1W5CBiDEvvVkE4888F4vJjA5iEjomhK LX+sVXa6v2Bud9LY2nagA2xAjOhdh9XLoeiuZrwoweN35ehQE8WwDrP998PTLRLEO3sq ns6usReL+hNiVLRtjy1z2o3pjRroV1Y+A6WNcHOjIA+V+eucxVBphwCYSaNn08Cg7Wm2 hfug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p6si10351095otk.11.2020.03.16.03.15.16; Mon, 16 Mar 2020 03:15:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730546AbgCPKO6 (ORCPT + 99 others); Mon, 16 Mar 2020 06:14:58 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:50330 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730538AbgCPKO5 (ORCPT ); Mon, 16 Mar 2020 06:14:57 -0400 Received: by mail-wm1-f65.google.com with SMTP id z13so898431wml.0 for ; Mon, 16 Mar 2020 03:14:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=mT1+gDN6f+fs8b4YY3DWtQhvan8/bp51m7MCR4G4Mzo=; b=FpWtw7SuvswMu9i1u6lEcZ+rlwhwb/vHKajYgTFlr00whKvJng7+GMA4Fnb5l7gDbC Sx1vob+GefNQcNsER4I3d46ad9wJLWr7L1x7pzigDXx5sxQGLY4YKHj/09CVZK8qW4ZY 75hHS+Mx3rccymWAWaapf0P0RjuTAF1lL2yEhxwV1KMnn6WtoP/FEMMoqnYQINZupDmU X4sTVy51wom5NIiReS31XsMkDDZzgHEuebp2o0Ve2fDPFWSy85g6EIivZkzoFsi82bDu 4FiAJtOKczCB+9Suz5gfLnvCc6TILsMU5rRoBTNWsjlQHHS7L7CZOiCPTxkWaSZ8zDFd GwrA== X-Gm-Message-State: ANhLgQ3puuzmObOUZNZ5ptVpAowLY9rVGI6BAF12164oYqKtnvu4nnmi U9QsHeoZsRWHB+7rwAhG7Dk= X-Received: by 2002:a1c:2d4f:: with SMTP id t76mr26782494wmt.60.1584353693932; Mon, 16 Mar 2020 03:14:53 -0700 (PDT) Received: from localhost (ip-37-188-254-25.eurotel.cz. [37.188.254.25]) by smtp.gmail.com with ESMTPSA id j39sm14665548wre.11.2020.03.16.03.14.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2020 03:14:52 -0700 (PDT) Date: Mon, 16 Mar 2020 11:14:49 +0100 From: Michal Hocko To: Tetsuo Handa Cc: David Rientjes , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems Message-ID: <20200316101449.GG11482@dhcp22.suse.cz> References: <993e7783-60e9-ba03-b512-c829b9e833fd@i-love.sakura.ne.jp> <202003120012.02C0CEUB043533@www262.sakura.ne.jp> <20200312153238.c8d25ea6994b54a2c4d5ae1f@linux-foundation.org> <20200316093152.GE11482@dhcp22.suse.cz> <3be371a0-5b1e-7115-8659-186612ad5fb0@i-love.sakura.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3be371a0-5b1e-7115-8659-186612ad5fb0@i-love.sakura.ne.jp> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 16-03-20 19:04:44, Tetsuo Handa wrote: > On 2020/03/16 18:31, Michal Hocko wrote: > >> What happens if the allocator has SCHED_FIFO? > > > > The same thing as a SCHED_FIFO running in a tight loop in the userspace. > > > > As long as a high priority context depends on a resource held by a low > > priority task then we have a priority inversion problem and the page > > allocator is no real exception here. But I do not see the allocator > > is much different from any other code in the kernel. We do not add > > random sleeps here and there to push a high priority FIFO or RT tasks > > out of the execution context. We do cond_resched to help !PREEMPT > > kernels but priority related issues are really out of scope of that > > facility. > > > > Spinning with realtime priority in userspace is a userspace's bug. > Spinning with realtime priority in kernelspace until watchdog fires is > a kernel's bug. We are not responsible for userspace's bug, and I'm > asking whether the memory allocator kernel code can give enough CPU > time to other threads even if current thread has realtime priority. We've been through that discussion many times and the core point is that this requires a large surgery to work properly. It is not just to add a sleep into the page allocator and be done with that. Page allocator cannot really do much on its own. It relies on many other contexts to make a forward progress. What you really demand is far from trivial. Maybe you are looking something much closer to the RT kernel than what other preemption modes can offer currently. Right now, you really have to be careful when running FIFO/RT processes and plan their resources very carefully. Is that ideal? Not really but considering that this is the status quo for many years it seems that the usecases tend to find their way around that restriction. -- Michal Hocko SUSE Labs