Received: by 2002:a17:90a:1609:0:0:0:0 with SMTP id n9csp850083pja; Wed, 1 Apr 2020 09:49:22 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuy+zTETjqqG7HphSRr4VWKjNln39xViQr+Pv83jynEkI6+6N2kNYKZxpKUDuqn3H/Ezr6/ X-Received: by 2002:a9d:69d5:: with SMTP id v21mr18251454oto.197.1585759762288; Wed, 01 Apr 2020 09:49:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585759762; cv=none; d=google.com; s=arc-20160816; b=vSiVoVRid5vPdS1mikRBewR/WT784XBERGbS9enRq30qRv/FpJvOosiXNdeAGquvH2 U8wDmX1vSEz7C6NmY/cdmJCWAKlB8VW6dnW8JjYuMTYZMfAidiJF0ZY1osO2kVYY+67d Q7VOoDzj8ISBpPG00knSDtpXTbtm/uzX5Bp5GAom7sBTlQXY4owguZM+svW9CniAZ9e1 0b2JY1Fl4RKFqAwOlmWnGIAz/R5kD1lxer6yAQZGsp5rZokBkvE2Pqu+WR1m9PmHb+V/ X+SY2SPx/DIIVuSTLF/stIZ7xM2uSjpdUcKe/6dHtqunM8MFskn8AojbcLXCfR50CKqz EQxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=tDHtrdSZNoLS1sP98Si1Rgv/b2XhOVAqCit5vWVsyBI=; b=nl805oUhL128C67oKX9lGsQJggmb/jP9PtefxnuC+SfBrHayaPG8H4iCKVFuKz1NgA eDwThfDs9e+zSVISf9v9voNUom9OxqgMUEXcoErZaLiHtk1nHndoTzaI9HVZy1cD/swE bAVKbGaY65Aoz6InlQXSQuNd/3a2xDLiGbidUPp7f2IP1VjuGZ/GQlBX7tp3PZ8UEGZs CZRXkUnfR9ebF+iHGcEygAgiXb2sS3VP57iZDZe54X5LRXO0HgtqQofGx5KG3YzZ6qT7 aVif8PBTdT1IHkuG/sDeO79645Qaq+ppDKBelHFjJ1Q5lf0TTmhdQ06G3FwKB15CPYG/ aIfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f29si982567ooh.85.2020.04.01.09.49.10; Wed, 01 Apr 2020 09:49:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390022AbgDAQrF (ORCPT + 99 others); Wed, 1 Apr 2020 12:47:05 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40580 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387865AbgDAQrD (ORCPT ); Wed, 1 Apr 2020 12:47:03 -0400 Received: by mail-wr1-f68.google.com with SMTP id u10so868490wro.7 for ; Wed, 01 Apr 2020 09:47:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=tDHtrdSZNoLS1sP98Si1Rgv/b2XhOVAqCit5vWVsyBI=; b=bm3MJzU8/SjRZsdVM+3g3N8CAY/HNkNNwFyrw3nW4M+xxgZrPjoBn6TJiSL0PlB8oi QhQIjZ8b3hKMIPlJSwqqh+UyBaex/MdGCqEbrCOhYK/LDAXVUzBej1tNXuzdzgSfLZt7 iFQeNW28LUaUwDjU75OU4YB5FbD+zJ1r4OkQ5etbePFhLJ4ndyffyD65iarNJr/mChQY Q4gzgPS8wPNtnhUbidagcdViR9uvEpVWRplBEvTkezUxugRmdngdYeBfVBirQajdPZKv bfWJQebdHunPGnOGQKyOIBbcubC6OoyeqJeLBIK+wXENhk3NFeYRAetn9LPhy1L/bWlM C6Xg== X-Gm-Message-State: ANhLgQ1gZpxf2fmQVtCQVS5S0bYBDiGVvH+O70oHBxt+ZYj52/6bpdmu MAOc3gKaLH/SLe5u0er+t5w= X-Received: by 2002:a5d:5547:: with SMTP id g7mr27309581wrw.263.1585759621755; Wed, 01 Apr 2020 09:47:01 -0700 (PDT) Received: from localhost (ip-37-188-180-223.eurotel.cz. [37.188.180.223]) by smtp.gmail.com with ESMTPSA id w9sm3712138wrk.18.2020.04.01.09.46.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2020 09:47:00 -0700 (PDT) Date: Wed, 1 Apr 2020 18:46:54 +0200 From: Michal Hocko To: Pavel Tatashin Cc: Daniel Jordan , Vlastimil Babka , David Hildenbrand , Shile Zhang , Andrew Morton , Kirill Tkhai , linux-mm , LKML Subject: Re: [PATCH v3] mm: fix tick timer stall during deferred page init Message-ID: <20200401164654.GY22681@dhcp22.suse.cz> References: <20200311123848.118638-1-shile.zhang@linux.alibaba.com> <20200401154217.GQ22681@dhcp22.suse.cz> <20200401160048.GU22681@dhcp22.suse.cz> <20200401160929.jwekhr24tb44odea@ca-dmjordan1.us.oracle.com> <20200401161243.GW22681@dhcp22.suse.cz> <20200401161810.xvqikca2x46yqrlx@ca-dmjordan1.us.oracle.com> <20200401162655.GX22681@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 01-04-20 12:41:13, Pavel Tatashin wrote: > On Wed, Apr 1, 2020 at 12:26 PM Michal Hocko wrote: > > > > On Wed 01-04-20 12:18:10, Daniel Jordan wrote: > > > On Wed, Apr 01, 2020 at 06:12:43PM +0200, Michal Hocko wrote: > > > > On Wed 01-04-20 12:09:29, Daniel Jordan wrote: > > > > > On Wed, Apr 01, 2020 at 06:00:48PM +0200, Michal Hocko wrote: > > > > > > On Wed 01-04-20 17:50:22, David Hildenbrand wrote: > > > > > > > On 01.04.20 17:42, Michal Hocko wrote: > > > > > > > > This needs a double checking but I strongly believe that the lock can be > > > > > > > > simply dropped in this path. > > > > > > > > > > This is what my fix does, it limits the time the resize lock is held. > > > > > > > > Just remove it from the deferred intialization and add a comment that we > > > > deliberately not taking the lock here because abc > > > > > > I think it has to be a little more involved because of the window where > > > interrupts might allocate during deferred init, as Vlastimil pointed out a few > > > years ago when the change was made. > > > > I do not remember any details but do we have any actual real allocation > > failure or was this mostly a theoretical concern. Vlastimil? For your > > context we are talking about 3a2d7fa8a3d5 ("mm: disable interrupts while > > initializing deferred pages") > > I do not remember seeing any real failures, this was a theoretical > window. So, we could potentially simply remove these locks until we > see a real boot failure in some interrupt thread. The allocation has > to be rather large as well. Yes please! We are really great at over complicating and over engineering stuff based on theoretical issues and build on top of that and make the code even more complex because nobody dares to re-evaluate and so on. -- Michal Hocko SUSE Labs