Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp9956587imu; Wed, 5 Dec 2018 13:17:23 -0800 (PST) X-Google-Smtp-Source: AFSGD/UQBtkUxIjZfXcYEnZ5yS643Rvx4u6eG0woP3HJDLrKUVg+TQkli0mjMdGGMSpW8tYIsqVf X-Received: by 2002:a65:4142:: with SMTP id x2mr21284969pgp.356.1544044643215; Wed, 05 Dec 2018 13:17:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544044643; cv=none; d=google.com; s=arc-20160816; b=Q5D0smU2wHQv90HW1oos4LyzzOTezpnnJYR71F9/zFPvTYPDETg7Ds6azVWR0VsEaR VXeuseVdbZDSHRk/jJ51vpz+ZSuwB5U3iBs1uEYHPLu9zHKSCmxapl3Qy1JmYd87P3D+ U6VAi+vn1SH603I2btm58xXYsnyOh8SpkHnx2fLHql5mkMzg4CMhteVhmM2wtlFMrj+a 1g6jPoAxdKhnEQdCS2zOCGLiVgORvsv1JG3XOx9C0JoBnN3/qafCBq3pAcGjrOJEHI6m ZAw8azgaxKlq0LHgc86ptoUGE2Br1tK237Y922RQbyqN0zW7TRZiMIZyaaRZrqKkOEt6 C8cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=GMz2EIwH2WgkDln2haXdayGa7cvr+Ps4oWLojflnUhs=; b=mWwUeeE3+sgaoXLJvHxEqLB5xz+pp6FLT08Ima2qsUcw+F3YsuKuAl/LRshTymsGKD ysfG+pyE9sngxRUn+910lp9QoieQE0VnxbMFnKf8RdSuO2po9qaIzUjyvi8CkBTGjpZR ezilMbHYBzYhXWMjYBFbPYyfcsE4HzuT8Aic2plYGdjfFGxlesSOy23baY5xd+xKvfKW 2CcEvhYKViLmq9ZYpM/ATPIdX3rNrmb+bAB45g7bMdtzbVISV7VD+xi08llCYCSTcUaB SJYxjt/o5a+Do+lNyML5W/XGqiDOngmTZuFyDtnosa0a9hIGpmcAO+4sWb+Lb1E+YFb5 7rTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RGd1C76F; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x23si18390911pgk.272.2018.12.05.13.17.08; Wed, 05 Dec 2018 13:17:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RGd1C76F; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728584AbeLEVOl (ORCPT + 99 others); Wed, 5 Dec 2018 16:14:41 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:36692 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728138AbeLEVOk (ORCPT ); Wed, 5 Dec 2018 16:14:40 -0500 Received: by mail-pg1-f196.google.com with SMTP id n2so9591481pgm.3 for ; Wed, 05 Dec 2018 13:14:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=GMz2EIwH2WgkDln2haXdayGa7cvr+Ps4oWLojflnUhs=; b=RGd1C76FOHH4RsslXnYqUk1hzJG4EMRD1TT6B24OgqTe8wKsdTTHcDrReyrXyn7odU Ej8pwv0CpLIBiw/Ll7YmbA1ANSs9tazGtIA3MXwVteEGESj6g4i5V0B92LteZzbyffP2 SrkqctzUZTrEJFzKtBp92RWdS8DzF1VPP2iyqJUSOFtH5I9SMENb+2cQ1unwqbTqcbNk NNdHCmybqrsqyIHsQwMuRRBEcPzyLuWTunezexc5jEIIAUJppvZthg0DGKmN/MrL+Fhk pqTDBSMu2uli5Gy8PMN6cKVL2qnTDAvI1ZHws591f0W6N/peSKO98M9p9Hfr9CKXZo8K D5dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=GMz2EIwH2WgkDln2haXdayGa7cvr+Ps4oWLojflnUhs=; b=fxYsuFKZeYct6i46RYu/PHELhsmYRGL701L5EYDKTrNzeyEM4SthSNRB+01bieP3T5 CZOTw04623RHlZqJLwzNjg0f/issjCHDUdsOpBT7p8aoOJEomup4n2PG01tJxwcy6Shq sAJYF8B0rc0EmWoQSVTuxqo+40WoMFXvN2GMeFQMB+G8sBPYTIDVnXfIFdZu4P1757zr NkGVqW3ZqUp08o7F2qR9XmXRJXMcTGF3Na5UVqxMKGLMt9KR9t/NcqNs+taVZCEcSOsd W9WTIlhFLu3HEw9PfmEXzoiYFCeGaE40sbdXEc9wPclzhjU2KS8ZT6V1udfSV2XC2q2Y OEkw== X-Gm-Message-State: AA+aEWZ2elwXb5f5mlRuBNZ9TtYcGbuTOtBzr1G/TK9BUd89ZYBbBbzF KBVFTfXV2Wu2EjHDzQNbzTQJYQ== X-Received: by 2002:a62:6f49:: with SMTP id k70mr25757855pfc.7.1544044479555; Wed, 05 Dec 2018 13:14:39 -0800 (PST) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id m20sm23271861pgv.93.2018.12.05.13.14.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 05 Dec 2018 13:14:38 -0800 (PST) Date: Wed, 5 Dec 2018 13:14:37 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Michal Hocko cc: Vlastimil Babka , Linus Torvalds , Andrea Arcangeli , ying.huang@intel.com, s.priebe@profihost.ag, mgorman@techsingularity.net, Linux List Kernel Mailing , alex.williamson@redhat.com, lkp@01.org, kirill@shutemov.name, Andrew Morton , zi.yan@cs.rutgers.edu Subject: Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions In-Reply-To: <20181205203243.GX1286@dhcp22.suse.cz> Message-ID: References: <20181205090554.GX1286@dhcp22.suse.cz> <20181205203243.GX1286@dhcp22.suse.cz> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 5 Dec 2018, Michal Hocko wrote: > > As we've been over countless times, this is the desired effect for > > workloads that fit on a single node. We want local pages of the native > > page size because they (1) are accessed faster than remote hugepages and > > (2) are candidates for collapse by khugepaged. > > > > For applications that do not fit in a single node, we have discussed > > possible ways to extend the API to allow remote faulting of hugepages, > > absent remote fragmentation as well, then the long-standing behavior is > > preserved and large applications can use the API to increase their thp > > success rate. > > OK, I just give up. This doesn't lead anywhere. You keep repeating the > same stuff over and over, neglect other usecases and actually force them > to do something special just to keep your very specific usecase which > you clearly refuse to abstract into a form other people can experiment > with or at least provide more detailed broken down numbers for a more > serious analyses. Fault latency is only a part of the picture which is > much more complex. Look at Mel's report to get an impression of what > might be really useful for a _productive_ discussion. The other usecases is part of patch 2/2 in this series that is functionally similar to the __GFP_COMPACT_ONLY patch that Andrea proposed. We can also work to extend the API to allow remote thp allocations. Patch 1/2 reverts the behavior of commit ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") which added NUMA locality on top of an already conflated madvise mode. Prior to this commit that was merged for 4.20, *all* thp faults were constrained to the local node; this has been the case for three years and even prior to that in other kernels. It turns out that allowing remote allocations introduces access latency in the presence of local fragmentation. The solution is not to conflate MADV_HUGEPAGE with any sematic that suggests it allows remote thp allocations, especially when that changes long-standing behavior, regresses my usecase, and regresses the kernel test robot. I'll change patch 1/2 to not touch new_page() so that we are only addressing thp faults and post a v2.