Received: by 10.192.165.148 with SMTP id m20csp461463imm; Fri, 4 May 2018 00:32:50 -0700 (PDT) X-Google-Smtp-Source: AB8JxZq2D2jxAY/940NReKoppzeTkVOJqrL8kGAo97y/EVasBCjBYK0JxGjatPWcxfWG0GPurMZD X-Received: by 2002:a17:902:144:: with SMTP id 62-v6mr26909685plb.202.1525419170392; Fri, 04 May 2018 00:32:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525419170; cv=none; d=google.com; s=arc-20160816; b=nLkrtBTWhXX0S5U0QzBiQA7hLuIO4ehWj1eQN0gm216abYCgKo7hqRnxYvQxaB1NEV 68vjYRy7gc2Ce8SgL9UcA06yQCCiTSO5SygCg3Llj88fzGTG/uo7pc5R8K0GRN3OInN8 ilhmpNJN23CfysgaCDCx+U8NbtQLA1+xJeerX8rvrMVPHquKD4/4cbOBjkZCSNdKvXRq SM31j5S0u+SZp1zAWPSyEq8vtboLMtF3retqlsxuTJ5mhuUSr9vfMdf28Qeru1aRfYvE WsU3/t3a241noByzduG+VT6riNA6ZiCpOSC/G1pQqISDUJFjLxFxcIc28TPJZXXduPqt 22Sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=tdHQ9O3OrZoRy+IPU+1k85qqFdfiry9DH1YlYTAMd5I=; b=IRWlH8d8MUlS4dyY0XZePm2cMnu24k7PBVfLbMs311JQXYVgCwar0U9bk/YCfV15cQ ASgU+Y9WNA4pY4k4Albrxa+ETHrHWvBYiBJPujcHOaeXcA7rVP6ddX1Ae/DsTzJSQIW/ aOGCNgyhDwEYz4gn19dIzcnXab3EcLvGTpoUyiApEqAs+M4zgCmUlxIpE4jkovmu3/JA /4qQOx+yygdMvE52/I1xPr227i65aPCqqMczVhE8EHdvBFMp/TyC5CHuJsSPczsPxXlU sTRxJAEN0NmmsPmmVo4pxvV6Tl692Ap5NUt4KuzIPpJqy15cQob6gcF317+rsM+fWLbG XATQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=V4njH5Py; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m2-v6si14880153plt.55.2018.05.04.00.32.36; Fri, 04 May 2018 00:32:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=V4njH5Py; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751554AbeEDHcD (ORCPT + 99 others); Fri, 4 May 2018 03:32:03 -0400 Received: from mail-wm0-f54.google.com ([74.125.82.54]:55529 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751301AbeEDHcB (ORCPT ); Fri, 4 May 2018 03:32:01 -0400 Received: by mail-wm0-f54.google.com with SMTP id a8so2482632wmg.5 for ; Fri, 04 May 2018 00:32:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=tdHQ9O3OrZoRy+IPU+1k85qqFdfiry9DH1YlYTAMd5I=; b=V4njH5PybaXPUIkITFCcRfwXTOBdD7YgoTNMVJoSozxocp+rjey0vxdBod5BX//axq HYnZP9lOgqI2NeM12xpeF9W0C2++oONttI1yO0RAdY1tkMD8sEvBAav74U05uCeHECUD DxoT5P/IWySzZ9YupNlEcX99ikgzQ0IxIB1QFzaqguiRFpv89KzxciQ9fom/AnBww5gP fZiAS/zZlwXdY7RtPjQG4yQCeDReUeaSBNKSM8Dbn0bqyPVAg/hjGBPjLU37t8wB7O2L Xs9IpKEsGxlrfJIeq5FcI8KTHOPQV1UE06z7fCHBXpGftr4OFyzCcR8k3ZNW2VsMgIQI 4wGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=tdHQ9O3OrZoRy+IPU+1k85qqFdfiry9DH1YlYTAMd5I=; b=oUvIR3e4+NsOrS3if1gjkatHIb9PA6XUTBJwhO9aXIqEAzSj7q+IyJhJ5lVIbL6mIH s5E7kIIdWn1mTbzCVY84yvSF/nuY/FKlqQ1ho0R0wULH7VrdW+pH4o97FeHxxDOZjxx8 PSkcVck+O11ggTVSWoNdp1mUxXUKedpTrFYcO3giU+iEl/mxmeLO6oeQyoS6Q3sOU5rq P5sXs7lswiI0nA6mfgfakm4yw3iBaP6pYy49Q3qZHvjgFB70JALyELmLGEsalalr36/9 bqbPyhwfVm/1+clNu4BHSuNUK31irZ5TFSSCgDEZkTIkI4VaG4wtfEwh8RxAjoYIdUtp HHoA== X-Gm-Message-State: ALQs6tA8zKTT5mwWQ5RQ+nF6+P8o86qx2VBR+FT+w8hIG8bRISB4eHjn 1Ht+K3IzyH40Lc7625xV1jso1uOtjWxo3Vpw47A= X-Received: by 10.28.145.196 with SMTP id t187mr15727396wmd.19.1525419120371; Fri, 04 May 2018 00:32:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.187.131 with HTTP; Fri, 4 May 2018 00:31:59 -0700 (PDT) In-Reply-To: <8b06973c-ef82-17d2-a83d-454368de75e6@suse.cz> References: <1525408246-14768-1-git-send-email-iamjoonsoo.kim@lge.com> <8b06973c-ef82-17d2-a83d-454368de75e6@suse.cz> From: Joonsoo Kim Date: Fri, 4 May 2018 16:31:59 +0900 Message-ID: Subject: Re: [PATCH] mm/page_alloc: use ac->high_zoneidx for classzone_idx To: Vlastimil Babka Cc: Andrew Morton , Mel Gorman , Michal Hocko , Linux Memory Management List , LKML , Johannes Weiner , Minchan Kim , Ye Xiaolong , Joonsoo Kim Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2018-05-04 16:03 GMT+09:00 Vlastimil Babka : > On 05/04/2018 06:30 AM, js1304@gmail.com wrote: >> From: Joonsoo Kim >> >> Currently, we use the zone index of preferred_zone which represents >> the best matching zone for allocation, as classzone_idx. It has a problem >> on NUMA system with ZONE_MOVABLE. >> >> In NUMA system, it can be possible that each node has different populated >> zones. For example, node 0 could have DMA/DMA32/NORMAL/MOVABLE zone and >> node 1 could have only NORMAL zone. In this setup, allocation request >> initiated on node 0 and the one on node 1 would have different >> classzone_idx, 3 and 2, respectively, since their preferred_zones are >> different. If they are handled by only their own node, there is no problem. >> However, if they are somtimes handled by the remote node, the problem >> would happen. >> >> In the following setup, allocation initiated on node 1 will have some >> precedence than allocation initiated on node 0 when former allocation is >> processed on node 0 due to not enough memory on node 1. They will have >> different lowmem reserve due to their different classzone_idx thus >> an watermark bars are also different. >> > ... > >> >> min watermark for NORMAL zone on node 0 >> allocation initiated on node 0: 750 + 4096 = 4846 >> allocation initiated on node 1: 750 + 0 = 750 >> >> This watermark difference could cause too many numa_miss allocation >> in some situation and then performance could be downgraded. >> >> Recently, there was a regression report about this problem on CMA patches >> since CMA memory are placed in ZONE_MOVABLE by those patches. I checked >> that problem is disappeared with this fix that uses high_zoneidx >> for classzone_idx. >> >> http://lkml.kernel.org/r/20180102063528.GG30397@yexl-desktop >> >> Using high_zoneidx for classzone_idx is more consistent way than previous >> approach because system's memory layout doesn't affect anything to it. > > So to summarize; > - ac->high_zoneidx is computed via the arcane gfp_zone(gfp_mask) and > represents the highest zone the allocation can use > - classzone_idx was supposed to be the highest zone that the allocation > can use, that is actually available in the system. Somehow that became > the highest zone that is available on the preferred node (in the default > node-order zonelist), which causes the watermark inconsistencies you > mention. Yes! Thanks for summarize! > I don't see a problem with your change. I would be worried about > inflated reserves when e.g. ZONE_MOVABLE doesn't exist, but that doesn't > seem to be the case. My laptop has empty ZONE_MOVABLE and the > ZONE_NORMAL protection for movable is 0. Yes! Protection number is calculated by using the number of managed page in upper zone. If there is no memory on the upper zone, protection will be 0. > But there had to be some reason for classzone_idx to be like this and > not simple high_zoneidx. Maybe Mel remembers? Maybe it was important > then, but is not anymore? Sigh, it seems to be pre-git. Based on my code inspection, this patch changing classzone_idx implementation would not cause the problem. I also have tried to find the reason for classzone_idx implementation by searching git history but I can't. As you said, it seems to be pre-git. It would be really helpful that someone who remembers the reason for current classzone_idx implementation teaches me the reason. Thanks.