Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp3424599pxu; Mon, 19 Oct 2020 11:41:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyTSRA2jGZDuQw8ZyxQXeqd+XYqoyBVkhbItuJwRqN/kTuDiOBF+jIiHPzTHhr7qVROZDKa X-Received: by 2002:a05:6402:2208:: with SMTP id cq8mr1136728edb.191.1603132919486; Mon, 19 Oct 2020 11:41:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603132919; cv=none; d=google.com; s=arc-20160816; b=Hchw4bC52RmhYKFxOhMzT3vM+mJSOkoa9M1+OxOwv3ut1kEZJEEhPX4J2hwahzb/S3 v13xYpzMXWtf2knByAXaioHnXwzFXO+8LsYlzt/DDGGamtZ1Oi0jn8nCwXPWTLlSBfK+ UMAPDAhLNw/1LXLUDSwv6z7SGy4cG5MCGYQSRWIvrcQzQHlR+tmpdsRs8elukMi7gzDy +9UHKZzA9adihSHPkwGbyhwxYRWcizLplkCH120uH87ti5lgSJt9SVSjFcto3CwsWACb laE3Uo5AhPCUmBx8vVl1QARRurX028q5v2rw0djgA51P9ml9GF4X970zVSnDJfsRoax0 v83w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=M9iLeKVp3nEaJ65+HjYKGrEoohnZIpxfdaQ1MUWPtik=; b=0Um8OxOKJTf5h4ALQkY0ayzuZELOSHlY4t1Ndv3M1eW4VBMMlLc3rwn1GXaxia/HQC 8uDjfSa9hV9dJWnwVVTjm7B0oBbSIkRTQjiMjn68zce77eR6j+1QZW4wiJf4qihf3BXt fM6VtwPH1Eec61l3m3ahp1DDG2L5uerr+aKq0pvpT7KPzGo1rzx769MMtRPXkpDnwm9U JcCcf3DJ0VlIwRY1Z6T/d4rDZmGArp8H2bS/L/duTZnXOpdyq5rJTnfb5JK5qiZoLcA5 FTnyTPIVnn+0WsMtmNR0tKvF/iwlcOSyPsZBzLQn/3r/UyIyJkncqDvcq1TGUhd3EYnW 3R9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XOBQwAqF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bt15si582444ejb.170.2020.10.19.11.41.34; Mon, 19 Oct 2020 11:41:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XOBQwAqF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730710AbgJSSkh (ORCPT + 99 others); Mon, 19 Oct 2020 14:40:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730464AbgJSSkh (ORCPT ); Mon, 19 Oct 2020 14:40:37 -0400 Received: from mail-io1-xd41.google.com (mail-io1-xd41.google.com [IPv6:2607:f8b0:4864:20::d41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED744C0613D0 for ; Mon, 19 Oct 2020 11:40:36 -0700 (PDT) Received: by mail-io1-xd41.google.com with SMTP id b15so826970iod.13 for ; Mon, 19 Oct 2020 11:40:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=M9iLeKVp3nEaJ65+HjYKGrEoohnZIpxfdaQ1MUWPtik=; b=XOBQwAqFu1fNKsYXa1rDfMAvWGF1Lyv0OQg5cOxKv+52V+dAgknCxWle4pmYrjrGWF Ts85Ms5oP6tqYqLt9H+yt61luBwVw95BSEQm72ffhTqpRZpe70TnSuNh4cRYXrgjeYuA bN0GgubH43uabNyI8ZkCxHp5uwVB4iA1vgoP4xX9dt2Lf70uanwld8k+VXgjHpCfSRId lodC4OX2UK8lCE/4VzHkUIdf6SAvANJCXC1mIH5ra19atDE6QEIHIminYcwT9CeeY85j zPhPCWFuRthLziVp6fgDOeBpDdfchlMM0OPFG+cALWf5Pf+cxKL5HsaBuMTDP3uAMhR2 UsZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=M9iLeKVp3nEaJ65+HjYKGrEoohnZIpxfdaQ1MUWPtik=; b=ciL1MQQsugNXpalc3A9Tup85BOjD96Xhrpho8v7uX9y+0RMEghAFwCWJEBhuVL+86Q mxwmyH2xRxOAl4Lo4FCIgzAyzvwMHkSqbEZVC7OEZsbezvo0dlKf9LhuMLBxaNjMNlav ccyzEoq4w8rfstAlHBYGrsA0hOuKhD6TJyvhi+s7lEovrXGZ/GJZGbdAwHK1SD6hJT4V 3wj8ELu4qPz+d0IWPnes1XbAXxCFeZ9whx1RJPsPoPr8ft79Dvh72mIxqF3Oy8EbL3f/ D6e+2L93rRkNEKeGY/PBRFlb4lDTeGPyw2a7KRrPDmph8DV7dYvvozZeYWQrjr4ocOhE 7/1A== X-Gm-Message-State: AOAM531hqnoYhfQGT0YGjK4Ex2Iwv8oXp6N02mxyN3yXui5ewp7qSlLt s1fQW27EIyogtAlT7+wU4Cs3Sw== X-Received: by 2002:a6b:7942:: with SMTP id j2mr664217iop.73.1603132836151; Mon, 19 Oct 2020 11:40:36 -0700 (PDT) Received: from maple.netwinder.org (rfs.netwinder.org. [206.248.184.2]) by smtp.gmail.com with ESMTPSA id t26sm436320ioi.11.2020.10.19.11.40.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 11:40:35 -0700 (PDT) From: Ralph Siemsen To: akpm@linux-foundation.org Cc: charante@codeaurora.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, vinmenon@codeaurora.org, stable@vger.kernel.org, Linus Torvalds , Ralph Siemsen Subject: Re: [PATCH] mm, page_alloc: skip ->watermark_boost for atomic order-0 allocations-fix Date: Mon, 19 Oct 2020 14:40:17 -0400 Message-Id: <20201019184017.6340-1-ralph.siemsen@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200617171518.96211e345de65c54b9343a3a@linux-foundation.org> References: <20200617171518.96211e345de65c54b9343a3a@linux-foundation.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Please consider applying the patch from this thread to 5.8.y: commit f80b08fc44536a311a9f3182e50f318b79076425 The fix should also go into 5.4.y, however the patch needs some minor adjustments due to surrounding context differences. Attached below is a version I have tested against 5.4.71. This solves a "page allocation failure" error that can be reproduced both on physical hardware, and also under qemu-system-arm. The test consists of repeatedly running md5sum on a large file. In my tests the file contains 1GB of random data, while the system has only 256MB RAM. No other tasks are running or consuming significant memory. After some time (between 1 and 200 iterations) the kernel reports a page allocation failure. Additional failures occur fairly quickly thereafter. The md5sum is correctly computed in each case. The OOM is not invoked. The backtrace shows a 0-order GFP_ATOMIC was requested, with quite a bit of memory available, and yet the allocation fails. Similar error also occurs when "md5sum" is replaced by "scp" or "nc". The backtrace again shows a 0-order with GFP_ATOMIC that fails, with plenty of memory available according to the Mem-Info dump. The problem does not occur under 4.9.y or 4.19.y. Bisction has found that the problem started to occur with 688fcbfc06e4 ("mm/vmalloc: modify struct vmap_area to reduce its size") during the 5.4 dev cycle. I can provide additional logs and details if interested. Thanks, Ralph Below is the f80b08fc445 commit, tweaked to apply to 5.4.y. From: Charan Teja Reddy Subject: [PATCH] mm, page_alloc: skip ->waternark_boost for atomic order-0 allocations [upstream commit f80b08fc44536a311a9f3182e50f318b79076425 with context adjusted to match linux-5.4.y] When boosting is enabled, it is observed that rate of atomic order-0 allocation failures are high due to the fact that free levels in the system are checked with ->watermark_boost offset. This is not a problem for sleepable allocations but for atomic allocations which looks like regression. This problem is seen frequently on system setup of Android kernel running on Snapdragon hardware with 4GB RAM size. When no extfrag event occurred in the system, ->watermark_boost factor is zero, thus the watermark configurations in the system are: _watermark = ( [WMARK_MIN] = 1272, --> ~5MB [WMARK_LOW] = 9067, --> ~36MB [WMARK_HIGH] = 9385), --> ~38MB watermark_boost = 0 After launching some memory hungry applications in Android which can cause extfrag events in the system to an extent that ->watermark_boost can be set to max i.e. default boost factor makes it to 150% of high watermark. _watermark = ( [WMARK_MIN] = 1272, --> ~5MB [WMARK_LOW] = 9067, --> ~36MB [WMARK_HIGH] = 9385), --> ~38MB watermark_boost = 14077, -->~57MB With default system configuration, for an atomic order-0 allocation to succeed, having free memory of ~2MB will suffice. But boosting makes the min_wmark to ~61MB thus for an atomic order-0 allocation to be successful system should have minimum of ~23MB of free memory(from calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are observed despite system is having ~20MB of free memory. In the testing, this is reproducible as early as first 300secs since boot and with furtherlowram configurations(<2GB) it is observed as early as first 150secs since boot. These failures can be avoided by excluding the ->watermark_boost in watermark caluculations for atomic order-0 allocations. [akpm@linux-foundation.org: fix comment grammar, reflow comment] [charante@codeaurora.org: fix suggested by Mel Gorman] Link: http://lkml.kernel.org/r/31556793-57b1-1c21-1a9d-22674d9bd938@codeaurora.org Signed-off-by: Charan Teja Reddy Signed-off-by: Andrew Morton Acked-by: Vlastimil Babka Cc: Vinayak Menon Cc: Mel Gorman Link: http://lkml.kernel.org/r/1589882284-21010-1-git-send-email-charante@codeaurora.org Signed-off-by: Linus Torvalds Signed-off-by: Ralph Siemsen --- mm/page_alloc.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index aff0bb4629bd..b0e9ea4c220e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3484,7 +3484,8 @@ bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, } static inline bool zone_watermark_fast(struct zone *z, unsigned int order, - unsigned long mark, int classzone_idx, unsigned int alloc_flags) + unsigned long mark, int classzone_idx, + unsigned int alloc_flags, gfp_t gfp_mask) { long free_pages = zone_page_state(z, NR_FREE_PAGES); long cma_pages = 0; @@ -3505,8 +3506,23 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order, if (!order && (free_pages - cma_pages) > mark + z->lowmem_reserve[classzone_idx]) return true; - return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags, - free_pages); + if (__zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags, + free_pages)) + return true; + /* + * Ignore watermark boosting for GFP_ATOMIC order-0 allocations + * when checking the min watermark. The min watermark is the + * point where boosting is ignored so that kswapd is woken up + * when below the low watermark. + */ + if (unlikely(!order && (gfp_mask & __GFP_ATOMIC) && z->watermark_boost + && ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN))) { + mark = z->_watermark[WMARK_MIN]; + return __zone_watermark_ok(z, order, mark, classzone_idx, + alloc_flags, free_pages); + } + + return false; } bool zone_watermark_ok_safe(struct zone *z, unsigned int order, @@ -3647,7 +3663,8 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK); if (!zone_watermark_fast(zone, order, mark, - ac_classzone_idx(ac), alloc_flags)) { + ac_classzone_idx(ac), alloc_flags, + gfp_mask)) { int ret; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT -- 2.17.1