Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1363981ybz; Sat, 25 Apr 2020 17:28:48 -0700 (PDT) X-Google-Smtp-Source: APiQypKF9mLgPWzF6ELL+sYtXOSBYcwEpIKpK64bS9mbZjy6JhFm8hu2wdR+5u4UG3A5nyK/cSYq X-Received: by 2002:a17:906:1dce:: with SMTP id v14mr985870ejh.244.1587860928676; Sat, 25 Apr 2020 17:28:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587860928; cv=none; d=google.com; s=arc-20160816; b=U05mL25HP7e1N9Gm8AtP/srU2qR3mSTgDM8552wHO9NQ9TaMS+SlucE+LQo5cc2Vq8 hCHnkZjOnCoYT/Vbkr25AyCgF0RIKvVwlMBQdjuU1GqgpqENiQ855/bpIel1VtM71LDI WpPXOTldo+UPbzh0gChk17gSyxTSWWfU1i3v/MEqtai7g2+jeknawprwHFBety0SDF0Z eG0FId5I4Qvl9iNngIODE+my/1L5ll5LP0o2SkeZUUzexusbwP4vS+oQCCR79k8uFrGt 1T/Lg97AU6ctMo9VkGMAvyJ0MuEpFh1hJKIeNs8ve0LdYRf+X8nunMNxucDBhfKoqNtn F0vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=8xmKCYI+jX1zRpjZdOQaBq6lJt4VoIk+9vs486xhYXY=; b=NqKSWUqC7i7GwGsDdxidGTDlXFLKRjF8qredfRTnN/vzDT8MV3g29tix0e0IykDn3L DD3z/04xqvXpq9B0O1t/gh/nqrvtAobMU5VMIwZeTLHF7NpeMR/zjTyWSWkjmKH+LfGa gmoPCxisOJXHuMJCgTf5mHB3gCrXbFfQuvYyzZwBO01k0f4iz2ek6w9swH7EABqkmg2J x7tGkJJmfHDoNVX54j8jXt2/jpl8ZVREYhorKospGNICPY4xd09zP+vZdNGWO2S6MfBE ifHIPzCqurq/erO8A2GiOZ0ZFTL+53/kNStdWNTnS2g+uiiWjzDdhJuyQATiWcbR04t4 aQQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VJ5jshIg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i14si6060239edt.120.2020.04.25.17.28.24; Sat, 25 Apr 2020 17:28:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VJ5jshIg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726110AbgDZA1I (ORCPT + 99 others); Sat, 25 Apr 2020 20:27:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:43076 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725962AbgDZA1I (ORCPT ); Sat, 25 Apr 2020 20:27:08 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EA7D0206DD; Sun, 26 Apr 2020 00:27:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587860827; bh=38m9QbQ29W8cxPKCyfwNPKHx2orqrhU9h1meg+rxbrI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=VJ5jshIg0z9Jvys8Zu9kJNoqX/1glzhTUOjXYMtiOQVGKM1R5N0ZKMwwjtfXOeOQr AOUdUd1D8crHJ1QUVSTIaOo9QxV5hNi5CsJlPc0g3OqzwTr2ow+Zg/hwO/pALATSs1 sUccODQic2ONGGNQd/Th0vhOsZJP3FIXkCWCgshA= Date: Sat, 25 Apr 2020 17:27:06 -0700 From: Andrew Morton To: David Rientjes Cc: Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon Message-Id: <20200425172706.26b5011293e8dc77b1dccaf3@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 24 Apr 2020 13:48:06 -0700 (PDT) David Rientjes wrote: > If GFP_ATOMIC allocations will start failing soon because the amount of > free memory is substantially under per-zone min watermarks, it is better > to oom kill a process rather than continue to reclaim. > > This intends to significantly reduce the number of page allocation > failures that are encountered when the demands of user and atomic > allocations overwhelm the ability of reclaim to keep up. We can see this > with a high ingress of networking traffic where memory allocated in irq > context can overwhelm the ability to reclaim fast enough such that user > memory consistently loops. In that case, we have reclaimable memory, and "user memory allocation", I assume? Or maybe "blockable memory allocatoins". > reclaiming is successful, but we've fully depleted memory reserves that > are allowed for non-blockable allocations. > > Commit 400e22499dd9 ("mm: don't warn about allocations which stall for > too long") removed evidence of user allocations stalling because of this, > but the situation can apply anytime we get "page allocation failures" > where reclaim is happening but per-zone min watermarks are starved: > > Node 0 Normal free:87356kB min:221984kB low:416984kB high:611984kB active_anon:123009936kB inactive_anon:67647652kB active_file:429612kB inactive_file:209980kB unevictable:112348kB writepending:260kB present:198180864kB managed:195027624kB mlocked:81756kB kernel_stack:24040kB pagetables:11460kB bounce:0kB free_pcp:940kB local_pcp:96kB free_cma:0kB > lowmem_reserve[]: 0 0 0 0 > Node 1 Normal free:105616kB min:225568kB low:423716kB high:621864kB active_anon:122124196kB inactive_anon:74112696kB active_file:39172kB inactive_file:103696kB unevictable:204480kB writepending:180kB present:201326592kB managed:198174372kB mlocked:204480kB kernel_stack:11328kB pagetables:3680kB bounce:0kB free_pcp:1140kB local_pcp:0kB free_cma:0kB > lowmem_reserve[]: 0 0 0 0 > > Without this patch, there is no guarantee that user memory allocations > will ever be successful when non-blockable allocations overwhelm the > ability to get above per-zone min watermarks. > > This doesn't solve page allocation failures entirely since it's a > preemptive measure based on watermarks that requires concurrent blockable > allocations to trigger the oom kill. To complete solve page allocation > failures, it would be possible to do the same watermark check for non- > blockable allocations and then queue a worker to asynchronously oom kill > if it finds watermarks to be sufficiently low as well. > Well, what's really going on here? Is networking potentially consuming an unbounded amount of memory? If so, then killing a process will just cause networking to consume more memory then hit against the same thing. So presumably the answer is "no, the watermarks are inappropriately set for this workload". So would it not be sensible to dynamically adjust the watermarks in response to this condition? Maintain a larger pool of memory for these allocations? Or possibly push back on networking and tell it to reduce its queue sizes? So that stuff doesn't keep on getting oom-killed?