Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp508123lqt; Thu, 6 Jun 2024 09:43:15 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWk04JPhmT2+wGQFw6pg8C5llYRpjjCmVsPnW5/hiz3YvA6z69QR0eOmZcPYthYqT+ny7JfMQ3OQeFPxJmcqlTa3Q2rcJwayLkeXnev8Q== X-Google-Smtp-Source: AGHT+IHLD1yWIRUD0T3iRYZGj3pOalRQ9A52IhZTNeXOzQKp6/j7+c2qf1m0aqnezG9x8ktLyP9L X-Received: by 2002:a05:6a00:2e12:b0:6f3:e720:cead with SMTP id d2e1a72fcca58-703e5944cc5mr7692515b3a.5.1717692195251; Thu, 06 Jun 2024 09:43:15 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717692195; cv=pass; d=google.com; s=arc-20160816; b=ZcjgIuMdncH8Cw905J0O8P9dTUyVYNrCEg7YKUjKNm3yNzR8WJ5yer71TAvUGXFq3Q 9LX774O9vqdMWoDRdOdcWV/nsq2GO2bcQiWCcR0A6fgFejem8a8zaWilz8KE1bxlC+2v CxICrs8gB2XkQNVyXoG5Wfa1lgzark4FMauAKy7qHlHSI+H3WUn7gVARgWLan8HH+LIY 3O45RCDTaWHBcrwe7fLNgThKCCsN1x44gbg770ItZFHL1I5CViMHm1YKqzoV7XQUerxf KHYqeDiW9xxrhJxFhq2DMqGBLeZEsasbIRc3IIMUmRLTI42gYuJOErAlYXdynTaPpn8a n5Qw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; fh=kce9Dffps5WMiAfYnfwImaT1GD/RZn9O0PtdhlSKhOI=; b=lj3i8+/EdeBpDp2rIh/tLjzhNM65uWLXxzQVO64CQzLE35v1Czm0CfKkjxGivowVkh 5HrBy/DL53nhShc/Pk0m31DAIfZnWoB8gmw3WLlcwC5WDXcYE65HZBuvWDReqK6FckZE hW8cQLgGl1VZkoD8En2MwMc6uPvLhC79fuqhRTQf8VCcPm2KnHux9be2Ai3fz+CoCUid gXrQdvOB7nfeuHZ+WMkcB/Lv0jEEjbAECC9YwDYqSFjux8Ijw7J3uzjZvbhsE9zrBScj ZFoFahs1AfXWGfyg48Tu5LVs5mjtEcazmOGHK0z+hhceUe1WyMGOudkhGn+W6HXnIztZ jQIw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hRe9ouoj; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-204717-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204717-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-6de262c0664si1384502a12.424.2024.06.06.09.43.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 09:43:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-204717-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hRe9ouoj; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-204717-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204717-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id D4818288243 for ; Thu, 6 Jun 2024 16:43:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8DED6198A19; Thu, 6 Jun 2024 16:43:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hRe9ouoj" Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 055411CAA9 for ; Thu, 6 Jun 2024 16:43:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.179 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717692183; cv=none; b=rc9Fm+FqR9LqGcZk5UiVjPZdENerCQSsZwsAaEPRfk4LWyaHzbo6Lkam1RurbCwNZni7YXYti20Tulg0vry2LDE+BmZ0xb5Jnk6UiXHU0N5v0Y4d0rmf1qyioUqLuOzotlvHNtZPlNaZg/91rLfi/Go+4c5B7+RV2mVkN5Nn3SI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717692183; c=relaxed/simple; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=ehJHw0+YPDMHRWysBr68Gkj5I3e8x63NIF9tDwvKcO/tt2ep58gFVl9gHA0Z8Q8yZdMXoZKhpV9/BoITM21rhEwgu5M8wJqDcldiM/WChHk34zBFi3I3mGXxklUUKmicq1xDY61OWN5++cwoLDwVNgsq2TgWOAxcmRz1NdFDIR8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hRe9ouoj; arc=none smtp.client-ip=209.85.167.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-oi1-f179.google.com with SMTP id 5614622812f47-3c9cc681e4fso628122b6e.0 for ; Thu, 06 Jun 2024 09:43:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717692181; x=1718296981; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; b=hRe9ouoj+zjoXqs45pkT4hnhTjD1/JXfX4+R645zYamVcshpCyYaIUvgmWFcR1apPF U03JQkwtCxzJJj1iWiETFIDMNUREIDzGGpO+0Q6Xfj3vZ+EggipQXrykbpTQTMFoM0Ej Jf9V9cckM2PrnvS6BlEgvHsARQnIVv+9QtZ3XOiop7OSUzcuJy6OwEAkv/DqGIpbhZo0 py458+P4Z7ss8VyLUYJyAU3ycPETO2+OJ8H0bTSU94pT20zxftZPiWVijThMudyUwpxB VHHOxzTWl8Vt0Bg1J9JABOCISOzzCOxp8t1VgQH/x5O95RNYgDTHYAwXYal0IoUASeQb 3dLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717692181; x=1718296981; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; b=VXD7zM6hYQ6b6zeuh5PHO8t/mGAQItcQwIdLE8QOP8XjbLxPv6dKX+CazsZS0h4p9b 039mlYmhkE3Trjv9bD2K5alZ2D/qfSh4v8u05tpJJwm4ZNXg8ywIEQSwlkX9Yfg0Fi/T K7b7+VYK81NGIwj2iPlxyYvQD48f+Sh+5jnVYEP/ga1By4UXVlFI9biBOjiOIeYYpXsl ufBN1o010tlHJn3ZWPZv2+luKi1tbYMTdJD43oWBNxRdOz2ZYcyuST2Fr0BBGiAxl8l0 l5mX/iWo59a/gZlBiRkC/Sqjf6RHFiE7Uja1c59SovNuXTXaRGivcU4o+jK0CRpz81TD 9JpA== X-Forwarded-Encrypted: i=1; AJvYcCVN8MPmro6hD2zrnq/IcykVRF6JEP598c2FoS72x6stdtvk5AZsicHN5iVmLoxKeGKvzcUt0K9QNfX2NIRLpciAsE76Bzhe4CX9fAEl X-Gm-Message-State: AOJu0Yybw3VjFcLGdrx9KgDnxDoApVo9vmWLQXiaDdWz4KNE/BSyUVIK KTbseaJa0ESfEuJl1wWwmObxb+oMjCkThW2ajsHGMQ0K+sM/yzA62ZOg3bxzyonwBcWth34MqAG 3t2gedfoDptviR4lre2rgv5PYtwWaH1KbFlYx6rxuPGv5Z2yg/4nbQps= X-Received: by 2002:a54:448c:0:b0:3d2:368:9251 with SMTP id 5614622812f47-3d2044e5e2bmr6367048b6e.38.1717692180748; Thu, 06 Jun 2024 09:43:00 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240508202111.768b7a4d@yea> <20240602200332.3e531ff1@yea> <20240604001304.5420284f@yea> <20240604134458.3ae4396a@yea> <20240604231019.18e2f373@yea> <20240606010431.2b33318c@yea> <20240606152802.28a38817@yea> In-Reply-To: <20240606152802.28a38817@yea> From: Yosry Ahmed Date: Thu, 6 Jun 2024 09:42:21 -0700 Message-ID: Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) To: Erhard Furtner Cc: Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Johannes Weiner , Nhat Pham , Chengming Zhou , Sergey Senozhatsky , Minchan Kim , "Vlastimil Babka (SUSE)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Jun 6, 2024 at 6:28=E2=80=AFAM Erhard Furtner wrote: > > On Wed, 5 Jun 2024 16:58:11 -0700 > Yosry Ahmed wrote: > > > On Wed, Jun 5, 2024 at 4:53=E2=80=AFPM Yu Zhao wrot= e: > > > > > > On Wed, Jun 5, 2024 at 5:42=E2=80=AFPM Yosry Ahmed wrote: > > > > > > > > On Wed, Jun 5, 2024 at 4:04=E2=80=AFPM Erhard Furtner wrote: > > > > > > > > > > On Tue, 4 Jun 2024 20:03:27 -0700 > > > > > Yosry Ahmed wrote: > > > > > > > > > > > Could you check if the attached patch helps? It basically chang= es the > > > > > > number of zpools from 32 to min(32, nr_cpus). > > > > > > > > > > Thanks! The patch does not fix the issue but it helps. > > > > > > > > > > Means I still get to see the 'kswapd0: page allocation failure' i= n the dmesg, a 'stress-ng-vm: page allocation failure' later on, another ks= wapd0 error later on, etc. _but_ the machine keeps running the workload, st= ays usable via VNC and I get no hard crash any longer. > > > > > > > > > > Without patch kswapd0 error and hard crash (need to power-cycle) = <3min. With patch several kswapd0 errors but running for 2 hrs now. I doubl= e checked this to be sure. > > > > > > > > Thanks for trying this out. This is interesting, so even two zpools= is > > > > too much fragmentation for your use case. > > > > > > Now I'm a little bit skeptical that the problem is due to fragmentati= on. > > > > > > > I think there are multiple ways to go forward here: > > > > (a) Make the number of zpools a config option, leave the default as > > > > 32, but allow special use cases to set it to 1 or similar. This is > > > > probably not preferable because it is not clear to users how to set > > > > it, but the idea is that no one will have to set it except special = use > > > > cases such as Erhard's (who will want to set it to 1 in this case). > > > > > > > > (b) Make the number of zpools scale linearly with the number of CPU= s. > > > > Maybe something like nr_cpus/4 or nr_cpus/8. The problem with this > > > > approach is that with a large number of CPUs, too many zpools will > > > > start having diminishing returns. Fragmentation will keep increasin= g, > > > > while the scalability/concurrency gains will diminish. > > > > > > > > (c) Make the number of zpools scale logarithmically with the number= of > > > > CPUs. Maybe something like 4log2(nr_cpus). This will keep the numbe= r > > > > of zpools from increasing too much and close to the status quo. The > > > > problem is that at a small number of CPUs (e.g. 2), 4log2(nr_cpus) > > > > will actually give a nr_zpools > nr_cpus. So we will need to come u= p > > > > with a more fancy magic equation (e.g. 4log2(nr_cpus/4)). > > > > > > > > (d) Make the number of zpools scale linearly with memory. This make= s > > > > more sense than scaling with CPUs because increasing the number of > > > > zpools increases fragmentation, so it makes sense to limit it by th= e > > > > available memory. This is also more consistent with other magic > > > > numbers we have (e.g. SWAP_ADDRESS_SPACE_SHIFT). > > > > > > > > The problem is that unlike zswap trees, the zswap pool is not > > > > connected to the swapfile size, so we don't have an indication for = how > > > > much memory will be in the zswap pool. We can scale the number of > > > > zpools with the entire memory on the machine during boot, but this > > > > seems like it would be difficult to figure out, and will not take i= nto > > > > consideration memory hotplugging and the zswap global limit changin= g. > > > > > > > > (e) A creative mix of the above. > > > > > > > > (f) Something else (probably simpler). > > > > > > > > I am personally leaning toward (c), but I want to hear the opinions= of > > > > other people here. Yu, Vlastimil, Johannes, Nhat? Anyone else? > > > > > > I double checked that commit and didn't find anything wrong. If we ar= e > > > all in the mood of getting to the bottom, can we try using only 1 > > > zpool while there are 2 available? I.e., > > > > Erhard, do you mind checking if Yu's diff below to use a single zpool > > fixes the problem completely? There is also an attached patch that > > does the same thing if this is easier to apply for you. > > No, setting ZSWAP_NR_ZPOOLS to 1 does not fix the problem unfortunately (= that being the only patch applied on v6.10-rc2). This confirms Yu's theory that the zpools fragmentation is not the main reason for the problem. As Vlastimil said, the setup is already tight on memory and that commit may have just pushed it over the edge. Since setting ZSWAP_NR_ZPOOLS to 1 (which effectively reverts the commit) does not help in v6.10-rc2, then something else that came after the commit would have pushed it over the edge anyway. > > Trying to alter the lowmem and virtual mem limits next as Michael suggest= ed. I saw that this worked. So it seems like we don't need to worry about the number of zpools, for now at least :) Thanks for helping with the testing, and thanks to everyone else who helped on this thread. > > Regards, > Erhard