Received: by 2002:ab2:3319:0:b0:1ef:7a0f:c32d with SMTP id i25csp780314lqc; Fri, 8 Mar 2024 11:19:20 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUtvLKz8JIkELB5eUe8QEU/D28Hl3QXHxItWVlkeNJorMjZdFstxtw9XNTO1XlUJ2b77iFQBwnOZJ09aplQ8zFPwZTlYXzLSWwX+SHrMA== X-Google-Smtp-Source: AGHT+IHna2ArJIt3DqKtikyKGUslKsaaUq9ce0us8h9AMqswzjZe4wPc0/ad4t8y7AMrptYxdUWC X-Received: by 2002:a05:622a:cf:b0:42e:db06:a35b with SMTP id p15-20020a05622a00cf00b0042edb06a35bmr138717qtw.17.1709925559906; Fri, 08 Mar 2024 11:19:19 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709925559; cv=pass; d=google.com; s=arc-20160816; b=tmCRxd3Ll0IU1ZSwGg2r5/EucL28vDq0Ggx0ymfHahl3qmDVf1a9VpMWPKOAGwJxpy +3ytpqE9vRp1NalZreKSllJ/9cnlreaCwZaiA36QRgjpH5MCx86igPFcD9aM7F68UyKK ohLwu0Zn2x7rh9Cey74WfdYnux9gQG6vo911gJ40z1/RqFBiSJWN1cLGuCmFfCFUaOjH Hoj1fwzWrcRE1xJrfnhGWR2RfDJn92gpFrr2CoIt7siMYQ2oYDSns/qONC4SQk4l4IKE dycUfp05ObDUsgRd+bSVgWc+S5IVmL/o9xS8xaJALA7iDRqFP2qj19xfB01A0Vao/ent TyqQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=UVPdnhePflx6M0Y1VqPM0q+L6FmThcZFrCpB9y8TiBA=; fh=ZoycYxgrqD+PctduXofaxsnle2BhtYUtP4pZUPU9Pos=; b=khBSnQx3/GPRR+5R1ov3cqVCNqSMy5X8YHQ9T/xtO5r7ikk5ECIfisKcWi4v/PRPby 1C9h0IE/wT8xl2J3/EHRSXfD+/XmahMiSryVIdT4NMwweJiaIGj7/OBHIx+x6SvwFX5k czc3STqPzjefiAwk5sYDnyAdRGCuRar1fgHWDKn/ZBF65EQFtgGGQ3H9D9lRMlVnTvUJ AEYPmfu/t/JMoNLdnWakXo0Vw8PBBme5bQdenrDT4r0TJTSbMQBOQcNybEQ5Wk4BY7sy TI6srhxBZeioqKlPqVSfLvjTX2Wh/AkDvcLKVHMxZf+7pLgN6XoYHPIsiC7kJcyAYLht UPwA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=MuEJ1PD2; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-97503-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-97503-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id q5-20020a05622a030500b0042ee0574834si130825qtw.367.2024.03.08.11.19.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Mar 2024 11:19:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-97503-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=MuEJ1PD2; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-97503-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-97503-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 8B38C1C2130C for ; Fri, 8 Mar 2024 19:19:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A1C565A787; Fri, 8 Mar 2024 19:19:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MuEJ1PD2" Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C10D8833 for ; Fri, 8 Mar 2024 19:19:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709925550; cv=none; b=ViZOYZBKT8MMknIt7NOdyi9D63RN9fmtjjBMENNtZualfmugmFWLWv0HIyu3m5eX18ZLGk7JweK5blUJnABPF9KWaK/PWbaylQZv1y/dhN7Uiyn+dKsGfRVNkS1A2g30PFgiS+BA3QPDOEswl1ypGX+1t9+1aZCpUUoqZ5j7e8M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709925550; c=relaxed/simple; bh=AI/VQSRaeCsmw+yYPLV6oidtLSNkXOOIzRWcNKusNqc=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=qhKRB5Po7NmJt9dtTDlKmHmrL0jRDoYMJDXFoKDkbGGeE0sJ8O9AcsNP2AwGeFXAN8R4Fs3qk1xtfAQ7O6jpqWMKcIJlRV2Wgb0rXEbN8SiuXd+abWGe4hSTKhA8XuXdJ0FccqyMo54SDqlGViGszc7NSJ+N/9mzuC6qFrWqN2g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MuEJ1PD2; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-33e686d60eeso985459f8f.0 for ; Fri, 08 Mar 2024 11:19:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709925547; x=1710530347; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UVPdnhePflx6M0Y1VqPM0q+L6FmThcZFrCpB9y8TiBA=; b=MuEJ1PD2n49TouZ57wuNHEu1X6wcJMT8Np3lWYPNE3gIy/Ti3Ui7rigk5MnaEw6D2J 17iwIdg0Fw8emlmZSYStUQ9uKM2+y8PCgjlNiH7qtxap2vREb+dVesEDFXr0CCOp244w t2C3Sf2Zl+DVZZSTyTi7yB37AHDUtNoTsyQW2IZ3Y6oBuJxPJL28zye4t8xMZT3EnFi8 e1wZvo0hbuh/RPD8wfzH25mukujJncymj3DS/Rpwj2o4eUD/TrneGG6pHs3vquogQpnH r0Jzk2/joh+zfdWSkhcmknBkRw/yXTxpuD+WfwK7fpqIHEnGakv7ZX1DKB4eGVo7GuIM TH5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709925547; x=1710530347; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UVPdnhePflx6M0Y1VqPM0q+L6FmThcZFrCpB9y8TiBA=; b=oEi0Q947BGIa23Ct1FwHd5J3qfakS7lTvdQb8cnx9M1JDQ2GP8nChzCrYQW8q9YT7O SD7hRC9uYyhkuyEAKwtb+gkT0S0m11beRygb/+uFcjQHn3KXNEnBTXj3g4YjsLqFD+iT 29GQMuzjkVlhlQkzzn0sFKf8apzpXp8t0IwfbxWhUo4ZhHYk+ujtX+mRJwdqsUCXRVSk 2iCLE01eEH1GDJkTCv4OWy+xXCS8dt+03FGWtc6XeLobFdumKHTBQWKyT7ubqCGIeTmo 8H33dQrEM9wLx/vDC3KKrYa5lO2z2wNw21jNNND+KuNGcLJgNZmjKQCrTf11RM9i4bUf aW7A== X-Forwarded-Encrypted: i=1; AJvYcCWzekZH91dHbyDaCFRyYc6/z3cC57JS7Or/MBWIPoBEML3BwqghVZiOHyzXnKIih/i1Fe2md2OrjtO4SiTfu6Sqrvgqrbs44UPMxZID X-Gm-Message-State: AOJu0YzKWpOOUfCbmDtxcZG45nTMkn+Mk2bYM8i0edZIWxECJUkS7PNR 8SVUGBTromqV687uvTwIT6+KSocufLHqzvrfo9lhIb53vF9KhYTziXLNdC+DHEffpJFz3JKiy9a 6L1LZGLQYK1yMx0R/8jq9jJijPeSydChi4GJ6 X-Received: by 2002:adf:fd89:0:b0:33d:aaba:aa66 with SMTP id d9-20020adffd89000000b0033daabaaa66mr58532wrr.65.1709925547287; Fri, 08 Mar 2024 11:19:07 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240229235134.2447718-1-axelrasmussen@google.com> In-Reply-To: From: Axel Rasmussen Date: Fri, 8 Mar 2024 11:18:28 -0800 Message-ID: Subject: Re: MGLRU premature memcg OOM on slow writes To: Chris Down Cc: cgroups@vger.kernel.org, hannes@cmpxchg.org, kernel-team@fb.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, yuzhao@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Feb 29, 2024 at 4:30=E2=80=AFPM Chris Down w= rote: > > Axel Rasmussen writes: > >A couple of dumb questions. In your test, do you have any of the followi= ng > >configured / enabled? > > > >/proc/sys/vm/laptop_mode > >memory.low > >memory.min > > None of these are enabled. The issue is trivially reproducible by writing= to > any slow device with memory.max enabled, but from the code it looks like = MGLRU > is also susceptible to this on global reclaim (although it's less likely = due to > page diversity). > > >Besides that, it looks like the place non-MGLRU reclaim wakes up the > >flushers is in shrink_inactive_list() (which calls wakeup_flusher_thread= s()). > >Since MGLRU calls shrink_folio_list() directly (from evict_folios()), I = agree it > >looks like it simply will not do this. > > > >Yosry pointed out [1], where MGLRU used to call this but stopped doing t= hat. It > >makes sense to me at least that doing writeback every time we age is too > >aggressive, but doing it in evict_folios() makes some sense to me, basic= ally to > >copy the behavior the non-MGLRU path (shrink_inactive_list()) has. > > Thanks! We may also need reclaim_throttle(), depending on how you impleme= nt it. > Current non-MGLRU behaviour on slow storage is also highly suspect in ter= ms of > (lack of) throttling after moving away from VMSCAN_THROTTLE_WRITEBACK, bu= t one > thing at a time :-) Hmm, so I have a patch which I think will help with this situation, but I'm having some trouble reproducing the problem on 6.8-rc7 (so then I can verify the patch fixes it). If I understand the issue right, all we should need to do is get a slow filesystem, and then generate a bunch of dirty file pages on it, while running in a tightly constrained memcg. To that end, I tried the following script. But, in reality I seem to get little or no accumulation of dirty file pages. I thought maybe fio does something different than rsync which you said you originally tried, so I also tried rsync (copying /usr/bin into this loop mount) and didn't run into an OOM situation either. Maybe some dirty ratio settings need tweaking or something to get the behavior you see? Or maybe my test has a dumb mistake in it. :) #!/usr/bin/env bash echo 0 > /proc/sys/vm/laptop_mode || exit 1 echo y > /sys/kernel/mm/lru_gen/enabled || exit 1 echo "Allocate disk image" IMAGE_SIZE_MIB=3D1024 IMAGE_PATH=3D/tmp/slow.img dd if=3D/dev/zero of=3D$IMAGE_PATH bs=3D1024k count=3D$IMAGE_SIZE_MIB || ex= it 1 echo "Setup loop device" LOOP_DEV=3D$(losetup --show --find $IMAGE_PATH) || exit 1 LOOP_BLOCKS=3D$(blockdev --getsize $LOOP_DEV) || exit 1 echo "Create dm-slow" DM_NAME=3Ddm-slow DM_DEV=3D/dev/mapper/$DM_NAME echo "0 $LOOP_BLOCKS delay $LOOP_DEV 0 100" | dmsetup create $DM_NAME || ex= it 1 echo "Create fs" mkfs.ext4 "$DM_DEV" || exit 1 echo "Mount fs" MOUNT_PATH=3D"/tmp/$DM_NAME" mkdir -p "$MOUNT_PATH" || exit 1 mount -t ext4 "$DM_DEV" "$MOUNT_PATH" || exit 1 echo "Generate dirty file pages" systemd-run --wait --pipe --collect -p MemoryMax=3D32M \ fio -name=3Dwrites -directory=3D$MOUNT_PATH -readwrite=3Drandwrite = \ -numjobs=3D10 -nrfiles=3D90 -filesize=3D1048576 \ -fallocate=3Dposix \ -blocksize=3D4k -ioengine=3Dmmap \ -direct=3D0 -buffered=3D1 -fsync=3D0 -fdatasync=3D0 -sync=3D0 \ -runtime=3D300 -time_based