Received: by 2002:a05:6500:2018:b0:1fb:9675:f89d with SMTP id t24csp695183lqh; Fri, 31 May 2024 13:32:11 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWHXiLKh997mT5rQbXHwOC30VJJ0MPA2Q4iAoMdHDfesWsyIkD5dfeXe0rQMtFf3i/fcF9Cak/4WETTHw7WfzckU/pSg6Qy8sd0Nhdf+Q== X-Google-Smtp-Source: AGHT+IH4m7ipq7yHt/PCwf5QdkwOuGKV8OsKpx5AfPWgxAGmI6aFdIMHT4hWwYwiP9Zboz39Reci X-Received: by 2002:a05:6808:440a:b0:3d1:d1be:9152 with SMTP id 5614622812f47-3d1e3473c96mr2659148b6e.12.1717187531389; Fri, 31 May 2024 13:32:11 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717187531; cv=pass; d=google.com; s=arc-20160816; b=lezn1HfK6hYhh88R1V/49Rl/0H8hT9c1btWoNJOSQOWiGcBdp9uoPLZCsXOAStG2LM XDVl3N/dAqqHprrjxUtzewYioTC+MAsWJ8bmcDza9IwGxtF2tr3FQq0tyvZIij0ACri2 C8Eyg+VDgJ7TJbR4/C+K/i1RBTB258EndFTK59wE7yrW18FvkqSAV/XVNbSeQextVFwZ eJsT7jqR/bzzJFfJmQ6sKbuSDxTA9R8nRFm1P7YyJVf/zgZdaZqKPW/42+7YvrF63QQz hMNixMlVf9n2U7o8Zh0UzjNhIgwMjjCaSbGT1ESjGLQlkq4HD/XoiSZOYUnSYzUbnDQ0 sRpA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=InPM3IgrwlWAIe+nSHKsuhmVDndDnlj7rMcdOWuFi5w=; fh=wH9q95UTrvztX3FH4HCtkerqouX+hkNVjXteHkD4bWM=; b=L3zksHZqmz9Sj5pFTyUzyknLZg5KkVe1elpS7SpwdP/oIkExyTbDJCUnRXsABYCQ0N UFIJy2MUZBSvh6BrVxlFP4tktklyf82u3yfE2PXOMk+rr3EB8nqDzEuA1UK1cSnW7h7T K5+uKO/4GUain6Y06fokfBVAErnAErHggB0rttSKb8hDoXl7HhgUQGHFW0VMCI++IGOH PeFZazP4aDEC9JtGKU6EEGkGZ8uxBp4vDjPKIVM8/yhS05DLybN9EHw1nIr7lRu6/lws cg7elgDXrbY3H0v8PAQGeYwhPIBuk5sPSUgKEvYcOQ+nRQqsB7HuKc0kwzAudU0ahVKs ymcw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0NBYD5ft; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-197444-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-197444-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id 6a1803df08f44-6ae4b415228si30272446d6.390.2024.05.31.13.32.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 May 2024 13:32:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-197444-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0NBYD5ft; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-197444-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-197444-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 0C3221C22932 for ; Fri, 31 May 2024 20:32:11 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 71D1917D8B3; Fri, 31 May 2024 20:32:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0NBYD5ft" Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F38054D8DD for ; Fri, 31 May 2024 20:31:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717187519; cv=none; b=qTtjsd3OHJkVbgLzlJsfz7/EDrrL9ObvY4ME2SLwhz4E07rvQkmsHSLKrth4xwcUunM+tnWKJh8jaWM0HgGYuwkw0li1WxDT7cF2jdtGvjY8W/bdJcSKyYo8SP6fcQEYed6I1M+2QJLZ7ruYNpxqtA8iBgcRBClE8xypPPBnbNI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717187519; c=relaxed/simple; bh=InPM3IgrwlWAIe+nSHKsuhmVDndDnlj7rMcdOWuFi5w=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=FfOhsOJpTX1Nj+HJaFD1ng6e9Yt2jcDlKwtYXZPyo+TYpnVt4jspneuPiJGQN27mEGrlHEHoPKk6ErntmeO27YNNSbd0qUNocZzywg+qxxFsxNnJKoHvX9xMMY01KEOchAXP7IXoDbBw06cv3TUsVQ/9js67su9YJFqbHvL3iZ0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=0NBYD5ft; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-41fef5dda72so4015e9.1 for ; Fri, 31 May 2024 13:31:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717187516; x=1717792316; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=InPM3IgrwlWAIe+nSHKsuhmVDndDnlj7rMcdOWuFi5w=; b=0NBYD5ftjqzLMMU9S/ShWY0r9adcsacAYrM6QxiDfQawAq5CIU1qD7b8p/9X1Hh9Vn lCeav7KbQTZC6WocF6DzhU0JB4kTTRoGMFPgYyLf6liDfGpbROuuf0wi9Ii3IF2Wp6ZH LVAHYybeCo0IO6rGM34tdAhVTzb/Y/5xa6yThldfSM2fHLRpWf9tyD3zhhSPVOvMnQ5F HsTmc2o2GZfb5q551QKtw3Su6tbc7M+8j+U/aPJi73PXxJ0zf4OgSd+qNvvQsbGx64r9 MyGt8lz5GklSwzrPofU6xaiWPHie5vbk9RrAhk6LaMlnayWdQ6y7x3a9KE9fRKZisF3z z+Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717187516; x=1717792316; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=InPM3IgrwlWAIe+nSHKsuhmVDndDnlj7rMcdOWuFi5w=; b=Ji43EJzaPQG2Av5NLh/7F5TG7ihhQWsmM6WgBNq7l5t2kGw0PPh73ywWXiAtnKnJDs QSDlzjhiiFpwtqyKnlt7ZFxEkJVKcPd9YXmvxWd74oYg/PJ39hUzFf3GD10tT9dfLpKR Q027ZITa3gEYyWoIzTtfi5Dx8lhTnHkvrotzY7xy+5W9w5ChMuPORIsPrw5cAKNpDGFf 8cTrdLUlQXZnPkp8WkBs8hzbbvohmZEAc1+6/3Gl2SfVPJ8VMWwB4qUpVY++a5Gg0+XP KPLvp2v/e8zlbacixJut2TgFAbCXrRpNU1TvlcMBbpoZF50e7Tyn7KZVEZ+K9gTdG4uv i0zQ== X-Forwarded-Encrypted: i=1; AJvYcCVbREVtrwQdgrqGry7QFCTt4Zp8by6JFbOCbUIOKmteG1BAfeQPTkwtLVAMe2BhaB5mRCX7QvIQE2B39JuPziobCwRowOScOFBo1FCg X-Gm-Message-State: AOJu0Yw4msXbDSak9Eduh9VuOuenYcTGpI3V7vfu4G9jk6Fpt4ll7b3C XQ+oZxt+Mdzo4OUXWG1XkRCPHw/20xwu9ZNAYTchUvYh5SPY0FOD2RAERGpjdPwF8nTz8Zf60fu eZ5wJ3OYgO/6pPzU4wp5dzFLJwo/ntnPtVqVP X-Received: by 2002:a05:600c:299:b0:418:97c6:188d with SMTP id 5b1f17b1804b1-421358ce41bmr41075e9.7.1717187515962; Fri, 31 May 2024 13:31:55 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240529180510.2295118-1-jthoughton@google.com> <20240529180510.2295118-3-jthoughton@google.com> In-Reply-To: From: Yu Zhao Date: Fri, 31 May 2024 14:31:17 -0600 Message-ID: Subject: Re: [PATCH v4 2/7] mm: multi-gen LRU: Have secondary MMUs participate in aging To: Oliver Upton Cc: James Houghton , Andrew Morton , Paolo Bonzini , Albert Ou , Ankit Agrawal , Anup Patel , Atish Patra , Axel Rasmussen , Bibo Mao , Catalin Marinas , David Matlack , David Rientjes , Huacai Chen , James Morse , Jonathan Corbet , Marc Zyngier , Michael Ellerman , Nicholas Piggin , Palmer Dabbelt , Paul Walmsley , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Shuah Khan , Suzuki K Poulose , Tianrui Zhao , Will Deacon , Zenghui Yu , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 31, 2024 at 1:24=E2=80=AFAM Oliver Upton wrote: > > On Wed, May 29, 2024 at 03:03:21PM -0600, Yu Zhao wrote: > > On Wed, May 29, 2024 at 12:05=E2=80=AFPM James Houghton wrote: > > > > > > Secondary MMUs are currently consulted for access/age information at > > > eviction time, but before then, we don't get accurate age information= . > > > That is, pages that are mostly accessed through a secondary MMU (like > > > guest memory, used by KVM) will always just proceed down to the oldes= t > > > generation, and then at eviction time, if KVM reports the page to be > > > young, the page will be activated/promoted back to the youngest > > > generation. > > > > Correct, and as I explained offline, this is the only reasonable > > behavior if we can't locklessly walk secondary MMUs. > > > > Just for the record, the (crude) analogy I used was: > > Imagine a large room with many bills ($1, $5, $10, ...) on the floor, > > but you are only allowed to pick up 10 of them (and put them in your > > pocket). A smart move would be to survey the room *first and then* > > pick up the largest ones. But if you are carrying a 500 lbs backpack, > > you would just want to pick up whichever that's in front of you rather > > than walk the entire room. > > > > MGLRU should only scan (or lookaround) secondary MMUs if it can be > > done lockless. Otherwise, it should just fall back to the existing > > approach, which existed in previous versions but is removed in this > > version. > > Grabbing the MMU lock for write to scan sucks, no argument there. But > can you please be specific about the impact of read lock v. RCU in the > case of arm64? I had asked about this before and you never replied. > > My concern remains that adding support for software table walkers > outside of the MMU lock entirely requires more work than just deferring > the deallocation to an RCU callback. Walkers that previously assumed > 'exclusive' access while holding the MMU lock for write must now cope > with volatile PTEs. > > Yes, this problem already exists when hardware sets the AF, but the > lock-free walker implementation needs to be generic so it can be applied > for other PTE bits. Direct reclaim is multi-threaded and each reclaimer can take the mmu lock for read (testing the A-bit) or write (unmapping before paging out) on arm64. The fundamental problem of using the readers-writer lock in this case is priority inversion: the readers have lower priority than the writers, so ideally, we don't want the readers to block the writers at all. Using my previous (crude) analogy: puting the bill right in front of you (the writers) profits immediately whereas searching for the largest bill (the readers) can be futile. As I said earlier, I prefer we drop the arm64 support for now, but I will not object to taking the mmu lock for read when clearing the A-bit, as long as we fully understand the problem here and document it clearly.