Received: by 2002:a05:6500:1b8f:b0:1fa:5c73:8e2d with SMTP id df15csp943463lqb; Wed, 29 May 2024 15:58:24 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVx2AyGghiTg71SHjy0rqSLz7VAMqDlZ/8tvN2jIc0U9wkHlm1TUjbZcHkR9xMpyHQdBofcKbqxVaT/jSpmLMDt6hvQhHxRFRYs3qYGNg== X-Google-Smtp-Source: AGHT+IH24qhaG/T9sDiOQ0pg4eLjXV06q3xFVfQjdwqzAskTgeWkT2QiPhE3P3ZPn7kAEwhJLf6m X-Received: by 2002:a17:907:ca0:b0:a5a:5ecd:3744 with SMTP id a640c23a62f3a-a65e923edddmr25709966b.64.1717023504433; Wed, 29 May 2024 15:58:24 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717023504; cv=pass; d=google.com; s=arc-20160816; b=SgxAwSUJiN8pCiG7sZQ1Pbimv2QMJGg0p3Ne7VdjJtqVOBdCRoH3MZyS8kxc1RZ3ae OUfI4N2XwFHJOKV5Tkij3FymEvr52p0EtIQWxhWP34+JLKcvTVYd8q+HlnWPB1jJl4W1 SFmc0QW0lGIkf1zuYxpZz0EGmCzG0HocgxXQhIR73zc1HNOUhtCnp4IPXnhM/wwrscs0 7WHiIfVHtJ6fJMTZyAAYn4ArTLw/qGPN8M+S/8EUbJlv3NMKbPF4fUFk8qD+FIAQ4xFd X6PtRQ43WDcIEAA8pcouiOVhQjIP9oAR/BNjWWRIdLLt5mexG/R+2lf5sJOQmtEQ2WYo qjHg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :in-reply-to:date:dkim-signature; bh=lXQ4aAmXpFR+Rnn+LVSp/P9w+zocv3IAjl6jcVlXHAQ=; fh=oaUQPd0MoSW8e4VRJMs6AUkAkopKRdOn6XB7/g9nMHs=; b=tXQ13MDsj93pW1+cJQhhUi4n3k2oV4iAFCtZbT70JPt+m8IERSYz61r3f0yj+6jdrp EaBrDgjfInyG5fux8h7D5JVij9Se3PVRfeXbgisf5M0yz0eK+VzffkAQtp8fJTz2NxIQ a94ttmQs+xOUPjG7qWTuFzS7UlyScNT05loG8j3iiJ0Vu5pe/f7Q01n24/oOYK1rsXex 3GvKQ4AhvdHs5+Ifhh0PLKD/2eaTpgGxCyn+nidPG/uCc39e3RQssG+7Ffw1gtZwmAQa 5Q/ooybRkIl7HcV4dS7gnqGu13vAGiCUuZkkmM630kYFCtWNYwF+QUuh+piQnwl2K7kW p7Cg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hQASlsLC; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-194762-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194762-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a62c2e1c494si543025166b.217.2024.05.29.15.58.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 May 2024 15:58:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-194762-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=hQASlsLC; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-194762-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194762-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 7DD6E1F22B37 for ; Wed, 29 May 2024 22:58:21 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 05687200108; Wed, 29 May 2024 22:58:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hQASlsLC" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BF681CB32A for ; Wed, 29 May 2024 22:58:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717023487; cv=none; b=edLngfeoTVw5PhYwIq4k3d7LAtIfB6AzYBZ03cdA9pvbZK3JUQmftr6lmWfUxYe9K8rUNV1wDmJ29iUyYQ+OLzd7nLO943VGzQ611oXJOU1d200BuyQmVoig8fseCBUBNMhOCQXbE7k4Ps6lo1CKFpjcjxJD5YEeGfJjG4f0yKY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717023487; c=relaxed/simple; bh=ILTrqYj7gGL+uJmOCLBAjzct0EhWj0aN1ME6oQovXBA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gaaZEoZLWfSbzT1OOcEEQ3gmfH0GmxnPbOZIRhSHFlWW7yQulLAYGaHWXjMvYIhFZyhnK2NwGj4vxbilCbNUBACqkj+j7NkfFFa3xX5BksY8djiKwbzmlnRRV8u6NU/ByypGkqHqvT/IiJ9AwEUW2OAKYsRmfRiT12+3rJ5ClGY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hQASlsLC; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-627f20cff42so3832557b3.0 for ; Wed, 29 May 2024 15:58:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717023484; x=1717628284; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=lXQ4aAmXpFR+Rnn+LVSp/P9w+zocv3IAjl6jcVlXHAQ=; b=hQASlsLCL8kat+pj2/OkA1ZECcq0+xKl8UztvJtH9+LXg6cCfWSX0oohqQTDHgHoIb J6rtEUd3mudMo4BrLifi5jjMNrb//SYmWuebakj8yeapa+M7cM5rUoZcZlx3Ve+NlJDg zd/JJqHa5+YYXVp4BgXx5VQTmO3mwBsaM7h74wh90vWCuVcNEX66rVWxK5hJ7BvneDvO CajLCrG7dWW7LRy2ArZt5Y6OfZjNOArR1+A54pEBICg/tKqOHQxZTurZ7Wor7jsvEqOW reBjDmxC96PJzNL61dz+shkHdzwkmQRJziF8517eNEIX/mNdrh/qkIuS3DWbDtU6yAAX Q/1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717023484; x=1717628284; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=lXQ4aAmXpFR+Rnn+LVSp/P9w+zocv3IAjl6jcVlXHAQ=; b=X/lBRji5Y5vM35CT5T3CLxzD0HD7f55hwETzC/7EnPXBQmWBoedF24fc5vfeRkIFp+ aRjKXcRmHlzSlMFh1Xh0JkHtRi469y7dzQhOfyQ607ndANB76HAlx8x7LXLzcDGJpOYA /b2PK1E8L4l8NCUqKc7P/jtx6Mk540SCb1QNcT0SKrVxZWbPvU4Ym4ZG4lwDyR7kwjky Yc49Xd18j1ntTJF/s3dTCrvUa6oa+Rur8ZcHP9IWvD1NrGdIShArW5m/GwqxUu1HGkdz rxylfC3Lz6zCxcDVgX/XUywQ3EjOzuS8bw6VonCu0g6kfveXA9zddb5MrAEK9MvhxcLW GjVg== X-Forwarded-Encrypted: i=1; AJvYcCV2L3Hkf7fJ0R4xYBUCjPTxOI4JjkYF12Oqhcuz0cHVENa9ufjIFSCeiF+DftZzLqMX7wDdJ0dEYpVX/u6sG5IfZOrQyS0CR3pkkbMf X-Gm-Message-State: AOJu0YzjGHXQU88eAaR6htif4AYLK1VhZBGvelMPsfgVF5ImwH4foMCg mvKAPt2JJXk/orQM3JXcpmeuwv6IxWmBOGPi0A2HEsrhlgXsCAFc+bD2/CDlc1OFtIVhHVuONH+ m8w== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1007:b0:dfa:5838:b919 with SMTP id 3f1490d57ef6-dfa5a68828dmr142777276.10.1717023484293; Wed, 29 May 2024 15:58:04 -0700 (PDT) Date: Wed, 29 May 2024 15:58:02 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240529180510.2295118-1-jthoughton@google.com> <20240529180510.2295118-3-jthoughton@google.com> Message-ID: Subject: Re: [PATCH v4 2/7] mm: multi-gen LRU: Have secondary MMUs participate in aging From: Sean Christopherson To: Yu Zhao Cc: James Houghton , Andrew Morton , Paolo Bonzini , Albert Ou , Ankit Agrawal , Anup Patel , Atish Patra , Axel Rasmussen , Bibo Mao , Catalin Marinas , David Matlack , David Rientjes , Huacai Chen , James Morse , Jonathan Corbet , Marc Zyngier , Michael Ellerman , Nicholas Piggin , Oliver Upton , Palmer Dabbelt , Paul Walmsley , Raghavendra Rao Ananta , Ryan Roberts , Shaoqin Huang , Shuah Khan , Suzuki K Poulose , Tianrui Zhao , Will Deacon , Zenghui Yu , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, May 29, 2024, Yu Zhao wrote: > On Wed, May 29, 2024 at 3:59=E2=80=AFPM Sean Christopherson wrote: > > > > On Wed, May 29, 2024, Yu Zhao wrote: > > > On Wed, May 29, 2024 at 12:05=E2=80=AFPM James Houghton wrote: > > > > > > > > Secondary MMUs are currently consulted for access/age information a= t > > > > eviction time, but before then, we don't get accurate age informati= on. > > > > That is, pages that are mostly accessed through a secondary MMU (li= ke > > > > guest memory, used by KVM) will always just proceed down to the old= est > > > > generation, and then at eviction time, if KVM reports the page to b= e > > > > young, the page will be activated/promoted back to the youngest > > > > generation. > > > > > > Correct, and as I explained offline, this is the only reasonable > > > behavior if we can't locklessly walk secondary MMUs. > > > > > > Just for the record, the (crude) analogy I used was: > > > Imagine a large room with many bills ($1, $5, $10, ...) on the floor, > > > but you are only allowed to pick up 10 of them (and put them in your > > > pocket). A smart move would be to survey the room *first and then* > > > pick up the largest ones. But if you are carrying a 500 lbs backpack, > > > you would just want to pick up whichever that's in front of you rathe= r > > > than walk the entire room. > > > > > > MGLRU should only scan (or lookaround) secondary MMUs if it can be > > > done lockless. Otherwise, it should just fall back to the existing > > > approach, which existed in previous versions but is removed in this > > > version. > > > > IIUC, by "existing approach" you mean completely ignore secondary MMUs = that > > don't implement a lockless walk? >=20 > No, the existing approach only checks secondary MMUs for LRU folios, > i.e., those at the end of the LRU list. It might not find the best > candidates (the coldest ones) on the entire list, but it doesn't pay > as much for the locking. MGLRU can *optionally* scan MMUs (secondary > included) to find the best candidates, but it can only be a win if the > scanning incurs a relatively low overhead, e.g., done locklessly for > the secondary MMU. IOW, this is a balance between the cost of > reclaiming not-so-cold (warm) folios and that of finding the coldest > folios. Gotcha. I tend to agree with Yu, driving the behavior via a Kconfig may generate si= mpler _code_, but I think it increases the overall system complexity. E.g. distr= os will likely enable the Kconfig, and in my experience people using KVM with = a distro kernel usually aren't kernel experts, i.e. likely won't know that th= ere's even a decision to be made, let alone be able to make an informed decision. Having an mmu_notifier hook that is conditionally implemented doesn't seem = overly complex, e.g. even if there's a runtime aspect at play, it'd be easy enough= for KVM to nullify its mmu_notifier hook during initialization. The hardest pa= rt is likely going to be figuring out the threshold for how much overhead is too = much.