Received: by 2002:a05:6500:2018:b0:1fb:9675:f89d with SMTP id t24csp623866lqh; Fri, 31 May 2024 11:05:11 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUBUkCl5zTVsQWsrUYH1pViWvib8B6tPU6DDI8ySCBXYYSXB6CFUx4f9KE8HkJs6VqcXHsQL7D7sZldOKYcecD4X6TBYUNt8wDfROUFzQ== X-Google-Smtp-Source: AGHT+IEmYjLRotUM2bv00prcr6Ca18lzE7CzADA/Na1XmQJaO7/Gxd7Q+gYpZeXV6iYN/WFoVxjN X-Received: by 2002:ac2:4556:0:b0:526:81ab:191 with SMTP id 2adb3069b0e04-52b8970078fmr1479785e87.60.1717178710991; Fri, 31 May 2024 11:05:10 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717178710; cv=pass; d=google.com; s=arc-20160816; b=x3Tb6NGIzDMMy1TQzWFGy1ckwIxuX6w8zg94Bbr596hsllKzokGJHiaU+jALANNik6 UAGZRQVFTOXwYFtnzfgWyBJb9jNCHD2BzPYBytQ4IiiRw6wr4oMd/kjd2NgUFhZ/IdV1 m+XvhW+bpiz+UM54S6Fpu2idqchB7MyT+yJPRiIz5gvKBFz4BzMt/OBrMVCqrkU2jwqP 19xTTqm2greTgS1RruUpT006jQU5deKoiGAV6la5hYPBStioTU03fCoclZrA0bQ23lcZ 0T0sow9OO63P/mIZuvU18zX0VfsV7tHEO6tC5zT9L6ldRV1xQ+S+XQ1mHowsFx4J64Xz x7bw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :dkim-signature; bh=HfTfA1pAaQTMxSVE2Twz55nEqsHzsr+JodUe0ZaWjik=; fh=HzG39NCSltHZPWWJ3jXqUWbtWytbdeDF4n9YrsJ0HoA=; b=kMr9xnJ93fBvXC5aoJ53ZaILVXlCmsK2YScDyFImVxVd8F8gVJdOP2dMTAnMUAu80W /0CK1xhsmZsHKF26/V+Hkn/xbznLOfEQP3geS6DfEv8lwakykQu1f7a839iLoJyVte8p dD7dwipwv5O90l20bhGFkVaKCB8YglaTAaRZHmPsJlQuAA06lk3eA9B1WQw8dCt9aA/W 7kwNTZVnghrOZmW56FRnDVkAtS0hdowPACwMjrfY/ad58zPoIeSBbcdWCQ17Xxe4X6Y3 Wz+8vvGVK+EEWfePS1i80qcN5tX3ywD2scZ9DFAd+TmX2byAOhDZ3+HB3SkY1S3f5cNv xKBw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=CM8EZ1uk; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-197280-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-197280-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a67ea38dd43si113349366b.279.2024.05.31.11.05.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 May 2024 11:05:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-197280-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=CM8EZ1uk; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-197280-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-197280-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 816E61F25FAA for ; Fri, 31 May 2024 18:05:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C4FD017C9EE; Fri, 31 May 2024 18:04:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CM8EZ1uk" Received: from mail-yb1-f173.google.com (mail-yb1-f173.google.com [209.85.219.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B06515B99F for ; Fri, 31 May 2024 18:04:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717178678; cv=none; b=DmGFJNNUQM92zVm6p2p0Uq17Vy6cfHMIB7WlKjoJHqdjj1PbQtf2Shu2BgM+YGYOVzZCHYzUOQXnEA+xpyzahYOCoLu/FGE3CLVolMQzH07k1EErJ4aBDpdyB+hI92VYPUEj3IjpbTeHXTtarC0Upm0KXvLkxkpBkRa+fTE0kBs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717178678; c=relaxed/simple; bh=EzjE65Kz4LvJT4hRxLlPgk+qCuo0nSvK1ZnNuO/vDQs=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Cggc3RloxDVz63tf7/aPIyQnf8IA7a4nNeBOFBuXd1vKmgA/ZSP30vvBTiqaW0MdqAu2Ca4U/GK/LuHcaJQSMDvr6hiTvV0UYlRKeDaIaAmuBsdFdOz7cr3CNfqAi5ZgEUI67ALAZOPNA7T+ozzX0doKqOIrVsNiO/qGi2U/4RE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CM8EZ1uk; arc=none smtp.client-ip=209.85.219.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-f173.google.com with SMTP id 3f1490d57ef6-dfa72e97dfeso1033875276.1 for ; Fri, 31 May 2024 11:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717178676; x=1717783476; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HfTfA1pAaQTMxSVE2Twz55nEqsHzsr+JodUe0ZaWjik=; b=CM8EZ1ukwGgZG+xY9cDzakQhebF8yotGttvHtk4rC71/G7iw8bvJlS8AM8PXSIw105 GGLjQQ2hAOjP5Up1MqgjviRfPuij5uBDKRxqmg1wGp3q0T/6gfFFYx6ifHMaUvPpEvMq AK1SIv1cGAezS/ZuQ0LAw2odYQHBQAwGgZpzwFhRntcQuwodIE8taeqXW1PKgjiMZ9ic 8DZFtv5pNNWz+SaCkEm+kpt7CsdPeL34fhd3ePoR4Mmh2SNC/yzUehFUHLOEcxSpWsKs ZDXfBVQn5pAuTX5bWiImVsP+ZgWcurzE7bh9UYOBKgqwnjIiTVGs4+TpCFA7dzPqKDkO QOlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717178676; x=1717783476; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HfTfA1pAaQTMxSVE2Twz55nEqsHzsr+JodUe0ZaWjik=; b=fcKKvQzAwrTbUqzdjrO3xj8BJkgTcEFcrCBvZ2sKnwWKzY0OgSlevaTcZ1ZKTiSafN YEvf+W0BzKTVD4SF1+5gyqlSQCGa8L06P9oJNmB5tfg6FDaD6CnuVVnENWkRdmKX55El ZSbZqK5l4oIcJCmKrwzDJpc43DNo+JBlaV4oNXk9Las7YWcuwYKcedclkgdCIUgIYYVR eB3PRXBrinf7DnVQOUG8gDMWPsW9DsazipaQx7l4yrnwnxgtG/IT6phkH4MJ0a/WmmDu 1j2oSQK2T64Eu92A4wF7U+AHdikHFNnfDUwHa+GQZGpFJCYmPusG1mu4eg/4QTGQjhat dj7g== X-Forwarded-Encrypted: i=1; AJvYcCX8O76yoYKGMe9/6S0O8BTNwVqqq9f+0wY1Dp8QvOohkqNoGZm/zxAqVoR0SBFHFnbiVxTzyHAcTbACsa9uOB20r0WLKe0M2LPQ470y X-Gm-Message-State: AOJu0YznsNqKR0YlHR6hDBlpsnsv2fLyPdVbhqcvqXUyDBtO5UIggV0e Q1rDOTnzYjGUKBK4yObwtwiogBIfvx8/x2NBqq7XpjPrgKSLqZwMuyrnVyzDjoiGBbnq9lCnK3M PS8xRg7Q3LWhPXHyuTISen4nwh8E= X-Received: by 2002:a25:c58b:0:b0:de5:5bca:ecb0 with SMTP id 3f1490d57ef6-dfa72ec2cf9mr2624477276.0.1717178675951; Fri, 31 May 2024 11:04:35 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240531092001.30428-1-byungchul@sk.com> <20240531092001.30428-10-byungchul@sk.com> In-Reply-To: From: Byungchul Park Date: Sat, 1 Jun 2024 03:04:24 +0900 Message-ID: Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped To: Dave Hansen Cc: Byungchul Park , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Content-Type: text/plain; charset="UTF-8" Dave Hansen wrote: > > On 5/31/24 02:19, Byungchul Park wrote: > .. > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 0283cf366c2a..03683bf66031 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -2872,6 +2872,12 @@ static inline void file_end_write(struct file *file) > > if (!S_ISREG(file_inode(file)->i_mode)) > > return; > > sb_end_write(file_inode(file)->i_sb); > > + > > + /* > > + * XXX: If needed, can be optimized by avoiding luf_flush() if > > + * the address space of the file has never been involved by luf. > > + */ > > + luf_flush(); > > } > .. > > +void luf_flush(void) > > +{ > > + unsigned long flags; > > + unsigned short int ugen; > > + > > + /* > > + * Obtain the latest ugen number. > > + */ > > + spin_lock_irqsave(&luf_lock, flags); > > + ugen = luf_gen; > > + spin_unlock_irqrestore(&luf_lock, flags); > > + > > + check_luf_flush(ugen); > > +} > > Am I reading this right? There's now an unconditional global spinlock It looked *too much* to split the lock to several locks as rcu does until version 11. However, this code introduced in v11 looks problematic. > acquired in the sys_write() path? How can this possibly scale? I should find a better way. > So, yeah, I think an optimization is absolutely needed. But, on a more > fundamental level, I just don't believe these patches are being tested. > Even a simple microbenchmark should show a pretty nasty regression on > any decently large system: > > > https://github.com/antonblanchard/will-it-scale/blob/master/tests/write1.c > > Second, I was just pointing out sys_write() as an example of how the > page cache could change. Couldn't a separate, read/write mmap() of the > file do the same thing and *not* go through sb_end_write()? > > So: > > fd = open("foo"); > ptr1 = mmap(fd, PROT_READ); > ptr2 = mmap(fd, PROT_READ|PROT_WRITE); > > foo = *ptr1; // populate the page cache > ... page cache page is reclaimed and LUF'd > *ptr2 = bar; // new page cache page is allocated and written to I think this part would work but I'm not convinced. I will check again. > printk("*ptr1: %d\n", *ptr1); > > Doesn't the printk() see stale data? > > I think tglx would call all of this "tinkering". The approach to this > series is to "fix" narrow, specific cases that reviewers point out, make > it compile, then send it out again, hoping someone will apply it. Sorry for not perfect work and bothering you but you know what? I can see what is happening in this community too. Of course, I bet you would post better quality mm patches from the 1st version than me but might not in other subsystems. > So, for me, until the approach to this series changes: NAK, for x86. I understand why you got mad and feel sorry but I couldn't expect the regression you mentioned above. And I admit the patches have had problems I couldn't find in advance until you, Hildenbrand and Ying. I will do better. > Andrew, please don't take this series. Or, if you do, please drop the > patch enabling it on x86. I don't want to ask to merge either, if there are still issues. > I also have the feeling our VFS friends won't take kindly to having That is also what I thought it was. What should I do then? I don't believe you do not agree with the concept itself. Thing is the current version is not good enough. I will do my best by doing what I can do. > random luf_foo() hooks in their hot paths, optimized or not. I don't > see any of them on cc. Yes. I should've cc'd them. I will. Byungchul