Received: by 2002:ab2:6991:0:b0:1f2:fff1:ace7 with SMTP id v17csp66716lqo; Wed, 27 Mar 2024 07:09:44 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXWdrAZNE7tak7G6tYpGj4qoVFce7gnAGPXPyjeuZ50ITL9u3F6cHR4rG50o4adFUFv4ak0HmOceyQuAMGasy5ybONLnCFtNxtA3pG38A== X-Google-Smtp-Source: AGHT+IHsCp+67hG45tv4bmbTMJToLxobxV1URQUiKCDD46Gn1csyMPoVNWhoKr4bAPiG6j4Er0xH X-Received: by 2002:a17:906:370a:b0:a4d:f47a:2a07 with SMTP id d10-20020a170906370a00b00a4df47a2a07mr1030405ejc.38.1711548582378; Wed, 27 Mar 2024 07:09:42 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711548582; cv=pass; d=google.com; s=arc-20160816; b=L3lFVH70aWBXtnfDnEoeycUp0efupCaGvaznFf98qlFmasPjhbQp/TWDYL4OtJB7Oj V9A3PI4+O2ZsxHVD7PRkIANcUzfcqdxyskT51NE7+ADcV9y+tLZBIzuz41hHmJRdc5tU RPM+13jKGZ1FsPbvOVTIQ0O7+fIkMyQU1EaKkX7JKzRUZUW5rjBRECiNNNCgK9HLZP3O 3mv76NtEdXhmigrTVnNbhc/q8PF3SSkuXQmWeWsM9+v5kmrowH4+YdM1qY7IUFQWzW/7 jB78fT/TwEM7bqJUQpT2X1xjfsocgJmFF+SsD8SPYEgoQFh7d83y4yj+n5hGIYUowYDP vLuA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=NLZi73Xb4ep908R2UXV6rktiIulOxpZuvd/TS5KeEy4=; fh=oHVwDuUn2G0q/nBHwBMg+/uC92NaPvoReDtlYdC95QM=; b=My/VbvOCUBxfv3Os+hm3vtWwyoq8nEoEFw9t5Fo+c+kxRykw9dVQFaGBrJER6Mo1OO rd7yvIzzo6B4jOE6gt2n3Eb/xiLLUzOlE6LUBkLGafJ5P55aB7JcCoFHuv/RQOnOLgcl lZPLyYu/VWCkI4D5F8QoWQ/0FgV1uJdj55O5mGjP6Jxeir3qjDIaOhB0yBJhMv9eUSfl mMtdKhwlso3D4E0vDDEaOJBSGQ/WMkljVl7kN2HxwIsJS3ZHzPbepNp2grLCOS5CB/IZ VSjRjUZXfqE60YGlq5jlQdf7h/BG3JEmzbomPDe1Q6xkllp0hiyl0f896t6xeHo5NKVl W6iw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=vnlu+KBx; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-121283-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-121283-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id i18-20020a170906115200b00a4deec978d4si1949489eja.608.2024.03.27.07.09.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Mar 2024 07:09:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-121283-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=vnlu+KBx; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-121283-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-121283-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CD6D71F275EC for ; Wed, 27 Mar 2024 14:09:21 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 29166131188; Wed, 27 Mar 2024 12:31:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="vnlu+KBx" Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7BB51311A7 for ; Wed, 27 Mar 2024 12:31:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711542718; cv=none; b=atqfYBDOp/RMTkvu6CRA4PH/p+tvmbVrx1B5LzptsJKBazPpNfAwZtEE0hDhePd+BtEItfCjku418X6oIysnIYIeQByAdbMh8LTbxZ1OrY2JOG4s+h9EvkKU8/kmj57YBCZV6QUhDDlj9fu8GuIdRUx0Uhrs1br0fUQDXApJfCg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711542718; c=relaxed/simple; bh=jI8j+Hk7jO97k2/0wPNbWPlKYub9zJxqLVA5E2Xe8+8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UeEAPyVuu4DqqVzsR+iBeASnRSF3BPMq9wuPJbthAZAhwxNehkV4mwC9QuvwF6ljY5Jmume8S9RqGpELs5aKcY18xUQQISTPvoYqGkv0VjYM5iYfpHCiNU/CDGIEzysAfktMcNlunT63rF/xrwfdeXGgU6Fg5qHZJl1zmRsjR3s= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=vnlu+KBx; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=NLZi73Xb4ep908R2UXV6rktiIulOxpZuvd/TS5KeEy4=; b=vnlu+KBxz0A/LWrNCJWqk+pOMj FqC3Hme8EZRafost80noqjORFBRzKICIGLAvtFPgVKInoFqzVbb2gB825R3qGetjUztfDMzH6RjFy RmuGURgTIWwtHoOo4P5VMwkcJe1iSxTbg4feNAf87bl4RKYxSTwq6YRr2pwyPO7ZFtdgZDc6S5Nej 1AoZaOlltlgWw3JirOg9Q2jlR/u6lP2HODPT5J0t5pMZB2pEYhbJ1cY7YIG0AoRkpDHnDDSRIJ1ES NZSeiL9wxmjtgFnjaZs+PHZQDxgTWE+wf32ZAcozbJ//v14kiyVDGQSmp1n97hWkY4xKA4aHM+haC OixDqVuw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rpSRZ-00000003qY9-10f2; Wed, 27 Mar 2024 12:31:49 +0000 Date: Wed, 27 Mar 2024 12:31:49 +0000 From: Matthew Wilcox To: Zhaoyang Huang Cc: =?utf-8?B?6buE5pyd6ZizIChaaGFveWFuZyBIdWFuZyk=?= , Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , =?utf-8?B?5bq357qq5ruoIChTdGV2ZSBLYW5nKQ==?= Subject: Re: summarize all information again at bottom//reply: reply: [PATCH] mm: fix a race scenario in folio_isolate_lru Message-ID: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed, Mar 27, 2024 at 09:25:59AM +0800, Zhaoyang Huang wrote: > > Ignoring any other thread, you're basically saying that there's a > > refcount imbalance here. Which means we'd hit an assert (that folio > > refcount went below zero) in the normal case where another thread wasn't > > simultaneously trying to do anything. > Theoretically Yes but it is rare in practice as aops->readahead will > launch all pages to IO under most scenarios. Rare, but this path has been tested. > read_pages > aops->readahead[1] > ... > while (folio = readahead_folio)[2] > filemap_remove_folio > > IMO, according to the comments of readahead_page, the refcnt > represents page cache dropped in [1] makes sense for two reasons, '1. > The folio is going to do IO and is locked until IO done;2. The refcnt > will be added back when found again from the page cache and then serve > for PTE or vfs' while it doesn't make sense in [2] as the refcnt of > page cache will be dropped in filemap_remove_folio > > * Context: The page is locked and has an elevated refcount. The caller > * should decreases the refcount once the page has been submitted for I/O > * and unlock the page once all I/O to that page has completed. > * Return: A pointer to the next page, or %NULL if we are done. Follow the refcount through. In page_cache_ra_unbounded(): folio = filemap_alloc_folio(gfp_mask, 0); (folio has refcount 1) ret = filemap_add_folio(mapping, folio, index + i, gfp_mask); (folio has refcount 2) Then we call read_pages() First we call ->readahead() which for some reason stops early. Then we call readahead_folio() which calls folio_put() (folio has refcount 1) Then we call folio_get() (folio has refcount 2) Then we call filemap_remove_folio() (folio has refcount 1) Then we call folio_unlock() Then we call folio_put() (folio has refcount 0 and is freed) Yes, other things can happen in there to increment the refcount, so this folio_put() might not be the last put, but we hold the folio locked the entire time, so many things which might be attempted will block on the folio lock. In particular, nobody can remove it from the page cache, so its refcount cannot reach 0 until the last folio_put() of the sequence.