Received: by 2002:a89:2d5:0:b0:1ef:f8eb:5d24 with SMTP id d21csp124666lqs; Sun, 17 Mar 2024 13:46:57 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV8YXqYD9RniLORaXMXyoayEH8WznTzEJRDkvtYD3huMLZzR5T/PDjwVMnNCJt8MIo49xgSbv/1mOayhVfPvckn01ih/e1Zxj+gGMCHSg== X-Google-Smtp-Source: AGHT+IHzUvny5PqGpV9+nHfvWnPFoOvJjKE1GlFFHNfNrvAFBRL8J8bLaP6RZX+fJhglg7WCEWnd X-Received: by 2002:a0c:f3c5:0:b0:690:afbf:56d1 with SMTP id f5-20020a0cf3c5000000b00690afbf56d1mr11668213qvm.8.1710708417532; Sun, 17 Mar 2024 13:46:57 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710708417; cv=pass; d=google.com; s=arc-20160816; b=uj1wEZTvxNGfp3b/wW71QPgtkT501yk4kfqGPCuE37ZgqDtXm/ah/ckkHN8o0EoN5r N8Pvf0v64wBP+zUUfIPMtV6kJMdMkNJFDEqF7UF768yD5q4fufOsaYqmV/3xlnEUPK9H 5IAukCMgr6Uhb6XiYWO/YnMu3d2zB1+FFBKp10bj6D5ChtCDYM2hp3TXB7bNDDhGc3AH 38J5Nuy2c8uarTY68xOjP+btXuMpTj6rcoFEfnxB9FY75xZuYjSCqi766bEKPlWF3KQU sSQMABSYKfO/vC5mu2M7c/ILvi0xmUdmgWS73YTCbXSkNrdJU3glnjN0/jAibot429uB 2Q5w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=v6PWbj/mXZq7FDrSc1NJ3BLmb2CI422t7EcGhIN71c0=; fh=TLld1Q412t9S/orOK3RsaoQPfD9IBBaDKbvTHow21tU=; b=XDd9+K2gm9RF902yQtJ+hrx6RqvwNnytBj0q4dDe9nVgmzCrWlyguPRTYr3jKuS7lZ W3Wjv5UgmLhiK5AbnEjIaWYyUrLj8UPLOFyNz5EtYd1xGXro2+/NPk4dMhNvvYYejPx3 pmwAwAJuAe5XKJR56i5ZruPu8U3amVdsFQcS2g2UIEdczwFtdeU8a8C1Ox7ff+EDA1CZ pRg/bQkul0gHS+nbRFnQKdgIT9vVqW7SZeCAGkinUYMHXhp2WCWmHJoH0V8R5qLgoMJU yIxEl8BPB606tc70FE6kIBPTvZ4b+rF+DLLrpQG/3u7roSGvAnojY8U1GrWvl35lyr4M EVwg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=b92MQ+KJ; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-105553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-105553-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id gv10-20020a056214262a00b006911cf57909si7351356qvb.463.2024.03.17.13.46.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Mar 2024 13:46:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-105553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=b92MQ+KJ; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-105553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-105553-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 217141C20D55 for ; Sun, 17 Mar 2024 20:46:57 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C34D5200CB; Sun, 17 Mar 2024 20:46:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b92MQ+KJ" Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5EEED1EF0D for ; Sun, 17 Mar 2024 20:46:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710708409; cv=none; b=PzzGBCfC5Z8Rt8Fahfo46eSGGj/FCzo5kIBoA6QiWxHSeVPIo8YtbMNE5P0B+fnt5E9JAbwSD+RvYop42ynYd6to8Bd6/HnGMIqSTwOBp8b8vxT70BiklFAFqZi3VBr/4HqJz41h9Ehqd23U3+bEYRnAtiUPRnTn39x+a+YH2rA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710708409; c=relaxed/simple; bh=MR/e1rNpFem2d28l9M1/gFdtZHul48yHvn/tMsBQq9g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=uGPi8F1+GK1hR5cP+RxxIONpO8Ei/M6ozuqVC/qAJs13/Ww63xTkJ6B4GaJ+5gXBiXG4N5i0QQGRNTh5u3tQaPLt2nJEkqsHaoolY71GhZfpEWX4BqW3gKwNy4bgN4ViYZr38MfTYLVmUJpFFfHUoxIIsxxQ7lYljvu1P4LQNA8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=b92MQ+KJ; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=v6PWbj/mXZq7FDrSc1NJ3BLmb2CI422t7EcGhIN71c0=; b=b92MQ+KJGg4tspLYmBItbm4l83 IkbCPvs2rjvKRjIPRJwzYv1dmMvG7WpuxZ/EEl3BaJYW9DUo1U7ZVIunZHp+7f/J16AhmNhl/VYli E7GYw/x3YBvKCcDtb6EFui+Tc6rgX3kI2+fNWRUNLWTiwmxG9ZrmoJ5tNgjzGbn09zr/nVTRuSDe9 MgA4ci1rda8vUT3xsrQUJ/2mxgxOKsRGknRsch8ER4w3EDmU/2G8k1WUB6cQXOy9MZmLjTa1RtnFk TQgLY+pw7DChRU0ziEFKl9IjBaUPtd2iE8UMq6h+zlG7LJ8yHaj5jrLESfL+kUbncOwxW6AhoQdXS 10xMDw+w==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rlxOt-0000000FZ76-21n6; Sun, 17 Mar 2024 20:46:35 +0000 Date: Sun, 17 Mar 2024 20:46:35 +0000 From: Matthew Wilcox To: Zhaoyang Huang Cc: "zhaoyang.huang" , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, steve.kang@unisoc.com Subject: Re: [PATCH] mm: fix a race scenario in folio_isolate_lru Message-ID: References: <20240314083921.1146937-1-zhaoyang.huang@unisoc.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Sun, Mar 17, 2024 at 12:07:40PM +0800, Zhaoyang Huang wrote: > Could it be this scenario, where folio comes from pte(thread 0), local > fbatch(thread 1) and page cache(thread 2) concurrently and proceed > intermixed without lock's protection? Actually, IMO, thread 1 also > could see the folio with refcnt==1 since it doesn't care if the page > is on the page cache or not. > > madivise_cold_and_pageout does no explicit folio_get thing since the > folio comes from pte which implies it has one refcnt from pagecache Mmm, no. It's implicit, but madvise_cold_or_pageout_pte_range() does guarantee that the folio has at least one refcount. Since we get the folio from vm_normal_folio(vma, addr, ptent); we know that there is at least one mapcount on the folio. refcount is always >= mapcount. Since we hold pte_offset_map_lock(), we know that mapcount (and therefore refcount) cannot be decremented until we call pte_unmap_unlock(), which we don't do until we have called folio_isolate_lru(). Good try though, took me a few minutes of looking at it to convince myself that it was safe. Something to bear in mind is that if the race you outline is real, failing to hold a refcount on the folio leaves the caller susceptible to the VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio); if the other thread calls folio_put(). I can't understand any of the scenarios you outline below. Please try again without relying on indentation. > #thread 0(madivise_cold_and_pageout) #1 > (lru_add_drain->fbatch_release_pages) > #2(read_pages->filemap_remove_folios) > refcnt == 1(represent page cache) > > refcnt==2(another one represent LRU) > folio comes from page cache > folio_isolate_lru > release_pages > filemap_free_folio > > > refcnt==1(decrease the one of page cache) > > folio_put_testzero == true > > > > list_add(folio->lru, pages_to_free) //current folio will break LRU's > integrity since it has not been deleted > > In case of gmail's wrap, split above chart to two parts > > #thread 0(madivise_cold_and_pageout) #1 > (lru_add_drain->fbatch_release_pages) > refcnt == 1(represent page cache) > > refcnt==2(another one represent LRU) > folio_isolate_lru release_pages > > folio_put_testzero == true > > > > list_add(folio->lru, pages_to_free) > > //current folio will break LRU's integrity since it has not been > deleted > > #1 (lru_add_drain->fbatch_release_pages) > #2(read_pages->filemap_remove_folios) > refcnt==2(another one represent LRU) > folio comes from page cache > release_pages > filemap_free_folio > > refcnt==1(decrease the one of page cache) > folio_put_testzero == true > > list_add(folio->lru, pages_to_free) > //current folio will break LRU's integrity since it has not been deleted > > > > > #0 folio_isolate_lru #1 release_pages > > > BUG_ON(!folio_refcnt) > > > if (folio_put_testzero()) > > > folio_get(folio) > > > if (folio_test_clear_lru())