Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753733AbdDJPNe (ORCPT ); Mon, 10 Apr 2017 11:13:34 -0400 Received: from fldsmtpe01.verizon.com ([140.108.26.140]:37197 "EHLO fldsmtpe01.verizon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753330AbdDJPMf (ORCPT ); Mon, 10 Apr 2017 11:12:35 -0400 X-IronPort-Anti-Spam-Filtered: false X-IronPort-AV: E=Sophos;i="5.37,182,1488844800"; d="scan'208";a="186661388" From: alexander.levin@verizon.com X-Host: discovery.odc.vzwcorp.com To: Jan Kara CC: Johannes Weiner , Andrew Morton , Tejun Heo , Hugh Dickins , Michel Lespinasse , "Kirill A. Shutemov" , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [patch 1/3] mm: protect set_page_dirty() from ongoing truncation Thread-Topic: [patch 1/3] mm: protect set_page_dirty() from ongoing truncation Thread-Index: AQHSsgw9f2vc0qBo1kq6JkonOL3+gA== Date: Mon, 10 Apr 2017 15:07:58 +0000 Message-ID: <20170410150755.kd2gjqyfmvschtxd@sasha-lappy> References: <1417791166-32226-1-git-send-email-hannes@cmpxchg.org> <20170410022230.xe5sukvflvoh4ula@sasha-lappy> <20170410120638.GD3224@quack2.suse.cz> In-Reply-To: <20170410120638.GD3224@quack2.suse.cz> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Mutt/1.6.2-neo (2016-08-21) x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.144.60.250] Content-Type: text/plain; charset="us-ascii" Content-ID: <7C5DAAEEC8E1DC4D871FE18EF9EF3D30@vzwcorp.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id v3AFDt7v004356 Content-Length: 3162 Lines: 76 On Mon, Apr 10, 2017 at 02:06:38PM +0200, Jan Kara wrote: > On Mon 10-04-17 02:22:33, alexander.levin@verizon.com wrote: > > On Fri, Dec 05, 2014 at 09:52:44AM -0500, Johannes Weiner wrote: > > > Tejun, while reviewing the code, spotted the following race condition > > > between the dirtying and truncation of a page: > > > > > > __set_page_dirty_nobuffers() __delete_from_page_cache() > > > if (TestSetPageDirty(page)) > > > page->mapping = NULL > > > if (PageDirty()) > > > dec_zone_page_state(page, NR_FILE_DIRTY); > > > dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE); > > > if (page->mapping) > > > account_page_dirtied(page) > > > __inc_zone_page_state(page, NR_FILE_DIRTY); > > > __inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE); > > > > > > which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE. > > > > > > Dirtiers usually lock out truncation, either by holding the page lock > > > directly, or in case of zap_pte_range(), by pinning the mapcount with > > > the page table lock held. The notable exception to this rule, though, > > > is do_wp_page(), for which this race exists. However, do_wp_page() > > > already waits for a locked page to unlock before setting the dirty > > > bit, in order to prevent a race where clear_page_dirty() misses the > > > page bit in the presence of dirty ptes. Upgrade that wait to a fully > > > locked set_page_dirty() to also cover the situation explained above. > > > > > > Afterwards, the code in set_page_dirty() dealing with a truncation > > > race is no longer needed. Remove it. > > > > > > Reported-by: Tejun Heo > > > Signed-off-by: Johannes Weiner > > > Cc: > > > Acked-by: Kirill A. Shutemov > > > > Hi Johannes, > > > > I'm seeing the following while fuzzing with trinity on linux-next (I've changed > > the WARN to a VM_BUG_ON_PAGE for some extra page info). > > But this looks more like a bug in 9p which allows v9fs_write_end() to dirty > a !Uptodate page? I thought that 77469c3f5 ("9p: saner ->write_end() on failing copy into non-uptodate page") prevented from that happening, but that's actually the change that's causing it (I ended up misreading it last night). Will fix it as follows: diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c index adaf6f6..be84c0c 100644 --- a/fs/9p/vfs_addr.c +++ b/fs/9p/vfs_addr.c @@ -310,9 +310,13 @@ static int v9fs_write_end(struct file *filp, struct address_space *mapping, p9_debug(P9_DEBUG_VFS, "filp %p, mapping %p\n", filp, mapping); - if (unlikely(copied < len && !PageUptodate(page))) { - copied = 0; - goto out; + if (!PageUptodate(page)) { + if (unlikely(copied < len)) { + copied = 0; + goto out; + } else { + SetPageUptodate(page); + } } /* * No need to use i_size_read() here, the i_size -- Thanks, Sasha