Received: by 2002:ab2:7855:0:b0:1f9:5764:f03e with SMTP id m21csp341444lqp; Wed, 22 May 2024 06:22:10 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVfQT4PlBGCQlxS0vZJy4kgLx7hHD+LsiCIydSNB4kkqCP7bsM5m3gn5mm2kWA8EXWj+6aXPUytPRvCvzGOzFpX/bWpI+oZ5+tMbdz9zQ== X-Google-Smtp-Source: AGHT+IG0/eVv1eiRe07d93YTcjPPe20u/q/q1WwTEkrAhahlcbSDcMFLuAYULZ1aSyQKlccM7dNg X-Received: by 2002:a05:622a:148b:b0:43e:1eaa:131d with SMTP id d75a77b69052e-43f9e1d84c6mr27188191cf.60.1716384130371; Wed, 22 May 2024 06:22:10 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716384130; cv=pass; d=google.com; s=arc-20160816; b=KsKFwdH4ke/+wJ4LsQDEDUsLaIMdWtxHwz7MtowGmjJ/sr6rrEt9o6tz6xM3tVJAX6 6NXITBS/jlhFD+hqW5dUnyx5vlKphA6dqxTM9MQMV7Q/MfaWSkgNGEQsw9+2dOncChFV OKpkJ8lPQbNYax9Km5sV0IW1FRVdzszzY3iyEo9rw9RWUlQYLMsV64bLEPqDeZnriGPH QBCoGBtuxzucq7H/NxCSY28iFBzG1/jYvDV55h1/dHaLBlxu/vyLmJW3UeOh9IuHDv9T 9QmztzWSEgDJpD1vD4sM/t7YIaJWSKT9b4qqZo1Rgp+UD1N7beLGyYnTIWaRB3+2yNaO U5YQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:from :dkim-signature; bh=w3U75sxnOeZdFGOwo1OVSElYeJTqXT6eZMDMxenHS3Y=; fh=q5AlReaN5GPQUDd+5Fo/EWCdTegVs4rKugmMlidGiBE=; b=Vm/RQVmqHCyirEzJd+/7kM6Fxo+i+PmO3tCq/klsNXuf1pDjUWPHAiLawpUhe6v6Md WT7I6vhJcXo4lSJGc1Mdnc0diftYMiB9/SpiS7Rf2tdb7cnDVE+0XXwRwY2eqawxIvZI 25tFrgF1ukEpy9C54KSfxKudZxMZ1/rvABas2brTIMKFtqf6RO3PCX+sCeq9+2CgPEU5 lScGdovSXV/tjedrmBp2OGoYS8rjIlN82HJxT6SDJzQmGgXKT+XLmZhRlnoLSdn+IhV1 lGn1wuR3mXlr/tOqqBUoLe1U7baYpTyOZOq2mgdiNT+MczUJ86uASXCYMBRwqwU3s5S4 4C+Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=vOt2aZ2Y; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-ext4+bounces-2626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-ext4+bounces-2626-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id d75a77b69052e-43e3908fca7si170558381cf.513.2024.05.22.06.22.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 06:22:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4+bounces-2626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=vOt2aZ2Y; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-ext4+bounces-2626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-ext4+bounces-2626-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 0EC1B1C20D79 for ; Wed, 22 May 2024 13:22:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 279D013F011; Wed, 22 May 2024 13:21:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="vOt2aZ2Y" X-Original-To: linux-ext4@vger.kernel.org Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B9D313E8A0 for ; Wed, 22 May 2024 13:21:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716384115; cv=none; b=nE6B7AX9W5FCcSIE+wg4+UiJlF9w/+XlYVJPF+9OIKc/JAdp4USUycNX3jH3VGsuGSG1iX8cwiGjzBrPYwveoiiW1V6hr+p8iKQJU8iS6acF+HbaY7N5/hVT9skiEREOjPu+ul/sKQi6XlQXoknXgZkmB2aEOtUVMGETAxO0hFg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716384115; c=relaxed/simple; bh=mYSAmhsQtJ0PDxq3WdA7gK5lRtv9WoXaj1HB0gLlhTM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=YzxRFINaIEUE7ej6/IdWuevOgiMlaKHianuUfL+GW8FzNxjOW7zdC+HC3WQM0lb9i4Y6XUMELq/uHloe6LqGnlbUmye25vxoxpqmKK7Ud3IRKfpX89/QdLHbyDDH97ADYJ7ywtZbUGVItvVo4XfKpUyfKYkWEiqPA1IIGB2p8r0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=vOt2aZ2Y; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev X-Envelope-To: jack@suse.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1716384110; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=w3U75sxnOeZdFGOwo1OVSElYeJTqXT6eZMDMxenHS3Y=; b=vOt2aZ2Y9/UJ1YxmIkQijMwxaEiEiRznlwvMdfhDBB9j0LYCWaPqhK4XOPVPyJVG36jIwT ahgBcq3oeBLlOJuBfd7AaWVeLe7jaxhxPsIDHOQygAFn/Snna7Zcb4HsplYNZSoch7vFpI y7JHs0igR4dZm4KpBXYdwnUAGxyp1F4= X-Envelope-To: tytso@mit.edu X-Envelope-To: linux-kernel@vger.kernel.org X-Envelope-To: luis.henriques@linux.dev X-Envelope-To: jack@suse.com X-Envelope-To: adilger@dilger.ca X-Envelope-To: linux-ext4@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Luis Henriques To: Jan Kara Cc: "Luis Henriques (SUSE)" , Theodore Ts'o , Andreas Dilger , Jan Kara , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 1/2] ext4: fix fast commit inode enqueueing during a full journal commit In-Reply-To: <20240522103545.ypmmoyxvls52i6yl@quack3> (Jan Kara's message of "Wed, 22 May 2024 12:35:45 +0200") References: <20240521154535.12911-1-luis.henriques@linux.dev> <20240521154535.12911-2-luis.henriques@linux.dev> <20240522103545.ypmmoyxvls52i6yl@quack3> Date: Wed, 22 May 2024 14:21:47 +0100 Message-ID: <87pltedlsk.fsf@brahms.olymp> Precedence: bulk X-Mailing-List: linux-ext4@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-Migadu-Flow: FLOW_OUT On Wed 22 May 2024 12:35:45 PM +02, Jan Kara wrote; > On Tue 21-05-24 16:45:34, Luis Henriques (SUSE) wrote: >> When a full journal commit is on-going, any fast commit has to be enqueued >> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing >> is done only once, i.e. if an inode is already queued in a previous fast >> commit entry it won't be enqueued again. However, if a full commit starts >> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to >> be done into FC_Q_STAGING. And this is not being done in function >> ext4_fc_track_template(). > > Ah, good catch. > >> This patch fixes the issue by simply re-enqueuing the inode from the MAIN >> into the STAGING queue. >> >> This bug was found using fstest generic/047. This test creates several 32k >> bytes files, sync'ing each of them after it's creation, and then shutting >> down the filesystem. Some data may be loss in this operation; for example a >> file may have it's size truncated to zero. >> >> Signed-off-by: Luis Henriques (SUSE) >> --- >> fs/ext4/fast_commit.c | 19 +++++++++++++------ >> 1 file changed, 13 insertions(+), 6 deletions(-) >> >> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c >> index 87c009e0c59a..337b5289cf11 100644 >> --- a/fs/ext4/fast_commit.c >> +++ b/fs/ext4/fast_commit.c >> @@ -396,12 +396,19 @@ static int ext4_fc_track_template( >> return ret; >> >> spin_lock(&sbi->s_fc_lock); >> - if (list_empty(&EXT4_I(inode)->i_fc_list)) >> - list_add_tail(&EXT4_I(inode)->i_fc_list, >> - (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING || >> - sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ? >> - &sbi->s_fc_q[FC_Q_STAGING] : >> - &sbi->s_fc_q[FC_Q_MAIN]); >> + if (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING || >> + sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) { >> + if (list_empty(&EXT4_I(inode)->i_fc_list)) >> + list_add_tail(&EXT4_I(inode)->i_fc_list, >> + &sbi->s_fc_q[FC_Q_STAGING]); >> + else >> + list_move_tail(&EXT4_I(inode)->i_fc_list, >> + &sbi->s_fc_q[FC_Q_STAGING]); > > So I'm not sure this is actually safe. I'm concerned about the following > race: > > Task1 Task2 > > handle = ext4_journal_start(..) > modify inode_X > ext4_fc_track_inode(inode_X) > ext4_fsync(inode_X) > ext4_fc_commit() > jbd2_fc_begin_commit() > journal->j_flags |= JBD2_FAST_COMMIT_ONGOING; > ... > jbd2_journal_lock_updates() > blocks waiting for handle of Task2 > modify inode_X > ext4_fc_track_inode(inode_X) > - moves inode out of FC_Q_MAIN > ext4_journal_stop() > fast commit proceeds but skips inode_X... Hmm... I see, the problem is deeper that I thought. > How we deal with a similar issue in jbd2 for ordinary buffers is that we > just mark the buffer as *also* belonging to the next transaction (by > setting jh->b_next_transaction) and during commit cleanup we move the bh to > the appropriate list of the next transaction. Here, we could mark the inode > as also being part of the next fast commit and during fastcommit cleanup we > could move it to FC_Q_STAGING which is then spliced back to FC_Q_MAIN. Yeah, I guess that would work. I'll need to add a new field to flag the 'next commit' in struct ext4_inode_info. I'll need to play a bit with it and see what I can came up with. Thanks for the suggestion. > Also Harshad has recently posted changes to fast commit code that modify > how fast commits are serialized (in particular jbd2_journal_lock_updates() > is gone). I didn't read them yet but your change definitely needs a careful > verification against those changes to make sure we don't introduce new data > integrity issues. > Right, I saw his patchset only after sending my RFC (and I should have probably included him on the CC as well; probably get_maintainer.pl isn't picking his email). I'll need to look at those changes too, which will probably take me some time as most of that code isn't familiar to me. Thanks a lot for your review, Jan. Much appreciated. Cheers, -- Luis >> + } else { >> + if (list_empty(&EXT4_I(inode)->i_fc_list)) >> + list_add_tail(&EXT4_I(inode)->i_fc_list, >> + &sbi->s_fc_q[FC_Q_MAIN]); >> + } >> spin_unlock(&sbi->s_fc_lock); >> >> return ret; > > Honza > -- > Jan Kara > SUSE Labs, CR