Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752777AbcCVWHZ (ORCPT ); Tue, 22 Mar 2016 18:07:25 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:52398 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751368AbcCVWHW (ORCPT ); Tue, 22 Mar 2016 18:07:22 -0400 Subject: Re: [PATCH 2/6] writeback: wb_start_writeback() should use WB_SYNC_ALL for WB_REASON_SYNC To: Dave Chinner References: <1458669320-6819-1-git-send-email-axboe@fb.com> <1458669320-6819-3-git-send-email-axboe@fb.com> <20160322213427.GR11812@dastard> <56F1BBCC.8050004@fb.com> <20160322220431.GT11812@dastard> CC: , , From: Jens Axboe Message-ID: <56F1C216.5050103@fb.com> Date: Tue, 22 Mar 2016 16:07:18 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160322220431.GT11812@dastard> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-03-22_09:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2811 Lines: 64 On 03/22/2016 04:04 PM, Dave Chinner wrote: > On Tue, Mar 22, 2016 at 03:40:28PM -0600, Jens Axboe wrote: >> On 03/22/2016 03:34 PM, Dave Chinner wrote: >>> On Tue, Mar 22, 2016 at 11:55:16AM -0600, Jens Axboe wrote: >>>> If you call sync, the initial call to wakeup_flusher_threads() ends up >>>> calling wb_start_writeback() with reason=WB_REASON_SYNC, but >>>> wb_start_writeback() always uses WB_SYNC_NONE as the writeback mode. >>>> Ensure that we use WB_SYNC_ALL for a sync operation. >>> >>> This seems wrong to me. We want background write to happen as >>> quickly as possible and /not block/ when we first kick sync. >> >> It's not going to block. wakeup_flusher_threads() async queues >> writeback work through wb_start_writeback(). > > The flusher threads block, not the initial wakeup. e.g. they will > now block waiting for data writeback to complete before writing the > inode. i.e. this code in __writeback_single_inode() is now triggered > by the background flusher: > > /* > * Make sure to wait on the data before writing out the metadata. > * This is important for filesystems that modify metadata on data > * I/O completion. We don't do it for sync(2) writeback because it has a > * separate, external IO completion path and ->sync_fs for guaranteeing > * inode metadata is written back correctly. > */ > if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) { > int err = filemap_fdatawait(mapping); > if (ret == 0) > ret = err; > } Yeah, that's not ideal, for this case we'd really like something that WB_SYNC_ALL_NOWAIT... > It also changes the writeback chunk size in write_cache_pages(), so > instead of doing a bit of writeback from all dirty inodes, it tries > to write everything from each inode in turn (see > writeback_chunk_size()) which will further exacerbate the wait > above. But that part is fine, if it wasn't for the waiting. >>> The latter blocking passes of sync use WB_SYNC_ALL to ensure that we >>> block waiting for all remaining IO to be issued and waited on, but >>> the background writeback doesn't need to do this. >> >> That's fine, they can get to wait on the previously issued IO, which >> was properly submitted with WB_SYNC_ALL. >> >> Maybe I'm missing your point? > > Making the background flusher block and wait for data makes it > completely ineffective in speeding up sync() processing... Agree, we should not wait on the pages individually, we want them submitted and then waited on. And I suppose it's no differently than handling the normal buffered write from an application, which then gets waited on with fsync() or similar. So I can drop this patch, it'll work fine without it. -- Jens Axboe