Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp7006754ybi; Thu, 13 Jun 2019 08:03:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqz2xkl7LJWS6ATDe/glYi/80lVyIstIB6bSwYfIZ+zrOVXUv6FzxVD7n5EgZ/D8ZzG/Wg3i X-Received: by 2002:a17:90a:b104:: with SMTP id z4mr6082619pjq.102.1560438226815; Thu, 13 Jun 2019 08:03:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560438226; cv=none; d=google.com; s=arc-20160816; b=XZkVUkDY0XISOznel6ANF9cR0SkCFiw0UGju+4nZG3cSc8QtP32gG3fUg909Tz41Sk ldhySjWpH7uVuZ+43GAmLcKilHrP3+qMjxqHbh6mrlzsOzDrH1s2WUuBOKmTePjKkCkE zYNKbGZLf4HpmqskrVlYY1qZRRHcvjaPC4mUW/qYQ7vRfcuAuck+KwhJYMzuyFj5Djfq fb/HgQyo3fIGy8oM/3azKCtGbvcoMlfWyVs2xc0sVpKoPf45JazC6IQrl1iFavnXHSqO mVFs4tbGYVy89i321pZjB2Q7IrEB+tfelxYKatddS+0V66uO+KIA8t4MuLcmejHsvQnj l1+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=Zpqnm/x/AdsYycdl11Nd+V+IfO9WFcP66DZDngrAd9E=; b=QK2JILvHG7bvSPVbnpn8s65o29xZmynwfEWwjYnDkkQBYQQ41x3E2YD9d0H4yuFNP5 3qo4lm1N2G6bMzkZLiQfMgyDVsyrBR4VpDy1+RLYbxs5MLlm5Cjbuk0+EwSYraWhwXTX qAFTb00mkSg06+xrsZG9gJ+C6zJTszXyrBzBfldgMRi0AccuHoNDRGO13dWCMel8GK2L QZslnKfpxOEH7cnTjTN/sjq/IiJqB31fWBA8bvsdgv9PmXZQOlTRQll5ndFLlRbjGPVL XiIVvRAtua4QNMOSac1zZ5jid2hcy+cJT+lhhMAT59B8nA8tgF5maGgYk+MHd+bSELBW pCcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b="vIdt/h5+"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f26si3695924pfd.193.2019.06.13.08.03.30; Thu, 13 Jun 2019 08:03:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b="vIdt/h5+"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732640AbfFMPC3 (ORCPT + 99 others); Thu, 13 Jun 2019 11:02:29 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:34109 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732513AbfFMO1e (ORCPT ); Thu, 13 Jun 2019 10:27:34 -0400 Received: by mail-qk1-f194.google.com with SMTP id t8so9172859qkt.1 for ; Thu, 13 Jun 2019 07:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Zpqnm/x/AdsYycdl11Nd+V+IfO9WFcP66DZDngrAd9E=; b=vIdt/h5+tPMSzsO2WKNiQQDQMPwbBZbBm9OHyAHyx4n4yPEPu33aL/2jtnT9p70MVL F+qhjp0HOftKSWyWCPGK+P2vNwSwohfNdHX/LcHmc5kAQu0QsNp7he5b4V312LKw+B43 y9krIVhmC8HG4q8LZM91gw8AN/Zq2CpBcY8tFJtHMQPB1he2syEi85gzP+vydfgrEXsA o728y38cWbkN6/T0fVNcWbX87K728NKLE4BuAEY8gvdrbrTLI5w5Cx+3W90amkL9VBMV z16Tu4jECWiEGPEow45o7izIqWeGmBToyagWwpaEUaScJjq93T1uyPEdpSWoLV2QKN0D otKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Zpqnm/x/AdsYycdl11Nd+V+IfO9WFcP66DZDngrAd9E=; b=HmVOLdCQCQpJo85Y7pa7pMFcXbXFN6zv1t4QN9VdCGz+FP8yIO3ZXd+yDcwXjdASUA ftgNQFqTj3ioD1oBFgReCWIcuCQDRRmko/r1hCVlydEM4wuWt4qZHQLZHlQu2i/H3s5C 5bkvK3Jr4DYCbnyHidDP28TpmDf+gQg2AQaKqPiwEpF1l5+tCgPd7Ft71iUVHBljzPoB YOJnf1t/oqHB+J6bLiTFDaKAZSTpne2cJ8g3OaQf20fakY3FhD7/JczbfFCNQIiems11 w9ys/7b2ywxZpluRfHTUEjPw/dVgqv7GqisPJQXYGxw6NH4W9lUan14Ipbof2PrIXQU8 5Fkw== X-Gm-Message-State: APjAAAWidAnra59UTJeeGYvL1dfXT2YN9h4QyZUbH454l4ETxzEaGZIa 8ZNauFKnKQWfQk0uAz//2GTVbg== X-Received: by 2002:a05:620a:14a1:: with SMTP id x1mr71416914qkj.164.1560436053376; Thu, 13 Jun 2019 07:27:33 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::9d6b]) by smtp.gmail.com with ESMTPSA id x7sm1732042qth.37.2019.06.13.07.27.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 07:27:33 -0700 (PDT) Date: Thu, 13 Jun 2019 10:27:31 -0400 From: Josef Bacik To: Naohiro Aota Cc: linux-btrfs@vger.kernel.org, David Sterba , Chris Mason , Josef Bacik , Qu Wenruo , Nikolay Borisov , linux-kernel@vger.kernel.org, Hannes Reinecke , linux-fsdevel@vger.kernel.org, Damien Le Moal , Matias =?utf-8?B?QmrDuHJsaW5n?= , Johannes Thumshirn , Bart Van Assche Subject: Re: [PATCH 17/19] btrfs: shrink delayed allocation size in HMZONED mode Message-ID: <20190613142731.mgitehmuyz2566no@MacBook-Pro-91.local> References: <20190607131025.31996-1-naohiro.aota@wdc.com> <20190607131025.31996-18-naohiro.aota@wdc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190607131025.31996-18-naohiro.aota@wdc.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 07, 2019 at 10:10:23PM +0900, Naohiro Aota wrote: > In a write heavy workload, the following scenario can occur: > > 1. mark page #0 to page #2 (and their corresponding extent region) as dirty > and candidate for delayed allocation > > pages 0 1 2 3 4 > dirty o o o - - > towrite - - - - - > delayed o o o - - > alloc > > 2. extent_write_cache_pages() mark dirty pages as TOWRITE > > pages 0 1 2 3 4 > dirty o o o - - > towrite o o o - - > delayed o o o - - > alloc > > 3. Meanwhile, another write dirties page #3 and page #4 > > pages 0 1 2 3 4 > dirty o o o o o > towrite o o o - - > delayed o o o o o > alloc > > 4. find_lock_delalloc_range() decide to allocate a region to write page #0 > to page #4 > 5. but, extent_write_cache_pages() only initiate write to TOWRITE tagged > pages (#0 to #2) > > So the above process leaves page #3 and page #4 behind. Usually, the > periodic dirty flush kicks write IOs for page #3 and #4. However, if we try > to mount a subvolume at this timing, mount process takes s_umount write > lock to block the periodic flush to come in. > > To deal with the problem, shrink the delayed allocation region to have only > expected to be written pages. > > Signed-off-by: Naohiro Aota > --- > fs/btrfs/extent_io.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > index c73c69e2bef4..ea582ff85c73 100644 > --- a/fs/btrfs/extent_io.c > +++ b/fs/btrfs/extent_io.c > @@ -3310,6 +3310,33 @@ static noinline_for_stack int writepage_delalloc(struct inode *inode, > delalloc_start = delalloc_end + 1; > continue; > } > + > + if (btrfs_fs_incompat(btrfs_sb(inode->i_sb), HMZONED) && > + (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages) && > + ((delalloc_start >> PAGE_SHIFT) < > + (delalloc_end >> PAGE_SHIFT))) { > + unsigned long i; > + unsigned long end_index = delalloc_end >> PAGE_SHIFT; > + > + for (i = delalloc_start >> PAGE_SHIFT; > + i <= end_index; i++) > + if (!xa_get_mark(&inode->i_mapping->i_pages, i, > + PAGECACHE_TAG_TOWRITE)) > + break; > + > + if (i <= end_index) { > + u64 unlock_start = (u64)i << PAGE_SHIFT; > + > + if (i == delalloc_start >> PAGE_SHIFT) > + unlock_start += PAGE_SIZE; > + > + unlock_extent(tree, unlock_start, delalloc_end); > + __unlock_for_delalloc(inode, page, unlock_start, > + delalloc_end); > + delalloc_end = unlock_start - 1; > + } > + } > + Helper please. Really for all this hmzoned stuff I want it segregated as much as possible so when I'm debugging or cleaning other stuff up I want to easily be able to say "oh this is for zoned devices, it doesn't matter." Thanks, Josef