Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp4477331ybz; Tue, 28 Apr 2020 11:58:54 -0700 (PDT) X-Google-Smtp-Source: APiQypL6LjIQifn3TfWWnhqOndGn1mqtxJ1xwZQwWiAILZTdJmh9C3vwL5IhgQ5Y/Py8my17JSSp X-Received: by 2002:aa7:dd84:: with SMTP id g4mr20321949edv.257.1588100334452; Tue, 28 Apr 2020 11:58:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588100334; cv=none; d=google.com; s=arc-20160816; b=bsGhvQ/gWwY/hT6swGrRdtJfAqHtfl+X93mJQ7uq5iG6bK7Zv3bbyk8al8JK4OOMpT HtFaPJ7pUGjW2ftoE8E4hGvbAeCOP1bL5v7n6/2eZl1Nq6mPOxeK6oZ1zZuJ9cATV8LW LI5cEVt8bRZPQmBmehAwGhetAM1rYo2nTt3YY4QXVHSZcgfFLov835xPZF6CDPUvdrbP Z6C/lwuWtZlxkHMrtPOP54GH69MooZ84ov1J+b9VfVHt2SBOhZg7+MdJpVzRMrDDffmH tmM+ZP6xN2LFn5ALybyeQkGLSlBLQjzWH2EImPrEIm9Wteb8gZOUvrq3lvd1i2YCYKde N0Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=PLUQqjkaaHodUmbjGgSNwtSrn7WAPz7pLDnIEnJtCAo=; b=x5Le7loOrLfIlnCyuWsyEnYbB5W0VPyw6bMq2Rqru3rJSXkAwo4tpL0vi/5Quokr1y gw3QqXjiXpAsLguvDvtq7TtA4lwmS9LIfe87oHerEyUD4IVU4P/Otl4TiRPBTrHAH2wC gASw9eHRrvG9gwRqCBL+ddqI2dWiRJnYCExVjHVr851ZQLlVjhoD+6qxVeR7NATMtJCG oJOWg9wHRc3bEmr2W7r/YHGhvHZK5FGoQ8LTGBdIH+Cw+vuOEaWDYpFlRBoi3q49pHe/ rMjh3QywXr/lsHgzfQPCXptvzxrnA7iZRf3iM7skn0eNFwBBgyLnC++QR3CLec4Rlzvu hK5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sJINWpII; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m24si2282082edr.227.2020.04.28.11.58.30; Tue, 28 Apr 2020 11:58:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sJINWpII; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729710AbgD1Sbl (ORCPT + 99 others); Tue, 28 Apr 2020 14:31:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:47224 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729182AbgD1Sbg (ORCPT ); Tue, 28 Apr 2020 14:31:36 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B747621775; Tue, 28 Apr 2020 18:31:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588098695; bh=NGkzurwTVy7ztk8Pn1yr/WrJ/bY4T3GZkEAupV5RS5k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sJINWpIIOgUEQvQBoXaA2Kb558Hhb4NmRyw045Emu7TaEAlDlWcMUwcrMau+/6s6g b83unabEEaaMGK0DqtVYLMb6TfLnS6CLhLQ7r2VB3nKECQf8p7zWNQjX8TZWDU4297 66TzadZYoVo4nstmAV80WMakLkI39fXUE99aVfrE= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dmitry Monakhov , Theodore Tso , Sasha Levin Subject: [PATCH 5.4 001/168] ext4: fix extent_status fragmentation for plain files Date: Tue, 28 Apr 2020 20:22:55 +0200 Message-Id: <20200428182231.887191135@linuxfoundation.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200428182231.704304409@linuxfoundation.org> References: <20200428182231.704304409@linuxfoundation.org> User-Agent: quilt/0.66 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dmitry Monakhov [ Upstream commit 4068664e3cd2312610ceac05b74c4cf1853b8325 ] Extents are cached in read_extent_tree_block(); as a result, extents are not cached for inodes with depth == 0 when we try to find the extent using ext4_find_extent(). The result of the lookup is cached in ext4_map_blocks() but is only a subset of the extent on disk. As a result, the contents of extents status cache can get very badly fragmented for certain workloads, such as a random 4k read workload. File size of /mnt/test is 33554432 (8192 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 8191: 40960.. 49151: 8192: last,eof $ perf record -e 'ext4:ext4_es_*' /root/bin/fio --name=t --direct=0 --rw=randread --bs=4k --filesize=32M --size=32M --filename=/mnt/test $ perf script | grep ext4_es_insert_extent | head -n 10 fio 131 [000] 13.975421: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [494/1) mapped 41454 status W fio 131 [000] 13.975939: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6064/1) mapped 47024 status W fio 131 [000] 13.976467: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6907/1) mapped 47867 status W fio 131 [000] 13.976937: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3850/1) mapped 44810 status W fio 131 [000] 13.977440: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3292/1) mapped 44252 status W fio 131 [000] 13.977931: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6882/1) mapped 47842 status W fio 131 [000] 13.978376: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3117/1) mapped 44077 status W fio 131 [000] 13.978957: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [2896/1) mapped 43856 status W fio 131 [000] 13.979474: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [7479/1) mapped 48439 status W Fix this by caching the extents for inodes with depth == 0 in ext4_find_extent(). [ Renamed ext4_es_cache_extents() to ext4_cache_extents() since this newly added function is not in extents_cache.c, and to avoid potential visual confusion with ext4_es_cache_extent(). -TYT ] Signed-off-by: Dmitry Monakhov Link: https://lore.kernel.org/r/20191106122502.19986-1-dmonakhov@gmail.com Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin --- fs/ext4/extents.c | 47 +++++++++++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 164dbfd40c52d..9bd44588eb77c 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -498,6 +498,30 @@ int ext4_ext_check_inode(struct inode *inode) return ext4_ext_check(inode, ext_inode_hdr(inode), ext_depth(inode), 0); } +static void ext4_cache_extents(struct inode *inode, + struct ext4_extent_header *eh) +{ + struct ext4_extent *ex = EXT_FIRST_EXTENT(eh); + ext4_lblk_t prev = 0; + int i; + + for (i = le16_to_cpu(eh->eh_entries); i > 0; i--, ex++) { + unsigned int status = EXTENT_STATUS_WRITTEN; + ext4_lblk_t lblk = le32_to_cpu(ex->ee_block); + int len = ext4_ext_get_actual_len(ex); + + if (prev && (prev != lblk)) + ext4_es_cache_extent(inode, prev, lblk - prev, ~0, + EXTENT_STATUS_HOLE); + + if (ext4_ext_is_unwritten(ex)) + status = EXTENT_STATUS_UNWRITTEN; + ext4_es_cache_extent(inode, lblk, len, + ext4_ext_pblock(ex), status); + prev = lblk + len; + } +} + static struct buffer_head * __read_extent_tree_block(const char *function, unsigned int line, struct inode *inode, ext4_fsblk_t pblk, int depth, @@ -532,26 +556,7 @@ __read_extent_tree_block(const char *function, unsigned int line, */ if (!(flags & EXT4_EX_NOCACHE) && depth == 0) { struct ext4_extent_header *eh = ext_block_hdr(bh); - struct ext4_extent *ex = EXT_FIRST_EXTENT(eh); - ext4_lblk_t prev = 0; - int i; - - for (i = le16_to_cpu(eh->eh_entries); i > 0; i--, ex++) { - unsigned int status = EXTENT_STATUS_WRITTEN; - ext4_lblk_t lblk = le32_to_cpu(ex->ee_block); - int len = ext4_ext_get_actual_len(ex); - - if (prev && (prev != lblk)) - ext4_es_cache_extent(inode, prev, - lblk - prev, ~0, - EXTENT_STATUS_HOLE); - - if (ext4_ext_is_unwritten(ex)) - status = EXTENT_STATUS_UNWRITTEN; - ext4_es_cache_extent(inode, lblk, len, - ext4_ext_pblock(ex), status); - prev = lblk + len; - } + ext4_cache_extents(inode, eh); } return bh; errout: @@ -899,6 +904,8 @@ ext4_find_extent(struct inode *inode, ext4_lblk_t block, path[0].p_bh = NULL; i = depth; + if (!(flags & EXT4_EX_NOCACHE) && depth == 0) + ext4_cache_extents(inode, eh); /* walk through the tree */ while (i) { ext_debug("depth %d: num %d, max %d\n", -- 2.20.1