Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp241690rwl; Sat, 25 Mar 2023 01:15:49 -0700 (PDT) X-Google-Smtp-Source: AKy350aBo9FlEj526HLLJJcq/Kx3wR7oitOdeyy6JiK89ncQuUakg8RXQxE9wLjMan5jSfz4dPIP X-Received: by 2002:a17:906:ae53:b0:8b1:7ac6:3186 with SMTP id lf19-20020a170906ae5300b008b17ac63186mr5425022ejb.68.1679732149316; Sat, 25 Mar 2023 01:15:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679732149; cv=none; d=google.com; s=arc-20160816; b=MTWNRN044DxUrk8lx8tdzugqoZ5SS6s34v//wxGRqmhgybE6xFAtlVpxxJWWPo+x9j 2ysCMy0FQHJtVHqHDkeAs+Ekz+7T+oqGweZNMhGkHiLRFTJsW4gWX1qA1A5pi3SntBxM NvJaykh7jryGqdwUTb3PE31XX8y6YCQzckvjKNJ5iDNjlfHoJdHssAHtpe8jRAEMHCri dzXTG7thGTDfi7KpE4AewjwGQN2InIKQiXyt206GGv0SqMkEhCEYZG9834qzyHIESPWa l829f29/AFezqSrlqqK73Ou1491ryn48kpd+zV1e/XMssO4AoQad+EFM5ATNnzeGe0J0 AWug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=137IwdRd25qQg7zap7GpaN8RZvC7sCCGJZQuGgmDSWA=; b=q1/+sVTmGs7B0RJaMY512EPrNSw3RP+W/Uyjs56z0M3T89MfsIMvkCxaHv6YlwQZa+ hvjKItbpTZa9PUAZkd7lGm0ukiOLDVpnz0WvFbrAqfQDfYHVk3opVw4whO8zhK+Z+v9M i23xVDwhPwG97lyDKQMbpghbFWFaznrBqstaVRb6y/4JhycrxJnJf7OWeXrCMrL+jI+D 2OlcTkN5lKVozRK+h+YOHdSPLpXFugJ53pjqTs0MxT2C/7bHF8yhY9urJaKJIks8i0xp CuFZpE1Z+pkQXrnKLJ+hJN1lQRsR/ADbySw6o3DVcE6fR2JmdYpAT2v9DrFaEFI7qMx+ JBzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=pr4ppiY4; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i17-20020a056402055100b00501da7aff57si11670735edx.10.2023.03.25.01.15.25; Sat, 25 Mar 2023 01:15:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=pr4ppiY4; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231968AbjCYIOg (ORCPT + 99 others); Sat, 25 Mar 2023 04:14:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229722AbjCYIOY (ORCPT ); Sat, 25 Mar 2023 04:14:24 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 738FCFF08; Sat, 25 Mar 2023 01:14:11 -0700 (PDT) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32P7KIOc009886; Sat, 25 Mar 2023 08:14:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=137IwdRd25qQg7zap7GpaN8RZvC7sCCGJZQuGgmDSWA=; b=pr4ppiY45utNjorXp0gnbnsy6lpLwRHTq8kldvMo9QkM6vLTIfbDjTBp6iaPJ2Wq2xfq ipAMRN+dqX6bu38ykxsFCHxFzBR0VxtZBzRUKyDs3B0VflFevsbqTACkjYqEFwXb3t2t PVPy3zE+Fpuj6OTIRQ/+h9p51wNnSBWnp+SV1CYZEFXT+KpI5d2mG4uneh5idChMqeHa T2WyEX1TuoHlsGeMiAT7f/Lq0pN1mrnvdvP6J0Gfmmt3FcyLy4E66i/ahwia7dUZ7AXH SmCBk10b5K77oL5gfXNF91REaAvnGfIsBtGQG/UK0oAjNKirejvVfam3hA01ySGWwp2k 0Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3phvfgrmh1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 25 Mar 2023 08:14:07 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32P8CFE0018015; Sat, 25 Mar 2023 08:14:06 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3phvfgrmgp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 25 Mar 2023 08:14:06 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32P2tKaN015705; Sat, 25 Mar 2023 08:14:04 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3phrk6g78u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 25 Mar 2023 08:14:04 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32P8E12R47776366 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 25 Mar 2023 08:14:01 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9B55420040; Sat, 25 Mar 2023 08:14:01 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A4ABD20043; Sat, 25 Mar 2023 08:13:59 +0000 (GMT) Received: from li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com (unknown [9.43.63.61]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Sat, 25 Mar 2023 08:13:59 +0000 (GMT) From: Ojaswin Mujoo To: linux-ext4@vger.kernel.org, "Theodore Ts'o" Cc: Ritesh Harjani , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara , rookxu , Ritesh Harjani Subject: [PATCH v6 6/9] ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa() Date: Sat, 25 Mar 2023 13:43:39 +0530 Message-Id: X-Mailer: git-send-email 2.31.1 In-Reply-To: References: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: xji1ey5cfpT1jojSzSI5T0_193ahK7dY X-Proofpoint-GUID: pwNUWzkSlNPWvDyXwRr2afakyUmcwWSo Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-24_11,2023-03-24_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 clxscore=1015 bulkscore=0 mlxscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 adultscore=0 spamscore=0 impostorscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2303250065 X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org When the length of best extent found is less than the length of goal extent we need to make sure that the best extent atleast covers the start of the original request. This is done by adjusting the ac_b_ex.fe_logical (logical start) of the extent. While doing so, the current logic sometimes results in the best extent's logical range overflowing the goal extent. Since this best extent is later added to the inode preallocation list, we have a possibility of introducing overlapping preallocations. This is discussed in detail here [1]. As per Jan's suggestion, to fix this, replace the existing logic with the below logic for adjusting best extent as it keeps fragmentation in check while ensuring logical range of best extent doesn't overflow out of goal extent: 1. Check if best extent can be kept at end of goal range and still cover original start. 2. Else, check if best extent can be kept at start of goal range and still cover original start. 3. Else, keep the best extent at start of original request. Also, add a few extra BUG_ONs that might help catch errors faster. [1] https://lore.kernel.org/r/Y+OGkVvzPN0RMv0O@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com Suggested-by: Jan Kara Signed-off-by: Ojaswin Mujoo Reviewed-by: Ritesh Harjani (IBM) Reviewed-by: Jan Kara --- fs/ext4/mballoc.c | 49 ++++++++++++++++++++++++++++++----------------- 1 file changed, 31 insertions(+), 18 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 37bf6507cbfd..1304c95d8c59 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4328,6 +4328,7 @@ static void ext4_mb_use_inode_pa(struct ext4_allocation_context *ac, BUG_ON(start < pa->pa_pstart); BUG_ON(end > pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len)); BUG_ON(pa->pa_free < len); + BUG_ON(ac->ac_b_ex.fe_len <= 0); pa->pa_free -= len; mb_debug(ac->ac_sb, "use %llu/%d from inode pa %p\n", start, len, pa); @@ -4666,10 +4667,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) pa = ac->ac_pa; if (ac->ac_b_ex.fe_len < ac->ac_g_ex.fe_len) { - int winl; - int wins; - int win; - int offs; + int new_bex_start; + int new_bex_end; /* we can't allocate as much as normalizer wants. * so, found space must get proper lstart @@ -4677,26 +4676,40 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) BUG_ON(ac->ac_g_ex.fe_logical > ac->ac_o_ex.fe_logical); BUG_ON(ac->ac_g_ex.fe_len < ac->ac_o_ex.fe_len); - /* we're limited by original request in that - * logical block must be covered any way - * winl is window we can move our chunk within */ - winl = ac->ac_o_ex.fe_logical - ac->ac_g_ex.fe_logical; + /* + * Use the below logic for adjusting best extent as it keeps + * fragmentation in check while ensuring logical range of best + * extent doesn't overflow out of goal extent: + * + * 1. Check if best ex can be kept at end of goal and still + * cover original start + * 2. Else, check if best ex can be kept at start of goal and + * still cover original start + * 3. Else, keep the best ex at start of original request. + */ + new_bex_end = ac->ac_g_ex.fe_logical + + EXT4_C2B(sbi, ac->ac_g_ex.fe_len); + new_bex_start = new_bex_end - EXT4_C2B(sbi, ac->ac_b_ex.fe_len); + if (ac->ac_o_ex.fe_logical >= new_bex_start) + goto adjust_bex; - /* also, we should cover whole original request */ - wins = EXT4_C2B(sbi, ac->ac_b_ex.fe_len - ac->ac_o_ex.fe_len); + new_bex_start = ac->ac_g_ex.fe_logical; + new_bex_end = + new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len); + if (ac->ac_o_ex.fe_logical < new_bex_end) + goto adjust_bex; - /* the smallest one defines real window */ - win = min(winl, wins); + new_bex_start = ac->ac_o_ex.fe_logical; + new_bex_end = + new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len); - offs = ac->ac_o_ex.fe_logical % - EXT4_C2B(sbi, ac->ac_b_ex.fe_len); - if (offs && offs < win) - win = offs; +adjust_bex: + ac->ac_b_ex.fe_logical = new_bex_start; - ac->ac_b_ex.fe_logical = ac->ac_o_ex.fe_logical - - EXT4_NUM_B2C(sbi, win); BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical); BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len); + BUG_ON(new_bex_end > (ac->ac_g_ex.fe_logical + + EXT4_C2B(sbi, ac->ac_g_ex.fe_len))); } pa->pa_lstart = ac->ac_b_ex.fe_logical; -- 2.31.1