Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp877758pxf; Thu, 8 Apr 2021 15:12:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySZULDjiIrjBuzymlP6Q9Kt+kqYkoiJYs8TmQrAZGcbSgZAUBg4aDqt10KOPzkqeNhrjMA X-Received: by 2002:a65:6648:: with SMTP id z8mr10089956pgv.268.1617919938575; Thu, 08 Apr 2021 15:12:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617919938; cv=none; d=google.com; s=arc-20160816; b=fyURC2SEz7LXcZ4AKzGbjhwQi4SJF1yQVRcEe/YGoXTd97QoCtpAibxUnPpvvJsBvi qKbg0YH6XjfdHSl0mjlSV38opSTR1xNpE4HeFrGJe4L4v9XYH6ZbpsrBBeMZJ/jgd2GO 53m9rCfrkvwD/aZxSSVllIiSylL4CtQSqFwp/txlAXMhP5Ucdl+GCo9lANfLd0o8/0JS zsC0y31rhTZkU0bhdQmNX5mNq65upraHzdIou+ZZ+fRWQlyS8A7m7nJlrqt2rBN9NYYu KpUUrUDJ1SeFl39RTYvYvasoKsIZvoRBoWUf9qwSgeLgrRCutFxmdOwj9RkVWCfmkCkk bTCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:to:cc:in-reply-to:date:subject :mime-version:message-id:from:dkim-signature; bh=C80Mo1mdqkfUGzCOjeM9jPvl7FbavRoAbteSG3nDKMU=; b=on3G87AVs7kr0j5tOMS5feBh8MIhDcdmVTW1/i78i7zKxJ1opTh1QMWKnVZ8eCe/ur n8KkPHrP9jP8/GsXmhK1Q24cHydTYTKQj/HmAwFEOpZab0DjGI+qXRJDu3gmnAsZueft P0pTt7Rf1Lvm/e5QLwmbptADqyVYH339Mf0Bh2MRAQL7ksWT9lRCUWt7l4HpOb4jKhjR TjZUi989fA1782ZB2JQL1IbvsVTnmMFb7E+ffHfuVQ6prTNM2twshOBh1psAmw9QnKzd cK1cpWTmIF30r+IUzPVzf1WvcjaNWDAv2zkc4HIOVDeZB6mcqxo/MDRw067SVbPheTXf oKZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dilger-ca.20150623.gappssmtp.com header.s=20150623 header.b="rS/7l9KU"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l5si752360pji.45.2021.04.08.15.12.03; Thu, 08 Apr 2021 15:12:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@dilger-ca.20150623.gappssmtp.com header.s=20150623 header.b="rS/7l9KU"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232774AbhDHWLx (ORCPT + 99 others); Thu, 8 Apr 2021 18:11:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232788AbhDHWLw (ORCPT ); Thu, 8 Apr 2021 18:11:52 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0768CC061761 for ; Thu, 8 Apr 2021 15:11:40 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id a12so2867934pfc.7 for ; Thu, 08 Apr 2021 15:11:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dilger-ca.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=C80Mo1mdqkfUGzCOjeM9jPvl7FbavRoAbteSG3nDKMU=; b=rS/7l9KUqbfhS64Mq/5Qz9Jhdr8BRjcDNm+1amDclct1i2KnmhNjXcUh8aN7mX3u8M wjfDVLZQqCuiCNFc5c0Slf7gUKqlgjMtaQU3Zc0spp+8aA32iBOdj/OV+4FQACsikJMZ mUHw0I00Nd/k1Lqjq3q9aMcyAqqKI4ZiDzC5+oOMB4dldqPf5Y3o6B67YN730ibF5acE jQDA4JoXl3Wrh9hr7OHCEPasvgjIzS8B3A/p4BJ4co47vaLcZsNq5Mwr/bQn4Dspy3o9 tQ2za3fY6QjFRWrwq5KXcPQ33yJ5CvelYARobVFQHZdU7Izyu6wpWi2rJyohVr1ASYrB sttg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=C80Mo1mdqkfUGzCOjeM9jPvl7FbavRoAbteSG3nDKMU=; b=dY+yGjcjjdIybwkSaXten/OAZW23GRDCS2DLkVJMnxWOQdw9lpNzp12fhsXgKEEn3b wfR/3327jhcC+hb6RmtLU679kWydb9CtNlzbT3FS+D1VV/+fuOD5Am/SsaZ+l2346VYv pAmUB0DXRKQ3iVQDrkHEJScLUA30TX4n2hxJDAx5SS+tlCZmBDscb30jIVHWQE138vjq iXvvfqm+RzbisXvGbO2NN2DQGjIIQPilP0cbof4yUE5S2Y7jAl0Up2fgYYtCE0UXaQUO NGG/kNqOEq2fNYM0t2wuorJKLwgbbRgtGA5rgIsgMk/bZvYDpYAEIyteldMi/QYmQ3Mf eHqQ== X-Gm-Message-State: AOAM532oadV3RnOQ2lHWYtm/Dlq+EYDEx0VwlRDNTcdfxRGYyf6Huno2 P9qvdnpbA5CH7IzZqniNy3kGXg== X-Received: by 2002:a63:da0a:: with SMTP id c10mr10029888pgh.255.1617919899485; Thu, 08 Apr 2021 15:11:39 -0700 (PDT) Received: from cabot.adilger.int (S01061cabc081bf83.cg.shawcable.net. [70.77.221.9]) by smtp.gmail.com with ESMTPSA id w17sm368136pfu.29.2021.04.08.15.11.38 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Apr 2021 15:11:38 -0700 (PDT) From: Andreas Dilger Message-Id: Content-Type: multipart/signed; boundary="Apple-Mail=_1B86BAB9-1EB2-4BD6-AD04-3C4C7927E3D8"; protocol="application/pgp-signature"; micalg=pgp-sha256 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [PATCH] ext4: add a configurable parameter to prevent endless loop in ext4_mb_discard_group_p Date: Thu, 8 Apr 2021 16:11:35 -0600 In-Reply-To: Cc: riteshh@linux.ibm.com, tytso@mit.edu, linux-ext4@vger.kernel.org, jack@suse.cz, linux-kernel@vger.kernel.org, baoyou.xie@alibaba-inc.com To: Wen Yang References: X-Mailer: Apple Mail (2.3273) Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org --Apple-Mail=_1B86BAB9-1EB2-4BD6-AD04-3C4C7927E3D8 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Apr 8, 2021, at 12:50 PM, Wen Yang wrote: >=20 > Hi Ritesh and Andreas, >=20 > Thank you for your reply. Since there is still a faulty machine, we = have analyzed it again and found it is indeed a very special case: >=20 >=20 > crash> struct ext4_group_info ffff8813bb5f72d0 > struct ext4_group_info { > bb_state =3D 0, > bb_free_root =3D { > rb_node =3D 0x0 > }, > bb_first_free =3D 1681, > bb_free =3D 0, > bb_fragments =3D 0, > bb_largest_free_order =3D -1, > bb_prealloc_list =3D { > next =3D 0xffff880268291d78, > prev =3D 0xffff880268291d78 ---> *** The list is empty > }, > alloc_sem =3D { > count =3D { > counter =3D 0 > }, > wait_list =3D { > next =3D 0xffff8813bb5f7308, > prev =3D 0xffff8813bb5f7308 > }, > wait_lock =3D { > raw_lock =3D { > { > val =3D { > counter =3D 0 > }, > { > locked =3D 0 '\000', > pending =3D 0 '\000' > }, > { > locked_pending =3D 0, > tail =3D 0 > } > } > } > }, > osq =3D { > tail =3D { > counter =3D 0 > } > }, > owner =3D 0x0 > }, > bb_counters =3D 0xffff8813bb5f7328 > } > crash> >=20 >=20 > crash> list 0xffff880268291d78 -l ext4_prealloc_space.pa_group_list = -s ext4_prealloc_space.pa_count > ffff880268291d78 > pa_count =3D { > counter =3D 1 ---> ****pa->pa_count > } > ffff8813bb5f72f0 > pa_count =3D { > counter =3D -30701 > } >=20 >=20 > crash> struct -xo ext4_prealloc_space > struct ext4_prealloc_space { > [0x0] struct list_head pa_inode_list; > [0x10] struct list_head pa_group_list; > union { > struct list_head pa_tmp_list; > struct callback_head pa_rcu; > [0x20] } u; > [0x30] spinlock_t pa_lock; > [0x34] atomic_t pa_count; > [0x38] unsigned int pa_deleted; > [0x40] ext4_fsblk_t pa_pstart; > [0x48] ext4_lblk_t pa_lstart; > [0x4c] ext4_grpblk_t pa_len; > [0x50] ext4_grpblk_t pa_free; > [0x54] unsigned short pa_type; > [0x58] spinlock_t *pa_obj_lock; > [0x60] struct inode *pa_inode; > } > SIZE: 0x68 >=20 >=20 > Before "goto repeat", it is necessary to check whether = grp->bb_prealloc_list is empty. This patch may fix it. > Please kindly give us your comments and suggestions. > Thanks. This patch definitely looks more reasonable than the previous one, but I don't think it is correct? It isn't clear how the system could get into this state, or stay there. If bb_prealloc_list is empty, then "busy" should be 0, since busy =3D 1 is only set inside "list_for_each_entry_safe(bb_prealloc_list)", and that implies the list is *not* empty? The ext4_lock_group() is held over this code, so the list should not be changing in this case, and even if the list changed, it _should_ only affect the loop one time. > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index 99bf091..8082e2d 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -4290,7 +4290,7 @@ static void ext4_mb_new_preallocation(struct = ext4_allocation_context *ac) > free_total +=3D free; >=20 > /* if we still need more blocks and some PAs were used, try = again */ > - if (free_total < needed && busy) { > + if (free_total < needed && busy && = !list_empty(&grp->bb_prealloc_list)) { > ext4_unlock_group(sb, group); > cond_resched(); > busy =3D 0; > goto repeat; Minor suggestion - moving "busy =3D 0" right after "repeat:" and = removing the "int busy =3D 0" initializer would make this code a bit more clear. Cheers, Andreas --Apple-Mail=_1B86BAB9-1EB2-4BD6-AD04-3C4C7927E3D8 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQIzBAEBCAAdFiEEDb73u6ZejP5ZMprvcqXauRfMH+AFAmBvf5cACgkQcqXauRfM H+CJaRAAsJW5zFKob3DjTlJ4M08Edj43NqNjQANTCX2ftnwWuPYLKyssYkM98pSJ yMWR+Hy34sOUBnB5gk2TCAsyQm0QE5yz/7AY/yA0ddctd01Q07X1/qmzL+p7O/sQ TO0dUl4ogApeOnJ+bYi1HkqX7akYaX4AnQ609t5NzmjsHobMdptBk2GTB3M9NKex Lpho0po1tPful85luF9wg2JFVNduFrYngktU4UL24KSzavwb7C02+eXXh716UGCZ 7oqtBaknmMO7lJWwmYJH+/XAVoU6+jPtYdV/YvEDcSUStx1MpA1TNS9jQ4c6tskc n+uiiG7kjE8OIdL6Rt7uLcxY9jVYNPCAsTFU5hg/CeYS5X6PoKDHSea9tabOxvJV ssbz1nKXSVhSF5eceMvsygewQHzpKiHwnAdWN/Pj8i6K3WjwUHlWNOcnS8roQo53 UDpn3ZJ3xLuB1afxZCCYZVqOYKY5nC4dm63dGnKilpFeBD7zN4YaYUg0nrKZzlQS C8njKgq6iZhHb8e1mYWSdx/SoCn0kndENDS0UkWGySGiaepzIEWNfNJQfD7XQpXV s7/rluDQMVLBJjQd5soLy16lgDNxfhae0HMq6mvN/Tqp4RrGQ3GAwD+su0lqoH92 nTA/IXlTaKvm4r30KSWS4FHj0nMZMND62oknIvywHdpsph7AIoY= =eiw9 -----END PGP SIGNATURE----- --Apple-Mail=_1B86BAB9-1EB2-4BD6-AD04-3C4C7927E3D8--