Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp450840rwr; Wed, 26 Apr 2023 01:24:32 -0700 (PDT) X-Google-Smtp-Source: AKy350bb0izqUpreDNWB7j4Wxv2Lwog2UAlHZRLHlhMYqtoDDGhhN5VonyAoGsFvBtn14BWrnu/B X-Received: by 2002:a05:6a20:4405:b0:f2:a2e5:c1f4 with SMTP id ce5-20020a056a20440500b000f2a2e5c1f4mr21436573pzb.20.1682497472585; Wed, 26 Apr 2023 01:24:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682497472; cv=none; d=google.com; s=arc-20160816; b=zuST/aGQ3K1AhoMnPi+2c4gd/UKube6iPFvEOsNtyPm6SpXkHnMXKhMw/jOGqor4ld 7Bstt8njyyT9GTWafacrswLmpPIc3Y/pG4lYye76AqMVTQIOevsk+9S+1GoBwb9SbJE+ t9e6Sa42KsHhIK5IBBizzUDaEeTG+vPmj3UI6UY/wefQW0/7iLlY0LrjdV5PlWpxpDuo 6rmfyEYceHeiL1WWCnxwwn2pWoQCNFuxsBEF66B9mRhW8+ceZnKQ8PiBpXAx7WAOYqPH W1XFbo7E8Ut8dZdIfTdC3ztlRKgPrZgpFy5A0sOaGiUKPGx5ZgTOy0aPws71FXO7Nup1 tWkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=d+IrVMLLsFtTIFfXe5C1RrLLzkKNl6u8wqD61djn7Ns=; b=1Bgo2Itd7u1h+hGT2F2dGQ1asHn7zwBZpCs/SjGCfQFseiu6QbuBdXvzZyQjoYHdgL 74wBJ2g/QRH7Jd4bSFS63I6i6IhHRu1e1rqolmIaaBvglculow3FpNpBi50Dx47Asdid NRkQof5HnPbKNq7jhI8UcIkpXqaS+oKxP0eO8J2nzezZONNaGkUV4UpkoAr/1SRvIwCe d7w0sBbpmSDCdyEyWqiqsLCtejMqaHyNNo4DzjsGFZD5Kr6cFYE2w5/XAosUOy8ipjv4 k7dicr1dSel6DtxYGCRtVK4wnlQH+x41tntGDiGo30HsRkxLTAFhhklUjkfsr8eVswO1 K93w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id go8-20020a17090b03c800b002373125c085si15597664pjb.141.2023.04.26.01.24.22; Wed, 26 Apr 2023 01:24:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240133AbjDZIW3 (ORCPT + 99 others); Wed, 26 Apr 2023 04:22:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230484AbjDZIWZ (ORCPT ); Wed, 26 Apr 2023 04:22:25 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC35D3591; Wed, 26 Apr 2023 01:22:23 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL34HWpz4f4Vy1; Wed, 26 Apr 2023 16:22:19 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S5; Wed, 26 Apr 2023 16:22:21 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 1/7] md/raid10: prevent soft lockup while flush writes Date: Wed, 26 Apr 2023 16:20:25 +0800 Message-Id: <20230426082031.1299149-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S5 X-Coremail-Antispam: 1UD129KBjvJXoW7ZFyxGw43ZryUArW3Jw1rXrb_yoW8ZrWrp3 90gFWYyw1UCw13AwsIyF4xKFyrZa90q3y7CFWkZw13XF13XFyUGayDXryjgrWDuryfGrW3 CF4vkrZ7Xw15tFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUqAp5UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,KHOP_HELO_FCRDNS, MAY_BE_FORGED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yu Kuai Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io latency is quite bad. Signed-off-by: Yu Kuai --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 32fb4ff0acdb..6b31f848a6d9 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1145,6 +1146,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else submit_bio_noacct(bio); bio = next; + cond_resched(); } kfree(plug); } -- 2.39.2