Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp3146092rwd; Mon, 29 May 2023 06:24:37 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6LZ4ADq0h0+XGXWj9TclAGzrxMoIhbB5PehXTjdQFjEzh2uPAV6tUuIMXUzub3P0oGISBr X-Received: by 2002:a17:902:d488:b0:1aa:e5cd:647a with SMTP id c8-20020a170902d48800b001aae5cd647amr13365463plg.23.1685366677224; Mon, 29 May 2023 06:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685366677; cv=none; d=google.com; s=arc-20160816; b=mAvgZsXXKqyh2qZtZwB0mmMkG9zAdY67ltm/rvHBVaAvzOWIw6+blW2tjQqZFB94lc mp2dJE4GwyhWecRudHar5HTO5KhlqPfLuT4jWEW2O7Li/RyxgZur1l0cxnWfL4hw90Q6 sBAJbLy62lwo6cimWbhyeXDbPuLEOogcX3SlQ8sAHsJnBQzlNNILl64Cg1ZcpXoKG5Ms hWLeC4+FDlgTp3dlrQJuJJ7Zv1SAjo7ecsXBZqtoVLEiy+z607hs4LIII2supJwqEdHk J6+V89ZSC5cxrD2nDBwolFlZPoXJyGXWpAAiCbnSA0UjTPQ7tsh6ucmUXMnxBb/Swd+a QbpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=MblzIhAb1mOSw12snINGZLTHkkuheYCSH643FZSmwGg=; b=B+OtL3k43DomeFiy99O2llx8x+ShydGxjxnABc8+isTBJQMknLLgreRa+tt+Cb1D0M pChQtlJtwsnYQ/xwbSvNLfCMbEWWotAqLUBy9HgL4/0ukRRpQhz+XOTCYjRUj0kOAdfM dl0lr+0BUPDUu11GaCsrzZQjz24HM/OVmnwRGZMYYtD43WaPJ334Lq7JRyLKyRoRHgOR 0QKzZteMjssHmvWbmNU69ZP5KFyi2AlwxkdoDOMPwnSsBPtYCHM8SFJL8GOnZWd0b7B3 C2TRn3TloR9dlSicY1BvMwInaP9rNtdgoRX8Ae0O6JSHHlfj7k3AL/9lSEHS7tphJqLd CwqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b14-20020a170903228e00b001a4ee6ec8d3si116400plh.65.2023.05.29.06.24.25; Mon, 29 May 2023 06:24:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229767AbjE2NPY (ORCPT + 99 others); Mon, 29 May 2023 09:15:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229717AbjE2NPK (ORCPT ); Mon, 29 May 2023 09:15:10 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EB3D107; Mon, 29 May 2023 06:14:50 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGD2PSBz4f41Vd; Mon, 29 May 2023 21:14:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S5; Mon, 29 May 2023 21:14:45 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 1/7] md/raid10: prevent soft lockup while flush writes Date: Mon, 29 May 2023 21:11:00 +0800 Message-Id: <20230529131106.2123367-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S5 X-Coremail-Antispam: 1UD129KBjvJXoW7ZFyxGw43ZryUArW3Jw1rXrb_yoW8ZFyUpa 90gFWYyw4UCw13AwsIyF4IgFyrZa90q3y7CFWvyw13XF13XFyUGa1DJrWjgrWDuryfGrW3 CF4vkrZ7Xw15tFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbec_DUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,KHOP_HELO_FCRDNS, MAY_BE_FORGED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yu Kuai Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad. Signed-off-by: Yu Kuai --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 32fb4ff0acdb..6b31f848a6d9 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1145,6 +1146,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else submit_bio_noacct(bio); bio = next; + cond_resched(); } kfree(plug); } -- 2.39.2