Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp4309971pxu; Tue, 1 Dec 2020 01:33:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJwC6F3r6OXZ7F8Gl6COkbbI8V/fy9V8KQjP8wp5LeznqnTUsg84cOfdLly43XQJfTmpIzsJ X-Received: by 2002:a17:906:edc4:: with SMTP id sb4mr160959ejb.21.1606815212850; Tue, 01 Dec 2020 01:33:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606815212; cv=none; d=google.com; s=arc-20160816; b=GV4P0zj+jfiB2gMsxETc3YjBdOVypPAr3MSfJPlIDoc6keWo48c/MKnuKfbVmOzR78 +MJmRUx+wMVGTTbEpyyMq9ek/1piZ6Z5uNrjeryPyr7MgParPl3DXaxnYz3Pwex5CObb xuFrXhSH2zmWjlMQXA6A9/8V4TVoBUUHvu3OZWJXf2Ac+3c4XJYNZZkFEKdRUBLgkvxC dTGTjXrPmnneM+Jn17jqewueBS8oiSiguDK+vBIv1PJgKgJr30iTRCBUL6dxcy8+ot5p iLqeLYgcmQ//WY7Yml3ceBCaV+8nEJ3MG9mLFilCj25A2HJJIp9JVUgh25u+HXHn/I9B ComA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject; bh=F2Q+LmAGuoEVh9VEZVH4ZOfsFY6JGdNnaF/0nJnfynA=; b=IQu1dJfs9dauokgASP7yRs1mXJUjDhN5SeyO66sK64kp+SQ5sj53EH6N20B3/fh3fm +XBPBQ0JqJXxJ62TeNT7Aj5Qlzg6wZGNKFjKGPRr5K4/0/TB59RBndml+9Yetpprv818 UcbRGHH8gsno2n8dfg02FliYFeL/xaFaBQ3kJsgY0KAGFYSxMxgZzmWOR9zeOGzLehB0 VE+T6BmHn+r2H+hNHZs/2loDGNDhLjRToU1zY3f665bs5SJ7X1woek4s+Khsx+9jO3ga EJ61m6Oosi5Z7Ss7QXlBHKsIc+7/l8eXGXuzXBpCG1spa4+Rp1tyc36r9+8M3dek/Gxz JAaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z18si696389edi.229.2020.12.01.01.33.10; Tue, 01 Dec 2020 01:33:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390949AbgLAJaH (ORCPT + 99 others); Tue, 1 Dec 2020 04:30:07 -0500 Received: from mx3.molgen.mpg.de ([141.14.17.11]:59935 "EHLO mx1.molgen.mpg.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388019AbgLAJaB (ORCPT ); Tue, 1 Dec 2020 04:30:01 -0500 Received: from [141.14.16.43] (donald-xps13.molgen.mpg.de [141.14.16.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: buczek) by mx.molgen.mpg.de (Postfix) with ESMTPSA id 81C522064786D; Tue, 1 Dec 2020 10:29:18 +0100 (CET) Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition To: Guoqing Jiang , Song Liu , linux-raid@vger.kernel.org, Linux Kernel Mailing List , it+raid@molgen.mpg.de References: <95fbd558-5e46-7a6a-43ac-bcc5ae8581db@cloud.ionos.com> From: Donald Buczek Message-ID: <77244d60-1c2d-330e-71e6-4907d4dd65fc@molgen.mpg.de> Date: Tue, 1 Dec 2020 10:29:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <95fbd558-5e46-7a6a-43ac-bcc5ae8581db@cloud.ionos.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 30.11.20 um 03:06 schrieb Guoqing Jiang: > > > On 11/28/20 13:25, Donald Buczek wrote: >> Dear Linux mdraid people, >> >> we are using raid6 on several servers. Occasionally we had failures, where a mdX_raid6 process seems to go into a busy loop and all I/O to the md device blocks. We've seen this on various kernel versions. >> >> The last time this happened (in this case with Linux 5.10.0-rc4), I took some data. >> >> The triggering event seems to be the md_check cron job trying to pause the ongoing check operation in the morning with >> >>      echo idle > /sys/devices/virtual/block/md1/md/sync_action >> >> This doesn't complete. Here's /proc/stack of this process: >> >>      root@done:~/linux_problems/mdX_raid6_looping/2020-11-27# ps -fp 23333 >>      UID        PID  PPID  C STIME TTY          TIME CMD >>      root     23333 23331  0 02:00 ?        00:00:00 /bin/bash /usr/bin/mdcheck --continue --duration 06:00 >>      root@done:~/linux_problems/mdX_raid6_looping/2020-11-27# cat /proc/23333/stack >>      [<0>] kthread_stop+0x6e/0x150 >>      [<0>] md_unregister_thread+0x3e/0x70 >>      [<0>] md_reap_sync_thread+0x1f/0x1e0 >>      [<0>] action_store+0x141/0x2b0 >>      [<0>] md_attr_store+0x71/0xb0 >>      [<0>] kernfs_fop_write+0x113/0x1a0 >>      [<0>] vfs_write+0xbc/0x250 >>      [<0>] ksys_write+0xa1/0xe0 >>      [<0>] do_syscall_64+0x33/0x40 >>      [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> Note, that md0 has been paused successfully just before. > > What is the personality of md0? Is it also raid6? Yes. > >> >>      2020-11-27T02:00:01+01:00 done CROND[23333]: (root) CMD (/usr/bin/mdcheck --continue --duration "06:00") >>      2020-11-27T02:00:01+01:00 done root: mdcheck continue checking /dev/md0 from 10623180920 >>      2020-11-27T02:00:01.382994+01:00 done kernel: [378596.606381] md: data-check of RAID array md0 >>      2020-11-27T02:00:01+01:00 done root: mdcheck continue checking /dev/md1 from 11582849320 >>      2020-11-27T02:00:01.437999+01:00 done kernel: [378596.661559] md: data-check of RAID array md1 >>      2020-11-27T06:00:01.842003+01:00 done kernel: [392996.625147] md: md0: data-check interrupted. >>      2020-11-27T06:00:02+01:00 done root: pause checking /dev/md0 at 13351127680 >>      2020-11-27T06:00:02.338989+01:00 done kernel: [392997.122520] md: md1: data-check interrupted. >>      [ nothing related following.... ] >> >> After that, we see md1_raid6 in a busy loop: >> >>      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND >>      2376 root     20   0       0      0      0 R 100.0  0.0   1387:38 md1_raid6 > > Seems the reap sync thread was trying to stop md1_raid6 while md1_raid6 was triggered again and again. > >> >> Also, all processes doing I/O do the md device block. >> >> This is /proc/mdstat: >> >>      Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] >>      md1 : active raid6 sdk[0] sdj[15] sdi[14] sdh[13] sdg[12] sdf[11] sde[10] sdd[9] sdc[8] sdr[7] sdq[6] sdp[5] sdo[4] sdn[3] sdm[2] sdl[1] >>            109394518016 blocks super 1.2 level 6, 512k chunk, algorithm 2 [16/16] [UUUUUUUUUUUUUUUU] >>            [==================>..]  check = 94.0% (7350290348/7813894144) finish=57189.3min speed=135K/sec >>            bitmap: 0/59 pages [0KB], 65536KB chunk >>      md0 : active raid6 sds[0] sdah[15] sdag[16] sdaf[13] sdae[12] sdad[11] sdac[10] sdab[9] sdaa[8] sdz[7] sdy[6] sdx[17] sdw[4] sdv[3] sdu[2] sdt[1] >>            109394518016 blocks super 1.2 level 6, 512k chunk, algorithm 2 [16/16] [UUUUUUUUUUUUUUUU] >>            bitmap: 0/59 pages [0KB], 65536KB chunk >> > > So the RECOVERY_CHECK flag should be set, not sure if the simple changes > helps, but you may give it a try. Thanks. I've copied the original condition block to execute before the modified one and added some logging to it to see, if the change actually triggers. I will pause and unpause the check frequently on a busy machine to get this code executed more often. Best Donald > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 98bac4f..e2697d0 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -9300,7 +9300,8 @@ void md_check_recovery(struct mddev *mddev) >                         md_update_sb(mddev, 0); > >                 if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > -                   !test_bit(MD_RECOVERY_DONE, &mddev->recovery)) { > +                   (!test_bit(MD_RECOVERY_DONE, &mddev->recovery) || > +                    test_bit(MD_RECOVERY_CHECK, &mddev->recovery))) { >                         /* resync/recovery still happening */ >                         clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); >                         goto unlock; > > Thanks, > Guoqing -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433