Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp122171imj; Thu, 14 Feb 2019 16:46:52 -0800 (PST) X-Google-Smtp-Source: AHgI3IZJbEwRYPs07FxtGN1n/z4YgohAFePhsXp1FhyU6HwnYHrYoCnWVjK6jMJshpdmpzQ8Ji08 X-Received: by 2002:a17:902:6bc3:: with SMTP id m3mr7187103plt.24.1550191612397; Thu, 14 Feb 2019 16:46:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550191612; cv=none; d=google.com; s=arc-20160816; b=DrJ7yUGsspdbAk3WqdPaQlXr/SsM7KVEoP5jCTrWfOSfpL+allcdrIr2wUDpZ3fuaN MifZgTuwTJaPqXmKfSl+1w5/AXct3O0GiP5eq8KB/956Fz7OWRZNCsxvdH4De0Xsy8lw 2LiAoJt2VOBpSW75XdP41sCbClVg1QsdTI0DrKQzUKBePHPdqWmt4EG8FptzpurWbuLR 51zoe2h4ugdZze9uDMMCFilNbsNW893+2dsfIg6BZs8XOu57pZ9xOHI00ywE5BcX8rxY wjCqpK434Qg5nEogMMTCq/Oa/CE1/TYKxxKZv1WSlhk78VBiLgh7z5abNXyVoCzoUNeX 7CBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:user-agent:message-id:date:subject:cc:to :from:dkim-signature; bh=zWZc5prG9PG/gbLgwFpHQXgXSX3bPKTrNUpmOt+TsBw=; b=v/xS4ME5+hLbxUrIVeANltDZtmz4QfR2dB9xFA8pNE2sJMq35Ov5AROUpYBuil99BT SGFF52Xe5FBUYmEDR9rVGVdFQqBPI9ThRDXSk5ByyGM8ht+JRuM1PRzKg9wjex21LoWL 4eYpVoeshUoy1hj9oxRa0X3kTXbTN0ytwbr2gk4zR9ptQjFNlj1xK5Q3Y27kJO7RS3me K+FkjVu+W9GML8Aoff6wjIXq/yojo3NtaaRnQo0Dxf9R1eK0PEOpkrv0ks1LyvhrVaEH UJ2KhoyHuRNADB6UnVoaKwV7VKr8QBn7mp/m/2ixP5ezpaVFSWYNPmS+gthu5RUFBHV8 5JLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@stwm.de header.s=stwm-20170627 header.b=U1Eawyj6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g2si3878575pfd.200.2019.02.14.16.46.36; Thu, 14 Feb 2019 16:46:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@stwm.de header.s=stwm-20170627 header.b=U1Eawyj6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2502069AbfBNP1Z (ORCPT + 99 others); Thu, 14 Feb 2019 10:27:25 -0500 Received: from mailin.studentenwerk.mhn.de ([141.84.225.229]:51568 "EHLO email.studentenwerk.mhn.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2407551AbfBNP1Y (ORCPT ); Thu, 14 Feb 2019 10:27:24 -0500 Received: from mailhub.studentenwerk.mhn.de (mailhub.studentenwerk.mhn.de [127.0.0.1]) by email.studentenwerk.mhn.de (Postfix) with ESMTP id 440gHb0XxlzRhSK; Thu, 14 Feb 2019 16:27:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stwm.de; s=stwm-20170627; t=1550158043; bh=zWZc5prG9PG/gbLgwFpHQXgXSX3bPKTrNUpmOt+TsBw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U1Eawyj6aOC4zQq0jVuA5NoQciwRlLPpK1dq436DfXO2woGT6tl5C3VbucphAwWDe zSwMUiiyBVkoPNUxBRtLAE1lSVGaCD8JOwnd9mY7GD6giWHXm6U++zZGyDElFQOJyP gx6eZR4RAGoTt5u1UJnSdF+GCQ7ni+Yty+jtcpcnJHZ4ChBzsIQcy4jXwE6/aoqaM0 9rbGGxW+LaonHXAtH5wH57AarVEYcsUUGy6lu0mT0Wy4V4iuok8ngNtyodqEWhoO4X sBvBsO3n/e4RIPslV1Y0OBQJcVnSo/wDg5xasbhF3j0vWUwkdUNglPZkdV/QyOgDBx BAsYSDqFDKHGQ== From: Wolfgang Walter To: Guoqing Jiang Cc: Jens Axboe , NeilBrown , linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: linux 4.19.19: md0_raid:1317 blocked for more than 120 seconds. Date: Thu, 14 Feb 2019 16:27:22 +0100 Message-ID: <3877135.KJXZSZYZ1L@stwm.de> User-Agent: KMail/4.14.3 (Linux/4.18.12-041812-generic; KDE/4.14.13; x86_64; ; ) In-Reply-To: <0c832f67-de10-8872-d3db-6a9f11c97454@suse.com> References: <2131016.q2kFhguZXe@stwm.de> <3057098.nBgIypvgED@stwm.de> <0c832f67-de10-8872-d3db-6a9f11c97454@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am Donnerstag, 14. Februar 2019, 10:09:56 schrieb Guoqing Jiang: > On 2/12/19 7:20 PM, Wolfgang Walter wrote: > > Am Dienstag, 12. Februar 2019, 16:20:11 schrieb Guoqing Jiang: > >> On 2/11/19 11:12 PM, Wolfgang Walter wrote: > >>> With 4.19.19 we see sometimes the following issue (practically on= ly with > >>> blk_mq, though): > >>>=20 > >>> Feb 4 20:04:46 tettnang kernel: [252300.060165] INFO: task > >>> md0_raid1:317 > >>> blocked for more than 120 seconds. Feb 4 20:04:46 tettnang kerne= l: > >>> [252300.060188] Not tainted 4.19.19-debian64.all+1.1 #1 Feb= 4 > >>> 20:04:46 tettnang kernel: [252300.060197] "echo 0 > > >>> /proc/sys/kernel/hung_task_timeout_secs" disables this message. F= eb 4 > >>> 20:04:46 tettnang kernel: [252300.060207] md0_raid1 D 0 = 317 > >>>=20 > >>> 2 0x80000000 Feb 4 20:04:46 tettnang kernel: [252300.060211] = Call > >>>=20 > >>> Trace: > >>> Feb 4 20:04:46 tettnang kernel: [252300.060222] ? > >>> __schedule+0x2a2/0x8c0 > >>> Feb 4 20:04:46 tettnang kernel: [252300.060226] ? > >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang ke= rnel: > >>> [252300.060229] schedule+0x32/0x90 Feb 4 20:04:46 tettnang kern= el: > >>> [252300.060241] md_super_wait+0x69/0xa0 [md_mod] Feb 4 20:04:46= > >>> tettnang kernel: [252300.060247] ? finish_wait+0x80/0x80 Feb 4 > >>> 20:04:46 > >>> tettnang kernel: [252300.060255] md_bitmap_wait_writes+0x8e/0xa0= > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060263] ? > >>> md_bitmap_get_counter+0x42/0xd0 [md_mod] Feb 4 20:04:46 tettnang= > >>> kernel: > >>> [252300.060271] md_bitmap_daemon_work+0x1e8/0x380 [md_mod] Feb = 4 > >>> 20:04:46 tettnang kernel: [252300.060278] ? md_rdev_init+0xb0/0x= b0 > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060285] > >>> md_check_recovery+0x26/0x540 [md_mod] Feb 4 20:04:46 tettnang ke= rnel: > >>> [252300.060290] raid1d+0x5c/0xf00 [raid1] Feb 4 20:04:46 tettna= ng > >>> kernel: [252300.060294] ? preempt_count_add+0x79/0xb0 Feb 4 20:= 04:46 > >>> tettnang kernel: [252300.060298] ? lock_timer_base+0x67/0x80 Feb= 4 > >>> 20:04:46 tettnang kernel: [252300.060302] ? > >>> _raw_spin_unlock_irqrestore+0x20/0x40 Feb 4 20:04:46 tettnang ke= rnel: > >>> [252300.060304] ? try_to_del_timer_sync+0x4d/0x80 Feb 4 20:04:4= 6 > >>> tettnang kernel: [252300.060306] ? del_timer_sync+0x35/0x40 Feb = 4 > >>> 20:04:46 tettnang kernel: [252300.060309] ? > >>> schedule_timeout+0x17a/0x3b0 > >>> Feb 4 20:04:46 tettnang kernel: [252300.060312] ? > >>> preempt_count_add+0x79/0xb0 Feb 4 20:04:46 tettnang kernel: > >>> [252300.060315] ? _raw_spin_lock_irqsave+0x25/0x50 Feb 4 20:04:= 46 > >>> tettnang kernel: [252300.060321] ? md_rdev_init+0xb0/0xb0 [md_mo= d] Feb > >>> 4 20:04:46 tettnang kernel: [252300.060327] ? md_thread+0xf9/0x1= 60 > >>> [md_mod] Feb 4 20:04:46 tettnang kernel: [252300.060330] ? > >>> r1bio_pool_alloc+0x20/0x20 [raid1] Feb 4 20:04:46 tettnang kerne= l: > >>> [252300.060336] md_thread+0xf9/0x160 [md_mod] Feb 4 20:04:46 te= ttnang > >>> kernel: [252300.060340] ? finish_wait+0x80/0x80 Feb 4 20:04:46 > >>> tettnang > >>> kernel: [252300.060344] kthread+0x112/0x130 Feb 4 20:04:46 tett= nang > >>> kernel: [252300.060346] ? kthread_create_worker_on_cpu+0x70/0x70= Feb 4 > >>> 20:04:46 tettnang kernel: [252300.060350] ret_from_fork+0x35/0x4= 0 > >>>=20 > >>> I saw that there was a similar problem with raid10 and an upstrea= m patch > >>>=20 > >>> e820d55cb99dd93ac2dc949cf486bb187e5cd70d > >>> md: fix raid10 hang issue caused by barrier > >>> by Guoqing Jiang > >>>=20 > >>> I wonder if there is a similar fix needed for raid1? > >>=20 > >> Seems not, the calltrace tells the previous write superblock IO wa= s not > >> finish as expected, > >> there is a report for raid5 which has similar problem with md_supe= r_wait > >> in the link [1]. Maybe > >> you can disable blk-mq to narrow down the issue as well. > >=20 > > I already did for 4 weeks. I didn't saw this with blk-mq disabled (= for > > scsi > > and md), though this may be by luck. >=20 > Then I guess it maybe related to blk-mq, which scheduler are you used= > with blk-mq? > And maybe you can switch it to see if it is caused by specified > scheduler or not. mq-deadline for SCSI and none for md and dm. >=20 > >> [1] |https://bbs.archlinux.org/viewtopic.php?id=3D243520 > >=20 > > I found this bug report in debian: > >=20 > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D904822 >=20 > Thanks, the bug report also said it didn't happen after disable blk-m= q. >=20 Regards, --=20 Wolfgang Walter Studentenwerk M=FCnchen Anstalt des =F6ffentlichen Rechts