Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp1883003imm; Fri, 7 Sep 2018 07:35:37 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZSkABBnfe5QhK6x1pe38OZUtKHWbICCZJXigkM2sYJPAQcsP7pmYNn0VVaN8O91ehIFypK X-Received: by 2002:a62:cc83:: with SMTP id j3-v6mr8812289pfk.255.1536330937795; Fri, 07 Sep 2018 07:35:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536330937; cv=none; d=google.com; s=arc-20160816; b=g9367W0/pq0Dgm/CX8Wx2ZQymCCwwK7zAY099Lr32ppklEdDuc0AsH2SJ31zTjadtU b/LJoFXU2HPHamfJ7YqSBtLv3hdllwJ0/jpROwoLhrtIzKHzlL/zN4Mr43HERciwSn6L z3oQ9XlRT2A+sDyPoHrilDe2TMMswQGcDdjXa+HixhcAJ3jIxqU2wpqjosZ2bpsSvgy8 R3rMhq6B8zQAGWQEc8Y7qBLjJt6mHvwIF8WgoLH947zVpc8s0J/iQE9XqD6RCG3QLsH+ XG/ad3PWpLSZWfrqmycR88RoRwrmV/5VnRs1crt2ssZ3UeNDOkEHwfnQ56uwTvjCHuIB RjzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=lJ1b8oDYuqnyKXjPHvRTWs7zSA5DsNYYe8sviNyyVMI=; b=uDcUxGDFoBIzAS2G0D96EjuTWkduYskposQmFZUmIgSpeuIBgKu0D4m33pKB/HKtat H4Vryz8kOycpqLSDwUE1ErsLnAMLAOYP3Mi1GN5c0lHVfUZ1zAp6j2SFulB9WKYSVJNc X2vuFj0CKpbtSoujT+1SiTz/alettES1bMBcRzdUc/10ufArKJnXIl9pO5ShzmCkeLiP o7I3eUT/YNijT8hBcTajAnc8CVvj4AJh3WnCc4d8ng2mNRrKwbYVp/c3zzJZ9ocDNSl4 lg/iWwYs3SMkoHOZbQEC9gU4EjszA/0najYFuT2MadNMT95emNL4m0hYrRJZ8ztTxd8L QY3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b="M6tZWC/T"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b1-v6si6576731plz.476.2018.09.07.07.35.22; Fri, 07 Sep 2018 07:35:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b="M6tZWC/T"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728683AbeIGPwk (ORCPT + 99 others); Fri, 7 Sep 2018 11:52:40 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:33746 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726452AbeIGPwk (ORCPT ); Fri, 7 Sep 2018 11:52:40 -0400 Received: by mail-ed1-f68.google.com with SMTP id d8-v6so11285175edv.0 for ; Fri, 07 Sep 2018 04:12:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=lJ1b8oDYuqnyKXjPHvRTWs7zSA5DsNYYe8sviNyyVMI=; b=M6tZWC/TWn5poqk0W7JAyOeTdsFd4k+KqjoU8u46UUc5EBsG9nPacShNhdnsGwvY5Y eUAW2QU1Kr8h0CZsnQaICdaUlg1+sGF3awtICi2YJfvLqYQnhoM9yP4HJSZal8ctjwV9 8Zgi0YXHuFQAhTjR/sEwKO2EnpEeBf/GRmXFpg/aYt+Gg7byyBcZ5G5+jJW2/i2Rqrgf wfVg3dLfW6RtFOm0vurK9M5PAGaxDuuRHSUTkVFgdI+GvByBCQoyGVehA/+G0QzOfj3a tKrmJvqvS/NVvh6W9KHVDK+zk0ms2+jG6GUi1YMolbH1/jajTfxkm4vl7LfJirEB2eYD 0F4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=lJ1b8oDYuqnyKXjPHvRTWs7zSA5DsNYYe8sviNyyVMI=; b=WTSl+qpwGuliFC6bqO2Ph/cV5/uOReMv5RhyQboR2wjaFS8ELmhO/Jk0F76ilQw34N dQ2Kq+QWVNm8H3zqpTZOcfKFDgTMj+fExkYGD4uZTIgE1l/j3LgamiEJUgeTzcRbhdVO 4qyU7Ycn+qhDaOnUOrxfmYyIhxkdVGATYFpV8QphQBAadAPNRkzDxvu90dV1MIrSm8+2 /6F/By1Ym+3C+Jd4vREC6zOsNQrKgpFwllAqfQiuk4yOU8ZwGyl5urJcQVwD4vng5K11 2nLDPQ0k13Fj3urdAg+CA/r4QMUkW7yUIoAJIivldFbESSM+rhYWrMnFC7TzdPulLIFF 4Kig== X-Gm-Message-State: APzg51DM+e7Pg7asBFNVS0RwOqMaQAoYtiUC3fyY0luK/uDbYiw3guSa yaJOjl7fk9F6LipgxR3eIaiRJA== X-Received: by 2002:a50:a267:: with SMTP id 94-v6mr8480754edl.189.1536318730281; Fri, 07 Sep 2018 04:12:10 -0700 (PDT) Received: from dhcp.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id w22-v6sm4509732eda.34.2018.09.07.04.12.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 07 Sep 2018 04:12:09 -0700 (PDT) Message-ID: <1536318728.23560.98.camel@arista.com> Subject: Re: [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds From: Dmitry Safonov To: Jiri Slaby , kernel test robot Cc: lkp@01.org, Daniel Axtens , Dmitry Safonov <0x7f454c46@gmail.com>, Sergey Senozhatsky , Dmitry Vyukov , Tan Xiaojun , Peter Hurley , Pasi =?ISO-8859-1?Q?K=E4rkk=E4inen?= , Greg Kroah-Hartman , Michael Neuling , Mikulas Patocka , linux-kernel@vger.kernel.org, stable@vger.kernel.org Date: Fri, 07 Sep 2018 12:12:08 +0100 In-Reply-To: References: <20180907045041.GF1110@shao2-debian> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.24.6 (3.24.6-1.fc26) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2018-09-07 at 08:39 +0200, Jiri Slaby wrote: > On 09/07/2018, 06:50 AM, kernel test robot wrote: > > FYI, we noticed the following commit (built with gcc-7): > > > > commit: 0b4f83d510f6fef6bb9da25f122c8d733d50516f ("[PATCH 2/4] tty: > > Hold tty_ldisc_lock() during tty_reopen()") > > url: https://github.com/0day-ci/linux/commits/Dmitry-Safonov/tty-Ho > > ld-write-ldisc-sem-in-tty_reopen/20180829-165618 > > base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/tty.git > > tty-testing > > > > in testcase: trinity > > with following parameters: > > > > runtime: 300s > > > > test-description: Trinity is a linux system call fuzz tester. > > test-url: http://codemonkey.org.uk/projects/trinity/ > > > > > > on test machine: qemu-system-x86_64 -enable-kvm -m 256M > > > > caused below changes (please refer to attached dmesg/kmsg for > > entire log/backtrace): > > > > > > +--------------------------------------------------+------------+ > > ------------+ > > > | 58dd163974 | > > > 0b4f83d510 | > > > > +--------------------------------------------------+------------+ > > ------------+ > > > boot_successes | 14 | > > > 4 | > > > boot_failures | 0 | > > > 6 | > > > INFO:task_blocked_for_more_than#seconds | 0 | > > > 6 | > > > Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | > > > 6 | > > > > +--------------------------------------------------+------------+ > > ------------+ > > > > > > > > [ 244.816801] INFO: task validate_data:655 blocked for more than > > 120 seconds. > > [ 244.818833] Not tainted 4.18.0-11684-g0b4f83d #1 > > [ 244.820028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [ 244.826965] validate_data D 0 655 623 0x20020000 > > [ 244.828279] Call Trace: > > [ 244.828958] ? __schedule+0x843/0x950 > > [ 244.830173] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.834903] schedule+0x31/0x70 > > [ 244.835665] schedule_timeout+0x34/0x760 > > [ 244.836613] ? ftrace_likely_update+0x35/0x60 > > [ 244.837683] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.838818] ? ftrace_likely_update+0x35/0x60 > > [ 244.840127] ? ftrace_likely_update+0x35/0x60 > > [ 244.845947] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.847882] __ldsem_down_read_nested+0x23a/0x3b0 > > [ 244.849886] ? tty_ldisc_ref_wait+0x25/0x50 > > [ 244.853807] tty_ldisc_ref_wait+0x25/0x50 > > [ 244.854946] tty_compat_ioctl+0x8a/0x120 > > [ 244.855928] ? this_tty+0x80/0x80 > > [ 244.856742] __ia32_compat_sys_ioctl+0xc28/0x1ce0 > > [ 244.857981] do_int80_syscall_32+0x1d2/0x5f0 > > [ 244.859003] entry_INT80_compat+0x88/0xa0 > > [ 244.859972] INFO: task dnsmasq:668 blocked for more than 120 > > seconds. > > [ 244.868315] Not tainted 4.18.0-11684-g0b4f83d #1 > > [ 244.869583] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [ 244.871744] dnsmasq D 0 668 1 0x20020000 > > [ 244.873063] Call Trace: > > [ 244.873697] ? __schedule+0x843/0x950 > > [ 244.874572] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.875725] schedule+0x31/0x70 > > [ 244.876576] schedule_timeout+0x34/0x760 > > [ 244.877573] ? ftrace_likely_update+0x35/0x60 > > [ 244.878660] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.879872] ? ftrace_likely_update+0x35/0x60 > > [ 244.890522] ? ftrace_likely_update+0x35/0x60 > > [ 244.891572] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.892746] __ldsem_down_read_nested+0x23a/0x3b0 > > [ 244.893861] ? tty_ldisc_ref_wait+0x25/0x50 > > [ 244.894841] tty_ldisc_ref_wait+0x25/0x50 > > [ 244.895911] tty_compat_ioctl+0x8a/0x120 > > [ 244.896916] ? this_tty+0x80/0x80 > > [ 244.897717] __ia32_compat_sys_ioctl+0xc28/0x1ce0 > > [ 244.898821] do_int80_syscall_32+0x1d2/0x5f0 > > [ 244.899830] entry_INT80_compat+0x88/0xa0 > > [ 244.909466] INFO: task dropbear:734 blocked for more than 120 > > seconds. > > [ 244.911173] Not tainted 4.18.0-11684-g0b4f83d #1 > > [ 244.912394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [ 244.914176] dropbear D 0 734 1 0x20020000 > > [ 244.915446] Call Trace: > > [ 244.916068] ? __schedule+0x843/0x950 > > [ 244.916945] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.918076] schedule+0x31/0x70 > > [ 244.918832] schedule_timeout+0x34/0x760 > > [ 244.919781] ? ftrace_likely_update+0x35/0x60 > > [ 244.921104] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.922304] ? ftrace_likely_update+0x35/0x60 > > [ 244.923347] ? ftrace_likely_update+0x35/0x60 > > [ 244.924369] ? __ldsem_down_read_nested+0x1c4/0x3b0 > > [ 244.925496] __ldsem_down_read_nested+0x23a/0x3b0 > > [ 244.926598] ? tty_ldisc_ref_wait+0x25/0x50 > > [ 244.927578] tty_ldisc_ref_wait+0x25/0x50 > > [ 244.928526] tty_compat_ioctl+0x8a/0x120 > > [ 244.929449] ? this_tty+0x80/0x80 > > [ 244.930240] __ia32_compat_sys_ioctl+0xc28/0x1ce0 > > [ 244.940083] do_int80_syscall_32+0x1d2/0x5f0 > > [ 244.941310] entry_INT80_compat+0x88/0xa0 > > [ 244.944070] > > [ 244.944070] Showing all locks held in the system: > > [ 244.945558] 1 lock held by khungtaskd/18: > > [ 244.946495] #0: (____ptrval____) (rcu_read_lock){....}, at: > > debug_show_all_locks+0x16/0x190 > > [ 244.948503] 2 locks held by askfirst/235: > > [ 244.949439] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x50 > > [ 244.951762] #1: (____ptrval____) (&ldata- > > >atomic_read_lock){+.+.}, at: n_tty_read+0x13d/0xa00 > > Here, it just seems to wait for input from the user. > > > [ 244.953799] 1 lock held by validate_data/655: > > [ 244.954814] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x50 > > [ 244.956764] 1 lock held by dnsmasq/668: > > [ 244.957649] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x50 > > [ 244.959598] 1 lock held by dropbear/734: > > [ 244.967564] #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x50 > > Hmm, I assume there is another task waiting for write_ldsem and that > one > prevents these readers to proceed. Unfortunately, due to the defunct > __ptrval__ pointer hashing crap, we do not see who is waiting for > what. > But I am guessing all are the same locks. Hmm, and the writer is not able to grab it because there is already reader who uses console, I suppose. I wonder if we can add re-acquiring for the reader side.. That will make write acquire less likely to fail under the load. I'll try to reproduce it. > > So I think, we are forced to limit the waiting to 5 seconds in reopen > in > the end too (the same as we do for new open in tty_init_dev). > > Dmitry, could you add the limit and handle the return value of > tty_ldisc_lock now? Yeah, will do. -- Thanks, Dmitry