Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp869503rdb; Fri, 2 Feb 2024 06:35:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IHGy9nfvlRerKKTovmpkx+jarr278srLsduNuFwsbu+yLWZSZ4XhegrsNmd4siIWmX4kfb8 X-Received: by 2002:a92:507:0:b0:363:b662:9028 with SMTP id q7-20020a920507000000b00363b6629028mr984948ile.27.1706884554178; Fri, 02 Feb 2024 06:35:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706884554; cv=pass; d=google.com; s=arc-20160816; b=e3LW8EyfYwB8ujK8A8WjF0A/ysjkA5E9Ay/DB6BDdnt6I0xM9x+US2Cc7nVb4PQ2aN 6GoRQcG7/+RO/F57lkrHjsX8KMIzABAPgePViVCLJdmV3wEQGUL0hPTpdJv4OUroEce+ 6gi4I/8o15cX0uq0+pjZJORWvCWKNDeh1gVf509dsH8vd2ushhiZxnWOY94IXSviDNt0 7xl3q2jxJNHgJGb9nZXbOB4Y3H/30nols085FUYLOhIvGaCbc0VnTBfFTQgEkB9YE/GP umUmhoC5/34vXjvD2pkHKMtroqLJfWVQv+I/R6Zuqte+KIhcizzuhFjjXiZqSzpd0oJb E0Gg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=QTXW/IgMSjYbmMUv1HG2gfGoIaBkbzHe0l/jl6UZwLo=; fh=OQGx3CvxoIvmAYEe66acYllyOsoAZRhXmsNIVnKpyk4=; b=m5T92cGRfKqKx5sNx/uIT3OLXiYjqQSWCqny1Q2lVGjHQnCd8fH42GnMz5+V3r7Z/2 QQulceee68orWJrffrc0Ea8tWPT5rELK2tPbRih+jner07GKfaWkT3xPnoidb7QO3dw+ LcLk4w1p6M7gaH4sMGp6iIrizWZxhdmx3baOYQYQx8P7KC3kK3WfbXlXzIt/ckjPyRab s3IbcSFBuB2d4iimU4ik8bachwT03vuPr3iGxetQaqjpsS6J2i9Zwtn8St9HKKo4zhTs Ftje3tzC8uX+NrnOqhB5I9BTKft7WIom9lzzpdv4Hx22xalX+WL/3kJuccX+qqEabiLJ 5SUA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.com header.s=google header.b="UjO9/ecV"; arc=pass (i=1 spf=pass spfdomain=suse.com dkim=pass dkdomain=suse.com dmarc=pass fromdomain=suse.com); spf=pass (google.com: domain of linux-kernel+bounces-49958-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49958-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com X-Forwarded-Encrypted: i=1; AJvYcCUO5IcV64id3Ejvs8KVw3OIQybelyPk8p4mwswR1bNlajAQn9FS2xlGNBufK+eoUr/661HgE2UBPASZY+lKB7otK3ckKiXLZHGhxdcL2Q== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 193-20020a6300ca000000b005dbd2086a0dsi1676844pga.331.2024.02.02.06.35.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 06:35:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-49958-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=google header.b="UjO9/ecV"; arc=pass (i=1 spf=pass spfdomain=suse.com dkim=pass dkdomain=suse.com dmarc=pass fromdomain=suse.com); spf=pass (google.com: domain of linux-kernel+bounces-49958-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49958-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id CB03E2881D7 for ; Fri, 2 Feb 2024 14:35:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5FC5014533C; Fri, 2 Feb 2024 14:35:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="UjO9/ecV" Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02F072B9BB for ; Fri, 2 Feb 2024 14:35:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706884532; cv=none; b=RV9yfkVkeaY3S1oGU4CRvUwWV6DcAUjioW8racca2MuCr16yyIAlb8blsn5zchVtSjmKji+EIhGyfBE3U+NqHms4cyo+pDK7VCyJePUIsgVrGodajMuT+V0TIO0KCLdTJUOiiw/Dyn6qx3RZgEmn0oxFNE2FswGbgzkmNHkV1rY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706884532; c=relaxed/simple; bh=4HTPW99u3wqlNp6Rv9yUxE8nx6RD09MRp/DTp8XNPyo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HNeDvy9tV7GL/2oze+pYityGWFrot5DNhEJEzuilEn1aQKWd5a7R9sSG3hmC5FHkHf1YClMkr0cy7Zi7UlEmtnUPcRkTkj28lQoUku+LlZYwuW8rBSnamMg1u6T5m4we2FLdfFTHOjA/JoPI7IZb8TewVu2cBtMtYel3Pg52umo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=UjO9/ecV; arc=none smtp.client-ip=209.85.218.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a370e63835cso80723466b.1 for ; Fri, 02 Feb 2024 06:35:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1706884528; x=1707489328; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=QTXW/IgMSjYbmMUv1HG2gfGoIaBkbzHe0l/jl6UZwLo=; b=UjO9/ecVNR/RUm04u4QZKxgCHIA9a2v2SWleCJevZdtdE9voJWl1BHAMxwx+An0qaT dQWWgU6dIJ8MQt2CgoWMJL+YRq0/jxRyTM+qgZ1Onf4wVLe31m6zvFFRUgYmQxwCv0YN bPchINApGtEpAVoSEysAI6ZVB74NNKH1BqgU4tJsNRJugbBAVVfPIyqB0b/7h9w/FpPj QZvn6rGBnWIkv9Bsvdpypz7F8S3La5f5SnYrUS4pflsRaNPApQB5uduf1YhgC69WtgCn WQ2m4ngqgL1iakKye3WKhaBsopHA9vm+ZuA2GtlveOrDy1Wk5SMIvy1Dh9JiFAUVDeMz 7mfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706884528; x=1707489328; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QTXW/IgMSjYbmMUv1HG2gfGoIaBkbzHe0l/jl6UZwLo=; b=ZdKhFPB2CFUzt+BYdD2r1MrqeQ4+gzIIPRAXuTwmv6uHjWOBWrEcoOnzamixUc/229 XSQTXazaUe/crj2lYYdGubZabkCSMFpYDzBqVKfoIy7Wr9MWol59xvd0tiaFCQg+P3sC m429dzbFz79z/7wAckJjqQlXEs5n29tQOPlvDrb/2nAqUCp0jOjOhx6ES1mKKuyO4jG5 NPvLkw8u/cVKLmkct9fvoixQI0eGPd6bN5+JyJ+jKWeQ/wbvaXY8XyTOPXLa3Ik0sEF/ ra3CWWge9TuB2hwN4j06lzel0wrORkR8yzcbh0ohNhTT6Hvg6ep9ImepJYOzg6KRr/F9 O+gQ== X-Gm-Message-State: AOJu0YxyyvOpICugCrHgxP/EhsQpbwfGcEVsYpNrrbvrqFw1s5lXlHF2 2qpF+woukqgcmXSY619FX5tIuAtwBXniW1kDt2fpOOIiU9uZ1bPP52iX4P/eOrI= X-Received: by 2002:a17:906:5399:b0:a36:fc06:6f7b with SMTP id g25-20020a170906539900b00a36fc066f7bmr1294273ejo.44.1706884528064; Fri, 02 Feb 2024 06:35:28 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCV3+2fQtHpjtsymH97+zaX2cVRDP0JvZtZ3prHAs/d0++TLMhL5qi+ITYLMFuhkLHeZ2QxJsvPpVrxMXdyNXt1U1yuU59BZkcijCPKsj1xAeB4j0NfUej06nU+S3NEO8ptT2KRNdG9s5v6/dkuRBKjLgjtZE0YGY5wC8eTwCeI77tedMJ7ndPZhX1sAnk/ro28jxc0KYcFbLI0NFy8wg2PlyhbDazwT+2j1+WJsctddRoip0+cNvcA5tnCgcdzZHNjTM9UkFNe2BUkfZfhzFveoD2kLh+CqjLbW Received: from alley ([176.114.240.50]) by smtp.gmail.com with ESMTPSA id f14-20020a17090624ce00b00a3683b565fbsm924750ejb.187.2024.02.02.06.35.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 06:35:27 -0800 (PST) Date: Fri, 2 Feb 2024 15:35:26 +0100 From: Petr Mladek To: "Paul E. McKenney" Cc: Tejun Heo , Lai Jiangshan , Steven Rostedt , John Ogness , Sergey Senozhatsky , Stephen Rothwell , linux-kernel@vger.kernel.org, rcu@vger.kernel.org Subject: Re: [BUG] workqueues and printk not playing nice since next-20240130 Message-ID: References: <410d6a87-bf34-457e-b714-1e6149d48532@paulmck-laptop> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <410d6a87-bf34-457e-b714-1e6149d48532@paulmck-laptop> On Fri 2024-02-02 05:04:45, Paul E. McKenney wrote: > Hello! > > Starting with next-20240130 (and perhaps a bit earlier), rcutorture gets > what initially looked like early-boot hangs, but only when running on > dual-socket x86 systems [1], as it it works just fine on my x86 laptop [2]. > But when running on dual-socket systems, this happens all the time, > perhaps because rcutorture works hard to split each guest OS across a > socket boundary. > > This is the reproducer: > > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 1m --configs "10*TREE01" --trust-make > > By "looked like early-boot hangs" I mean that qemu was quite happy, > but there was absolutely no console output. > > Bisection identified this commit: > > 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues") > > Reverting this commit made the problem go away. Except that it is really > hard to imagine this commit having any effect whatsoever on early boot > execution. Of course, this might be a failure of imagination on my part, > so I enlisted the aid of gdb: > > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 1m --configs "TREE01" --trust-make --gdb > > After following the resulting gdb startup instructions and waiting for > about ten seconds, I hit control-C on the gdb window and then: > > (gdb) bt > #0 default_idle () at arch/x86/kernel/process.c:743 > #1 0xffffffff81e94d34 in default_idle_call () at kernel/sched/idle.c:97 > #2 0xffffffff810d746d in cpuidle_idle_call () at kernel/sched/idle.c:170 > #3 do_idle () at kernel/sched/idle.c:312 > #4 0xffffffff810d76a4 in cpu_startup_entry (state=state@entry=CPUHP_ONLINE) > at kernel/sched/idle.c:410 > #5 0xffffffff81e95417 in rest_init () at init/main.c:730 > #6 0xffffffff8329adf2 in start_kernel () at init/main.c:1067 > #7 0xffffffff832a5038 in x86_64_start_reservations ( > real_mode_data=real_mode_data@entry=0x13d50 ) at arch/x86/kernel/head64.c:555 > #8 0xffffffff832a513c in x86_64_start_kernel ( > real_mode_data=0x13d50 ) at arch/x86/kernel/head64.c:536 > #9 0xffffffff810001d2 in secondary_startup_64 () > at arch/x86/kernel/head_64.S:461 > #10 0x0000000000000000 in ?? () > (gdb) print jiffies > $1 = 4294676330 > (gdb) print system_state > $2 = SYSTEM_RUNNING > > In other words, the system really has booted, and at least one CPU is > happily idling in the idle loop. And another CPU is (maybe not quite > so happily) running rcutorture: > > (gdb) thread 6 > [Switching to thread 6 (Thread 1.6)] > #0 0xffffffff8111160b in rcu_torture_one_read ( > trsp=trsp@entry=0xffffc900004abe90, myid=myid@entry=4) > at kernel/rcu/rcutorture.c:2003 > 2003 completed = cur_ops->get_gp_seq(); > (gdb) bt > #0 0xffffffff8111160b in rcu_torture_one_read ( > trsp=trsp@entry=0xffffc900004abe90, myid=myid@entry=4) > at kernel/rcu/rcutorture.c:2003 > #1 0xffffffff81111bef in rcu_torture_reader (arg=0x4 ) > at kernel/rcu/rcutorture.c:2097 > #2 0xffffffff810af3e0 in kthread (_create=0xffff8880047aa480) > at kernel/kthread.c:388 > #3 0xffffffff8103af1f in ret_from_fork (prev=, > regs=0xffffc900004abf58, fn=0xffffffff810af300 , > fn_arg=0xffff8880047aa480) at arch/x86/kernel/process.c:147 > #4 0xffffffff8100247a in ret_from_fork_asm () at arch/x86/entry/entry_64.S:242 > #5 0x0000000000000000 in ?? () > > So the system really did boot and is running just fine. It is just that > there is no console output. Details, details! > > Is there anything I can do to some combination of workqueues and printk > to help debug this? Or that I can do to anything else, as I am not > feeling all that picky. ;-) It really sounds strange. Console drivers should not use workqueues at least for console->write() callbacks. The main reason is that workqueues do not work in panic(). But it might bring many problems even when the system is running "normally" You wrote above that there was absolutely no console output so the problem was from the very beginning. I looks like that the consoles get blocked either during registration of the first console or when trying to flush the first message. Consoles, except for early consoles, are registered by console_initcall(). And init calls are proceed when the workqueues subsystem is already initialized. So, workqueues might somehow affect console driver initialization and registration. My first though was that the workqueues change might somehow block synchronize_srcu() which is used when updating console_list. But it not called during console registration. It is called only during suspend, resume, and console unregistration. So, it should not block the first boot messages. My current theory is that the change in workqueues might somehow block a console driver initialization. I think that some console drivers might use workqueues for some tasks even though they could not use them in the write() callback. Could you please provide console log when the problematic patch is disabled? I wonder what consoles drivers are registered and when. Just to be sure. Does the rcu torture configuration modify the behavior of srcu_read_lock_nmisafe() which is used by console_srcu_read_lock()? Best Regards, Petr