Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp2765250ybz; Mon, 27 Apr 2020 04:12:25 -0700 (PDT) X-Google-Smtp-Source: APiQypLsdX+PFo2criBCU3jqd+PZPKLUpcuXZXOrtoZuYYK+CtC5ZEOaiIDFvqAYB6qbd/gRzXWW X-Received: by 2002:a17:906:e210:: with SMTP id gf16mr19441833ejb.214.1587985945545; Mon, 27 Apr 2020 04:12:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587985945; cv=none; d=google.com; s=arc-20160816; b=SiJByIrF/5TOQCHAEx0WnSyirxCcMyU9E0Uzt+HFzsVEfd8mdETKCAHG+7eMK5saKs dQwktKyrNJLmm61wosRZuuqeGyFO+9gsf24c847tZKPF26DEZ5zyXnKlFGy9F6WYvqXa wj/vBRNqjd/5ELh3SxKBl6ZQits9mxwuQRpHQ6lO7xrzGr1qiy4ghqQDW9UwAEjRKeb0 ehtcQyRRleC/zLFtn43uNurGUFZAAjm+sUgFbEVujZ29v46P8LYxau6GSxKlr4/X+vgh zPIS18cYKJ7XFU7lprl5mxnkjxGDNPI5jSETqC+atzALq1yc7wTZCHlsBbyXxmIZaYfC VVSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=huZCH2+8l15yybBv+shC/Jc7YV1WV42QaqSocLZXd+Q=; b=Qs0MiUsmVmdw4rdy2VXQa5UuGGnv7WZZ1saEDR2GSYDgnrHLk+mYNNlEBC0b5RG+g9 aKaBd5+SjJJSdMTT/9Vb/Pee0mcdH3KV+UEnSs7wzmMVcOGTRBcGadsqcPPkGFKcdmZT GDeqPePrDsWL6KJ+q/VpianLmpbQx1aceZIGwUY3uLwKZwVoAAJnx/n/1jtENrs8zhhh OSk0jg0mw6s19g5bX7lpQtltH0zEloDo/3vbQ65l3LmAzRZSYRUf7rirHIaFBlIhPK3c pX5m7GHYFz6f9buYv2ClSxjz+nQ8dlUCv3c6g7WOx1plnyuA2TNuJRmHJpMJ/MUSkojV Kcow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 22si8441005ejw.409.2020.04.27.04.12.00; Mon, 27 Apr 2020 04:12:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726855AbgD0LKC (ORCPT + 99 others); Mon, 27 Apr 2020 07:10:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726539AbgD0LKB (ORCPT ); Mon, 27 Apr 2020 07:10:01 -0400 Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71FDBC0610D5 for ; Mon, 27 Apr 2020 04:10:01 -0700 (PDT) Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jT1e2-0000nL-Fs; Mon, 27 Apr 2020 13:09:50 +0200 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id D89C5100606; Mon, 27 Apr 2020 13:09:49 +0200 (CEST) From: Thomas Gleixner To: Ingo Molnar , Leon Romanovsky Cc: Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86 , Suresh Siddha , linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86/apic: Fix circular locking dependency between console and hrtimer locks In-Reply-To: <20200414062454.GA84326@gmail.com> References: <20200407170925.1775019-1-leon@kernel.org> <20200414054836.GA956407@unreal> <20200414062454.GA84326@gmail.com> Date: Mon, 27 Apr 2020 13:09:49 +0200 Message-ID: <87tv15qj5u.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ingo Molnar writes: > * Leon Romanovsky wrote: > The fix definitely looks legit, lockdep is right that we shouldn't take > the console_sem.lock even under trylock. > > It's only a printk_once(), yet I'm wondering why in the last ~8 years > this never triggered. Nobody ever ran lockdep and debug console level > enabled on such hardware, or did something else change? > > One possibility would be that apic_check_deadline_errata() marked almost > all Intel systems as broken and the TSC-deadline hardware never actually > got activated. In that case you have triggered rarely tested code and > might see other weirdnesses. Just saying. :-) > > Or a bootup with "debug" specified is much more rare in production > systems, hence the 8 years old bug. None of this makes any sense at all. The local APIC timer (in this case the TSC deadline timer) is set up during early boot on the boot CPU (before SMP setup) with this call chain: smp_prepare_cpus() native_smp_prepare_cpus() x86_init.timers.setup_percpu_clockev() setup_boot_APIC_clock() setup_APIC_timer() clockevents_config_and_register() tick_check_new_device() tick_setup_device() tick_setup_oneshot() clockevents_switch_state() lapic_timer_set_oneshot() __setup_APIC_LVTT() printk_once(...) Nothing holds hrtimer.base_lock in this call chain. But the lockdep splat clearly says: [ 735.324357] stack backtrace: [ 735.324360] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-for-upstream-dbg-2020-04-03_10-44-43-70 #1 ... So how can that be the first invocation of that printk_once()? While the patch looks innocent, it papers over the underlying problem and wild theories are not really helping here. Here is a boot log excerpt with lockdep enabled and 'debug' on the command line: [ 0.000000] Linux version 5.7.0-rc3 ... ... [ 3.992125] TSC deadline timer enabled [ 3.995820] smpboot: CPU0: Intel(R) .... ... [ 4.050766] smp: Bringing up secondary CPUs ... No splat nothing. The real question is WHY this triggers on Leons machine 735 seconds after boot and on CPU3. Thanks, tglx