Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2129209ybt; Mon, 15 Jun 2020 19:57:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyP0m37DWD7OKyl1TnT73yYNxSdMrlkXrDVWPfnlZUAkMYjiS+BCuYxNJIhf8fkvUb9jYgH X-Received: by 2002:a50:cdc6:: with SMTP id h6mr662227edj.111.1592276265441; Mon, 15 Jun 2020 19:57:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592276265; cv=none; d=google.com; s=arc-20160816; b=mR27+EpHnUDqRxGsCOY65+7auB51fj9JsJgS1SfNpjc+e36NNulelRIRnUVDnUt6ZY 0AIdIGv5lO7QQV0MemkDedJuEtqwQlMuIa92N21ZNZ7yvmrXbTieWuYwDNQiQd5OUnZ3 QWsvJICtwTh08gYwg49MAPn7x8F88lXlVBe3n45jFVvCW7ruCJwX+b08BOUBStXF5B9+ 88V2kGt3U3F/9LHMLOuebvy55TUcVbO+MRp1r85kkHMGJCLdiRfqx2tlTPyqfDzzg96X PPKyTR71doH/PBna+LRO+6hM69+pOzAZjzT10DUk2DOBhX42/lR524OyWolPu/NfKHup 583Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=8DIi77ra3yFPjj3u80MneNwkkGAFSppEMSSs36blUPU=; b=d6EJ8/OW3lYh0Y/uORhEKCee7lllcTQh6KO1hsqvTns8DbTsJHPZ3KyZuy8UCRC3jv 4h+spiwF5V/J+G16lWyFuXjMYKpnDOl6qKkLKehM88IjDlQZqoZOwcdQLGz6xK9zbG3c h/XK07xZwIZTEj3Hd+zxwA8ZsFC2JdLO03fwpQJwxLxp5s0Q6aisZvcXf0qzD13i9vev wWDJUM582BJyVavtWd+GQr8JoQ1PnLkjF6veRvtrUfoM4BHkjtP0IHosvK7ln460LQk8 v5yPYTpDEDqvoI9QyIhcNixxwqkA/x7eHDncC0tg2tCRgSaVrSvDC6AoBazhU0VaZTaI +6hA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w11si10282345edl.171.2020.06.15.19.57.22; Mon, 15 Jun 2020 19:57:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726646AbgFPCu2 (ORCPT + 99 others); Mon, 15 Jun 2020 22:50:28 -0400 Received: from mga17.intel.com ([192.55.52.151]:22554 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726044AbgFPCu2 (ORCPT ); Mon, 15 Jun 2020 22:50:28 -0400 IronPort-SDR: mFmmbHKG9SBBc40uMzDjCJR0OCap/SchHUTMi2ZuL1Y+AZJXMOJYLzj4UMOej+hzYEi+IJmXZE 9kh46i7dgXlA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2020 19:50:27 -0700 IronPort-SDR: EooRz/LA+ATGqtDCneDKtobTGFprX0pDFQt4zaWlIbGdGmDApuNBSJoM7oNEJMR3+VMawXixMT XZpPyokdgYdQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,517,1583222400"; d="scan'208";a="261998340" Received: from shao2-debian.sh.intel.com (HELO [10.239.13.3]) ([10.239.13.3]) by orsmga007.jf.intel.com with ESMTP; 15 Jun 2020 19:50:23 -0700 Subject: Re: [kmemleak] b751c52bb5: BUG:kernel_hang_in_boot_stage To: Catalin Marinas Cc: Nicolas Boichat , Linus Torvalds , Dmitry Vyukov , Masahiro Yamada , Kees Cook , Petr Mladek , Thomas Gleixner , Tetsuo Handa , Joe Lawrence , Uladzislau Rezki , Andy Shevchenko , Stephen Rothwell , Andrey Ryabinin , Andrew Morton , LKML , lkp@lists.01.org References: <20200610075156.GE12456@shao2-debian> <20200610105640.GB26099@gaia> From: Rong Chen Message-ID: <9bc70c0d-00b3-ff5a-e8bd-d4315e367fad@intel.com> Date: Tue, 16 Jun 2020 10:50:05 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20200610105640.GB26099@gaia> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/10/20 6:56 PM, Catalin Marinas wrote: > On Wed, Jun 10, 2020 at 03:51:56PM +0800, kernel test robot wrote: >> FYI, we noticed the following commit (built with gcc-7): >> >> commit: b751c52bb587ae66f773b15204ef7a147467f4c7 ("kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K") >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master >> >> in testcase: boot >> >> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G > [...] >> BUG: kernel hang in boot stage >> >> To reproduce: >> >> # build kernel >> cd linux >> cp config-5.3.0-11789-gb751c52bb587a .config >> make HOSTCC=gcc-7 CC=gcc-7 ARCH=i386 olddefconfig prepare modules_prepare bzImage > I've never tried kmemleak on i386. > > Anyway, I'm not sure what caused the hang (or whether it's a hang at > all) but I suspect prior to the above commit, kmemleak probably just > disabled itself (early log buffer exceeded). So the bug may have been > there already, only that kmemleak started working and tripped over it > when the log buffer increased. Hi, Sorry for the late, I can reproduce the problem with command "bin/lkp qemu -k job-script", and the kernel hangs: [    0.333897] ----------------------------------------------------- [    0.334561]                                  |block | try |context| [    0.335170] ----------------------------------------------------- [    0.335760]                           context:  ok  |  ok  |  ok | [    0.337995]                               try:  ok  |  ok  |  ok | [    0.340089]                             block:  ok  |  ok  |  ok | [    0.342175]                          spinlock:  ok  |  ok  |  ok | [    0.344481] ------------------------------------------------------- [    0.345068] Good, all 261 testcases passed! | [    0.345514] --------------------------------- KVM internal error. Suberror: 3 extra data[0]: 80000b0e extra data[1]: 31 extra data[2]: 182 extra data[3]: bfff0 EAX=00000000 EBX=00200297 ECX=00000000 EDX=ffffffff ESI=d2e997c0 EDI=d2e997f0 EBP=d2bbb038 ESP=c00bfff4 EIP=f4dccd57 EFL=00210046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA] CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0068 00000000 ffffffff 00c09300 DPL=0 DS   [-WA] DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA] FS =00d8 23331000 ffffffff 00809300 DPL=0 DS16 [-WA] GS =00e0 f6422900 00000018 00409100 DPL=0 DS   [--A] LDT=0000 00000000 ffffffff 00c00000 TR =0080 ff403000 0000206b 00008b00 DPL=0 TSS32-busy GDT=     ff401000 000000ff IDT=     ff400000 000007ff CR0=80050033 CR2=00000000 CR3=130fc000 CR4=00000690 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Is there a chance that the kernel got much slower with kmemleak enabled > and the test scripts timed out? no, the parent commit log is: [    0.313845] ----------------------------------------------------- [    0.314608]                                  |block | try |context| [    0.315314] ----------------------------------------------------- [    0.315974]                           context:  ok  |  ok  |  ok | [    0.318261]                               try:  ok  |  ok  |  ok | [    0.320478]                             block:  ok  |  ok  |  ok | [    0.322562]                          spinlock:  ok  |  ok  |  ok | [    0.324825] ------------------------------------------------------- [    0.325403] Good, all 261 testcases passed! | [    0.325809] --------------------------------- [    0.326221] kmemleak: Early log buffer exceeded (401), please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE [    0.327065] ACPI: Core revision 20190816 [    0.327585] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns [    0.328545] APIC: Switch to symmetric I/O mode setup [    0.329009] Enabling APIC mode:  Flat.  Using 1 I/O APICs [    0.329572] masked ExtINT on CPU#0 [    0.330686] ENABLING IO-APIC IRQs [    0.331001] init IO_APIC IRQs [    0.331274]  apic 0 pin 0 not connected > > Does this problem still exist with the latest mainline? yes, still in v5.7. Best Regards, Rong Chen