Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp21284774ybl; Mon, 6 Jan 2020 01:25:25 -0800 (PST) X-Google-Smtp-Source: APXvYqwO1lH3YPKW8ZWQFguf7cBrBsEt1L4PrNTWyn4RNxD0D9WaNTkWXqDWUxr6wlAtDDy0bHek X-Received: by 2002:a9d:2c68:: with SMTP id f95mr116460778otb.33.1578302725144; Mon, 06 Jan 2020 01:25:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578302725; cv=none; d=google.com; s=arc-20160816; b=deorCRFs4gGpACp33W5Ru3AnYOnMiKEnr0VUxp+5mRjKmoerZBnMuY/J+RVixjMDsK 4n7bCua4yMZbW+Dpn93yWASAWtY2R8EjL/9kcXlmK6IDqt6IYUJKbkcHoNIo5pMoBPmt /Rzkd4jlGSaHm6uV2SQ/U/cSaIqjPDVGknqAcX1AwJCcVng7k+OjBhAbGxUB6G9qrdMn LwH5IeLj8+ihCUEeFy4Dt2AtwpGgEGwsHHHBD0LUqXmBER3P0OBaO/m8zG6J7QxxabDn yFIzDwdv4xceyhAs2kLI8tMDVY3jiT64+qqDhrE7mfpAj+gLQ+oGm1nXu6pjxXoqFH9r H65A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :dlp-reaction:dlp-version:dlp-product:content-language :accept-language:in-reply-to:references:message-id:date:thread-index :thread-topic:subject:cc:to:from; bh=EiC90NYYu8R96xdk6evJVYIR1MwSfliJezfKfSwHwFI=; b=w55uatrBORWcT5yswcgYrN/tebbUT6Fli12uRHSufWSp98QzaKbrSrKWcmGwWgXoR5 6N2YC4Xi9fgO/em8Uza4opvwnOANmHDWTDA2SVwsbqZQKJhdTJMv3hqKNAjsaY2HN4nw lsbnrT8Db/ZRFReV3s3mAGP/nhvvLkbLIoagqJ8LivbJABXvWWVEeyfqkYaxUXaAKlRv PsSLqKtkQE84UJHD/79AIAtqM+vl02181395NiCnxXQf2JohJr76VlOdWldLBj9v6MeJ XBA541/xYiA6aFAhY38ESLB12FFbVwPP8wDUBgEo1UNVhL7V6Vy7D1HDez1nHcMkdqPl m7gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t28si35365064otr.16.2020.01.06.01.25.12; Mon, 06 Jan 2020 01:25:25 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726281AbgAFJY3 convert rfc822-to-8bit (ORCPT + 99 others); Mon, 6 Jan 2020 04:24:29 -0500 Received: from mga01.intel.com ([192.55.52.88]:33548 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726155AbgAFJY3 (ORCPT ); Mon, 6 Jan 2020 04:24:29 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jan 2020 01:24:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,402,1571727600"; d="scan'208";a="420649735" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga005.fm.intel.com with ESMTP; 06 Jan 2020 01:24:28 -0800 Received: from fmsmsx112.amr.corp.intel.com (10.18.116.6) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 6 Jan 2020 01:24:28 -0800 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by FMSMSX112.amr.corp.intel.com (10.18.116.6) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 6 Jan 2020 01:24:27 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.30]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.245]) with mapi id 14.03.0439.000; Mon, 6 Jan 2020 17:24:26 +0800 From: "Liu, Chuansheng" To: Ingo Molnar , Borislav Petkov CC: "linux-kernel@vger.kernel.org" , "Luck, Tony" , "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work Thread-Topic: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work Thread-Index: AQHVxFzlxQRcQtSYTE6XKuL77zdvEafcsTeAgAAA4ICAAKrY0A== Date: Mon, 6 Jan 2020 09:24:25 +0000 Message-ID: <27240C0AC20F114CBF8149A2696CBE4A615FC310@SHSMSX101.ccr.corp.intel.com> References: <20200106064155.64-1-chuansheng.liu@intel.com> <20200106070759.GB12238@zn.tnic> <20200106071107.GA95725@gmail.com> In-Reply-To: <20200106071107.GA95725@gmail.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Ingo Molnar On Behalf Of Ingo Molnar > Sent: Monday, January 6, 2020 3:11 PM > To: Borislav Petkov > Cc: Liu, Chuansheng ; linux-kernel@vger.kernel.org; > Luck, Tony ; tglx@linutronix.de; mingo@redhat.com; > hpa@zytor.com > Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized > therm_work > > * Borislav Petkov wrote: > > > On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote: > > > In ICL platform, it is easy to hit bootup failure with panic > > > in thermal interrupt handler during early bootup stage. > > > > > > Such issue makes my platform almost can not boot up with > > > latest kernel code. > > > > > > The call stack is like: > > > kernel BUG at kernel/timer/timer.c:1152! > > > > > > Call Trace: > > > __queue_delayed_work > > > queue_delayed_work_on > > > therm_throt_process > > > intel_thermal_interrupt > > > ... > > > > > > When one CPU is up, the irq is enabled prior to CPU UP > > > notification which will then initialize therm_worker. > > > > You mean the unmasking of the thermal vector at the end of > > intel_init_thermal()? > > > > If so, why don't you move that to the end of the notifier and unmask it > > only after all the necessary work like setting up the workqueues etc, is > > done, and save yourself adding yet another silly bool? > > A debugging WARN_ON_ONCE() when the workqueue is not initialized yet > would also be useful I suspect. This would turn any remaining race-crash > boot failure in this area into a warning. > Thanks Ingo. That's a good suggestion, I can try to make another patch about it later. Best Regards Chuansheng