Received: by 2002:aa6:c781:0:b0:139:2837:6657 with SMTP id u1csp1538064lkq; Fri, 3 Sep 2021 09:14:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwutBrnCE9kn12xSQkgYEeHs/+0joYgv4smoLN9XExxNUC57s6zrzdIL9beswZ2um8mOr12 X-Received: by 2002:a05:6e02:b48:: with SMTP id f8mr3270749ilu.25.1630685690369; Fri, 03 Sep 2021 09:14:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630685690; cv=none; d=google.com; s=arc-20160816; b=lGwi6h+vNcplbImFrhAlQTWSFSQ/AEpmf8Cr8vx91DtxlqAhWyhM+AC6qcLWyqgxJ0 oPLWYsrIKld8rPIB4wiBok9aE//J3nT8km4XrFwtpTV4FdfKrKXkwTz1xDMJ/GX2AC9O LbADUxZy1K0vimLUT4XmMEH+0GjNVGtroMaUleDKH72fPD01ETOvJS7rXM2xIsyUm3CH ZMOlEMn98GHfJPrDfbxS7aPVzwqhNHjvjReGvtA6yD8BpiJMjDNGZnWDEU0+bq69n+dz 7OsixvpKQLjYsxpQlU7D+DPJZf6lF+mVJ9Epcw+r/VHBDfNfJLYVC0E19CFVv11svV3E 1pkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:to:from:subject:message-id; bh=ZQ10D2nLNx6Tn15zkU3hZyIj5l6rnjcBdzyMgihqcEA=; b=FKX8fWURVshTARwZO41jE8xZ5QVBrXWrqzbi8hUJ3+ds7Vpx/6C4BEEbkcPD5pGUvu 4KXH9tq15CQNjbbx37qiaCi9J7dmmMpgSNGGHE75DmmkcePp+eOdWLEqRCODDrU2lSyC /HiVOpxLWHqsA+tnCl2iAw2rlDYcWO+6NwipCpSxUV+YbbwK+JJG2tB5mz6pZj2JbM/3 +eAHrORNwCfPG25MZ0TAvjA8t9mrewI7nKjyl1mwXmRYBJuHR7ADo4cyOC1fjeK0main EN4oRskh2hZ5pQdkTc28aEs/CkVdlYA3HfEsQRIQjErLFTD3DCJrzXWqBYZ7XYyfR2UF lVTQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g1si6211685ile.140.2021.09.03.09.14.38; Fri, 03 Sep 2021 09:14:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236365AbhICOjs (ORCPT + 99 others); Fri, 3 Sep 2021 10:39:48 -0400 Received: from mga03.intel.com ([134.134.136.65]:10629 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230332AbhICOjs (ORCPT ); Fri, 3 Sep 2021 10:39:48 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10096"; a="219497684" X-IronPort-AV: E=Sophos;i="5.85,265,1624345200"; d="scan'208";a="219497684" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Sep 2021 07:38:47 -0700 X-IronPort-AV: E=Sophos;i="5.85,265,1624345200"; d="scan'208";a="477243645" Received: from achiranj-mobl.gar.corp.intel.com ([10.213.105.90]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Sep 2021 07:38:44 -0700 Message-ID: <7f115f0476618d34b24ddec772acbbd7c0c4a572.camel@linux.intel.com> Subject: Re: Bug: d0e936adbd22 crashes at boot From: Srinivas Pandruvada To: Jens Axboe , LKML , "Rafael J. Wysocki" , Len Brown , inux-pm@vger.kernel.org Date: Fri, 03 Sep 2021 07:38:41 -0700 In-Reply-To: <3ac87893-55ba-f2d4-bb1e-382868f12d4c@kernel.dk> References: <942f4041-e4e7-1b08-3301-008ab37ff5b8@kernel.dk> <3ac87893-55ba-f2d4-bb1e-382868f12d4c@kernel.dk> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.40.0-1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2021-09-03 at 08:15 -0600, Jens Axboe wrote: > On 9/3/21 8:13 AM, Srinivas Pandruvada wrote: > > Hi Axboe, > > > > Thanks for reporting. > > On Fri, 2021-09-03 at 07:36 -0600, Jens Axboe wrote: > > > Hi, > > > > > > Booting Linus's tree causes a crash on my laptop, an x1 gen9. This > > > was > > > a bit > > > difficult to pin down as it crashes before the display is up, but I > > > managed > > > to narrow it down to: > > > > > > commit d0e936adbd2250cb03f2e840c6651d18edc22ace > > > Author: Srinivas Pandruvada > > > Date:   Thu Aug 19 19:40:06 2021 -0700 > > > > > >     cpufreq: intel_pstate: Process HWP Guaranteed change > > > notification > > > > > > which crashes with a NULL pointer deref in notify_hwp_interrupt() - > > > > > > > queue_delayed_work_on(). > > > > > > Reverting this change makes the laptop boot fine again. > > > > > Does this change fixes your issue? > > I would assume so, as it's crashing on cpudata == NULL :-) > > But why is it NULL? Happy to test patches, but the below doesn't look > like > a real fix and more of a work-around. This platform is sending an HWP interrupt on a CPU which we didn't yet bring it up for pstate control. So somehow firmware decided to send very early during boot, which previously we would have ignored it Actually try this, with more prevention diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index b4ffe6c8a0d0..6ee88d7640ea 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1645,12 +1645,24 @@ void notify_hwp_interrupt(void) if (!hwp_active || !boot_cpu_has(X86_FEATURE_HWP_NOTIFY)) return; - rdmsrl(MSR_HWP_STATUS, value); + rdmsrl_safe(MSR_HWP_STATUS, &value); if (!(value & 0x01)) return; + /* + * After hwp_active is set and all_cpu_data is allocated, there + * is small window. + */ + if (!all_cpu_data) { + wrmsrl_safe(MSR_HWP_STATUS, 0); + return; + } + cpudata = all_cpu_data[this_cpu]; - schedule_delayed_work_on(this_cpu, &cpudata->hwp_notify_work, msecs_to_jiffies(10)); + if (cpudata) + schedule_delayed_work_on(this_cpu, &cpudata- >hwp_notify_work, msecs_to_jiffies(10)); + else + wrmsrl_safe(MSR_HWP_STATUS, 0); } >