Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755231AbcJUK4f convert rfc822-to-8bit (ORCPT ); Fri, 21 Oct 2016 06:56:35 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:51491 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754833AbcJUK4e (ORCPT ); Fri, 21 Oct 2016 06:56:34 -0400 Date: Fri, 21 Oct 2016 12:56:30 +0200 From: Sebastian Andrzej Siewior To: "Charles (Chas) Williams" Cc: linux-kernel@vger.kernel.org, rt@linutronix.de Subject: Re: [PREEMPT-RT] Oops in rapl_cpu_prepare() Message-ID: <20161021105630.y2iym7smtdpyo54z@linutronix.de> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1469 Lines: 48 On 2016-10-20 16:27:55 [-0400], Charles (Chas) Williams wrote: > Recent 4.8 kernels have been oopsing when running under VMWare: can you reproduce this on bare metal? > [ 2.270203] BUG: unable to handle kernel NULL pointer dereference at 0000000000000408 > [ 2.270325] IP: [] rapl_cpu_online+0x59/0x70 … > > gdb tells me: > > (gdb) info line *(rapl_cpu_online+0x59) > Line 595 of "arch/x86/events/intel/rapl.c" starts at address 0xffffffff81012bb9 > and ends at 0xffffffff81012bbe . > > Which is: > > > target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu)); > if (target < nr_cpu_ids) > return 0; > > cpumask_set_cpu(cpu, &rapl_cpu_mask); > pmu->cpu = cpu; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< can you check if pmu is NULL? > return 0; … > Is there a particular order guaranteed by the callbacks? Will > rapl_cpu_prepare() always happen before online/offline? Additionally, yes, see include/linux/cpuhotplug.h. On CPU-up the array ids are invoked from CPUHP_OFFLINE till CPUHP_ONLINE. > rapl_cpu_prepare() can fail to allocate pmu, error codes callbacks are handled. … > But rapl_cpu_online() would have no idea about this. What should be > done in this case? If a callback (such as CPUHP_PERF_X86_RAPL_PREP) fail then we rollback to the starting point (in case of CPU up it would be CPUHP_OFFLINE. Sebastian