Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755616AbcJNAbw (ORCPT ); Thu, 13 Oct 2016 20:31:52 -0400 Received: from mail-pa0-f42.google.com ([209.85.220.42]:34276 "EHLO mail-pa0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752415AbcJNAbo (ORCPT ); Thu, 13 Oct 2016 20:31:44 -0400 Date: Thu, 13 Oct 2016 19:29:02 -0400 From: Tejun Heo To: zijun_hu Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, zijun_hu@htc.com, cl@linux.com Subject: Re: [RFC v2 PATCH] mm/percpu.c: fix panic triggered by BUG_ON() falsely Message-ID: <20161013232902.GD32534@mtj.duckdns.org> References: <57FCF07C.2020103@zoho.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57FCF07C.2020103@zoho.com> User-Agent: Mutt/1.7.0 (2016-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1462 Lines: 40 On Tue, Oct 11, 2016 at 10:00:28PM +0800, zijun_hu wrote: > From: zijun_hu > > as shown by pcpu_build_alloc_info(), the number of units within a percpu > group is educed by rounding up the number of CPUs within the group to > @upa boundary, therefore, the number of CPUs isn't equal to the units's > if it isn't aligned to @upa normally. however, pcpu_page_first_chunk() > uses BUG_ON() to assert one number is equal the other roughly, so a panic > is maybe triggered by the BUG_ON() falsely. > > in order to fix this issue, the number of CPUs is rounded up then compared > with units's, the BUG_ON() is replaced by warning and returning error code > as well to keep system alive as much as possible. I really can't decode what the actual issue is here. Can you please give an example of a concrete case? > @@ -2113,21 +2120,22 @@ int __init pcpu_page_first_chunk(size_t reserved_size, > > /* allocate pages */ > j = 0; > - for (unit = 0; unit < num_possible_cpus(); unit++) > + for (unit = 0; unit < num_possible_cpus(); unit++) { > + unsigned int cpu = ai->groups[0].cpu_map[unit]; > for (i = 0; i < unit_pages; i++) { > - unsigned int cpu = ai->groups[0].cpu_map[unit]; > void *ptr; > > ptr = alloc_fn(cpu, PAGE_SIZE, PAGE_SIZE); > if (!ptr) { > pr_warn("failed to allocate %s page for cpu%u\n", > - psize_str, cpu); > + psize_str, cpu); And stop making gratuitous changes? Thanks. -- tejun