Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755952AbbHQQcx (ORCPT ); Mon, 17 Aug 2015 12:32:53 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:36328 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755795AbbHQQcv (ORCPT ); Mon, 17 Aug 2015 12:32:51 -0400 Date: Mon, 17 Aug 2015 09:32:50 -0700 From: Greg KH To: Bharata B Rao Cc: Nathan Fontenot , linux-kernel@vger.kernel.org, david@gibson.dropbear.id.au Subject: Re: [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled Message-ID: <20150817163250.GA28254@kroah.com> References: <1439457422-10565-1-git-send-email-bharata@linux.vnet.ibm.com> <55CE08F9.4040106@linux.vnet.ibm.com> <20150817062653.GA9449@in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150817062653.GA9449@in.ibm.com> User-Agent: Mutt/1.5.23+102 (2ca89bed6448) (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4845 Lines: 81 On Mon, Aug 17, 2015 at 11:56:53AM +0530, Bharata B Rao wrote: > On Fri, Aug 14, 2015 at 10:27:53AM -0500, Nathan Fontenot wrote: > > On 08/13/2015 04:17 AM, Bharata B Rao wrote: > > > Last section of memory block is always initialized to > > > > > > mem->start_section_nr + sections_per_block - 1 > > > > > > which will not be true for a section that doesn't contain sections_per_block > > > sections due to the memory size specified. This causes the following > > > kernel crash when memory blocks under a node are registered during reboot > > > that follows a memory hotplug operation on pseries guest. > > > > > > Unable to handle kernel paging request for data at address 0xf0000000003f0020 > > > Faulting instruction address: 0xc0000000007657cc > > > Oops: Kernel access of bad area, sig: 11 [#1] > > > SMP NR_CPUS=1024 NUMA pSeries > > > > > > Modules linked in: > > > > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc6+ #48 > > > task: c0000000ba3c0000 ti: c00000013c580000 task.ti: c00000013c580000 > > > NIP: c0000000007657cc LR: c000000000592dbc CTR: 0000000000000400 > > > REGS: c00000013c5836f0 TRAP: 0300 Not tainted (4.2.0-rc6+) > > > MSR: 8000000000009032 MSR: 8000000000009032 <> CR: 48000048 XER: 00000000 > > > CR: 48000048 XER: 00000000 > > > CFAR: 00003fff990f50ec CFAR: 00003fff990f50ec DAR: f0000000003f0020 DSISR: 40000000 DAR: f0000000003f0020 DSISR: 40000000 SOFTE: 1 SOFTE: 1 > > > GPR00: c000000000592dbc c000000000592dbc c00000013c583970 c00000013c583970 c0000000014f0300 c0000000014f0300 00000000003f0000 00000000003f0000 > > > GPR04: 0000000000000000 0000000000000000 c0000000f43b2900 c0000000f43b2900 c0000000ba324668 c0000000ba324668 0000000000000001 0000000000000001 > > > GPR08: c000000001540300 c000000001540300 f000000000000000 f000000000000000 f0000000003f0000 f0000000003f0000 0000000000000001 0000000000000001 > > > GPR12: 0000000024000084 0000000024000084 c00000000ff20000 c00000000ff20000 c00000000000b5b0 c00000000000b5b0 0000000000000000 0000000000000000 > > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > GPR24: c00000000188c380 c00000000188c380 0000000000000000 0000000000000000 0000000000014000 0000000000014000 c0000000018b54e8 c0000000018b54e8 > > > GPR28: c00000013c06e800 c00000013c06e800 000000000000ffff 000000000000ffff 0000000000000000 0000000000000000 000000000000fc00 000000000000fc00 > > > > > > NIP [c0000000007657cc] .get_nid_for_pfn+0x2c/0x60 > > > LR [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150 > > > Call Trace: > > > [c00000013c583970] [c00000000056e44c] .put_device+0x2c/0x50 > > > [c00000013c5839f0] [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150 > > > [c00000013c583a80] [c0000000005932b4] .register_one_node+0x2c4/0x380 > > > [c00000013c583b30] [c000000000c882b8] .topology_init+0x44/0x1e0 > > > [c00000013c583bf0] [c00000000000ad30] .do_one_initcall+0x110/0x270 > > > [c00000013c583ce0] [c000000000c845d4] .kernel_init_freeable+0x278/0x360 > > > [c00000013c583db0] [c00000000000b5d4] .kernel_init+0x24/0x130 > > > [c00000013c583e30] [c0000000000094e8] .ret_from_kernel_thread+0x58/0x70 > > > > > > Fix this by updating the memory block to always contain the right > > > number of sections instead of assuming sections_per_block. > > > > > > Signed-off-by: Bharata B Rao > > > Cc: Nathan Fontenot > > > --- > > > drivers/base/memory.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > > > index 2804aed..7f3ce2e 100644 > > > --- a/drivers/base/memory.c > > > +++ b/drivers/base/memory.c > > > @@ -645,6 +645,7 @@ static int add_memory_block(int base_section_nr) > > > if (ret) > > > return ret; > > > mem->section_count = section_count; > > > + mem->end_section_nr = mem->start_section_nr + section_count -1; > > > > I think this change may be correct but makes me wonder if we need to update > > code elsewhere. There are places (at least in drivers/base/memory.c) that assume > > a memory block contains sections_per_block sections. > > > > Also, I think you may need to cc GregKH for this patch. > > Hi Greg - Do you think the above is the right fix to the problem that is > described here ? I have no idea, sorry, I didn't write this code :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/