Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754576AbbGTSdH (ORCPT ); Mon, 20 Jul 2015 14:33:07 -0400 Received: from mga02.intel.com ([134.134.136.20]:29587 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753255AbbGTSdF (ORCPT ); Mon, 20 Jul 2015 14:33:05 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,509,1432623600"; d="scan'208";a="609555831" From: Ashutosh Dixit To: Toshi Kani Cc: "linux-kernel\@vger.kernel.org" , "Dutt\, Sudeep" , "Rao\, Nikhil" , "Williams\, Dan J" Subject: Re: Regression in v4.2-rc1: vmalloc_to_page with ioremap References: <6DC2528F945B4149AB6566DFB5F22ED39BC5540B@ORSMSX115.amr.corp.intel.com> <1437407695.3214.156.camel@hp.com> <1437407942.3214.159.camel@hp.com> Date: Mon, 20 Jul 2015 11:33:02 -0700 In-Reply-To: <1437407942.3214.159.camel@hp.com> (Toshi Kani's message of "Mon, 20 Jul 2015 09:59:02 -0600") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5399 Lines: 110 On Mon, Jul 20 2015 at 08:59:02 AM, Toshi Kani wrote: > Can you also try the 'nohugeiomap' kernel option to the 4.2-rc1 kernel > to see if the problem goes away? Yes the problem goes away with 'nohugeiomap'. > Yes, vmalloc_to_page() assumes 4KB mappings. But ioremap with huge > pages was enabled in v4.1, while v4.2-rc1 has update for the check with > MTRRs. Yes we had bisected the failure down to the following commit, so the problem goes away if this commit is reverted. So we think that without this commit ioremap actually maps device memory in 4K pages and that is why we don't see the issue. commit b73522e0c1be58d3c69b124985b8ccf94e3677f7 Author: Toshi Kani Date: Tue May 26 10:28:10 2015 +0200 x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers > Can you send me outputs of the following files? If the driver > fails to load in v4.2-rc1, you can obtain the info in v4.1. > > /proc/mtrr > /proc/iomem > /proc/vmallocinfo > /sys/kernel/debug/kernel_page_tables (need CONFIG_X86_PTDUMP set) Since the outputs are large I have sent you the outputs in a separate mail outside the mailing list. > Also, does the driver map a regular memory range with ioremap? If not, > how does 'struct page' get allocated for the range (since > vmalloc_to_page returns a page pointer)? No the driver does not map regular memory with ioremap, only device memory. vmalloc_to_page was returning a valid 'struct page' in this case too. It appears it can do this correctly using pte_page as long as all four page table levels (pgd, pud, pmd, pte) are present and the problem seems to be happening because in the case of huge pages they are not. For us the bar size is 8 G so we think that with the new ioremap maps the bars using 1 G pages. Here is the call stack for the crash: [47360.050724] BUG: unable to handle kernel paging request at ffffc47e00000000 [47360.050965] IP: [] vmalloc_to_page+0x6c/0xb0 [47360.051136] PGD 0 [47360.051288] Oops: 0000 [#1] SMP [47360.059112] Workqueue: SCIF INTR 2 scif_intr_bh_handler [scif] [47360.059481] task: ffff88042d659d80 ti: ffff88042d664000 task.ti: ffff88042d664000 [47360.059986] RIP: 0010:[] [] vmalloc_to_page+0x6c/0xb0 [47360.060568] RSP: 0018:ffff88042d667bb8 EFLAGS: 00010206 [47360.060863] RAX: 00003c7e00000000 RBX: ffffc90040000000 RCX: 00003ffffffff000 [47360.061165] RDX: 00003c7e00000000 RSI: ffff880000000000 RDI: ffffc90040000000 [47360.061466] RBP: ffff88042d667bb8 R08: 0000000000000000 R09: ffff880679100000 [47360.061766] R10: 0000000000000000 R11: 0000000000000008 R12: 0000000000040000 [47360.062066] R13: 0000000000008000 R14: ffff880679100000 R15: ffff880679100000 [47360.062367] FS: 0000000000000000(0000) GS:ffff88083f600000(0000) knlGS:0000000000000000 [47360.062873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [47360.063169] CR2: ffffc47e00000000 CR3: 0000000001c0e000 CR4: 00000000000407e0 [47360.063468] Stack: [47360.063754] ffff88042d667c08 ffffffffa054c141 0000000000000000 0000000000040000 [47360.064565] ffff88042d667be8 ffff88082a6ebd20 0000000000000020 0000000000008000 [47360.065389] ffff88067f140228 ffff88082993a000 ffff88042d667c88 ffffffffa054d068 [47360.066212] Call Trace: [47360.066530] [] scif_p2p_setsg+0xb1/0xd0 [scif] [47360.066835] [] scif_init_p2p_info+0xe8/0x290 [scif] [47360.067171] [] scif_init+0x16a/0x2c0 [scif] [47360.067487] [] ? pick_next_task_fair+0x42a/0x4f0 [47360.067817] [] scif_nodeqp_msg_handler+0x42/0x80 [scif] [47360.068122] [] scif_nodeqp_intrhandler+0x32/0x70 [scif] [47360.068453] [] scif_intr_bh_handler+0x25/0x40 [scif] [47360.068759] [] process_one_work+0x143/0x410 [47360.069087] [] worker_thread+0x11b/0x4b0 [47360.069387] [] ? __schedule+0x309/0x880 [47360.069713] [] ? process_one_work+0x410/0x410 [47360.070013] [] ? process_one_work+0x410/0x410 [47360.070334] [] kthread+0xcc/0xf0 [47360.070639] [] ? schedule_tail+0x1e/0xc0 [47360.070961] [] ? kthread_freezable_should_stop+0x70/0x70 [47360.071266] [] ret_from_fork+0x3f/0x70 [47360.071584] [] ? kthread_freezable_should_stop+0x70/0x70 [47360.077579] RIP [] vmalloc_to_page+0x6c/0xb0 [47360.077979] RSP And gdb points to the following source: (gdb) list *vmalloc_to_page+0x6c 0xffffffff8119e50c is in vmalloc_to_page (mm/vmalloc.c:246). 241 */ 242 VIRTUAL_BUG_ON(!is_vmalloc_or_module_addr(vmalloc_addr)); 243 244 if (!pgd_none(*pgd)) { 245 pud_t *pud = pud_offset(pgd, addr); 246 if (!pud_none(*pud)) { 247 pmd_t *pmd = pmd_offset(pud, addr); 248 if (!pmd_none(*pmd)) { 249 pte_t *ptep, pte; 250 Thanks, Ashutosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/