Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754031AbbG2Nvk (ORCPT ); Wed, 29 Jul 2015 09:51:40 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:48234 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754010AbbG2Nvj (ORCPT ); Wed, 29 Jul 2015 09:51:39 -0400 Date: Wed, 29 Jul 2015 09:51:02 -0400 From: Johannes Weiner To: Josh Boyer Cc: Ming Lei , Tejun Heo , Jens Axboe , "Linux-Kernel@Vger. Kernel. Org" Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848 Message-ID: <20150729135102.GA11889@cmpxchg.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23+102 (2ca89bed6448) (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3227 Lines: 67 On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote: > Hi All, > > We've gotten a report[1] that any of the upcoming Fedora 23 install > images are all failing on 32-bit VMs/machines. Looking at the first > instance of the oops, it seems to be a bad page state where a page is > still charged to a group and it is trying to be freed. The oops > output is below. > > Has anyone seen this in their 32-bit testing at all? Thus far nobody > can recreate this on a 64-bit machine/VM. > > josh > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382 > > [ 9.026738] systemd[1]: Switching root. > [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd). > [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac > [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a > [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk) > [ 9.087284] page dumped because: page still charged to cgroup > [ 9.088772] bad because of flags: > [ 9.089731] flags: 0x21(locked|lru) > [ 9.090818] page->mem_cgroup:f2c3e400 It's also still locked and on the LRU. This page shouldn't have been freed. > [ 9.117848] Call Trace: > [ 9.118738] [] dump_stack+0x41/0x52 > [ 9.120034] [] bad_page.part.80+0xaa/0x100 > [ 9.121461] [] free_pages_prepare+0x3b9/0x3f0 > [ 9.122934] [] free_hot_cold_page+0x22/0x160 > [ 9.124400] [] ? copy_to_iter+0x1af/0x2a0 > [ 9.125750] [] ? mempool_free_slab+0x13/0x20 > [ 9.126840] [] __free_pages+0x37/0x50 > [ 9.127849] [] mempool_free_pages+0xd/0x10 > [ 9.128908] [] mempool_free+0x26/0x80 > [ 9.129895] [] bounce_end_io+0x56/0x80 The page state looks completely off for a bounce buffer page. Did somebody mess with a bounce bio's bv_page? > [ 9.130923] [] bounce_end_io_read+0x32/0x40 > [ 9.131973] [] bio_endio+0x56/0x90 > [ 9.132953] [] blk_update_request+0x87/0x310 > [ 9.134042] [] ? kvm_clock_read+0x17/0x20 > [ 9.135103] [] ? sched_clock+0x8/0x10 > [ 9.136100] [] blk_mq_end_request+0x16/0x60 > [ 9.136912] [] __blk_mq_complete_request+0x9d/0xd0 > [ 9.137730] [] blk_mq_complete_request+0x15/0x20 > [ 9.138515] [] loop_handle_cmd.isra.23+0x5d/0x8c0 [loop] > [ 9.139390] [] ? pick_next_task_fair+0xa63/0xbb0 > [ 9.140202] [] loop_queue_read_work+0x10/0x12 [loop] > [ 9.141043] [] process_one_work+0x145/0x380 > [ 9.141779] [] worker_thread+0x39/0x430 > [ 9.142524] [] ? process_one_work+0x380/0x380 > [ 9.143303] [] kthread+0xa6/0xc0 > [ 9.143936] [] ret_from_kernel_thread+0x21/0x30 > [ 9.144742] [] ? kthread_worker_fn+0x130/0x130 > [ 9.145529] Disabling lock debugging due to kernel taint -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/