Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751322AbdFEVoD (ORCPT ); Mon, 5 Jun 2017 17:44:03 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:52062 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751242AbdFEVoC (ORCPT ); Mon, 5 Jun 2017 17:44:02 -0400 Date: Mon, 5 Jun 2017 14:44:01 -0700 From: Andrew Morton To: Larry Finger Cc: LKML , linux-mm@kvack.org Subject: Re: Sleeping BUG in khugepaged for i586 Message-Id: <20170605144401.5a7e62887b476f0732560fa0@linux-foundation.org> In-Reply-To: <968ae9a9-5345-18ca-c7ce-d9beaf9f43b6@lwfinger.net> References: <968ae9a9-5345-18ca-c7ce-d9beaf9f43b6@lwfinger.net> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1327 Lines: 34 On Sat, 3 Jun 2017 14:24:26 -0500 Larry Finger wrote: > I recently turned on locking diagnostics for a Dell Latitude D600 laptop, which > requires a 32-bit kernel. In the log I found the following: > > BUG: sleeping function called from invalid context at mm/khugepaged.c:655 > in_atomic(): 1, irqs_disabled(): 0, pid: 20, name: khugepaged > 1 lock held by khugepaged/20: > #0: (&mm->mmap_sem){++++++}, at: [] > collapse_huge_page.isra.47+0x439/0x1240 > CPU: 0 PID: 20 Comm: khugepaged Tainted: G W > 4.12.0-rc1-wl-12125-g952a068 #80 > Hardware name: Dell Computer Corporation Latitude D600 > /03U652, BIOS A05 05/29/2003 > Call Trace: > dump_stack+0x76/0xb2 > ___might_sleep+0x174/0x230 > collapse_huge_page.isra.47+0xacf/0x1240 > khugepaged_scan_mm_slot+0x41e/0xc00 > ? _raw_spin_lock+0x46/0x50 > khugepaged+0x277/0x4f0 > ? prepare_to_wait_event+0xe0/0xe0 > kthread+0xeb/0x120 > ? khugepaged_scan_mm_slot+0xc00/0xc00 > ? kthread_create_on_node+0x30/0x30 > ret_from_fork+0x21/0x30 > > I have no idea when this problem was introduced. Of course, I will test any > proposed fixes. > Odd. There's nothing wrong with cond_resched() while holding mmap_sem. It looks like khugepaged forgot to do a spin_unlock somewhere and we leaked a preempt_count.