Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751486AbdFFODd (ORCPT ); Tue, 6 Jun 2017 10:03:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:56177 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751388AbdFFOCO (ORCPT ); Tue, 6 Jun 2017 10:02:14 -0400 Subject: Re: Sleeping BUG in khugepaged for i586 To: Andrew Morton , Larry Finger Cc: LKML , linux-mm@kvack.org References: <968ae9a9-5345-18ca-c7ce-d9beaf9f43b6@lwfinger.net> <20170605144401.5a7e62887b476f0732560fa0@linux-foundation.org> From: Vlastimil Babka Message-ID: Date: Tue, 6 Jun 2017 16:02:10 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170605144401.5a7e62887b476f0732560fa0@linux-foundation.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1882 Lines: 51 On 06/05/2017 11:44 PM, Andrew Morton wrote: > On Sat, 3 Jun 2017 14:24:26 -0500 Larry Finger wrote: > >> I recently turned on locking diagnostics for a Dell Latitude D600 laptop, which >> requires a 32-bit kernel. In the log I found the following: >> >> BUG: sleeping function called from invalid context at mm/khugepaged.c:655 >> in_atomic(): 1, irqs_disabled(): 0, pid: 20, name: khugepaged >> 1 lock held by khugepaged/20: >> #0: (&mm->mmap_sem){++++++}, at: [] >> collapse_huge_page.isra.47+0x439/0x1240 >> CPU: 0 PID: 20 Comm: khugepaged Tainted: G W W means thre was WARN earler. Could be related... Got logs? >> 4.12.0-rc1-wl-12125-g952a068 #80 What is "wl-12125-g952a068"? What patches on top of mainline? >> Hardware name: Dell Computer Corporation Latitude D600 >> /03U652, BIOS A05 05/29/2003 >> Call Trace: >> dump_stack+0x76/0xb2 >> ___might_sleep+0x174/0x230 >> collapse_huge_page.isra.47+0xacf/0x1240 >> khugepaged_scan_mm_slot+0x41e/0xc00 >> ? _raw_spin_lock+0x46/0x50 >> khugepaged+0x277/0x4f0 >> ? prepare_to_wait_event+0xe0/0xe0 >> kthread+0xeb/0x120 >> ? khugepaged_scan_mm_slot+0xc00/0xc00 >> ? kthread_create_on_node+0x30/0x30 >> ret_from_fork+0x21/0x30 >> >> I have no idea when this problem was introduced. Of course, I will test any >> proposed fixes. >> > > Odd. There's nothing wrong with cond_resched() while holding mmap_sem. > It looks like khugepaged forgot to do a spin_unlock somewhere and we > leaked a preempt_count. Hmm I'd expect such spin lock to be reported together with mmap_sem in the debugging "locks held" message? > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >