Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751558AbdFHRGC (ORCPT ); Thu, 8 Jun 2017 13:06:02 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:49655 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751524AbdFHRGB (ORCPT ); Thu, 8 Jun 2017 13:06:01 -0400 Date: Thu, 8 Jun 2017 10:05:57 -0700 From: Matthew Wilcox To: Michal Hocko Cc: David Rientjes , Vlastimil Babka , Larry Finger , Andrew Morton , LKML , linux-mm@kvack.org Subject: Re: Sleeping BUG in khugepaged for i586 Message-ID: <20170608170557.GA8118@bombadil.infradead.org> References: <968ae9a9-5345-18ca-c7ce-d9beaf9f43b6@lwfinger.net> <20170605144401.5a7e62887b476f0732560fa0@linux-foundation.org> <1e883924-9766-4d2a-936c-7a49b337f9e2@lwfinger.net> <9ab81c3c-e064-66d2-6e82-fc9bac125f56@suse.cz> <20170608144831.GA19903@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170608144831.GA19903@dhcp22.suse.cz> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1475 Lines: 40 On Thu, Jun 08, 2017 at 04:48:31PM +0200, Michal Hocko wrote: > On Wed 07-06-17 13:56:01, David Rientjes wrote: > > I agree it's probably going to bisect to 338a16ba15495 since it's the > > cond_resched() at the line number reported, but I think there must be > > something else going on. I think the list of locks held by khugepaged is > > correct because it matches with the implementation. The preempt_count(), > > as suggested by Andrew, does not. If this is reproducible, I'd like to > > know what preempt_count() is. > > collapse_huge_page > pte_offset_map > kmap_atomic > kmap_atomic_prot > preempt_disable > __collapse_huge_page_copy > pte_unmap > kunmap_atomic > __kunmap_atomic > preempt_enable > > I suspect, so cond_resched seems indeed inappropriate on 32b systems. Then why doesn't it trigger on 64-bit systems too? #ifndef ARCH_HAS_KMAP ... static inline void *kmap_atomic(struct page *page) { preempt_disable(); pagefault_disable(); return page_address(page); } #define kmap_atomic_prot(page, prot) kmap_atomic(page) ... oh, wait, I see. Because pte_offset_map() doesn't call kmap_atomic() on 64-bit. Indeed, it doesn't necessarily call kmap_atomic() on 32-bit either; only with CONFIG_HIGHPTE enabled. How much of a performance penalty would it be to call kmap_atomic() unconditionally on 64 bit to make sure that this kind of problem doesn't show on 32-bit systems only?