Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1858367imm; Thu, 24 May 2018 01:50:23 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoAljtaPGK7fTLMiyvCtwB+hKFKlgomQbkWJ6xWgeSgwhfR7jkdsAMM9zqhoh21qXHMLj+Y X-Received: by 2002:a17:902:6903:: with SMTP id j3-v6mr6436834plk.313.1527151822977; Thu, 24 May 2018 01:50:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527151822; cv=none; d=google.com; s=arc-20160816; b=w7go6ofAcsiiJr2TlfCbyHDedhQ8jZJtlVvsTJaQeNygBdeZdTn8VqFhgWnN9CNYeT 8G1tqrOtEtHYRfO8jCQM/7a8jVn0zt40NO95mKNmntaBmislPyHNKoMyFb1dE27SnrJr gcIDIKaunz575TR5/oZ81a79OemtgsQrSzszFiI9rMHx//Uh32y2iCOEk4g/aLBDCJ4p jpuXlWyiWWZTz2ZMe2tXTHgWzWQ7Yb5duPPZgE9rJpaT+gXxJnRmV5/GN+1/xVDD6Q8W XkxzG1ytFPuVk86hjqTWffeFAdjRlotUJCugLYaGpA03XxP+ZKlfIJCeJfl9aGIj+QUh OJ9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject:arc-authentication-results; bh=KrUAYfEdCC7Gt65eaHlYw16oQh6MM7tvLOy07GXRIu0=; b=F3WjZvEVKUjzdkl6pQCN0X4YmCb5RL6JNWx1YX8jl6u+bccqZ4YlMgXVRUO42Ll/k0 0QTSSW30+0oj868LlHblS3jrW3EWClcWZysJFhE+jz46+PENUS8xaRdj2FnR69EYU+tV 9PMtz8pL0VpJGBZbGlDt4IWqWUMJKJCrOcfcuAdO3mANcXpsJjqsQaPHskmuA9aFTnfJ nat+rWBOC/O3eo3ritmozgj0gd+zf8rsT/vw6/NuNqOTTFz7bNfvcCVDL/WRjvd9wzsD Eppfd4BhIlNKfKpnT0WChWaT5bnzXtRCVx2cUyJ6ZlpdizSY9RpullsWN1LaKn177G+j A7KQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n64-v6si16157403pga.265.2018.05.24.01.49.36; Thu, 24 May 2018 01:50:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965482AbeEXIo0 (ORCPT + 99 others); Thu, 24 May 2018 04:44:26 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:38312 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965042AbeEXIoU (ORCPT ); Thu, 24 May 2018 04:44:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C19B21435; Thu, 24 May 2018 01:44:19 -0700 (PDT) Received: from [10.1.206.73] (en101.cambridge.arm.com [10.1.206.73]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DC9623F589; Thu, 24 May 2018 01:44:17 -0700 (PDT) Subject: Re: [PATCH v2] mm/ksm: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm From: Suzuki K Poulose To: Andrew Morton , Jia He Cc: Andrea Arcangeli , Minchan Kim , Claudio Imbrenda , Arvind Yadav , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org, jia.he@hxt-semitech.com, Hugh Dickins References: <20180503124415.3f9d38aa@p-imbrenda.boeblingen.de.ibm.com> <1525403506-6750-1-git-send-email-hejianet@gmail.com> <20180509163101.02f23de1842a822c61fc68ff@linux-foundation.org> <2cd6b39b-1496-bbd5-9e31-5e3dcb31feda@arm.com> Message-ID: <6c417ab1-a808-72ea-9618-3d76ec203684@arm.com> Date: Thu, 24 May 2018 09:44:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <2cd6b39b-1496-bbd5-9e31-5e3dcb31feda@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14/05/18 10:45, Suzuki K Poulose wrote: > On 10/05/18 00:31, Andrew Morton wrote: >> On Fri,  4 May 2018 11:11:46 +0800 Jia He wrote: >> >>> In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by PAGE_SIZE >>> unaligned for rmap_item->address under memory pressure tests(start 20 guests >>> and run memhog in the host). >>> >>> ... >>> >>> In rmap_walk_ksm, the rmap_item->address might still have the STABLE_FLAG, >>> then the start and end in handle_hva_to_gpa might not be PAGE_SIZE aligned. >>> Thus it will cause exceptions in handle_hva_to_gpa on arm64. >>> >>> This patch fixes it by ignoring(not removing) the low bits of address when >>> doing rmap_walk_ksm. >>> >>> Signed-off-by: jia.he@hxt-semitech.com >> >> I assumed you wanted this patch to be committed as >> From:jia.he@hxt-semitech.com rather than From:hejianet@gmail.com, so I >> made that change.  Please let me know if this was inappropriate. >> >> You can do this yourself by adding an explicit From: line to the very >> start of the patch's email text. >> >> Also, a storm of WARN_ONs is pretty poor behaviour.  Is that the only >> misbehaviour which this bug causes?  Do you think the fix should be >> backported into earlier kernels? >> Jia, Andrew, What is the status of this patch ? Suzuki > > I think its just not the WARN_ON(). We do more than what is probably > intended with an unaligned address. i.e, We could be modifying the > flags for other pages that were not affected. > > e.g : > > In the original report [0], the trace looked like : > > > [  800.511498] [] kvm_age_hva_handler+0xcc/0xd4 > [  800.517324] [] handle_hva_to_gpa+0xec/0x15c > [  800.523063] [] kvm_age_hva+0x5c/0xcc > [  800.528194] [] kvm_mmu_notifier_clear_flush_young+0x54/0x90 > [  800.535324] [] __mmu_notifier_clear_flush_young+0x6c/0xa8 > [  800.542279] [] page_referenced_one+0x1e0/0x1fc > [  800.548279] [] rmap_walk_ksm+0x124/0x1a0 > [  800.553759] [] rmap_walk+0x94/0x98 > [  800.558717] [] page_referenced+0x120/0x180 > [  800.564369] [] shrink_active_list+0x218/0x4a4 > [  800.570281] [] shrink_node_memcg+0x58c/0x6fc > [  800.576107] [] shrink_node+0xe4/0x328 > [  800.581325] [] do_try_to_free_pages+0xe4/0x3b8 > [  800.587324] [] try_to_free_pages+0x124/0x234 > [  800.593150] [] __alloc_pages_nodemask+0x564/0xf7c > [  800.599412] [] khugepaged_alloc_page+0x38/0xb8 > [  800.605411] [] collapse_huge_page+0x74/0xd70 > [  800.611238] [] khugepaged_scan_mm_slot+0x654/0xa98 > [  800.617585] [] khugepaged+0x2bc/0x49c > [  800.622803] [] kthread+0x124/0x150 > [  800.627762] [] ret_from_fork+0x10/0x1c > [  800.633066] ---[ end trace 944c130b5252fb01 ]--- > > Now, the ksm wants to mark *a page* as referenced via page_referenced_one(), > passing it an unaligned address. This could eventually turn out to be > one of : > > ptep_clear_flush_young_notify(address, address + PAGE_SIZE) > > or > > pmdp_clear_flush_young_notify(address, address + PMD_SIZE) > > which now spans two pages/pmds and the notifier consumer might > take an action on the second page as well, which is not something > intended. So, I do think that old behavior is wrong and has other > side effects as mentioned above. > > [0] https://lkml.kernel.org/r/1525244911-5519-1-git-send-email-hejianet@gmail.com > > Suzuki