Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3974477ybb; Mon, 23 Mar 2020 11:08:10 -0700 (PDT) X-Google-Smtp-Source: ADFU+vv7zbZmePlt/WbJQfiOCruLzTrcuN1q68HuEOUE55tZfBgizwcYhUC6Ry2gtLij26BdXAMq X-Received: by 2002:aca:af12:: with SMTP id y18mr451578oie.78.1584986890549; Mon, 23 Mar 2020 11:08:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584986890; cv=none; d=google.com; s=arc-20160816; b=TPkSqwvN/FAYyM26oHsK2FFewrptGoH2BVtgypjfzYOg0GSWZa6J51/+LYKYlyQAl6 vwvLiayx/rERRzEgrweKIMLt6PQFmd9CEP6HjGnegb3DqbarJkNj6ir/F8nedV7SSmmn di9stKM4HRUfMBQN67MWgTHW79W+R/Kij3cHbkNd6VM8JK0I+svHb8gUbYdczGA1rHFl Oifp5kfBW1WV0MIaXSkbr1TCwoI/Trv2pYm0wrVZH+u6SZQACN55+6TyfLEwr2DL1C91 5p2GDfpzJcDhi1VVoxnTIKztjSVihQEaG6EeLlbFagqvAKvrKnfCHmh5jKHGAvNFuJOd I7Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=Ve+05QZoYPi0YuCClTc3ohV/03rfuQ4cI5EZopsltg8=; b=o/l+DoA17YKPjZs4gsDiTkocItXTGzIrcFVBMPIhWre0V3+5AfACJNL9fqtzRS+adN SZ60cCNgp0HK4rc66FvJoAGeZ+hENSipbq+iQaiXSLBIsUC5JK9AiOrfn6qAP53h90xG NuyAdqZoLiYMfYhl/cUBiy7H9k2MR2SztTblXxy3grNRUVCPUhbEv5ft2NVr84rd9GGS O/hqEX8PAbRX8K6eAH2TtR/7aKxDUxWlgxl4/6zBSb+UGGY4QUavvfXCao0fjXDdaTHp osQlVD0qWhgQ7Sr02j9Nzd21PPZ42L267DS+c/FV4p79rP65j264yPLErrKxY3hFSq1K qtlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=E35zVOra; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c203si7803264oib.273.2020.03.23.11.07.54; Mon, 23 Mar 2020 11:08:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=E35zVOra; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727725AbgCWSHJ (ORCPT + 99 others); Mon, 23 Mar 2020 14:07:09 -0400 Received: from mail-qv1-f65.google.com ([209.85.219.65]:45328 "EHLO mail-qv1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727091AbgCWSHJ (ORCPT ); Mon, 23 Mar 2020 14:07:09 -0400 Received: by mail-qv1-f65.google.com with SMTP id g4so1006672qvo.12 for ; Mon, 23 Mar 2020 11:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Ve+05QZoYPi0YuCClTc3ohV/03rfuQ4cI5EZopsltg8=; b=E35zVOramVTjTIS6drDV83CSHrAgJPiYFUoVqNBHtPyK9X35M60UZwZG6SSSLLDYKr q/RqNqf6mOSsd+zI56V1XOoxQZbXKVhpHBYHt5MZHQUFkGhzyG1hNHYzEhnAot+WSWfO kfFNOMiVnVUVs5USpcWkGRWwMiWJi64X9THbLAV55FJRRzpwhfCl7HJZNPWOqra5Sv8h 4NV/Qyj1sQxJHR4LySQ3DXTfik7aePNd5x21MPt/McAvqKJ3a+leXieu9VRHrESgh8jS tRVfxpBZTL43WBIXoijBiFGORijdgIhNfgBX5fRq99kzb7x00bFvRBs0si5toj6xo8+o NxBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Ve+05QZoYPi0YuCClTc3ohV/03rfuQ4cI5EZopsltg8=; b=AQxLhFCcIz+6HuWFSysxxFJ0+oZ404ztqd2DuJIIrvANPhB4loi8FI7Kp2jtBh93zh Jam4zc+rc132F+MLpdbn6CcIFs9tHqa1tCOKeanXY3PTfabRlRwuRmYopTBYfH0x6u/o vUgQa3PfZ8AoaxrL8/n6RDzlwR2WxhBzo/cDNsqjBXQVF3J/u5ctrDWLhAzyc/Dz1niF BKRjsIVSz2y1rUdyXBXcIditiFeivEqo1yY+xsiftfNWWkJ/gIUvtnd9mFbVfOLieL87 C/cRZaZaE2ZlChtLsKz967Hs5ImHCQsKFjODSZTNjlHlNWuVUc+UR1dVhWZs/y3xXYTQ exKA== X-Gm-Message-State: ANhLgQ1vCQY3sZy6U9GLdJh3w23nGN5AvecBc39GW2JQqXwZlaO+E+JY SultMJenFm/D+CjjJLalrWsltA== X-Received: by 2002:ad4:5401:: with SMTP id f1mr1081151qvt.209.1584986827928; Mon, 23 Mar 2020 11:07:07 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-68-57-212.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.57.212]) by smtp.gmail.com with ESMTPSA id c27sm1504978qkk.0.2020.03.23.11.07.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 23 Mar 2020 11:07:07 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1jGRTe-0007Xi-GP; Mon, 23 Mar 2020 15:07:06 -0300 Date: Mon, 23 Mar 2020 15:07:06 -0300 From: Jason Gunthorpe To: Mike Kravetz Cc: "Longpeng (Mike)" , akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org, arei.gonglei@huawei.com, weidong.huang@huawei.com, weifuqiang@huawei.com, kvm@vger.kernel.org, linux-mm@kvack.org, Matthew Wilcox , Sean Christopherson , stable@vger.kernel.org Subject: Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset() Message-ID: <20200323180706.GC20941@ziepe.ca> References: <1582342427-230392-1-git-send-email-longpeng2@huawei.com> <51a25d55-de49-4c0a-c994-bf1a8cfc8638@oracle.com> <20200323160955.GY20941@ziepe.ca> <69055395-e7e5-a8e2-7f3e-f61607149318@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <69055395-e7e5-a8e2-7f3e-f61607149318@oracle.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 23, 2020 at 10:27:48AM -0700, Mike Kravetz wrote: > > pgd = pgd_offset(mm, addr); > > - if (!pgd_present(*pgd)) > > + if (!pgd_present(READ_ONCE(*pgd))) > > return NULL; > > p4d = p4d_offset(pgd, addr); > > - if (!p4d_present(*p4d)) > > + if (!p4d_present(READ_ONCE(*p4d))) > > return NULL; > > > > pud = pud_offset(p4d, addr); > > One would argue that pgd and p4d can not change from present to !present > during the execution of this code. To me, that seems like the issue which > would cause an issue. Of course, I could be missing something. This I am not sure of, I think it must be true under the read side of the mmap_sem, but probably not guarenteed under RCU.. In any case, it doesn't matter, the fact that *p4d can change at all is problematic. Unwinding the above inlines we get: p4d = p4d_offset(pgd, addr) if (!p4d_present(*p4d)) return NULL; pud = (pud_t *)p4d_page_vaddr(*p4d) + pud_index(address); According to our memory model the compiler/CPU is free to execute this as: p4d = p4d_offset(pgd, addr) p4d_for_vaddr = *p4d; if (!p4d_present(*p4d)) return NULL; pud = (pud_t *)p4d_page_vaddr(p4d_for_vaddr) + pud_index(address); In the case where p4 goes from !present -> present (ie handle_mm_fault()): p4d_for_vaddr == p4d_none, and p4d_present(*p4d) == true, meaning the p4d_page_vaddr() will crash. Basically the problem here is not just missing READ_ONCE, but that the p4d is read multiple times at all. It should be written like gup_fast does, to guarantee a single CPU read of the unstable data: p4d = READ_ONCE(*p4d_offset(pgdp, addr)); if (!p4d_present(p4)) return NULL; pud = pud_offset(&p4d, addr); At least this is what I've been able to figure out :\ > > Also, the remark about pmd_offset() seems accurate. The > > get_user_fast_pages() pattern seems like the correct one to emulate: > > > > pud = READ_ONCE(*pudp); > > if (pud_none(pud)) > > .. > > if (!pud_'is a pmd pointer') > > .. > > pmdp = pmd_offset(&pud, address); > > pmd = READ_ONCE(*pmd); > > [...] > > > > Passing &pud in avoids another de-reference of the pudp. Honestly all > > these APIs that take in page table pointers and internally > > de-reference them seem very hard to use correctly when the page table > > access isn't fully locked against write. And the same protocol for the PUD, etc. > > It looks like at least the p4d read from the pgd is also unlocked here > > as handle_mm_fault() writes to it?? > > Yes, there is no locking required to call huge_pte_offset(). None? Not RCU or read mmap_sem? Jason