Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1420858pxb; Tue, 17 Aug 2021 11:13:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy40KcdQYa/qo7hzx3E80a8bav5cVgYFSjn+LdPhRMA4DnphqYW37y84sQsnPHRxck7uooS X-Received: by 2002:aa7:db13:: with SMTP id t19mr5564722eds.72.1629224025621; Tue, 17 Aug 2021 11:13:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629224025; cv=none; d=google.com; s=arc-20160816; b=irILF10TRb/l7PqFlrlfyvLgwrwV/3LcQBcVrwtEGqpY6+Vnw5IyVygH0XpS5l8ATN fgbOTCOo8xCmYOc/98e8Peg69SlLxGy9E9FcNNJla8ihXvljvzL82pEGKxro5ohXR/s0 kJipdeRt2Dw0b9o1uLV8aL27PhHYgOEjP9WenlwOw3wqtFabrUFRRfb7mGMY3vF1XS2R WXwCAraFOonfep617Ysa3wujtdQY7nr/+IFY8Fsv80sopBNAOXDU+gY7vrqxzJXlJKr3 4BAKof9PTyTIG6m+qjiCvu0dH/0OMUJQIVjM7b7ilKR5QBYPhvenM+B6j+ykcRbn+bJp 4IXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=iofck0C/wZ/jVJS98BI31ZbEAAnp/7+hsneZ78oAOSo=; b=aWICCYGKXfS3a34iRyuxcZPz1By3RvYBREoBuIp2as+nyG8cfBjzPmQYUHX2VGtDxG z+55/u9cuYLFGihrkddn2Oo1xtQZoIOP+YTDlyzor1sBRivdiqlWIbEAb0IaEnLR0jQ3 37oVAJwGCQ75rgV1KMWpBu6D+MyJr7Qo6oKQ9UvxBya415xiTDdkqi1GODu3VbpQHj1D mSmbRJvFo0sgjI6EVnguFplfvmpAbIvxkbT+693O6WB2B0YGseZVrjoQSdAmXPu4uSgg QeB4f09g8tGWKxvGuszzvwYVB5TQdZG5JOnSy5KXXOt0u2nlpbgVgHMuzSsqR6I7+Fgx R8gQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l21si3517454edw.14.2021.08.17.11.13.21; Tue, 17 Aug 2021 11:13:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232861AbhHQSKB (ORCPT + 99 others); Tue, 17 Aug 2021 14:10:01 -0400 Received: from gate.crashing.org ([63.228.1.57]:35155 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233417AbhHQSJ6 (ORCPT ); Tue, 17 Aug 2021 14:09:58 -0400 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 17HI3p37025443; Tue, 17 Aug 2021 13:03:51 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 17HI3o06025440; Tue, 17 Aug 2021 13:03:50 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 17 Aug 2021 13:03:50 -0500 From: Segher Boessenkool To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , userm57@yahoo.com, fthain@linux-m68k.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP Message-ID: <20210817180350.GH1583@gate.crashing.org> References: <1d28441dd80845e6428d693c0724cb6457247466.1629211378.git.christophe.leroy@csgroup.eu> <20210817162239.GF1583@gate.crashing.org> <0426a0d3-bdc6-1a34-1018-71b34282a6c6@csgroup.eu> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0426a0d3-bdc6-1a34-1018-71b34282a6c6@csgroup.eu> User-Agent: Mutt/1.4.2.3i Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi! On Tue, Aug 17, 2021 at 07:13:44PM +0200, Christophe Leroy wrote: > Le 17/08/2021 ? 18:22, Segher Boessenkool a ?crit?: > >On Tue, Aug 17, 2021 at 02:43:15PM +0000, Christophe Leroy wrote: > >>Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") > >>removed the 'isync' instruction after adding/removing NX bit in user > >>segments. The reasoning behind this change was that when setting the > >>NX bit we don't mind it taking effect with delay as the kernel never > >>executes text from userspace, and when clearing the NX bit this is > >>to return to userspace and then the 'rfi' should synchronise the > >>context. > >> > >>However, it looks like on book3s/32 having a hash page table, at least > >>on the G3 processor, we get an unexpected fault from userspace, then > >>this is followed by something wrong in the verification of MSR_PR > >>at end of another interrupt. > >> > >>This is fixed by adding back the removed isync() following update > >>of NX bit in user segment registers. Only do it for cores with an > >>hash table, as 603 cores don't exhibit that problem and the two isync > >>increase ./null_syscall selftest by 6 cycles on an MPC 832x. > >> > >>First problem: unexpected PROTFAULT > >> > >> [ 62.896426] WARNING: CPU: 0 PID: 1660 at > >> arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0 > >> [ 62.918111] Modules linked in: > >> [ 62.923350] CPU: 0 PID: 1660 Comm: Xorg Not tainted > >> 5.13.0-pmac-00028-gb3c15b60339a #40 > >> [ 62.943476] NIP: c001b5c8 LR: c001b6f8 CTR: 00000000 > >> [ 62.954714] REGS: e2d09e40 TRAP: 0700 Not tainted > >> (5.13.0-pmac-00028-gb3c15b60339a) > > > >That is not a protection fault. What causes this? > > That's the WARN_ON(error_code & DSISR_PROTFAULT) at > > https://elixir.bootlin.com/linux/v5.13/source/arch/powerpc/mm/fault.c#L354 Ah okay. How confusing :-/ > >A CSI (like isync) is required both before and after mtsr. It may work > >on some cores without -- what part of that is luck, if there is anything > >that guarantees it, is anyone's guess :-/ > > kuep_lock() is called when entering interrupts, it means we recently got an > 'rfi' to re-enable MMU. > kuep_unlock() is called when exit interrupts, it means we are soon going to > call 'rfi' to go back to user. > > In between, nobody is going to exec any userspace code, so who minds that > the 'mtsr' changing user segments is not completely finished ? Hey, that is my question! :-) So why does this not work on 750 then? > >>@@ -28,6 +30,8 @@ static inline void kuep_lock(void) > >> return; > >> > >> update_user_segments(mfsr(0) | SR_NX); > >>+ if (mmu_has_feature(MMU_FTR_HPTE_TABLE)) > >>+ isync(); /* Context sync required after mtsr() */ > >> } > > > >This needs a comment why you are not doing this for systems without > >hardware page table walk, at the least? > > Ok, will add a comment tomorrow. Thanks! Segher