Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1602639pxb; Mon, 11 Oct 2021 09:16:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyVT6WK4X2kvKVvbjYAovknRitqdNxVJAyRkUASyhPSnsFFpYy4s+fhzdoV4e3LBffCXW5Z X-Received: by 2002:a17:90a:d3d6:: with SMTP id d22mr31457704pjw.242.1633969002728; Mon, 11 Oct 2021 09:16:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633969002; cv=none; d=google.com; s=arc-20160816; b=P0oZy9MDA41YFYXHaL8B4klE8ESW06mb05aRcYBHHjl5zcg695UZSAAoRjR2MhroWi YSxJlntut+dLAIMiKb0cQUhG/nV76xkRI4SCukYsVwCfdXR5H16xWnSSuj6ncQkmv0vr HRt6pLWGAq0IGCMBCUkyEPEOnT7Ngkz722n6Xu1v/kQQeVP+Fj9j86LGUmYeMtLBN5wX 8eyQwGIV2NOXoqkk2Z3C+A02bAqXbKRPeouHmhFwervkNC+wVOoJOYPS1tOYuUc8xP9y wDxDTYX2fAqNkMXfaoI4cnsFWofF9RWR+N4XVlYd6nhS/HFgLAR41eyRufJZpc+kSGgc J/Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=PZDBy9SP8THOo3P9AFT0BZV00cuFvbRyq8k7nWvzyqk=; b=0Ki9XAJt1BOgkPqxFu/1imLjbRfqWjRi1xTpfRjD6YLNSy79C+4cZD6JqfDB67C1Rw eNNAYYJ6Qp3ZuYvku7tMenOLA3OkGkKOD6ioIx5Q7ecam31+ulg0Xcf+TuGhB62pQuAX 7ZZzjqapMMH5VYvZ+bULHhYJrGin4CAIK4nyrjhaiIVuVl9MUCVHZvFTGUh2Pm96qqoa N6tlHkZDs/riwrWcRA23S9sUeYSX6wusppVVQHPtLW0XAdywDxZF/w1zBeScTpnL6FnP Vb2o8QjxMZleMwLmIcLp5tQRaxLbUOQvRR/IqlflLaZkLdtP94eLbhFriu3FqgwlVhfo nFqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l18si15300472plh.436.2021.10.11.09.16.28; Mon, 11 Oct 2021 09:16:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235526AbhJKJkg (ORCPT + 99 others); Mon, 11 Oct 2021 05:40:36 -0400 Received: from foss.arm.com ([217.140.110.172]:40382 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235500AbhJKJke (ORCPT ); Mon, 11 Oct 2021 05:40:34 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 676011FB; Mon, 11 Oct 2021 02:38:33 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.22.9]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5A2033F66F; Mon, 11 Oct 2021 02:38:30 -0700 (PDT) Date: Mon, 11 Oct 2021 10:38:18 +0100 From: Mark Rutland To: Zhaoyang Huang Cc: Will Deacon , Catalin Marinas , Suzuki K Poulose , Ionela Voinescu , Quentin Perret , Vladimir Murzin , linux-arm-kernel@lists.infradead.org, Zhaoyang Huang , LKML , Ke Wang , ping.zhou1@unisoc.com Subject: Re: [RFC PATCH] arch: ARM64: add isb before enable pan Message-ID: <20211011093803.GA1421@C02TD0UTHF1T.local> References: <1633673269-15048-1-git-send-email-huangzhaoyang@gmail.com> <20211008080113.GA441@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Fri, Oct 08, 2021 at 04:34:12PM +0800, Zhaoyang Huang wrote: > On Fri, Oct 8, 2021 at 4:01 PM Will Deacon wrote: > > On Fri, Oct 08, 2021 at 02:07:49PM +0800, Huangzhaoyang wrote: > > > From: Zhaoyang Huang > > > > > > set_pstate_pan failure is observed in an ARM64 system occasionaly on a reboot > > > test, which can be work around by a msleep on the sw context. We assume > > > suspicious on disorder of previous instr of disabling SW_PAN and add an isb here. > > > > > > PS: > > > The bootup test failed with a invalid TTBR1_EL1 that equals 0x34000000, which is > > > alike racing between on chip PAN and SW_PAN. > > > > Sorry, but I'm struggling to understand the problem here. Please could you > > explain it in more detail? > > > > - Why does a TTBR1_EL1 value of `0x34000000` indicate a race? > > - Can you explain the race that you think might be occurring? > > - Why does an ISB prevent the race? > Please find panic logs[1], related codes[2], sample of debug patch[3] > below. TTBR1_EL1 equals 0x34000000 when panic Just to check, how do you know the value of TTBR1_EL1 was 0x34000000? That isn't in the log sample below -- was that from the output of show_pte(), an external debugger, or something else? I'm assuming from the "(ptrval)" bits below that can't have been from show_pte(). > and can NOT be captured > by the debug patch during retest (all entrances that msr ttbr1_el1 are > under watch) which should work. Adding ISB here to prevent race on > TTBR1 from previous access of sysregs which can affect the msr > result(the test is still ongoing). Could the race be > ARM64_HAS_PAN(automated by core) and SW_PAN. > > [1] > [ 0.348000] [0: migration/0: 11] Synchronous External Abort: > level 1 (translation table walk) (0x96000055) at 0xffffffc000e06004 > [ 0.352000] [0: migration/0: 11] Internal error: : 96000055 > [#1] PREEMPT SMP > [ 0.352000] [0: migration/0: 11] Modules linked in: > [ 0.352000] [0: migration/0: 11] Process migration/0 (pid: > 11, stack limit = 0x (ptrval)) > [ 0.352000] [0: migration/0: 11] CPU: 0 PID: 11 Comm: > migration/0 Tainted: G S Assuming I've read the `taint_flags` table correctly, that 'S' is `TAINT_CPU_OUT_OF_SPEC`, for which we should dump warnings for at boot time. The 'G' indicates the absence of proprietary modules. Can you provide a full dmesg for a failed boot, please? Have you made any changes to arch/arm64/kernel/cpufeature.c? Are you able to test with a mainline kernel? > 4.14.199-22631304-abA035FXXU0AUJ4_T4 #2 > > [ 0.352000] [0: migration/0: 11] Hardware name: Spreadtrum > UMS9230 1H10 SoC (DT) > [ 0.352000] [0: migration/0: 11] task: (ptrval) > task.stack: (ptrval) > [ 0.352000] [0: migration/0: 11] pc : patch_alternative+0x68/0x27c > [ 0.352000] [0: migration/0: 11] lr : > __apply_alternatives.llvm.7450387295891320208+0x60/0x160 > > [2] > __apply_alternatives > for() > patch_alternative <----panic here in the 2nd round of loop > after invoking flush_icache_range > flush_icache_range > > [3] > sub \tmp1, \tmp1, #SWAPPER_DIR_SIZE > + tst \tmp1, #0xffff80000000 // check ttbr1_el1 valid > + b.le . What are you trying to detect for here? This is testing both the ASID and BADDR[47] bits, so I don;t understand the rationale. Thanks, Mark. > msr ttbr1_el1, \tmp1 // set reserved ASID > > > > > > Signed-off-by: Zhaoyang Huang > > > --- > > > arch/arm64/kernel/cpufeature.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > > index efed283..3c0de0d 100644 > > > --- a/arch/arm64/kernel/cpufeature.c > > > +++ b/arch/arm64/kernel/cpufeature.c > > > @@ -1663,6 +1663,7 @@ static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused) > > > WARN_ON_ONCE(in_interrupt()); > > > > > > sysreg_clear_set(sctlr_el1, SCTLR_EL1_SPAN, 0); > > > + isb(); > > > set_pstate_pan(1); > > > > SCTLR_EL1.SPAN only affects the PAN behaviour on taking an exception, which > > is itself a context-synchronizing event, so I can't see why the ISB makes > > any difference here (at least, for the purposes of PAN). > > > > Thanks, > > > > Will