Received: by 2002:ab2:687:0:b0:1f4:6588:b3a7 with SMTP id s7csp167306lqe; Tue, 9 Apr 2024 19:41:55 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXy8UuSzmgAlKZmUixHycGZ8POAXCWkMj9/0UJmulZw2/qZCinDtbKsHFMrwAsm9xCJd3oBl1oE4a/wuGWXqaggo7JXz4McOgD/fkcZaA== X-Google-Smtp-Source: AGHT+IG5PkDHxlx9rjY7PL+wb6GnWb8bJjycBSR6J1ULVlb5+LnrLMzKJs65gsibC5pQA2lSXT2Q X-Received: by 2002:a05:6512:1396:b0:514:b4ca:f478 with SMTP id fc22-20020a056512139600b00514b4caf478mr821598lfb.39.1712716914848; Tue, 09 Apr 2024 19:41:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712716914; cv=pass; d=google.com; s=arc-20160816; b=R+O+sAVxRRCgi95qxchRkk1Z+craGj4EXWac2VOgz7jduYTXhHIxrIsf/s6gfmiiCn JxLrUiQ2FoCd/Nr2XZUkrkgTXmvPJi2fz4Lv5A7Bh+dEL7EPG5gceLXHMHBxpsZadf5Q 1XQ/vZvj3PQt+IcNMCTgn8gajBKjAPlSaRjKwBXI1Yni6FSZO1cBgqfMAjMZYodl9QUZ biqm7Pzf8ZLC5kKLR2+WhG7MOjK8KxqVQ0dD0mmh0CQ7JlwyfgQell/9GOvX5jLjr3FP kfc47xFWCU31AF7u11g/EqMoeAsib2vzAwmedF3UWkEXX8Tfgs+FUujKVi1xWip6n7U3 CAZg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Et/VMbYr44M161l+rhM1ggU+/w4LdzW8Ocmke9PoFJY=; fh=DmIazrIsbgyqg+E+JqJMo7AM+d0zdwqMAxFXACkHF5s=; b=WuseL+iJm8f98r9B+4kcCvA2VD6lQjf+T0Yk6imYhj03y5xoXL/4UNPawq5c/DuVBH k9QOkE9QJtd9l2r/2sdOc+ZY2QUQqgOzj9HPi89J+U49fWrXHesakLm4R1CB+aicgvkB bqJAHRcBf++8DbeQ8o0XQ3V0Bzg+i7ZvSnpHQCmaHRBp2Y/K5Eymcc2SO/tR4tULAtVA rZ62iPYHkw98AKqs/FUkQ/L5xzBBEKbXFP/mE4OvIWd4eDkFhh7GNAFNyLZooRQIe2vS PD6E60UreiLz2jHUfODiNK1QTe4iKVO/Kiy6FR3hUFbYa2fCcfLr/JPe6Uqc8k8RAh74 wyKA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-137852-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137852-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id h3-20020a0564020e8300b0056e23e3bdc6si5139014eda.252.2024.04.09.19.41.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Apr 2024 19:41:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-137852-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-137852-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137852-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 8FBD11F25719 for ; Wed, 10 Apr 2024 02:41:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BF18CB676; Wed, 10 Apr 2024 02:41:41 +0000 (UTC) Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0528C5CB5 for ; Wed, 10 Apr 2024 02:41:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712716901; cv=none; b=nCyAWqkS+mENFte6ZpJeASl92Q71QNwoWgUB/QDJ9NVbDeG5ODjk/CONAfylC1D2wEdkvDYnhxiIOGH164AtZCEobyq4SHiXQKSignOKflW2rkLDrZODDq7YdRBHNbA7tZTHYP5CoWrUh1SXbyRfDO7z9yulNONkQVlzJ5WSrSU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712716901; c=relaxed/simple; bh=WQE/DcVn0z+Dwqhu6TGO6zO3ltOLL30++8D0WS2ug+I=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hkv5rLwvGBCcJeX0p4Ci3RTyNtqyK49VQOyFeiNP/KQsMNQ80tZ//80uQAsAbHyx49vSVPyeLjYaeMxUVy/Wv1UjnPcq/Ui2/Hiivneg95tC9PYglupVp8y5UMLV4ZFR/Bxff7qVG/0/5drtXgg6iopmw00+D/f6d12ucdpxf/Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.163]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4VDnBV3Vjxz1GGgC; Wed, 10 Apr 2024 10:40:50 +0800 (CST) Received: from kwepemm600014.china.huawei.com (unknown [7.193.23.54]) by mail.maildlp.com (Postfix) with ESMTPS id 3204D18002D; Wed, 10 Apr 2024 10:41:36 +0800 (CST) Received: from Linux-SUSE12SP5.huawei.com (10.67.136.233) by kwepemm600014.china.huawei.com (7.193.23.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 10 Apr 2024 10:41:35 +0800 From: zhuqiuer To: CC: , , , , , , , , Subject: Re: [PATCH] ARM: Add a memory clobber to the fmrx instruction Date: Wed, 10 Apr 2024 10:41:25 +0800 Message-ID: <20240410024126.21589-1-zhuqiuer1@huawei.com> X-Mailer: git-send-email 2.12.3 In-Reply-To: <20240409164641.GC3219862@dev-arch.thelio-3990X> References: <20240409164641.GC3219862@dev-arch.thelio-3990X> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600014.china.huawei.com (7.193.23.54) > > Instruction fmrx is used throughout the kernel, > > where it is sometimes expected to be skipped > > by incrementing the program counter, such as in vfpmodule.c:vfp_init(). > > Therefore, the instruction should not be reordered when it is not intended. > > Adding a barrier() instruction before and after this call cannot prevent > > reordering by the compiler, as the fmrx instruction is constrained > > by '=r', meaning it works on the general register but not on memory. > > To ensure the order of the instruction after compiling, > > adding a memory clobber is necessary. > > > > Below is the code snippet disassembled from the method: > > vfpmodule.c:vfp_init(), compiled by LLVM. > > > > Before the patching: > > xxxxx: xxxxx bl c010c688 > > xxxxx: xxxxx mov r0, r4 > > xxxxx: xxxxx bl c010c6e4 > > ... > > xxxxx: xxxxx bl c0791c8c > > xxxxx: xxxxx movw r5, #23132 ; 0x5a5c > > xxxxx: xxxxx vmrs r4, fpsid <- this is the fmrx instruction > > > > After the patching: > > xxxxx: xxxxx bl c010c688 > > xxxxx: xxxxx mov r0, r4 > > xxxxx: xxxxx vmrs r5, fpsid <- this is the fmrx instruction > > xxxxx: xxxxx bl c010c6e4 > > > > Signed-off-by: zhuqiuer > > --- > > arch/arm/vfp/vfpinstr.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h > > index 3c7938fd40aa..e70129e10b8e 100644 > > --- a/arch/arm/vfp/vfpinstr.h > > +++ b/arch/arm/vfp/vfpinstr.h > > @@ -68,7 +68,7 @@ > > u32 __v; \ > > asm(".fpu vfpv2\n" \ > > "vmrs %0, " #_vfp_ \ > > - : "=r" (__v) : : "cc"); \ > > + : "=r" (__v) : : "memory", "cc"); \ > > __v; \ > > }) > > > > -- > > 2.12.3 > > > > This seems like the same issue that Ard was addressing with this patch > at https://lore.kernel.org/20240318093004.117153-2-ardb+git@google.com/, > does that change work for your situation as well? I do not really have a > strong preference between the two approaches, Ard also mentioned using > *current in the asm constraints as another option. Sorry for not reading Ard's thread at first. Yes, using "asm volatile" also worked for our case, and it was our previous solution. But we later switched to the memory clobber due to the same reason that you mentioned in Ard's thread. We believe that a memory clobber is robust enough to prevent the reordering situation mentioned. v1 -> v2: Adding a memory clobber the fmxr instruction.