Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1576320ybk; Sun, 10 May 2020 21:58:39 -0700 (PDT) X-Google-Smtp-Source: APiQypJTJrDAVw5xWTFazbz/6zWWlHWOLjtxVePic6SX2TprHgG67BDoIZlFOAzseH2ew/fHIUZV X-Received: by 2002:a17:906:41a:: with SMTP id d26mr3841092eja.217.1589173119578; Sun, 10 May 2020 21:58:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589173119; cv=none; d=google.com; s=arc-20160816; b=y3l5xB6xp9CTsE0Z71xnuzntZvGlx+BQM9E5wCrip5e50ixLu7XaJt3i6GbYVfiK7d VTydGvtxRx6hOHEJ50KTuDUIMUKb/Rpt5Dof7EwUJThkc1t/jbUKyfNb0CReUJ+XHzCE u2ygHfZ2ittAPqGP7ahVyOU1rXMr2rX9AVux+Vd64wfbBGE4qYR8TVQOCkZApRBZxWj8 x0RXVROYYGbZ5RP71D+cvQSHeqkPk5+IDdpEVSJwUHoOQtN2LqRPlq36YlyeIEZfmnJZ QXW2ihSgaYfvREXT6ZQPfR5As0D5HCS17cz1CL2QcWaD72KHnwSCjK3/7wWSuAIny2lE Dc9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bwBlCepL7Qc76YVn9o121W3aHUz5e0Lg200o/8u5NbY=; b=s6xlZoEYEud8Hnx29sqsTNiUNZWcCzAvPruz8fp5kOPBIuVbFVljEjX11zkCUZH0a8 RCPeZn4hdv5YQVa0eE/xoeOYtFWkT/4HVekk7y4k0OdRIyXOCaZKdaUR/rzubmygnyxe wkH7T4CURBbYk8X5piuhrvPl6/lLtW/7KD67texnRu0nOA0SwqMf/D+KEiLbdCsFHuja JNgT0OF+0YZ2g+VTXBKEfbLpz415xNc5yWI0FXKkVDmRXmiTkJRCbGDiIp70ReVZGwQ0 5OU55CLZl6Q4OF7Xw9xwYMdk2BZgwP8IdfKHC/0pOR3FJpknlIelRUgBOsCXRVGNbWZy yNmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wVATPgy4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gl22si1855982ejb.297.2020.05.10.21.58.11; Sun, 10 May 2020 21:58:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wVATPgy4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729137AbgEKExy (ORCPT + 99 others); Mon, 11 May 2020 00:53:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:38172 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728710AbgEKExw (ORCPT ); Mon, 11 May 2020 00:53:52 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1716824953; Mon, 11 May 2020 04:53:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1589172831; bh=42vi448p+mrkPGZCOyniyT8x08SXAABtZ5W56U5+AFk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wVATPgy4J3UEM/lNkY5n2GXotROHwWFFAKRGD1BrwdZjZlHs/9LHr8cOqlxQWQtqF imaE1vwOuZ3NitC41uzVYxcvGV7Ai62RDIkDVfPh6hgDF2F5IDxR4ZIJtUMmzXoYc+ BghxsO/MyWKiaxz0Gfnt6Eu7yc7ahBE+8eqqKMR0= From: Sasha Levin To: linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org Cc: hpa@zytor.com, dave.hansen@intel.com, tony.luck@intel.com, ak@linux.intel.com, ravi.v.shankar@intel.com, chang.seok.bae@intel.com, Sasha Levin , Randy Dunlap , Jonathan Corbet Subject: [PATCH v12 18/18] Documentation/x86/64: Add documentation for GS/FS addressing mode Date: Mon, 11 May 2020 00:53:11 -0400 Message-Id: <20200511045311.4785-19-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200511045311.4785-1-sashal@kernel.org> References: <20200511045311.4785-1-sashal@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Gleixner Explain how the GS/FS based addressing can be utilized in user space applications along with the differences between the generic prctl() based GS/FS base control and the FSGSBASE version available on newer CPUs. Originally-by: Andi Kleen Signed-off-by: Thomas Gleixner Signed-off-by: Chang S. Bae Signed-off-by: Sasha Levin Reviewed-by: Tony Luck Reviewed-by: Randy Dunlap Cc: Thomas Gleixner Cc: Borislav Petkov Cc: Andy Lutomirski Cc: H. Peter Anvin Cc: Dave Hansen Cc: Tony Luck Cc: Andi Kleen Cc: Randy Dunlap Cc: Jonathan Corbet --- Documentation/x86/x86_64/fsgs.rst | 199 +++++++++++++++++++++++++++++ Documentation/x86/x86_64/index.rst | 1 + 2 files changed, 200 insertions(+) create mode 100644 Documentation/x86/x86_64/fsgs.rst diff --git a/Documentation/x86/x86_64/fsgs.rst b/Documentation/x86/x86_64/fsgs.rst new file mode 100644 index 0000000000000..50960e09e1f66 --- /dev/null +++ b/Documentation/x86/x86_64/fsgs.rst @@ -0,0 +1,199 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Using FS and GS segments in user space applications +=================================================== + +The x86 architecture supports segmentation. Instructions which access +memory can use segment register based addressing mode. The following +notation is used to address a byte within a segment: + + Segment-register:Byte-address + +The segment base address is added to the Byte-address to compute the +resulting virtual address which is accessed. This allows to access multiple +instances of data with the identical Byte-address, i.e. the same code. The +selection of a particular instance is purely based on the base-address in +the segment register. + +In 32-bit mode the CPU provides 6 segments, which also support segment +limits. The limits can be used to enforce address space protections. + +In 64-bit mode the CS/SS/DS/ES segments are ignored and the base address is +always 0 to provide a full 64bit address space. The FS and GS segments are +still functional in 64-bit mode. + +Common FS and GS usage +------------------------------ + +The FS segment is commonly used to address Thread Local Storage (TLS). FS +is usually managed by runtime code or a threading library. Variables +declared with the '__thread' storage class specifier are instantiated per +thread and the compiler emits the FS: address prefix for accesses to these +variables. Each thread has its own FS base address so common code can be +used without complex address offset calculations to access the per thread +instances. Applications should not use FS for other purposes when they use +runtimes or threading libraries which manage the per thread FS. + +The GS segment has no common use and can be used freely by +applications. GCC and Clang support GS based addressing via address space +identifiers. + +Reading and writing the FS/GS base address +------------------------------------------ + +There exist two mechanisms to read and write the FS/GS base address: + + - the arch_prctl() system call + + - the FSGSBASE instruction family + +Accessing FS/GS base with arch_prctl() +-------------------------------------- + + The arch_prctl(2) based mechanism is available on all 64-bit CPUs and all + kernel versions. + + Reading the base: + + arch_prctl(ARCH_GET_FS, &fsbase); + arch_prctl(ARCH_GET_GS, &gsbase); + + Writing the base: + + arch_prctl(ARCH_SET_FS, fsbase); + arch_prctl(ARCH_SET_GS, gsbase); + + The ARCH_SET_GS prctl may be disabled depending on kernel configuration + and security settings. + +Accessing FS/GS base with the FSGSBASE instructions +--------------------------------------------------- + + With the Ivy Bridge CPU generation Intel introduced a new set of + instructions to access the FS and GS base registers directly from user + space. These instructions are also supported on AMD Family 17H CPUs. The + following instructions are available: + + =============== =========================== + RDFSBASE %reg Read the FS base register + RDGSBASE %reg Read the GS base register + WRFSBASE %reg Write the FS base register + WRGSBASE %reg Write the GS base register + =============== =========================== + + The instructions avoid the overhead of the arch_prctl() syscall and allow + more flexible usage of the FS/GS addressing modes in user space + applications. This does not prevent conflicts between threading libraries + and runtimes which utilize FS and applications which want to use it for + their own purpose. + +FSGSBASE instructions enablement +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + The instructions are enumerated in CPUID leaf 7, bit 0 of EBX. If + available /proc/cpuinfo shows 'fsgsbase' in the flag entry of the CPUs. + + The availability of the instructions does not enable them + automatically. The kernel has to enable them explicitly in CR4. The + reason for this is that older kernels make assumptions about the values in + the GS register and enforce them when GS base is set via + arch_prctl(). Allowing user space to write arbitrary values to GS base + would violate these assumptions and cause malfunction. + + On kernels which do not enable FSGSBASE the execution of the FSGSBASE + instructions will fault with a #UD exception. + + The kernel provides reliable information about the enabled state in the + ELF AUX vector. If the HWCAP2_FSGSBASE bit is set in the AUX vector, the + kernel has FSGSBASE instructions enabled and applications can use them. + The following code example shows how this detection works:: + + #include + #include + + /* Will be eventually in asm/hwcap.h */ + #ifndef HWCAP2_FSGSBASE + #define HWCAP2_FSGSBASE (1 << 1) + #endif + + .... + + unsigned val = getauxval(AT_HWCAP2); + + if (val & HWCAP2_FSGSBASE) + printf("FSGSBASE enabled\n"); + +FSGSBASE instructions compiler support +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +GCC version 4.6.4 and newer provide instrinsics for the FSGSBASE +instructions. Clang 5 supports them as well. + + =================== =========================== + _readfsbase_u64() Read the FS base register + _readfsbase_u64() Read the GS base register + _writefsbase_u64() Write the FS base register + _writegsbase_u64() Write the GS base register + =================== =========================== + +To utilize these instrinsics must be included in the source +code and the compiler option -mfsgsbase has to be added. + +Compiler support for FS/GS based addressing +------------------------------------------- + +GCC version 6 and newer provide support for FS/GS based addressing via +Named Address Spaces. GCC implements the following address space +identifiers for x86: + + ========= ==================================== + __seg_fs Variable is addressed relative to FS + __seg_gs Variable is addressed relative to GS + ========= ==================================== + +The preprocessor symbols __SEG_FS and __SEG_GS are defined when these +address spaces are supported. Code which implements fallback modes should +check whether these symbols are defined. Usage example:: + + #ifdef __SEG_GS + + long data0 = 0; + long data1 = 1; + + long __seg_gs *ptr; + + /* Check whether FSGSBASE is enabled by the kernel (HWCAP2_FSGSBASE) */ + .... + + /* Set GS base to point to data0 */ + _writegsbase_u64(&data0); + + /* Access offset 0 of GS */ + ptr = 0; + printf("data0 = %ld\n", *ptr); + + /* Set GS base to point to data1 */ + _writegsbase_u64(&data1); + /* ptr still addresses offset 0! */ + printf("data1 = %ld\n", *ptr); + + +Clang does not provide the GCC address space identifiers, but it provides +address spaces via an attribute based mechanism in Clang 2.6 and newer +versions: + + ==================================== ===================================== + __attribute__((address_space(256)) Variable is addressed relative to GS + __attribute__((address_space(257)) Variable is addressed relative to FS + ==================================== ===================================== + +FS/GS based addressing with inline assembly +------------------------------------------- + +In case the compiler does not support address spaces, inline assembly can +be used for FS/GS based addressing mode:: + + mov %fs:offset, %reg + mov %gs:offset, %reg + + mov %reg, %fs:offset + mov %reg, %gs:offset diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst index d6eaaa5a35fcd..a56070fc8e77a 100644 --- a/Documentation/x86/x86_64/index.rst +++ b/Documentation/x86/x86_64/index.rst @@ -14,3 +14,4 @@ x86_64 Support fake-numa-for-cpusets cpu-hotplug-spec machinecheck + fsgs -- 2.20.1