Received: by 2002:ab2:2994:0:b0:1ef:ca3e:3cd5 with SMTP id n20csp265480lqb; Thu, 14 Mar 2024 10:37:03 -0700 (PDT) X-Forwarded-Encrypted: i=4; AJvYcCX8Id4I9l3mHIiOzZoXz+/CFnrrPCrJisabfx3S1ReqI6sXNiyEheEsyV4Jegu/JHTket4K/lL5MYW5GGId9e7Rt5oQlHsWiPSlj6caSA== X-Google-Smtp-Source: AGHT+IE/PngcHZENxnwzDnD1utzy9rVSHNetqsJVeJQAjMvrDlHfF3BMpfbHDK6S5QdO35ipPKNp X-Received: by 2002:a05:622a:5b98:b0:430:a8a1:530f with SMTP id ec24-20020a05622a5b9800b00430a8a1530fmr1100350qtb.6.1710437823375; Thu, 14 Mar 2024 10:37:03 -0700 (PDT) ARC-Seal: i=3; a=rsa-sha256; t=1710437823; cv=pass; d=google.com; s=arc-20160816; b=o75UyUKsPHlcTqI7KUgd5AO8kYffxSKy9JZsnfVUFDRBrOZ2BtJeSWAu6583O9+3t9 upUnz989itPsTegR7MD2MDvNxgIiPXZNbhTOuRss6/LpXWhtATuNod87WL5kgH+5seq2 iLz8fP/7uvOHGsU8F8j8ctUr5SE6jVt8yoUlsjQfq32Hklnd+b16AEsl9JURhkvKSk0b 1+y3DujF6iA5jX5FQA+NkKvCT/erDLGqoxMgrcTSVgsfpJzIcO7dvAPnxSMKDreqH5rr 7jDWmeJDX93DA0re6ctxkSnQK+xMWllMdmPMRsx0wIU7UK6OHIe7XJmbr4CGUtkO3Vc6 rYxQ== ARC-Message-Signature: i=3; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=2FRnGeCxFOtPEvgUPXcBs82NrDY5IYDYS2r+RYjKVpg=; fh=7tCQ7pPy35rDjgnescXq6jJbHquj4Dt6W4jekZ5pWKo=; b=yXBbBAhxscHCFeGkQXvb/JNvlVGU8dhyScqU3T4szZaO6zGdGba4LiGjoSVgHkbBOi cSIzGkCuu5l8VEC0rRfojQlJ6QJ44/g53LfKtGK2ONP+LUl4OyBXlxS7oAy8ZsCPUPnu TJHTBKhcaJ0E+REisrv4TWoPrxoT+ZEXDQKYpiNfLLyv5hzDX9YXVgqRinoCSgr8m/wn J2zlLbNpAbHSYiGEudMMZILlFnGNe3xXfnu++wa82TdQM//2s66zMnOwrPMklTYnOOgJ u6Wg1dOg/FldSenahxc+tzVGCx88Z0V3xtGjUHHuhT/G34BAf+OfErG9MW5psK+9LCHj jWaw==; dara=google.com ARC-Authentication-Results: i=3; mx.google.com; dkim=pass header.i=@freebsd.org header.s=dkim header.b=sEhNz4ZB; arc=pass (i=2 spf=pass spfdomain=freebsd.org dkim=pass dkdomain=freebsd.org dmarc=pass fromdomain=freebsd.org); spf=pass (google.com: domain of linux-kernel+bounces-103630-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-103630-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=freebsd.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id x1-20020a05622a000100b0042ef4820b29si1920015qtw.321.2024.03.14.10.37.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Mar 2024 10:37:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-103630-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@freebsd.org header.s=dkim header.b=sEhNz4ZB; arc=pass (i=2 spf=pass spfdomain=freebsd.org dkim=pass dkdomain=freebsd.org dmarc=pass fromdomain=freebsd.org); spf=pass (google.com: domain of linux-kernel+bounces-103630-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-103630-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=freebsd.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 0D9F81C21020 for ; Thu, 14 Mar 2024 17:37:03 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 81D38745FD; Thu, 14 Mar 2024 17:36:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=freebsd.org header.i=@freebsd.org header.b="sEhNz4ZB" Received: from mx2.freebsd.org (mx2.freebsd.org [96.47.72.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F2D673510; Thu, 14 Mar 2024 17:36:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=96.47.72.81 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710437814; cv=pass; b=iEykuBQluAQH/WnOK6ePzg/Q3ShK2iGZUiE+YXvTrsJ8WQPD4C2UArkk+LuJUh9LFW/AAcFNstlIRx8FZqtL9LXssLKlbwS5XJod6bM5tALLLi/tR+TMzEMrGGWjEK68Y225HZKSw75uq7uNB55+mk2mHqcGV+B9NVP3d9FphOQ= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710437814; c=relaxed/simple; bh=CsK5s0+VbXMp/Uzr0NC2Vw2bo01xvjT7x2Lb4wK8GBs=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=W2G0MAPzP6b8KqBUevjaZ1po1X/+lPDWqulVfxJD7ouRpGseI794HGKMR8fp1dqYZyc81c6lmK9SfQelGuflAWlPSW5r/X3lxZF/2wcOfOfY6ehakZH4cLh60K4tDeas7uxkNFuzmOEIxynXPLvmdAww15lBoM/5RwREYh1bUeU= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=FreeBSD.org; spf=pass smtp.mailfrom=FreeBSD.org; dkim=pass (2048-bit key) header.d=freebsd.org header.i=@freebsd.org header.b=sEhNz4ZB; arc=pass smtp.client-ip=96.47.72.81 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=FreeBSD.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits)) (Client CN "mx1.freebsd.org", Issuer "R3" (verified OK)) by mx2.freebsd.org (Postfix) with ESMTPS id 4TwZLq2tq7z45qb; Thu, 14 Mar 2024 17:36:51 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TwZLq0vlZz4G9m; Thu, 14 Mar 2024 17:36:51 +0000 (UTC) (envelope-from jhb@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710437811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2FRnGeCxFOtPEvgUPXcBs82NrDY5IYDYS2r+RYjKVpg=; b=sEhNz4ZBuiVsXmcilCwowh42gWx/ujdNrhfQgF9MTZYFlTaPNgT6u6CtyovNZK0Ny3h9mr CjC3HmvFbS3BkqOsuR2TOxuM6kBYn8pnaHaQ4RrKVpMDnPny1CpYXDZtxa1zLIZAcWi31H dyjuNpYOqU0wNqbJ3lIuEYxHTtKYl6lUX7XIzF4UyvBMJHmpJcGNGO5SP24p0a9wo6wJTr /P8LZAor9zCuZK2PUIk/anzVNzBpYTQTHb3pdXSzZZKKUACMsWvjUoJjy9IHsaA73UU1+z F1KVTzacJkz7b2ZE1J2ENR/X/qDRR9YZ+z0Mdt6SGn+gwOTkDJJr6G8QZrsaoA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710437811; a=rsa-sha256; cv=none; b=PJKOWliNE9sARysvn+5wbnE2O6//dmWKHXlV8KLUAGyZVL7ruhitf1Yv22v2SuiB1cOfv2 CFcknR6jm5Wd0uWqAkvHZx4OE9ppyXIYhWhOlJ3yaXvOGMYjqUbtzBa822N37nVaTHugkB Yi5GnZJF9OPyqDUB2cPOrlns4sLcOTrLpYWPzzL0C4cakDQv3yboQLvsM0hNVys8Ca+FE1 cf72Qv++5EHsmMq4OYvmnXvM3KEcMWq7phQqb6jeImXXdUBzZUhVRq17uwqz/wLKZZXaIt 914d2ab+00tzpHGfMelNG8uVgQLWotKFY53gT6SN9L2qx2iXe8tZkd6K9nw+Rw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710437811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2FRnGeCxFOtPEvgUPXcBs82NrDY5IYDYS2r+RYjKVpg=; b=uNM9NnEpZukwPoUCmQi64Ps5+jKT7j+T+8xTlbM759DPqHIneuZ0wR4NKINoteUxbgJNvX dZtg3U0tYZM/JNO3w6oSigr0i4OCzfbaHAY3HWy0iPKcLLMZJRfxPOmK2mFW5kEY+H28Yj cTXHiLTARTTjEJPYbQnbR3jKkbC2I8tqB1XopkCpJRPzJbwHScQr6LG0KK/1NI+iWRAil7 LXRBkTUE/AsW9N36R/6ZBD4XeddWzpJiqzfiLQ9yZyMC/hM6exKzQ/J4Er3RdlfZQpgbGD kUkj/Np/G7DTcG6hUeI3XuFFL8gEdgfQBOJzlcI1Fnyb0HIFW1GmpHR3oCTrfQ== Received: from [IPV6:2601:644:937f:4c50:9159:2009:aff7:887a] (unknown [IPv6:2601:644:937f:4c50:9159:2009:aff7:887a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 4TwZLn2VmVzMNv; Thu, 14 Mar 2024 17:36:49 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Message-ID: Date: Thu, 14 Mar 2024 10:36:48 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/1] x86/elf: Add a new .note section containing Xfeatures information to x86 core files Content-Language: en-US To: Dave Hansen , Vignesh Balasubramanian , linux-kernel@vger.kernel.org, linux-toolchains@vger.kernel.org Cc: mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, aneesh.kumar@kernel.org, naveen.n.rao@linux.ibm.com, ebiederm@xmission.com, keescook@chromium.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, bpetkov@amd.com, jinisusan.george@amd.com, matz@suse.de, binutils@sourceware.org, felix.willgerodt@intel.com References: <20240314112359.50713-1-vigbalas@amd.com> <20240314112359.50713-2-vigbalas@amd.com> <971d21b7-0309-439e-91b6-234f84da959d@FreeBSD.org> From: John Baldwin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 3/14/24 10:10 AM, Dave Hansen wrote: > On 3/14/24 09:45, John Baldwin wrote: >> On 3/14/24 8:37 AM, Dave Hansen wrote: >>> On 3/14/24 04:23, Vignesh Balasubramanian wrote: >>>> Add a new .note section containing type, size, offset and flags of >>>> every xfeature that is present. >>> >>> Mechanically, I'd much rather have all of that info in the cover letter >>> in the actual changelog instead. >>> >>> I'd also love to see a practical example of what an actual example core >>> dump looks like on two conflicting systems: >>> >>>     * Total XSAVE size >>>     * XCR0 value >>>     * XSTATE_BV from the core dump >>>     * XFEATURE offsets for each feature >> >> I noticed this when I bought an AMD Ryzen 9 5900X based system for >> my desktop running FreeBSD and found that the XSAVE core dump notes >> were not recognized by GDB (FreeBSD dumps an XSAVE register set note >> that matches the same layout of NT_X86_XSTATE used by Linux). > > I just want to make sure that you heard what I asked. I'd like to see a > practical example of how the real-world enumeration changes between two > real world systems. > > Is that possible? > > Here's the raw CPUID data from the XSAVE region on my laptop: > >> 0x0000000d 0x00: eax=0x000002e7 ebx=0x00000a88 ecx=0x00000a88 edx=0x00000000 >> 0x0000000d 0x01: eax=0x0000000f ebx=0x00000998 ecx=0x00003900 edx=0x00000000 >> 0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000 >> 0x0000000d 0x05: eax=0x00000040 ebx=0x00000440 ecx=0x00000000 edx=0x00000000 >> 0x0000000d 0x06: eax=0x00000200 ebx=0x00000480 ecx=0x00000000 edx=0x00000000 >> 0x0000000d 0x07: eax=0x00000400 ebx=0x00000680 ecx=0x00000000 edx=0x00000000 >> 0x0000000d 0x08: eax=0x00000080 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 >> 0x0000000d 0x09: eax=0x00000008 ebx=0x00000a80 ecx=0x00000000 edx=0x00000000 >> 0x0000000d 0x0b: eax=0x00000010 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 >> 0x0000000d 0x0c: eax=0x00000018 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 >> 0x0000000d 0x0d: eax=0x00000008 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 > > Could we get that for an impacted AMD system, please? > > cpuid -1 --raw | grep " 0x0000000d " > > should do it. 0x0000000d 0x00: eax=0x00000207 ebx=0x00000988 ecx=0x00000988 edx=0x00000000 0x0000000d 0x01: eax=0x0000000f ebx=0x00000348 ecx=0x00001800 edx=0x00000000 0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000 0x0000000d 0x09: eax=0x00000008 ebx=0x00000980 ecx=0x00000000 edx=0x00000000 0x0000000d 0x0b: eax=0x00000010 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 0x0000000d 0x0c: eax=0x00000018 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 Here, I think the ebx value for the 0x09 leaf (PKRU) is the relevant difference here, it is 0xa80 on your laptop and 0x980 on the AMD CPU. (This is the missing MPX gap on AMD.) >>> This is pretty close to just a raw dump of the XSAVE CPUID leaves. >>> Rather than come up with an XSAVE-specific ABI that depends on CPUID >>> *ANYWAY* (because it dumps the "flags" register aka. ECX), maybe we >>> should just bite the bullet and dump out (some of) the raw CPUID space. >> >> So the current note I initially proposed and implemented for FreeBSD >> (https://reviews.freebsd.org/D42136) and an initial patch set for GDB >> (https://sourceware.org/pipermail/gdb-patches/2023-October/203083.html) >> do indeed dump a raw set of CPUID leaves.  The version I have for FreeBSD >> only dumps the raw leaf values for leaf 0x0d though the note format is >> extensible should additional leaves be needed in the future.  One of the >> questions if we wanted to use a CPUID leaf note is which leaves to dump >> (e.g. do you dump all of them, or do you just dump the subset that is >> currently needed). > > You dump what is needed and add to the dump over time. That is what I started with, yes, but am attempting to anticipate future problems in my list of caveats. >> Another quirky question is what to do about systems with hetergeneous >> cores (E vs P for example). > That's irrelevant for now. The cores may be heterogeneous but the > userspace ISA and (and thus XSAVE formats) are identical. If they're > not, then we have bigger problems on our hands. Yes, I agree on the bigger problems and hope we don't have to solve them. >> Currently those systems use the same XSAVE layout across all cores, >> but other CPUID leaves do already vary across cores on those systems. > > There shouldn't be any CPUID leaves that differ _and_ matter to > userspace and thus core dumps. Today that is true, yes. I'm fine with making that tradeoff (along with only dumping a subset of leaves) so long as the consensus is that is an acceptable tradeoff to make. >> However, there are other wrinkles with the leaf approach.  Namely, one >> of the use cases that I currently have an ugly hack for in GDB is if >> you are using gdb against a remote host running gdbserver and then use >> 'gcore' to generate a core dump.  GDB needs to write out a NT_X86_XSTATE >> note, but that note requires a layout.  What GDB does today is just pick >> a known Intel layout based on the XCR0 mask.  However, GDB should ideally >> start writing out whatever new note we adopt here, so if we dump raw >> CPUID leaves it means extending the GDB remote protocol so we can query >> the CPUID leaves from the remote host.  On the other hand, if we choose a >> more abstract format as proposed in this patch, the local GDB (or LLDB >> or whatever) can generate whatever synthetic layout it wants to write >> the local NT_X86_XSTATE.  (NB: A relevant detail here is that the GDB >> remote protocol does not pass the entire XSAVE state across as a block, >> instead gdbserver parses individual register values for AVX, etc. >> registers and those decoded register values are passed over the >> protocol.) > > So the gdb side says, "Give me PKRU" and the remote side parses the > XSAVE image, finds PKRU, and sends it over the wire? Yes. >> Another question is potentially supporting compact XSAVE format in >> for NT_X86_XSTATE.  Today Linux has some complicated code to re-expand >> the compat XSAVE format back out to the standard layout for ptrace() and >> process core dumps. > > Yeah, but supporting the compacted format in NT_X86_XSTATE doesn't help > us at all. We still intermingle user and supervisor state and that > needs to get repacked _anyway_. Fair enough. > In other words, no matter what we do, it's going to be complicated > because the userspace buffer can't have supervisor state and the kernel > buffer does have it. The compacted format mismatch is the least of our > problems. > >>   (FreeBSD doesn't yet make use of XSAVEC so we >> haven't yet dealt with that problem.) > > ... or XSAVES, which is actually the most relevant here. > > Backing up... there are two approaches here: > > 1. Dump out raw x86-specific gunk, aka. CPUID contents itself. There > are a billion ways to do this and lots of complications, including > the remote protocol implications > or > 2. Define an abstract format that works anywhere, not just on x86 and > not just for XSAVE. > > There's no (sane) middle ground. The implementation here (in this > patch) is fundamentally x86-specific and pretends to be some kind of > abstracted x86-independent format. Well, are there other register notes that could benefit from an approach like this? Most other register notes I'm aware of on various architectures either have a fixed layout (like the typical general purpose register notes), or they have a fixed set of registers but the size of individual registers can vary (thinks like SME or RISC-V's vector extension). XSAVE is the only one I'm aware of that packs multiple register sets into a single note. To step back a bit, another approach that could be taken (and I'm not sure it is worth it at this point) would to stop dumping a single XSAVE note and dump a separate register note for each feature. That is, dump a note for AVX (the upper bits of ymmX), a note for PKRU, etc. I think if I had to pick a strategy at the very beginning that's what I would choose now, but this isn't the very beginning and that sort of change is likely too disruptive. (This approach is what happens on other arches today in effect, e.g. on AArch64.) -- John Baldwin