Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4666290pxf; Tue, 30 Mar 2021 13:46:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFdLC7JJPv2RZeLB7ZyyUCiUKZAEy7y4pYOwAW7Q/GGiJLgFOe7Dpf+WbSheibAQkcNkS0 X-Received: by 2002:a17:906:5646:: with SMTP id v6mr663ejr.126.1617137179894; Tue, 30 Mar 2021 13:46:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617137179; cv=none; d=google.com; s=arc-20160816; b=xhvoTZpI0qknDYbtZd32D9ckWCCZQ5jWKTqfy3fc8f0s5aZAsMhkIhEhGHzbFbHJcf IA+wCzue/ar94H5AFpcM1bDSLtDgaYgmsbfjRYUu7wlTVkDC/JnUa+X7fZ9AbzmYAzHc IezI0N2iKNsa7U9S2rarZ+lOv8JnjIdNz5s56oJf1iMHqa9DR9CslePNv0K1Is5qZGLJ i+v63o5HuVHXJnpFRyzHykYik05YKDyw7+QeKuQbu1LRxBfYuL7j5EC1R+4eeSSRp0oT 1i+0EOjnHNOqnl+82kYv970ShLRT/8lBgp8iwITTGsqvSRmWWgNE8cSApNrvj/qHcQvS HYUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=BSE5y0ZBmGXAAbZGNNIqQt+RZFJJ4W5rVkMkdi4RK64=; b=e1NguiwRzVeczGjtUbs4IPEmMInzGWZFHriQvUBOsToY7oBsRtcCzGJLfeu76mFkjd 0R55fkBZlX4ZY9xcxGreiV0xVgPQNU3XraHvlSLay/v8siWwn8ZbrHr3Yi7YBGM7jxr+ IL4cUvU8aab0/vOLD5T+GRmaxzDV9+/8ukJ+HujOSMz9LMFOWST+OdMmdjHQoWaRZ/ER ejfJyYv746mbIoBiCk5H6HZQfcXmTKNIHgLqIJR4JRVvQqYvPxTa1k69+/lnXHEmkSJP y5rwUI33TY0Yys/WX7kg3G1xFakH+nYgTcIHGHZgt4mgvsmvCwbAKKJ/7FiTVVn7u0hg PnEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v10si39754edc.569.2021.03.30.13.45.57; Tue, 30 Mar 2021 13:46:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229940AbhC3Umt convert rfc822-to-8bit (ORCPT + 99 others); Tue, 30 Mar 2021 16:42:49 -0400 Received: from mail-ej1-f50.google.com ([209.85.218.50]:44596 "EHLO mail-ej1-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230125AbhC3Umg (ORCPT ); Tue, 30 Mar 2021 16:42:36 -0400 Received: by mail-ej1-f50.google.com with SMTP id e14so26738962ejz.11; Tue, 30 Mar 2021 13:42:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=6y0w4epdy1zgw6lkitkraNKrAbYioLiFpd2DfbWTyl8=; b=Inb7zticwx0eIkksP/PyxErSyQ+GvY8gPMbfEefPME5MCgrUj/Zos8ahk71jMy/3FQ NoKAumj/zxrS/7RWcZg+UMcqZkNmmngXkzGzoradW4f0fX8hlUReS7T+fVaZ9K4ZsDV7 wCyKWzek4v5bADCoFma5qKpcnOnas35ifmZPW+eagJjZOPnP158nRJA2so3QAmmg5IPg 0PChIQ2bnWVmTtRDExtaOC5KXzcPqA9gUDK0qBLrlSefo+QqZJdgS0rX1f6B7t70RsWX dxhQWZEHCHIdfvGg2vl+3qjr5LU9VODlOy/OfJUB7dBR6n5XApFUEJQ+FpS0KhKM1WoQ FE5A== X-Gm-Message-State: AOAM532KRgr4pJS4Z9zqnrixeNIRkFIY8ErRiBoBRkYKSrrlqgPVjfml BHgyh5U81rRPdtEYQAkeBO37qadQLgX9S5gNFTQ= X-Received: by 2002:a17:907:ea3:: with SMTP id ho35mr35338549ejc.219.1617136955046; Tue, 30 Mar 2021 13:42:35 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Len Brown Date: Tue, 30 Mar 2021 16:42:23 -0400 Message-ID: Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features To: Andy Lutomirski Cc: Dave Hansen , Andy Lutomirski , Greg KH , "Bae, Chang Seok" , X86 ML , LKML , libc-alpha , Florian Weimer , Rich Felker , Kyle Huey , Keno Fischer , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 30, 2021 at 4:20 PM Andy Lutomirski wrote: > > > > On Mar 30, 2021, at 12:12 PM, Dave Hansen wrote: > > > > On 3/30/21 10:56 AM, Len Brown wrote: > >> On Tue, Mar 30, 2021 at 1:06 PM Andy Lutomirski wrote: > >>>> On Mar 30, 2021, at 10:01 AM, Len Brown wrote: > >>>> Is it required (by the "ABI") that a user program has everything > >>>> on the stack for user-space XSAVE/XRESTOR to get back > >>>> to the state of the program just before receiving the signal? > >>> The current Linux signal frame format has XSTATE in uncompacted format, > >>> so everything has to be there. > >>> Maybe we could have an opt in new signal frame format, but the details would need to be worked out. > >>> > >>> It is certainly the case that a signal should be able to be delivered, run “async-signal-safe” code, > >>> and return, without corrupting register contents. > >> And so an an acknowledgement: > >> > >> We can't change the legacy signal stack format without breaking > >> existing programs. The legacy is uncompressed XSTATE. It is a > >> complete set of architectural state -- everything necessary to > >> XRESTOR. Further, the sigreturn flow allows the signal handler to > >> *change* any of that state, so that it becomes active upon return from > >> signal. > > > > One nit with this: XRSTOR itself can work with the compacted format or > > uncompacted format. Unlike the XSAVE/XSAVEC side where compaction is > > explicit from the instruction itself, XRSTOR changes its behavior by > > reading XCOMP_BV. There's no XRSTORC. > > > > The issue with using the compacted format is when legacy software in the > > signal handler needs to go access the state. *That* is what can't > > handle a change in the XSAVE buffer format (either optimized/XSAVEOPT, > > or compacted/XSAVEC). > > The compacted format isn’t compact enough anyway. If we want to keep AMX and AVX512 enabled in XCR0 then we need to further muck with the format to omit the not-in-use features. I *think* we can pull this off in a way that still does the right thing wrt XRSTOR. Agreed. Compacted format doesn't save any space when INIT=0, so it is only a half-step forward. > If we go this route, I think we want a way for sigreturn to understand a pointer to the state instead of inline state to allow programs to change the state. Or maybe just to have a way to ask sigreturn to skip the restore entirely. The legacy approach puts all architectural state on the signal stack in XSTATE format. If we make the signal stack smaller with a new fast-signal scheme, we need to find another place for that state to live. It can't live in the task context switch buffer. If we put it there and then take an interrupt while running the signal handler, then we'd overwrite the signaled thread's state with the signal handler's state. Can we leave it in live registers? That would be the speed-of-light signal handler approach. But we'd need to teach the signal handler to not clobber it. Perhaps that could be part of the contract that a fast signal handler signs? INIT=0 AMX state could simply sit patiently in the AMX registers for the duration of the signal handler. You can't get any faster than doing nothing :-) Of course part of the contract for the fast signal handler is that it knows that it can't possibly use XRESTOR of the stuff on the stack to necessarily get back to the state of the signaled thread (assuming we even used XSTATE format on the fast signal handler stack, it would forget the contents of the AMX registers, in this example) Len Brown, Intel Open Source Technology Center