Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp554821pxf; Wed, 31 Mar 2021 09:55:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxnw8uOZNUjqPpAiGAPJvxcZXD0WdUl8bd0RN7+zQsxhS/ZhDxehp+IvKfj2iTu96EsBxfP X-Received: by 2002:a05:6402:6ca:: with SMTP id n10mr4960821edy.312.1617209739814; Wed, 31 Mar 2021 09:55:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617209739; cv=none; d=google.com; s=arc-20160816; b=uXeHuT2XsKFQV5uxGvg2riXL4fs482eRLgVf6A66//O3c89hBmbMsd8bM5qiDJs0lg fMc0MY7Glwno0wLhrxsqzbPNUelJ8DIaXqCHW6F8X82QSl7ziaw+uf/dgwrw0CtBJ6HO IlSTOwaR20bodqtPiSEBj9AICzlzIrMLxW0vkdfcbmzh50v1rlvNTuj5zoB6JH1/PMMM 54D3+1emwc1qLHv4qcSxEqVVGG4MiXbSRxIaafFHNkUJa4LzG8Z3HRDh0Mc1df/TJ9D2 8DJYOKYqJ6FUUkq3nQIww+sdzZd8gBg19l9Mdhd1DD/m1QN5Y5vHCQUFX6xtu42YCkvI M25g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:in-reply-to:cc:references:message-id:date :subject:mime-version:from:content-transfer-encoding:dkim-signature; bh=hk1eGF9UROjG/f/Flt2fUdcK/q5+Gj+fZdKcwTUQYH8=; b=oZy0jhLV8rFdeIc2HdR4zR+syOiI7yc3pZvzvONv5NknuX2Q4ucqLn5NYhhygCNWXi W/1ZE3fP7jIBUTRwFWbCDVIWv9XJ07cW1dJH9YD4QaJSMBZ6uJ1MubYv5M1XvSSV65kv 26kwYROcaXTe9I8O/EymPeHPIs/RyZq537NQ3kCNDvIqnp/PjXFtd7R+d3QovJ6Nvu5v 8I0uY40lpKPh2MpjbaVVyXGjucGxtpMcLcKTF57C99eX+zi6WNpRN8neuvRRWJdHKjFB 6xeKPcqs/cUV+ApVbJIyfIm9FC1Xa8CEg96s4Q+5WFq1yCstVqXzG5tVLopszW/KhjPl PiPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=p+WEaeVo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dp12si2116510ejc.526.2021.03.31.09.55.16; Wed, 31 Mar 2021 09:55:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=p+WEaeVo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232406AbhCaQxx (ORCPT + 99 others); Wed, 31 Mar 2021 12:53:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233966AbhCaQxw (ORCPT ); Wed, 31 Mar 2021 12:53:52 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6203DC061574 for ; Wed, 31 Mar 2021 09:53:52 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id o2so8208179plg.1 for ; Wed, 31 Mar 2021 09:53:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=hk1eGF9UROjG/f/Flt2fUdcK/q5+Gj+fZdKcwTUQYH8=; b=p+WEaeVoSyBA0mxHcFAHnmyH9tTYKSvE0ibbhRPxRDUQduiljO1luAFiZb0qjbpql6 yR63zxtgjsqVT5o0BG22OkkyaaORrogc0OnDKFQYB1CGkphoBz6f3pnHxINsKeiIDeHo QmKzsRo6qrh1XZIjHSbNg7nUie179ftSIvOBYA3VBHo6lAMyCnN9epfUA3JgHHzPQQyB yJK8ezTHafDcb4LNLHpYEAanQCevhoOyrdAnNF8cRDRHEFryHQ/NgOspx+hulpe53Gu5 W0h7y+beCdqcy+xmBVO0GXQo+aMHRhis0bjX3Ua1aODL520iFPaH0piN32KcU4Ul9h7L ELEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=hk1eGF9UROjG/f/Flt2fUdcK/q5+Gj+fZdKcwTUQYH8=; b=BXCLwOJFnwS+Ukc6rAkPZ1eLZxa0gjrM8jguuDnk7VbuoAq9pkM8uYrP4yKnifv8v1 6pNmk8GSqvLSPUe3NMUwmlyMR09oRazVy0boXWWqTuS7cxyROLWPwhvWBTUSj9cIxNbj u0F4TZ3/lD9RrZX3cSFu/Sde9TSaRCmLfu6zVHL9xLrkEb3sF4fuJp/6EvsfbfM3UuUJ 8I9NM2/U73yo6VndQDjdSdUFhoXThu+Nl4vQzhWSx25jcN8/yZDvGTb4iLtjnjAVFvAN r4U5tatfq8XLpv4hNIKD0ggQ4xmNbZ36ZAkxWI2/3A75MgMnqSLGpbnZ1pQd4OTJWJsg Dy1Q== X-Gm-Message-State: AOAM530Z4wpvsEFsWk57zJAfaSJTfwYUyNAkWajyNyu/JN5IRNjwOem6 n6aP/UgIw4wqQcmuEEIMcaqMzg== X-Received: by 2002:a17:90a:c08a:: with SMTP id o10mr4369678pjs.67.1617209631861; Wed, 31 Mar 2021 09:53:51 -0700 (PDT) Received: from ?IPv6:2601:646:c200:1ef2:6c04:8e42:2555:a3ed? ([2601:646:c200:1ef2:6c04:8e42:2555:a3ed]) by smtp.gmail.com with ESMTPSA id h15sm2848098pfo.20.2021.03.31.09.53.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 31 Mar 2021 09:53:51 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features Date: Wed, 31 Mar 2021 09:53:50 -0700 Message-Id: References: Cc: David Laight , Dave Hansen , Andy Lutomirski , Greg KH , "Bae, Chang Seok" , X86 ML , LKML , libc-alpha , Florian Weimer , Rich Felker , Kyle Huey , Keno Fischer , Linux API In-Reply-To: To: Len Brown X-Mailer: iPhone Mail (18D70) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Mar 31, 2021, at 9:31 AM, Len Brown wrote: >=20 > =EF=BB=BFOn Tue, Mar 30, 2021 at 6:01 PM David Laight wrote: >=20 >>> Can we leave it in live registers? That would be the speed-of-light >>> signal handler approach. But we'd need to teach the signal handler to >>> not clobber it. Perhaps that could be part of the contract that a >>> fast signal handler signs? INIT=3D0 AMX state could simply sit >>> patiently in the AMX registers for the duration of the signal handler. >>> You can't get any faster than doing nothing :-) >>>=20 >>> Of course part of the contract for the fast signal handler is that it >>> knows that it can't possibly use XRESTOR of the stuff on the stack to >>> necessarily get back to the state of the signaled thread (assuming we >>> even used XSTATE format on the fast signal handler stack, it would >>> forget the contents of the AMX registers, in this example) >>=20 >> gcc will just use the AVX registers for 'normal' code within >> the signal handler. >> So it has to have its own copy of all the registers. >> (Well, maybe you could make the TMX instructions fault, >> but that would need a nested signal delivered.) >=20 > This is true, by default, but it doesn't have to be true. >=20 > Today, gcc has an annotation for user-level interrupts > https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html#x86-Functi= on-Attributes >=20 > An analogous annotation could be created for fast signals. > gcc can be told exactly what registers and instructions it can use for > that routine. >=20 > Of course, this begs the question about what routines that handler calls, > and that would need to be constrained too. >=20 > Today signal-safety(7) advises programmers to limit what legacy signal han= dlers > can call. There is no reason that a fast-signal-safety(7) could not be cr= eated > for the fast path. >=20 >> There is also the register save buffer that you need in order >> to long-jump out of a signal handler. >> Unfortunately that is required to work. >> I'm pretty sure the original setjmp/longjmp just saved the stack >> pointer - but that really doesn't work any more. >>=20 >> OTOH most signal handlers don't care - but there isn't a flag >> to sigset() (etc) so ask for a specific register layout. >=20 > Right, the idea is to optimize for *most* signal handlers, > since making any changes to *all* signal handlers is intractable. >=20 > So the idea is that opting-in to a fast signal handler would opt-out > of some legacy signal capibilities. Complete state is one of them, > and thus long-jump is not supported, because the complete state > may not automatically be available. Long jump is probably the easiest problem of all: sigsetjmp() is a *function= *, following ABI, so sigsetjmp() is expected to clobber most or all of the e= xtended state. But this whole annotation thing will require serious compiler support. We al= ready have problems with compilers inlining functions and getting confused a= bout attributes. An API like: if (get_amx()) { use AMX; } else { don=E2=80=99t; } Avoids this problem. And making XCR0 dynamic, for all its faults, at least h= elps force a degree of discipline on user code. >=20 > thanks, > Len Brown, Intel Open Source Technology Center