Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3923318pxf; Mon, 29 Mar 2021 15:40:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxZAD3RdhuUjlms/bqMx33gZ88HCdxsg9pL3fTm27mQt6jxB3CYVhzkS5bIuiHEzCy+1r/Q X-Received: by 2002:a17:906:d94:: with SMTP id m20mr29736880eji.511.1617057626569; Mon, 29 Mar 2021 15:40:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617057626; cv=none; d=google.com; s=arc-20160816; b=yD7D//TrBOobp2+64FP7nNX4v3ioTXZNzBoZLqHQW5EGrft1XFW65r2FPUs/LoIpyO utsEOg/qLn5IsKTN7YXbTqxi1H4k36b3XUS1U7M6srKb8vsHMtpvrt8x8bnT9/UrRuKi 7xp0IV46eCETjENbVvxFUBvFIvOxYZqmApJ164G/wjef6A4aKnoX60OHQDIMo/lKrK88 1pX0QMZh+teaf0cXFh5R/7st+CEW6A/WC5+SV8ysFnKvF5KpLc+Cf2mJRRYD3RmcQfoH OWqGCdEXeQlE5JVd5kXcDJ0kL0KLKiVn8OHQfIB10chbagNYtxSEKsEZc17+feTeot/c XmPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=V+skh3ZV3li9/52ITDgdy2Cp3kBPwWY/jc6inlmY7bE=; b=Tfs7ESho9pu7dYw6OKfgfOFWFdZaxVmp2658/Uro2wPcPIAzzHw/k7mfh4yd9kdUl3 FKiQ0xgLTsrie8CgQTYNzjJX6ibCwAaOtHTa2QNa/KPU30DqHVH4kQa8VEJ6Pc/kQFkf AdpYS2cxYdvhwG/jPst7yr62Qcc2fKZ0axM7yRKGce2N3qmQOp7X52cEsJmZWHra2Rhy ICffFW8Iiw32XDTnccUHniDhwYXCBtGbVWse1UqCX+IIbAh9NhNrpqY38O+FhA0xgB0r LV/bHGlt/ObOtCj9gAxgMYYTx8l72cozjZpdzEHy2U2gc/1ZdJMHdgftSW0JOrkZX8XX l61Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w18si13550004ejy.196.2021.03.29.15.39.59; Mon, 29 Mar 2021 15:40:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231135AbhC2Wid convert rfc822-to-8bit (ORCPT + 99 others); Mon, 29 Mar 2021 18:38:33 -0400 Received: from mail-ed1-f51.google.com ([209.85.208.51]:39826 "EHLO mail-ed1-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231213AbhC2WiZ (ORCPT ); Mon, 29 Mar 2021 18:38:25 -0400 Received: by mail-ed1-f51.google.com with SMTP id bf3so15953079edb.6; Mon, 29 Mar 2021 15:38:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=H/vOEpWo/fUM47hOEde/opJk1H2zdUydMbYSM1IgaM0=; b=M6Dr8L4C6ex7PjTm4CWhvefKp16ZN3AdnKszxRRBklUWrz8bsWKpiz8VBioi/le6Gl IyQqGpauPnJF9eT5V4T/av+gm4gCxR0xPsSsZcAYaNRbpfXea6X9QXNnRzGQXY8yZ66A PrOOjIVzAu9U0LNSuPn2khLGxPmrV1Hfd7SCJD9lxgtE9oYyPZi7BN7CN4qpWtV+etbB OcqtAW2a1IEDkBfWAZFnD3NRKOpVpu8JBOeJGU7+EPUWp3svmUuPSqG5PG+R/Ssr0Z8S hAz8wjJrGE4OIWTG8dxvHN9d7/hXkOQg4dnvfSgavCJ0gIsmX69XAPsVharXhwDyO6VA TNDQ== X-Gm-Message-State: AOAM5322PVaR353rvFbhbL7lPfIBuk+FooQNz+9g81grJcXIp9KhzJ7W YDJrdcML7S0fmHzipc7NhLKaEgx4+E7HlyE29qY= X-Received: by 2002:aa7:d917:: with SMTP id a23mr30782097edr.122.1617057504671; Mon, 29 Mar 2021 15:38:24 -0700 (PDT) MIME-Version: 1.0 References: <5F98327E-8EC4-455E-B9E1-74D2F13578C5@amacapital.net> In-Reply-To: <5F98327E-8EC4-455E-B9E1-74D2F13578C5@amacapital.net> From: Len Brown Date: Mon, 29 Mar 2021 18:38:13 -0400 Message-ID: Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features To: Andy Lutomirski Cc: Greg KH , Andy Lutomirski , "Bae, Chang Seok" , Dave Hansen , X86 ML , LKML , libc-alpha , Florian Weimer , Rich Felker , Kyle Huey , Keno Fischer , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 29, 2021 at 2:16 PM Andy Lutomirski wrote: > > > > On Mar 29, 2021, at 8:47 AM, Len Brown wrote: > > > > On Sat, Mar 27, 2021 at 5:58 AM Greg KH wrote: > >>> On Fri, Mar 26, 2021 at 11:39:18PM -0400, Len Brown wrote: > >>> Hi Andy, > >>> Say a mainline links with a math library that uses AMX without the > >>> knowledge of the mainline. > > > > sorry for the confusion. > > > > mainline = main(). > > > > ie. the part of the program written by you, and not the library you linked with. > > > > In particular, the library may use instructions that main() doesn't know exist. > > If we pretend for a bit that AMX were a separate device instead of a part of the CPU, this would be a no brainer: something would be responsible for opening a device node or otherwise requesting access to the device. > > Real AMX isn’t so different. Programs acquire access either by syscall or by a fault, they use it, and (hopefully) they release it again using TILERELEASE. The only thing special about it is that, supposedly, acquiring and releasing access (at least after the first time) is quite fast. But holding access is *not* free — despite all your assertions to the contrary, the kernel *will* correctly context switch it to avoid blowing up power consumption, and this will have overhead. > > We’ve seen the pattern of programs thinking that, just because something is a CPU insn, it’s free and no thought is needed before using it. This happened with AVX and AVX512, and it will happen again with AMX. We *still* have a significant performance regression in the kernel due to screwing up the AVX state machine, and the only way I know about any of the details is that I wrote silly test programs to try to reverse engineer the nonsensical behavior of the CPUs. > > I might believe that Intel has figured out how to make a well behaved XSTATE feature after Intel demonstrates at least once that it’s possible. That means full documentation of all the weird issues, no new special cases, and the feature actually making sense in the context of XSTATE. This has not happened. Let’s list all of them: > > - SSE. Look for all the MXCSR special cases in the pseudocode and tell me with a straight face that this one works sensibly. > > - AVX. Also has special cases in the pseudocode. And has transition issues that are still problems and still not fully documented. L > > - AVX2. Horrible undocumented performance issues. Otherwise maybe okay? > > - MPX: maybe the best example, but the compat mode part got flubbed and it’s MPX. > > - PKRU: Should never have been in XSTATE. (Also, having WRPKRU in the ISA was a major mistake, now unfixable, that seriously limits the usefulness of the whole feature. I suppose Intel could release PKRU2 with a better ISA and deprecate the original PKRU, but I’m not holding my breath.) > > - AVX512: Yet more uarch-dependent horrible performance issues, and Intel has still not responded about documentation. The web is full of people speculating differently about when, exactly, using AVX512 breaks performance. This is NAKked in kernel until docs arrive. Also, it broke old user programs. If we had noticed a few years ago, AVX512 enablement would have been reverted. > > - AMX: This mess. > > The current system of automatic user enablement does not work. We need something better. Hi Andy, Can you provide a concise definition of the exact problemI(s) this thread is attempting to address? Thank ahead-of-time for excluding "blow up power consumption", since that paranoia is not grounded in fact. thanks, -Len