Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1505478pxj; Fri, 21 May 2021 16:34:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxh742NNPFU3wNNJljztSaa+BkZHHaC6cWq/4IMUL1+HXjzO4uUoEinv75W14NAX733BBAc X-Received: by 2002:a92:ab09:: with SMTP id v9mr1631592ilh.55.1621640052147; Fri, 21 May 2021 16:34:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621640052; cv=none; d=google.com; s=arc-20160816; b=hlSNZICS2YMGdAxEgqOwCQ2LJJilUhacyukHiZNW60fPRQAyypH8JZ3ILAITzvNpp1 lYfDde9gsubpVmFLz3XM4keM283u1rPKIPl7wInpYYbGKTxtqHM+NY+K2zeE1zpbQejJ 5+MFHPj+1oWbUHI2mXZF/DG3nJlnDiUlQqtjK8ZE9urPkcSHIpzGR9O1fORrisWgU1rf Va6nX4L48/EKCATl67BqJkX64o3QZyHqA3LfQJ+nqViIeWyNUNO+8uPxw9zey8qFy7CY O1V+5o/TYqIPch5Q9W+OGq9kIuHLj0fhYYKVB35ohJAtsM96Aqji+/2Xt1Olus2Y7r6Y +17w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=wV8Jn7oT1/R7b2roeYS63X9RldUmuefktohm7BDX3HU=; b=SCwZ4FWw2Vr2v4G8OqR8kTWopCSsYpJVw78DkCDA40OVss2gOud9T4bzbu7REcYko2 azCm1iqzt/Y5cGcNfnIuJfPKX078v3SHwzqeajDmDg1IHlrxWapMDL8NGdLw6yjnpWFp 0qBAH+8zvnKt1j2GqGaZkeFSgqu22sYZw9RVjmOc9kZnQ7seUOOBheD0QhQ9RkHTX/qs 6Z1H752dIZ3SFgbDewuupDXz7af0LQqfC3fb6asWqTQMJyEy4lwThhDhpics1wHJBzK4 tgqFfoNG/iCVoEzBx8hYMelDBk9IyCa9Ujqy7V5B5807+6pGJXIPtPmO5e4xWpsf2A9P i7fw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m5si8036871jat.27.2021.05.21.16.33.59; Fri, 21 May 2021 16:34:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbhEUXdP (ORCPT + 99 others); Fri, 21 May 2021 19:33:15 -0400 Received: from mail-ej1-f46.google.com ([209.85.218.46]:42845 "EHLO mail-ej1-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230217AbhEUXdN (ORCPT ); Fri, 21 May 2021 19:33:13 -0400 Received: by mail-ej1-f46.google.com with SMTP id lg14so32730879ejb.9; Fri, 21 May 2021 16:31:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wV8Jn7oT1/R7b2roeYS63X9RldUmuefktohm7BDX3HU=; b=XxzDQ+N3/y6YmsPojl6IqtP0Tclfi00AlIYkS0WvbfwRaOTaDluTWbFYjGiTf52pF1 zhsRwvK2tvFybmcHp9fEhn50IOxMfHYgM/MH1rWilrG5ukMzJReZ0ZCTyjfulf5pr8f2 nLZBDOCC5qjdg/s/Mj3vOQqFwGg3cV/yQcNaAHYl2pBTaR3rP2LDxO7F7HDHm1hDwBek y0iCq3X4oidFHkp4w0x/s/Rre1kGSjRZNg+EGqjgNcgQzwZSZfN9FQgU4ZGoS1F3gwwY fICThoQUeYs1P95bpdWwP6jvySk2nlxxjjnlQ0Q2RSvHTbPIVAzo9WG9JBoNaW+0EVCQ Beeg== X-Gm-Message-State: AOAM532oXhmap4JEika/qU04bqgNGtTjY5npRhVj2+8JjHwxJyCNXH0/ uJ6/1qRTfTlrarp7iv81BfSb6diPUFDyN4dNL/8= X-Received: by 2002:a17:906:1dd1:: with SMTP id v17mr12370205ejh.31.1621639907917; Fri, 21 May 2021 16:31:47 -0700 (PDT) MIME-Version: 1.0 References: <20210415044258.GA6318@zn.tnic> <20210419141454.GE9093@zn.tnic> <20210419191539.GH9093@zn.tnic> <20210419215809.GJ9093@zn.tnic> <874kf11yoz.ffs@nanos.tec.linutronix.de> <87k0ntazyn.ffs@nanos.tec.linutronix.de> <37833625-3e6b-5d93-cc4d-26164d06a0c6@intel.com> <9c8138eb-3956-e897-ed4e-426bf6663c11@intel.com> <87pmxk87th.fsf@oldenburg.str.redhat.com> <939ec057-3851-d8fb-7b45-993fa07c4cb5@intel.com> <87r1i06ow2.fsf@oldenburg.str.redhat.com> <263a58a9-26d5-4e55-b3e1-3718baf1b81d@www.fastmail.com> <87k0nraonu.ffs@nanos.tec.linutronix.de> <878s47aeni.ffs@nanos.tec.linutronix.de> In-Reply-To: <878s47aeni.ffs@nanos.tec.linutronix.de> From: Len Brown Date: Fri, 21 May 2021 19:31:36 -0400 Message-ID: Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features To: Thomas Gleixner Cc: Andy Lutomirski , Florian Weimer , Dave Hansen , Dave Hansen via Libc-alpha , Rich Felker , Linux API , "Bae, Chang Seok" , "the arch/x86 maintainers" , Linux Kernel Mailing List , Kyle Huey , Borislav Petkov , Keno Fischer , Arjan van de Ven , Willy Tarreau Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With this proposed API, we seem to be combining two requirements, and I wonder if we should be treating them independently. Requirement 1: "Fine grained control". We want the kernel to be able to prohibit a program from using AMX. The foundation for this is a system call that the kernel can say "No". It may deny access for whatever reason it wants, including inability to allocate a buffer, or some TBD administer-invoked hook in the system call, say membership or lack of membership of the process in an empowered cgroup. Requirement 2: Ability to synchronously fail upon buffer allocation. I agree that pthread_create() returning an error code is more friendly way to kill a program rather than a SIGSEGV when touching AMX state for the first time. But the reality is, that program is almost certainly going to exit either way. So the 1st question is if the system call requesting permission should be on a per-process basis, or a per-task basis. A. per-task. If we do it this way, then we will likely wind up mandating a GET at the start of every routine in every library that touches AMX, and potentially also a PUT. This is because the library has no idea what thread called it. The plus is that this will address the "used once and sits on a buffer for the rest of the process lifetime' scenario. The minus is that high performance users will be executing thousands of unnecessary system calls that have zero value. B. per-process. If we do it this way, then the run time linker can do a single system call on behalf of the entire process, and there is no need to sprinkle system calls throughout the library. Presumably the startup code would query CPUID, query XCR0, query this system call, and set a global variable to access by all threads going forward. The plus is that permission makes more sense on a process basis than on a task basis. Why would the kernel give one thread in a process permission, and not another thread -- and if that happened, would a process actually be able to figure out what to do? If we do per-process, I don't see that the PUT call would be useful, and I would skip it. Neither A or B has an advantage in the situation where a thread is created long after initialization and faces memory allocation failure. A synchronously fails in the new system call, and B synchronously fails in pthread_create. The 2nd question is if "successful permission" implies synchronous allocation, or perhaps it allows "please enable on-demand dynamic allocation" X. Synchronous Allocation results in allocation failures returning a synchronous error code, explaining why the program needs to exit. The downside is that it is likely that in both case A and B, every thread in the program will allocate a buffer, if they ever use it or not. Indeed, it is possible that the API we have invented to manage AMX buffer use will actually *increase* AMX buffer use... a Y. Enable on-demand allocation. Here the system call enables XFD to not kill the process, but on first use to allocate a buffer for a thread that is actually touching AMX. The benefit is if you have a program with many threads, only the ones that actually use AMX will allocate buffers. Of course the down side is that this program is exposed to a SIGSEGV if vmalloc fails in that run-time allocation, rather than a friendly pthread_create -1 return code killing the program. And, of course, we can have our cake and eat it too, by having a the syscall tell the kernel if it wants (X) or (Y). The question is if it is worth the complexity of having two options. thoughts? -Len