Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp574983pxb; Wed, 29 Sep 2021 05:31:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJVrlUNB0UMGVA2hQCx5koWevrMaWDZ5K3D7R8Fhs5Ayjob4FXcwaP3w1AM+mYlgPLEZ+t X-Received: by 2002:a17:902:b490:b0:13d:588b:d857 with SMTP id y16-20020a170902b49000b0013d588bd857mr2656241plr.16.1632918670088; Wed, 29 Sep 2021 05:31:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632918670; cv=none; d=google.com; s=arc-20160816; b=Y2aA2rAZvb2ZdhWhT1N4Hvq8zOnB8qIIIsFXoCfZV+3Sp9l8avLj6j7jYWQcDTdqX/ ClidjW7FNxlNAZ9S998YBKJWkWJHhX0kjiN4kmRpCsALkB699CmHNBcTLiV6Lox1G4gY 5zz+DyjTkEPP6NAIFQZGx/ufmLmZDYT1nilIqbU9HMQdzb9+4+QENDX7vl1Mv1g0DS4i kMJybi/u7RsyRukhmoB+Vn53jTXS1fKKeAX0J0J5MR67d/XAQilqxsPB5z3j+8wzZ7ki h2u7BsURSv8uKbDVq5x+XR1j7+dftNMApBQy8QKzND9VtxgtM4moEilYNa+jiCaFTOS6 eFUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=e1BMtnJgV70KLngTKgu2ZON12D6kuHuc1YBGUlYYeH4=; b=G2M5fJYI3nHLkBmzHFhpo+/7jxg5I+QC3mPOVTyRRBtmmPlXQA68CytT8R9uVR+0E7 kGh9ksJISg7U33hoPcNC4vylZE3TwKz71/Erp4WrDQUP5XrLQ929c6b+ptfSKy1bppJl pZLXFeOqpuDcmlTl3zCqZlNqTbhhe0+jtbUZc8HZ3tFewAFde4rD8wdpks9RV1jOFJPn jjXEbdS0kayw2ksUwpvx+Nuz2TLpeIlIr7l56OB0hwA1kbUhsQsJcg68GgM1bhbg3/VZ p4x7Ch9CRBOhaO7cdPNXX1oxXyzY9BDjuwYSS74GgC4j4quqTWkwpaD8m3HHRvXvGTkr jwHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Zig++SWZ; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mw5si2037096pjb.132.2021.09.29.05.30.54; Wed, 29 Sep 2021 05:31:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Zig++SWZ; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343787AbhI2M3x (ORCPT + 99 others); Wed, 29 Sep 2021 08:29:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343767AbhI2M3x (ORCPT ); Wed, 29 Sep 2021 08:29:53 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ED51C06161C for ; Wed, 29 Sep 2021 05:28:12 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1632918490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=e1BMtnJgV70KLngTKgu2ZON12D6kuHuc1YBGUlYYeH4=; b=Zig++SWZv0uFvNn6iGF13hJygMX5H996xpm+TsQquNkXhCAmVkqTaO9mbz8DV2f/X/auqo gUm0Ss4GO8/nlt+ATF4qvB1V4XsR7HAUeJ57s5CtiWn5ahPg/us3yQPPK3FX1v4rcrAnwk rAALyWUnPCCw3LSPtmT54SScsGe1MYpPPmDUe4ZPRFW8BC8roV/YbHkUhPE47Am9hIgvCF nRy2uMjRf6Rg3E5PcPrARtPJ5pLWJMRMP6EzoqOwbcF1NOzoIMtUoxOQX+E66dKctUPRMZ l1p+xUwEYPnAxjpLh4KCF/MSU4n9rZ4YD3tJFHFvvbW1EodR0jR24DNSaGWT0Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1632918490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=e1BMtnJgV70KLngTKgu2ZON12D6kuHuc1YBGUlYYeH4=; b=OVL3UJh7xDcOsC045ZS9rtxR4krp71fKnSTY/BYUAQF7wAfq7q4/bwyT+V0XI9fr4dGDlf DzTdxqe9f01dfrDA== To: Peter Zijlstra , Andy Lutomirski Cc: Tony Luck , Fenghua Yu , Ingo Molnar , Borislav Petkov , Dave Hansen , Lu Baolu , Joerg Roedel , Josh Poimboeuf , Dave Jiang , Jacob Jun Pan , Raj Ashok , "Shankar, Ravi V" , iommu@lists.linux-foundation.org, the arch/x86 maintainers , Linux Kernel Mailing List Subject: Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting In-Reply-To: References: <20210920192349.2602141-1-fenghua.yu@intel.com> <20210920192349.2602141-6-fenghua.yu@intel.com> <87y27nfjel.ffs@tglx> <87o88jfajo.ffs@tglx> <87k0j6dsdn.ffs@tglx> Date: Wed, 29 Sep 2021 14:28:09 +0200 Message-ID: <87r1d78t2e.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 29 2021 at 11:54, Peter Zijlstra wrote: > On Fri, Sep 24, 2021 at 04:03:53PM -0700, Andy Lutomirski wrote: >> I think the perfect and the good are a bit confused here. If we go for >> "good", then we have an mm owning a PASID for its entire lifetime. If >> we want "perfect", then we should actually do it right: teach the >> kernel to update an entire mm's PASID setting all at once. This isn't >> *that* hard -- it involves two things: >> >> 1. The context switch code needs to resync PASID. Unfortunately, this >> adds some overhead to every context switch, although a static_branch >> could minimize it for non-PASID users. > >> 2. A change to an mm's PASID needs to sent an IPI, but that IPI can't >> touch FPU state. So instead the IPI should use task_work_add() to >> make sure PASID gets resynced. > > What do we need 1 for? Any PASID change can be achieved using 2 no? > > Basically, call task_work_add() on all relevant tasks [1], then IPI > spray the current running of those and presto. > > [1] it is nigh on impossible to find all tasks sharing an mm in any sane > way due to CLONE_MM && !CLONE_THREAD. Why would we want any of that at all? Process starts, no PASID assigned. bind to device -> PASID is allocated and assigned to the mm some task of the process issues ENQCMD -> #GP -> write PASID MSR After that the PASID is saved and restored as part of the XSTATE and there is no extra overhead in context switch or return to user space. All tasks of the process which did never use ENQCMD don't care and their PASID xstate is in init state. There is absolutely no point in enforcing that all tasks of the process have the PASID activated immediately when it is assigned. If they need it they get it via the #GP fixup and everything just works. Looking at that patch again, none of this muck in fpu__pasid_write() is required at all. The whole exception fixup is: if (!user_mode(regs)) return false; if (!current->mm->pasid) return false; if (current->pasid_activated) return false; wrmsrl(MSR_IA32_PASID, current->mm->pasid); current->pasid_activated = true; return true; There is zero requirement to look at TIF_NEED_FPU_LOAD or fpregs_state_valid() simply because the #GP comes straight from user space which means the FPU registers contain the current tasks user space state. If TIF_NEED_FPU_LOAD would be set or fpregs_state_valid() would be false after the user_mode() check then this would simply be a bug somewhere else and has nothing to do with this PASID fixup. So no need for magic update_one_xstate_feature() wrappers, no concurrency concerns, nothing. It's that simple, really. Anything more complex is just a purely academic exercise which creates more problems than it solves. Thanks, tglx