Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp642375ybm; Thu, 28 May 2020 11:27:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJ6GqFBokhaiiztjFBh5MIjaLwbTjQuq9M8KaZkBjqq9AZw8gZ+wYZdGj5cIqbCDu54Ugv X-Received: by 2002:a17:906:1dd1:: with SMTP id v17mr4221624ejh.395.1590690470775; Thu, 28 May 2020 11:27:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590690470; cv=none; d=google.com; s=arc-20160816; b=vNtcxrCiyWXJNC0ur6kzTkkvTVFCAZbpwvoHvsqjBLLojIapKTgcb4DWFpudj3YPWb TRFchLac+Jtwf7bbx4xadClUpUzPjtPrXZAb4PG0Fo8gKBYwIscMEwcx591eDPBnm25G 25yYZUW0wOHpdbhj6xEiLEMMrBN1xuoG1fhMpNdnhXlSqLRPe/2yhg9FNAuCPkzdBT2B CM/eP76Sj53ij7VIvSC7cr3H9d5kuzKNUoOj9TWpWhh+d3ftrlGQoFgoW8jH0pO7dS33 ET0NNVUVLj/OBqh5pGub9FYC3qLcMoRhsxK0XhkmBvi8RCUv5Qwuk/4IW7qp2RZLjRgG xJtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=jUE11Y1zJ1Tl4fLukMQ9czYQe08xD80kHxTSnP4G7u4=; b=0Cuz0H3LY75qyc9LrTAyfQLiTc9KwAIZ+nyI0jaL8y9bXn2six2d3nsNwv8nM+JDQV pEV8AKhlUokeWjwkzioleeX6Y9oohs4sm850wan5tT5sX3l0LH/WflGG9KmW4czO4+V7 Vcm9p8BkvqkEfLktatOZSZfgIiZDsOj2LuNBTPufPHor51Ig6VSd40p79PdL028DDhpx B88100khgKgWQ7nNqlElPuC5Rbhy3xqtY53w/BxdcdoyPzchN2PmW9R/B9rG64RwTLX6 h1VkYSSuNnEpY+LMWZN+i3Th9Sdh+U2N0EZw/SxGQs3/uNADdLLj0+ZrbVMjKTgli9KB Abog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=vlerUmdo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t9si4522629eds.213.2020.05.28.11.27.27; Thu, 28 May 2020 11:27:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=vlerUmdo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405772AbgE1SXa (ORCPT + 99 others); Thu, 28 May 2020 14:23:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405744AbgE1SX2 (ORCPT ); Thu, 28 May 2020 14:23:28 -0400 Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A2BCC08C5C6 for ; Thu, 28 May 2020 11:23:28 -0700 (PDT) Received: by mail-qk1-x743.google.com with SMTP id v79so4020358qkb.10 for ; Thu, 28 May 2020 11:23:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=jUE11Y1zJ1Tl4fLukMQ9czYQe08xD80kHxTSnP4G7u4=; b=vlerUmdoTTV0h6PVlMN6uV7AORUgYR3hNxnZZFpoF2skLxgadPogBt/Kk7PZTKKgem LDBLFSUrbGe+DOlQ+lge0de4T2qoHUR5RNYBMWEViX0q1LqdESd2rKvaMSfwuHCmn7DC FGNDOn0hvIUKllzvfgFS4s44Qp32YpSMwto2s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=jUE11Y1zJ1Tl4fLukMQ9czYQe08xD80kHxTSnP4G7u4=; b=o37AYG3xoLOx5hI2Ad3P5VsFimnj/PCKtEtQSMavw0C/VurcQApMQCQrZPxZar4eeT dYklDiXUW4CB2JQ5c+IosEzOM0oh+GBxuLVFBSqDUwSss4p1Hjb9xU8Xxqajgl4Tlkd/ X53h5d6a63V4txdfZ1nHokpa95bwo6Vm7jXIWfdZLva6RprP3w0RpmwjxtceeJG+ste/ C8rP2EzOtfBR+d177eZl/XHnwDY4u6n8zXG1FF267Tb3wMdzBzuEYTvQOoIpEi6yb7aO 736EtxRt+LkT0gFdl4Is2KEpP2tllpLVqZg/Xrjj9uIHs3mISS0SawZRYQFMVmHZu5sP xsIg== X-Gm-Message-State: AOAM533Iz5gFq/A7Awr/BY9btMP6ZK+skgwaOs8DQDAMYY/wFk/eAUrO cafnAQRo1xVvMnNvAN3EoSjAMg== X-Received: by 2002:a37:9781:: with SMTP id z123mr4190669qkd.266.1590690207201; Thu, 28 May 2020 11:23:27 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id t75sm1217463qke.35.2020.05.28.11.23.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 May 2020 11:23:26 -0700 (PDT) Date: Thu, 28 May 2020 14:23:25 -0400 From: Joel Fernandes To: Peter Zijlstra Cc: Phil Auld , Nishanth Aravamudan , Julien Desfossez , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, vpillai , linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, Aaron Lu , Aubrey Li , aubrey.li@linux.intel.com, Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [PATCH RFC] sched: Add a per-thread core scheduling interface Message-ID: <20200528182325.GA176149@google.com> References: <20200520222642.70679-1-joel@joelfernandes.org> <20200521085122.GF325280@hirez.programming.kicks-ass.net> <20200521134705.GA140701@google.com> <20200522125905.GM325280@hirez.programming.kicks-ass.net> <20200522213524.GD213825@google.com> <20200524140046.GA5598@lorien.usersys.redhat.com> <20200528170128.GN2483@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200528170128.GN2483@worktop.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On Thu, May 28, 2020 at 07:01:28PM +0200, Peter Zijlstra wrote: > On Sun, May 24, 2020 at 10:00:46AM -0400, Phil Auld wrote: > > On Fri, May 22, 2020 at 05:35:24PM -0400 Joel Fernandes wrote: > > > On Fri, May 22, 2020 at 02:59:05PM +0200, Peter Zijlstra wrote: > > > [..] > > > > > > It doens't allow tasks for form their own groups (by for example setting > > > > > > the key to that of another task). > > > > > > > > > > So for this, I was thinking of making the prctl pass in an integer. And 0 > > > > > would mean untagged. Does that sound good to you? > > > > > > > > A TID, I think. If you pass your own TID, you tag yourself as > > > > not-sharing. If you tag yourself with another tasks's TID, you can do > > > > ptrace tests to see if you're allowed to observe their junk. > > > > > > But that would require a bunch of tasks agreeing on which TID to tag with. > > > For example, if 2 tasks tag with each other's TID, then they would have > > > different tags and not share. > > Well, don't do that then ;-) We could also guard it with a mutex. First task to set the TID wins, the other thread just reuses the cookie of the TID that won. But I think we cannot just use the TID value as the cookie, due to TID wrap-around and reuse. Otherwise we could accidentally group 2 tasks. Instead, I suggest let us keep TID as the interface per your suggestion and do the needed ptrace checks, but convert the TID to the task_struct pointer value and use that as the cookie for the group of tasks sharing a core. Thoughts? thanks, - Joel > > > What's wrong with passing in an integer instead? In any case, we would do the > > > CAP_SYS_ADMIN check to limit who can do it. > > So the actual permission model can be different depending on how broken > the hardware is. > > > > Also, one thing CGroup interface allows is an external process to set the > > > cookie, so I am wondering if we should use sched_setattr(2) instead of, or in > > > addition to, the prctl(2). That way, we can drop the CGroup interface > > > completely. How do you feel about that? > > > > > > > I think it should be an arbitrary 64bit value, in both interfaces to avoid > > any potential reuse security issues. > > > > I think the cgroup interface could be extended not to be a boolean but take > > the value. With 0 being untagged as now. > > How do you avoid reuse in such a huge space? That just creates yet > another problem for the kernel to keep track of who is who. > > With random u64 numbers, it even becomes hard to determine if you're > sharing at all or not. > > Now, with the current SMT+MDS trainwreck, any sharing is bad because it > allows leaking kernel privates. But under a less severe thread scenario, > say where only user data would be at risk, the ptrace() tests make > sense, but those become really hard with random u64 numbers too. > > What would the purpose of random u64 values be for cgroups? That only > replicates the problem of determining uniqueness there. Then you can get > two cgroups unintentionally sharing because you got lucky. > > Also, fundamentally, we cannot have more threads than TID space, it's a > natural identifier.