Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp380259ybl; Wed, 22 Jan 2020 23:57:05 -0800 (PST) X-Google-Smtp-Source: APXvYqyYI4ZkPxwV6R0lScSDDbpIjccE4JqZbg8Eq2bjElSR1cO6XLZa80KRvmlf7+WeIbJQNfqM X-Received: by 2002:a05:6830:139a:: with SMTP id d26mr10839197otq.75.1579766225651; Wed, 22 Jan 2020 23:57:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579766225; cv=none; d=google.com; s=arc-20160816; b=A+YDuRL+ePY3Ckh+V1OuzPCp8lXd3IhF3u2r5y/OiIn/xjFPZycV8PAdSAGLMyB2Fx hy2epaOIvkL/KjmW+09/tRV3T2BuXTk0BYiKZGG/+NgqDNdhLCTNaLx12v2L+CYNe2A2 VV1honJ0KLlKKxcdYTWfW7DB3C+UDNZcRfQu8lAC0/0IHGSX4m2z4ctaLJvOWzJtSvfn vqcxDjfC0N45C4MDwBoTeAfxid+4t1sC7UAhAFJPzxNxaAfP/caLiOcL67UytUiWcbSr zbUP+KzDwJPma6C4PvRYMTzhilGmFE15WdvWIwi7FJhYhL/0GCXe2AjSFcPkW+rA6m7t Enfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=qE57u87vTumfNJecvMilwenXkQ02M72Gjkjln8TMKaQ=; b=kPZLAwsv3bjQrEbE1l9YjJCFtwIQZdWS+MB+c8sBEWp6Z9bxAIXHwnUsewkvWvrafo S181gdu5K60heF4+dc2tIRxhS3sO8Ru1SbMt/37ZWrI57Qp4fPhxeab3ijIOdiiJvo5s /w4evLw168jkDTmnN49qhAEYASElx4cOJLLeSePZAEhfVoepEBjNvXIxVhFsgFOcfGwQ lmNOJ9V7mmS6XggBgz3cZ7AEzu/66kZr1FGlNYbRjyEdeaF7bbte2Tzp8ejDHhEXB//y 0BGF9/62v1jmu7WSxEKUzio9FKMuyjqwrWF0Y2r8+bQAu7qHS+DewfjgEgaRvoAhDpjU a1Dg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zytor.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q6si718977oth.26.2020.01.22.23.56.51; Wed, 22 Jan 2020 23:57:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zytor.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726181AbgAWHz7 (ORCPT + 99 others); Thu, 23 Jan 2020 02:55:59 -0500 Received: from terminus.zytor.com ([198.137.202.136]:37879 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725785AbgAWHz7 (ORCPT ); Thu, 23 Jan 2020 02:55:59 -0500 Received: from carbon-x1.hos.anvin.org ([IPv6:2601:646:8600:3281:e7ea:4585:74bd:2ff0]) (authenticated bits=0) by mail.zytor.com (8.15.2/8.15.2) with ESMTPSA id 00N7s0r21387140 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Wed, 22 Jan 2020 23:54:00 -0800 Subject: Re: [RFC PATCH v1] pin_on_cpu: Introduce thread CPU pinning system call To: Mathieu Desnoyers , Chris Lameter Cc: Jann Horn , Peter Zijlstra , Thomas Gleixner , linux-kernel , Joel Fernandes , Ingo Molnar , Catalin Marinas , Dave Watson , Will Deacon , shuah , Andi Kleen , linux-kselftest , Russell King , Michael Kerrisk , Paul , Paul Turner , Boqun Feng , Josh Triplett , rostedt , Ben Maurer , linux-api , Andy Lutomirski References: <20200121160312.26545-1-mathieu.desnoyers@efficios.com> <430172781.596271.1579636021412.JavaMail.zimbra@efficios.com> <2049164886.596497.1579641536619.JavaMail.zimbra@efficios.com> <1648013936.596672.1579655468604.JavaMail.zimbra@efficios.com> From: "H. Peter Anvin" Message-ID: Date: Wed, 22 Jan 2020 23:53:54 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <1648013936.596672.1579655468604.JavaMail.zimbra@efficios.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-01-21 17:11, Mathieu Desnoyers wrote: > ----- On Jan 21, 2020, at 4:44 PM, Chris Lameter cl@linux.com wrote: > >> These scenarios are all pretty complex and will be difficult to understand >> for the user of these APIs. >> >> I think the easiest solution (and most comprehensible) is for the user >> space process that does per cpu operations to get some sort of signal. If >> its not able to handle that then terminate it. The code makes a basic >> assumption after all that the process is running on a specific cpu. If >> this is no longer the case then its better to abort if the process cannot >> handle moving to a different processor. > > The point of pin_on_cpu() is to allow threads to access per-cpu data > structures belonging to a given CPU even if they cannot run on that > CPU (because it is offline). > > I am not sure what scenario your signal delivery proposal aims to cover. > > Just to try to put this into the context of a specific scenario to see > if I understand your point, is the following what you have in mind ? > > 1. Thread A issues pin_on_cpu(5), > 2. Thread B issues sched_setaffinity removing cpu 5 from thread A's > affinity mask, > 3. Noticing that it would generate an invalid combination, rather than > failing sched_setaffinity, it would send a SIGSEGV (or other) signal > to thread A. > > Or so you have something entirely different in mind ? > I would agree that this seems like the only sane option, or you will be in a world of hurt because of conflicting semantics. It is not just offlining, but what happens if a policy manager calls sched_setaffinity() on another thread -- and now the universe breaks because a library is updated to use this new system call which collides with the expectations of the policy manager. There doesn't seem to be any way to get this to be a local event which doesn't break assumptions elsewhere in the system without making this an abort event of some type. However, signals are painful in their own right, mostly because of the lack of any infrastructure for allocating signals to libraries in user space. I was actually thinking about exactly that issue just this weekend. -hpa