Received: by 2002:a05:7412:1e0b:b0:fc:a2b0:25d7 with SMTP id kr11csp13983rdb; Wed, 14 Feb 2024 11:07:15 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVJ3yUolqrmawLefjm5F/YWx/BuXfWy+1vIMa3uMdaVyDdDSeWjDrb0XouH9/PB2jqPuOoIQe7G2m7O8bFiONbVkyBGkCmGzW7crFG/lQ== X-Google-Smtp-Source: AGHT+IHJCBnqVrequae2HWPJOWlP2S2CKJXA1e82ffj7pvyYlnz4/gnrCO6fk9SQMtQIP8un8Gn1 X-Received: by 2002:a05:620a:1087:b0:787:1ccd:fec2 with SMTP id g7-20020a05620a108700b007871ccdfec2mr3957810qkk.63.1707937635307; Wed, 14 Feb 2024 11:07:15 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707937635; cv=pass; d=google.com; s=arc-20160816; b=CYQ4m0c6sJpwzmEVVOQBEe9O5kofyNHs+gu/t4ZjeXaAw7E4jcfao7oSsmWhyVG293 q+QM+/J6/GRJNxG/s3wVX/oiE6wlqVX7j4Dazg94Osw741/g71U0N5JRPIdubAICaxRL Z5VKfx2ibU+K/Jk/xqMb5L2EOElkl5txcoKd4xtBilk3XOI26K+YvHhpp/bSLj8i/ToA fFtyCiRC0S1nbE5TBcFlUlF0l70u1PIdJ7K8ssc7PiBc1YpmSP5ROaCP0dl8pIblUyNB 38KE9iqc6zaFBhShlbXQdqHMQ2qQkCghxKr1VFLGh18UOslguMq6wBQyrBDl0JYqu7m+ t7ZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=9W6Z7oRvVu2ZzbktRJ5MM+rsUMP98UB3Mb0fkhog7OI=; fh=WNAuzBuZwkcCTvNu7kT4dyrSW2GmpBU/BbURjIdUkws=; b=SJ1ztUI2n7wlRWagTzG5YhYzMKKkzJcXBFfL0LI0lHEoAEmEO8pKBp30WpDWQh3DcU p39FvpD0WIYC1KSx8B2032XlgnjuvHGhL5j/Ytjcd0bfC9eeAnDvAoWsraGsUKO1k/aB XSlb85PwOiTAma4Qabxo+SF+WBraM4yRRYKT1cV4iw/tocC9M4AFD+DZQ0TBpBgPuo0E d1Wzv5jX2g8edp5LXoLACl9ThUTmBQzn1lQR6MHwvdcFlxc56teZrX+oSeZlbbTZ7+Gh h4te/ka6T8vwOPhLqEtNrcYcn18pIQH94uWmSSkYEyQY/ARdaG23qgg1AYovqn7uVuHW U88A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OXFiZn1a; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-65860-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65860-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=2; AJvYcCXZbxgcw7zPoQj8CMcqKPkO5sU9P9W+dz0tpi7+WPVUYggt8gUquSsTg/th5BcShmnTpNg15e+Mw77bZw4t6ojiv9PRP1T3HYwOoqWKZQ== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id d2-20020a05620a240200b00783e08c89f7si1507486qkn.749.2024.02.14.11.07.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 11:07:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-65860-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OXFiZn1a; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-65860-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65860-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 3EC4F1C23E4F for ; Wed, 14 Feb 2024 19:06:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2BF6613A887; Wed, 14 Feb 2024 19:06:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OXFiZn1a" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58EBD12D768 for ; Wed, 14 Feb 2024 19:06:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707937565; cv=none; b=M+aTX6tsp/dZL0GBMqXH5KeVxgOUAeGFQQhngEgkePLUgniraoNepJhFtn9oSWtFQKrnKB8YEP0vUasqxuyzHPCL68XRHCibjqxX2wOBZAig3ETeSlLsUmlGSQfrQKt3Y4FpKMAtkKJunI9++XfwP0xBtMmyC8RU5um2g/TQtcM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707937565; c=relaxed/simple; bh=1eBXhi+aN2hxjql0LiYY7CLtvXQBl0vtMMfB1UDrWnc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=A9T4/2xCjblLrqB+nnASOOj1g8RL29ezslD5rEphkpXGTwJvdz1u7LgWzKfI0hFezehXVpwojRcxthX9j2rreWTMbOsTt8K+46cVakxybYmTUFKpVMQcS03BYe3tKBZO0cOEQ0bRynqy/85MM5eoR9xIhXKWiQd3EZe13txPgeM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=OXFiZn1a; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707937562; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9W6Z7oRvVu2ZzbktRJ5MM+rsUMP98UB3Mb0fkhog7OI=; b=OXFiZn1azc8kNdYDLhRvtmt0U4oKiiSARik2IFG0TkuZm9MZ4k7KE4WlfybRH//CLwNIVH 5iPsxfGXTTi9rBUHi57T1af6DTrcwc+EHuicUN4mUqQmMX/ROV8xhA/a6riSUrGA0L/g2J tmjwIGo+D90EUXMSzzYOPsSpCzodvQk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-351-dPQyhSKhM7i5mtN3w2TTgg-1; Wed, 14 Feb 2024 14:05:58 -0500 X-MC-Unique: dPQyhSKhM7i5mtN3w2TTgg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DB27D185A783; Wed, 14 Feb 2024 19:05:57 +0000 (UTC) Received: from tpad.localdomain (unknown [10.96.133.6]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DE65EC185C1; Wed, 14 Feb 2024 19:05:56 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 3DD66401E1F13; Wed, 14 Feb 2024 15:58:49 -0300 (-03) Date: Wed, 14 Feb 2024 15:58:49 -0300 From: Marcelo Tosatti To: Thomas Gleixner Cc: linux-kernel@vger.kernel.org, Daniel Bristot de Oliveira , Juri Lelli , Valentin Schneider , Frederic Weisbecker , Leonardo Bras , Peter Zijlstra Subject: Re: [patch 04/12] clockevent unbind: use smp_call_func_single_fail Message-ID: References: <20240206184911.248214633@redhat.com> <20240206185709.928420669@redhat.com> <87jzngmqsw.ffs@tglx> <87plx3l6wc.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87plx3l6wc.ffs@tglx> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 On Sun, Feb 11, 2024 at 09:52:35AM +0100, Thomas Gleixner wrote: > On Wed, Feb 07 2024 at 09:51, Marcelo Tosatti wrote: > > On Wed, Feb 07, 2024 at 12:55:59PM +0100, Thomas Gleixner wrote: > > > > OK, so the problem is the following: due to software complexity, one is > > often not aware of all operations that might take place. > > The problem is that people throw random crap on their systems and avoid > proper system engineering and then complain that their realtime > constraints are violated. So you are proliferating bad engineering > practices and encourage people not to care. Its more of a practicality and cost concern: one usually does not have resources to fully review software before using that software. > > Now think of all possible paths, from userspace, that lead to kernel > > code that ends up in smp_call_function_* variants (or other functions > > that cause IPIs to isolated CPUs). > > So you need to analyze every possible code path and interface and add > your magic functions there after figuring out whether that's valid or > not. "A magic function", yes. > > The alternative, from blocking this in the kernel, would be to validate all > > userspace software involved in your application, to ensure it won't end > > up in the kernel sending IPIs. Which is impractical, isnt it ? > > It's absolutely not impractical. It's part of proper system > engineering. The wet dream that you can run random docker containers and > everything works magically is just a wet dream. Unfortunately that is what people do. I understand that "full software review" would be the ideal, but in most situations it does not seem to happen. > > (or rather, with such option in the kernel, it would be possible to run > > applications which have not been validated, since the kernel would fail > > the operation that results in IPI to isolated CPU). > > That's a fallacy because you _cannot_ define with a single CPU mask > which interface is valid in a particular configuration to end up with an > IPI and which one is not. There are legitimate reasons in realtime or > latency constraint systems to invoke selective functionality which > interferes with the overall system constraints. > > How do you cover that with your magic CPU mask? You can't. > > Aside of that there is a decent chance that you are subtly breaking user > space that way. Just look at that hwmon/coretemp commit you pointed to: > > "Temperature information from the housekeeping cores should be > sufficient to infer die temperature." > > That's just wishful thinking for various reasons: > > - The die temperature on larger packages is not evenly distributed and > you can run into situations where the housekeeping cores are sitting > "far" enough away from the worker core which creates the heat spot I know. > - Some monitoring applications just stop to work when they can't read > the full data set, which means that they break subtly and you can > infer exactly nothing. > > > So the idea would be an additional "isolation mode", which when enabled, > > would disallow the IPIs. Its still possible for root user to disable > > this mode, and retry the operation. > > > > So lets say i want to read MSRs on a given CPU, as root. > > > > You'd have to: > > > > 1) readmsr on given CPU (returns -EPERM or whatever), since the > > "block interference" mode is enabled for that CPU. > > > > 2) Disable that CPU in the block interference cpumask. > > > > 3) readmsr on the given CPU (success). > > > > 4) Re-enable CPU in block interference cpumask, if desired. > > That's just wrong. Why? > > Once you enable it just to read the MSR you enable the operation for > _ALL_ other non-validated crap too. So while the single MSR read might > be OK under certain circumstances the fact that you open up a window for > all other interfaces to do far more interfering operations is a red > flag. > > This whole thing is a really badly defined policy mechanism of very > dubious value. > > Thanks, OK, fair enough. From your comments, it seems that per-callsite toggling would be ideal, for example: /sys/kernel/interference_blocking/ directory containing one sub-directory per call site. Inside each sub-directory, a "enabled" file, controlling a boolean to enable or disable interference blocking for that particular callsite. Also a "cpumask" file on each directory, by default containing the same cpumask as the nohz_full CPUs, to control to which CPUs to block the interference for. How does that sound?