Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp5823768rwd; Wed, 24 May 2023 07:15:34 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ42QJ99tP9rLluUA2IZ1m9ZjBdgXkSh4HoN26yaUF0PY8Yms54CUjGUPhaAweAAFhxYE7+Q X-Received: by 2002:a05:6a20:6a1a:b0:10c:80a:47fd with SMTP id p26-20020a056a206a1a00b0010c080a47fdmr8609483pzk.36.1684937734643; Wed, 24 May 2023 07:15:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684937734; cv=none; d=google.com; s=arc-20160816; b=cE9yzkFLmBc4sffYI+weuD60bs/Ub4V1PTCmqmfBGrirsHDx9XdT7IXVYnO1SI1c27 gDiNGRtchXwHJf76TKmTAdzr46pmXpwVAv4uBYXU0Me0f2J6EcO7CvpBw4M8RTRI1+bT wT30iN15A1QyVsb818yZBI/dQhh+40YYTx3VCy6RIbREKMJw6tBY5ti6xI5vIhTEsidD lwhkbQYwSAN0t5RcN4kYLNhRMuw8kUWQrnXGSM0ZOudQFdZikv98vJzdBL+DTBMP2BXO EQNAZ23W/Ob6yOm7CoiQPtxziSuo3uFxzQIHSN2oMK/HrG+dAmzJR55GojhbuoE+hvNJ ni0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3/JE+/VA6wS2JR/X9eVdEeZwiqUU8bEuDrv3SbGQ10E=; b=cunb/Dj6jJlfDkurEa5VDiFIWox4UeUHyUF3KxQ+veB3Fyy70uBRkfbDFv3994r/Ss 5HL7luxpEL4mfLdZKTBPt+IAEp4r5xH6kfLvcdiAJX6YVRfhLTx/NPeLIcCmWJmW+6mb nZCk4pZlrg6lWaTfMdRXXCCQEEWy1lMZS5iIC3EpNFk8bhjt5doHCerur4m3U+cb+WPg xngtcY4tGFyUHk24HeMtV+QL8S9e/WP6TYCnMYcBeHYIopHCsxNLqHkoQyfyTlm7wrzT MPHHwo3G8VIRHJ/gkI0hRQZmnJ+2A3EjwbbJUxeYUl57/amvkfdlogjkQgX2RdfG4SfP 3l1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=aZLdBuFj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j124-20020a636e82000000b0053b928deb7asi2675948pgc.344.2023.05.24.07.15.20; Wed, 24 May 2023 07:15:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=aZLdBuFj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235588AbjEXNyq (ORCPT + 99 others); Wed, 24 May 2023 09:54:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235028AbjEXNyo (ORCPT ); Wed, 24 May 2023 09:54:44 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A1AE9E for ; Wed, 24 May 2023 06:53:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684936431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3/JE+/VA6wS2JR/X9eVdEeZwiqUU8bEuDrv3SbGQ10E=; b=aZLdBuFjfwjLQOwqJAN2xr9ZcW7Ir00JXZ+tu5fneWprYBKPcCShH05oYC9wI+Xw1DJ+hJ jcZGwrkREaDxDlkYgYPW6zt1JxaftLMCv+Y9biUU+3J+yXgC29QVN6Yj1smxDZSItlPvME WE7qybvs2/NzBZplmis3OsREVJ3yMAQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-197-93pexv7lOdW0qHkPt48qTg-1; Wed, 24 May 2023 09:53:45 -0400 X-MC-Unique: 93pexv7lOdW0qHkPt48qTg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 141C5802355; Wed, 24 May 2023 13:53:44 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CF1CE40C6EC4; Wed, 24 May 2023 13:53:43 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id EA382400DBEF5; Wed, 24 May 2023 10:53:23 -0300 (-03) Date: Wed, 24 May 2023 10:53:23 -0300 From: Marcelo Tosatti To: Michal Hocko Cc: Christoph Lameter , Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Russell King , Huacai Chen , Heiko Carstens , x86@kernel.org, Vlastimil Babka Subject: Re: [PATCH v8 00/13] fold per-CPU vmstats remotely Message-ID: References: <20230515180015.016409657@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 24, 2023 at 02:51:55PM +0200, Michal Hocko wrote: > [Sorry for a late response but I was conferencing last two weeks and now > catching up] > > On Mon 15-05-23 15:00:15, Marcelo Tosatti wrote: > [...] > > v8 > > - Add summary of discussion on -v7 to cover letter > > Thanks this is very useful! This helps to frame the further discussion. > > I believe the most important question to answer is this in fact > > I think what needs to be done is to avoid new queue_work_on() > > users from being introduced in the tree (the number of > > existing ones is finite and can therefore be fixed). > > > > Agree with the criticism here, however, i can't see other > > options than the following: > > > > 1) Given an activity, which contains a sequence of instructions > > to execute on a CPU, to change the algorithm > > to execute that code remotely (therefore avoid interrupting a CPU), > > or to avoid the interruption somehow (which must be dealt with > > on a case-by-case basis). > > > > 2) To block that activity from happening in the first place, > > for the sites where it can be blocked (that return errors to > > userspace, for example). > > > > 3) Completly isolate the CPU from the kernel (off-line it). > > I agree that a reliable cpu isolation implementation needs to address > queue_work_on problem. And it has to do that _realiably_. This cannot by > achieved by an endless whack-a-mole and chasing each new instance. There > must be a more systematic approach. One way would be to change the > semantic of schedule_work_on and fail call for an isolated CPU. The > caller would have a way to fallback and handle the operation by other > means. E.g. vmstat could simply ignore folding pcp data because an > imprecision shouldn't really matter. Other callers might chose to do the > operation remotely. This is a lot of work, no doubt about that, but it > is a long term maintainable solution that doesn't give you new surprises > with any new released kernel. There are likely other remote interfaces > that would need to follow that scheme. > > If the cpu isolation is not planned to be worth that time investment > then I do not think it is also worth reducing a highly optimized vmstat > code. These stats are invoked from many hot paths and per-cpu > implementation has been optimized for that case. It is exactly the same code, but now with a "LOCK" prefix for CMPXCHG instruction. Which should not cost much due to cache locking (these are per-CPU variables anyway). > If your workload would > like to avoid that as disturbing then you already have a quiet_vmstat > precedence so find a way how to use it for your workload instead. > > -- > Michal Hocko > SUSE Labs OK so an alternative solution is to completly disable vmstat updates for isolated CPUs. Are you OK with that ?