Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp3229451pxb; Tue, 20 Apr 2021 03:42:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxRTuDcJhjhZRXa6Dvh88aaXeWC1QichG/ubPB1Vn4ua19YT+QOoA0OABFA73HUkttcecKQ X-Received: by 2002:a17:906:2509:: with SMTP id i9mr26890078ejb.117.1618915361892; Tue, 20 Apr 2021 03:42:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618915361; cv=none; d=google.com; s=arc-20160816; b=VHiTrFzw9OLl9r7iUk2Hgfy52SNU83kjYyaxZp2ZS7zg9QDa2mOEVN6UpMBQuMHmmi QhT6tdwJd+nJHBC6bltyVLBKljPgy73yv1oC2jISenxOBfrL7krAaCOsezC/bHrn9Xpz BGuOErxA+f8fWZBGTSrQO9g9fY6NNGsmQEyIKAmsF4OpV4l/cJN2w1xEGir2S0Er6yPV pRyWvpVtmLbmRlVsXvjA8Jwpoj6dpNIgq1zoLeRV4nXuEOEyaFSCLOy4R0ByyBszOwqD 2DciXNAAb6zRryEuNcO9fnI3K1OyYJCXb/auQrffQIoStVFOuv0IiOqvWKVSS7GF6U0S 17Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=5vcp7LlvO4t6de7ogUbBXWf68yUyJYghPxKL5v8ehnQ=; b=NsFTmK+egZNUX2IedfK7Lm2WLRYLg9S/Y4xSHBfF20QDrg6ZRlkT9OaTxjHm/pll0N B1vMjT/rkWT/mc9NLzq8zGWPLU5gO/SiGwgpDGJJ6h6QhSDS7D+g9PjFM4ouHX5HqMBu H3wudWtZVbPQHgzRO7eJxYFme/CrcWGCx5qMKgeJATtIcGxUwRyW5CVmA68Dd22oFtLM W246Tm19fmoNnt4Ik8GYEjUZeth289uNEF/O6ibNF01bpqe/Ib3AEZiu1JEUr0W0a7uf hAyMiy2kRW4SfdKLCfQyoGcYhYzGYr2rnJC6C6rok2kLKFRnibGhsCNC0+QMg+eVmaD5 bMGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Zo8PDeJH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mb26si14223560ejb.754.2021.04.20.03.42.18; Tue, 20 Apr 2021 03:42:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Zo8PDeJH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231511AbhDTKlw (ORCPT + 99 others); Tue, 20 Apr 2021 06:41:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230408AbhDTKlw (ORCPT ); Tue, 20 Apr 2021 06:41:52 -0400 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0722BC06174A for ; Tue, 20 Apr 2021 03:41:21 -0700 (PDT) Received: by mail-ed1-x52c.google.com with SMTP id d21so24360080edv.9 for ; Tue, 20 Apr 2021 03:41:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5vcp7LlvO4t6de7ogUbBXWf68yUyJYghPxKL5v8ehnQ=; b=Zo8PDeJH5Sm2dWt16buPFPVqLPX27UTATlBGB4jk4LVLmvak8Cyv8GOT7P3/klB5jl Q8J0yFxV1KWiSf5S+YObExHM99hiT0t14MK2/7An0TY2xTf3EaWA11aSMS0CbvI4x3tt UtGSey3FyCKdu97Q6RSm5fBmOpS2pnHwoAJeDSqtfeqdtBp8P+SUIdHa38UAk40i2zfX DXTrLSV71B0ZP7IkdoKVSbM4P8RxDpZNm2EctQui+pPQydEZOGVZ5fiX2G1qs7J6R3Al 28/nV8xJC+2BpuNMnCsPcf86PMT1ljsdBmUssaqzXNPLOJcnUkx3FKR7QPRQKgg4Nj1z nDog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5vcp7LlvO4t6de7ogUbBXWf68yUyJYghPxKL5v8ehnQ=; b=U7W3w8QR0hpFBx+hKbnaemOLSSTfu5xPVmRYnfZmC8o5XRBN6Krh/RY7unBgtNDYse 0nUnUp483XCNUxasQTMOc6fraMLnTEfC8hwjNKse8l0Iqs5UL4xRc4j2nyUCFH3zndUk +9lDpuWJquVS/Az45EShfJ6LJQBmUBp+JciEngtgobvnQDm2B6GGktNnOIzBcLtXSinL OiCeFQudwpnS9EUARNu1WIJZlJPnzN+gCxMWsX1GpOVXoJqzOHsVjDhOyHIoTcpcWI8V UJ4Lx8gGrMELEuGmcnEW9/buhWoasoEvWcz9qE9zL8b/8hgpYT3lIaVn1VkFVpMG/nbU qbHA== X-Gm-Message-State: AOAM532pQMYudLQMDarP4163OxocoG3wABbsVjGCEXEEKzNZvIo5bDDR PWlnCDMpVdz6SVDa6yWjh8tPvn9ejjJvU0N4l543YhKn4tA= X-Received: by 2002:aa7:c492:: with SMTP id m18mr8438032edq.30.1618915279470; Tue, 20 Apr 2021 03:41:19 -0700 (PDT) MIME-Version: 1.0 References: <20210419184455.2987243-1-lrizzo@google.com> <20210419191712.GB26214@worktop.programming.kicks-ass.net> In-Reply-To: From: Luigi Rizzo Date: Tue, 20 Apr 2021 12:41:08 +0200 Message-ID: Subject: Re: [PATCH] smp: add a best_effort version of smp_call_function_many() To: Peter Zijlstra Cc: linux-kernel , axboe@kernel.dk, paulmck@kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 20, 2021 at 11:14 AM Peter Zijlstra wrote: > > On Mon, Apr 19, 2021 at 11:07:08PM +0200, Luigi Rizzo wrote: > > On Mon, Apr 19, 2021 at 9:17 PM Peter Zijlstra wrote: > > > > > > On Mon, Apr 19, 2021 at 11:44:55AM -0700, Luigi Rizzo wrote: > > > > Regardless of the 'wait' argument, smp_call_function_many() must spin > > > > if any of the target CPUs have their csd busy waiting to be processed > > > > for a previous call. This may cause high tail latencies e.g. when some > > > > of the target CPUs are running functions that disable interrupts for a > > > > long time; getrusage() is one possible culprit. > > > > > > > > Here we introduce a variant, __smp_call_function_many(), that adds > > > > a third 'best_effort' mode to the two existing ones (nowait, wait). > > > > In best effort mode, the call will skip CPUs whose csd is busy, and if > > > > any CPU is skipped it returns -EBUSY and the set of busy in the mask. > > > > This allows the caller to decide how to proceed, e.g. it might retry at > > > > a later time, or use a private csd, etc.. > > > > > > > > The new function is a compromise to avoid touching existing callers of > > > > smp_call_function_many(). If the feature is considered interesting, we > > > > could even replace the 'wait' argument with a ternary 'mode' in all > > > > smp_call_function_*() and derived methods. > > > > > > I don't see a user of this... > > > > This is actually something for which I was looking for feedback: > > > > my use case is similar to a periodic garbage collect request: > > the caller tells targets that it may be time to do some work, > > but it does not matter if the request is dropped because the > > caller knows who was busy and will reissue pending requests later. ... > > Any possible candidates that people can think of ? > > We mostly try and avoid using this stuff wherever possible. Only when > no other choice is left do we send IPIs. > > NOHZ_FULL already relies on this and gets massively unhappy when a new > user comes and starts to spray IPIs. I am curious, why is that -- is it because the new user is stealing the shared csd's in cfd_data (see below), or some other reason ? > > So no; mostly we send an IPI because we _HAVE_ to, not because giggles. > > That said; there's still some places left where we can avoid sending > IPIs, but in all those cases correctness mandates we actually handle > things and not randomly not do anything. My case too requires that the request is eventually handled, but with this non-blocking IPI the caller has a better option than blocking: it can either retry the multicast IPI at a later time if conditions allow, or it can post a dedicated CSD (with the advantage that being my requests idempotent, if the CSD is locked there is no need to retry because it means the handler has not started yet). In fact, if we had the option to use dedicated CSDs for multicast IPI, we wouldn't even need to retry because we'd know that the posted CSD is for our call back and not someone else's. cheers luigi