Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp7989194rwl; Thu, 23 Mar 2023 11:06:05 -0700 (PDT) X-Google-Smtp-Source: AK7set99Fco+3X7+Tsyk7hSy27WJId0aAXMp3yFNKnEDJ3MYRB/vy5DjPTYRSK1aeFAZHKY3gTvO X-Received: by 2002:a17:906:510:b0:91e:acf4:b009 with SMTP id j16-20020a170906051000b0091eacf4b009mr11356250eja.22.1679594765298; Thu, 23 Mar 2023 11:06:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679594765; cv=none; d=google.com; s=arc-20160816; b=M00sO+oauxkKqtM8ibMpRKghkp1WetsQjtm4VGUK3umfpUYp5LTYO7QDUsEsGpr3qy cg1u3TYeTc6GmeWYR7MaXWhqXJsgv0fNKamsyvwlqqFhEeUm6o+hXzs2NwCjN4rXt550 zxpM5EEFHA5viqSjuhu6n1sfscJWBCisWU/Rn5p38icU8WUBAauwrC4rIhWShcgZqyG3 nEOiLp4/fCy5VKivXc1lw6rqbhmHukqQ84fN31/K94oxu8wzwgOd1OlsSnvixt02tD8+ IDqG2Vfd3DSzyqe/tctIrVM4HzLjl4eFVlKS0olwEORQsbosY+19JTov6Cm/lqb17FiQ puWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=/cf+dDkaWjCdhFGTseyhJ1TYIdXnZViE1IHL8Z04nLU=; b=SQ2fg49I1nJVBwQIHEbJ4fFPRe5bDXgSslRJ+olSZW2oQp0m8qZvERrk8cHAvYQhr6 fCM1sOVyV6ffykSHrARgrNl3YSqbmEV+NPoIIFw1x3zi1GybOysWsWbcbEFocR92y5Hl TBikCdBZnLRBjihmzVsGA9UqpjklQLFrh4t8FG95VnifqiFXIwx+DV6dsbEPiPomQaG0 6aKIAeTOwe+owR84Ht+C1RQl7g2TPymaCXF01+dYqEt763K3Lh4ebQG4KlnblADKgzFy nJZkO1DIXIqVx2RuqJqDnaGvtMDPVGl1bhAs19Ap1VgE2eCd3Jmhgsi36UdL0YiEzMZ4 kLjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=SeEtrJin; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z8-20020aa7d408000000b00501d47ed46fsi8446121edq.532.2023.03.23.11.05.35; Thu, 23 Mar 2023 11:06:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=SeEtrJin; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231235AbjCWSEx (ORCPT + 99 others); Thu, 23 Mar 2023 14:04:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231220AbjCWSEs (ORCPT ); Thu, 23 Mar 2023 14:04:48 -0400 Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4CCFEC78 for ; Thu, 23 Mar 2023 11:04:46 -0700 (PDT) Received: by mail-ed1-x529.google.com with SMTP id i5so43521134eda.0 for ; Thu, 23 Mar 2023 11:04:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1679594685; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/cf+dDkaWjCdhFGTseyhJ1TYIdXnZViE1IHL8Z04nLU=; b=SeEtrJingXLDxc2TLzv7QBF2wfL9DQ94O+dSKlQ4Ti1iZ+X7K20WRqUzmjh7K/A8EL P0LDt6aU89zBgxG5p+bF/KuZuYm8Wel4E9VdPD1UpeVwpe5sHUYo8Og/sDimHTcn/vhC 6zXSKMO94wdRcSdkPJZRiQ2s6oSClGyReWh50= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679594685; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/cf+dDkaWjCdhFGTseyhJ1TYIdXnZViE1IHL8Z04nLU=; b=xXkYqyFt9di67WrvqYXjWomaxRhRBW9BqH5nSDmeKUtSL53sru/WzYyevajMSrpV0E SrzBaShZ4L4+j/ONrKlxHfQJ3eZ7/sxbHOfa04hWgBq9tqp4ssu4ITziRwpMEeuVrhd+ B46GN3VoVbof3MYwR0ZAERot8QGZQi6fuNoJ1DLroJziIkgIyuipaDgvnuBFWSUxntzn g7e86rIQYcuqUPrhC7atdgRGakA2h0aUnqCxUfZWEv/SJvnknlvNTPBLHhXmJxa8qYLI fjjQ1N6d0vX7op0fwUh+a9Y99lHbMhrr6uZJtPxzij7V+etW0ipaL3b/ANkiZC55KKsV 6oJA== X-Gm-Message-State: AO0yUKXU+hw2Uxx5+85uP9xWygzoP6t00lm7pRlZyQwfNlPSaoIst0js CQQqvvX/LzObIB2B560qx57/JPnd5zQSSCoUcDYtEg== X-Received: by 2002:a17:906:15d5:b0:8b1:7e23:5041 with SMTP id l21-20020a17090615d500b008b17e235041mr11717762ejd.39.1679594684742; Thu, 23 Mar 2023 11:04:44 -0700 (PDT) Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com. [209.85.208.44]) by smtp.gmail.com with ESMTPSA id ja21-20020a170907989500b0093338259b2bsm7456365ejc.207.2023.03.23.11.04.43 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Mar 2023 11:04:44 -0700 (PDT) Received: by mail-ed1-f44.google.com with SMTP id o12so90392298edb.9 for ; Thu, 23 Mar 2023 11:04:43 -0700 (PDT) X-Received: by 2002:a17:906:2c04:b0:931:6e39:3d0b with SMTP id e4-20020a1709062c0400b009316e393d0bmr5389177ejh.15.1679594683563; Thu, 23 Mar 2023 11:04:43 -0700 (PDT) MIME-Version: 1.0 References: <20230320210724.GB1434@sol.localdomain> In-Reply-To: From: Linus Torvalds Date: Thu, 23 Mar 2023 11:04:25 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [GIT PULL] fsverity fixes for v6.3-rc4 To: Tejun Heo Cc: Eric Biggers , fsverity@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, "Theodore Ts'o" , Nathan Huckleberry , Victor Hsieh , Lai Jiangshan Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=0.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 22, 2023 at 6:04=E2=80=AFPM Tejun Heo wrote: > > Thanks for the pointers. They all seem plausible symptoms of work items > getting bounced across slow cache boundaries. I'm off for a few weeks so > can't really dig in right now but will get to it afterwards. So just as a gut feeling, I suspect that one solution would be to always *start* the work on the local CPU (where "local" might be the same, or at least a sibling). The only reason to migrate to another CPU would be if the work is CPU-intensive, and I do suspect that is commonly not really the case. And I strongly suspect that our WQ_CPU_INTENSIVE flag is pure garbage, and should just be gotten rid of, because what could be considered "CPU intensive" in under one situation might not be CPU intensive in another one, so trying to use some static knowledge about it is just pure guess-work. The different situations might be purely contextual things ("heavy network traffic when NAPI polling kicks in"), but it might also be purely hardware-related (ie "this is heavy if we don't have CPU hw acceleration for crypto, but cheap if we do"). So I really don't think it should be some static decision, either through WQ_CPU_INTENSIVE _or_ through "WQ_UNBOUND means schedule on first available CPU". Wouldn't it be much nicer if we just noticed it dynamically, and WQ_UNBOUND would mean that the workqueue _can_ be scheduled on another CPU if it ends up being advantageous? And we actually kind of have that dynamic flag already, in the form of the scheduler. It might even be explicit in the context of the workqueue (with "need_resched()" being true and the workqueue code itself might notice it and explicitly then try to spread it out), but with preemption it's more implicit and maybe it needs a bit of tweaking help. So that's what I mean by "start the work as local CPU work" - use that as the baseline decision (since it's going to be the case that has cache locality), and actively try to avoid spreading things out unless we have an explicit reason to, and that reason we could just get from the scheduler. The worker code already has that "wq_worker_sleeping()" callback from the scheduler, but that only triggers when a worker is going to sleep. I'm saying that the "scheduler decided to schedule out a worker" case might be used as a "Oh, this is CPU intensive, let's try to spread it out". See what I'm trying to say? And yes, the WQ_UNBOUND case does have a weak "prefer local CPU" in how it basically tends to try to pick the current CPU unless there is some active reason not to (ie the whole "wq_select_unbound_cpu()" code), but I suspect that is then counter-acted by the fact that it will always pick the workqueue pool by node - so the "current CPU" ends up probably being affected by what random CPU that pool was running on. An alternative to any scheduler interaction thing might be to just tweak "first_idle_worker()". I get the feeling that that choice is just horrid, and that is another area that could really try to take locality into account. insert_work() realyl seems to pick a random worker from the pool - which is fine when the pool is per-cpu, but when it's the unbound "random node" pool, I really suspect that it might be much better to try to pick a worker that is on the right cpu. But hey, I may be missing something. You know this code much better than I = do. Linus