Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp3833442rwl; Mon, 27 Mar 2023 21:38:24 -0700 (PDT) X-Google-Smtp-Source: AKy350bk23qUtfT0JydGYZuDXs8VL9gVlCEX5/lyumpcw6uZPaYGJREUq5IdRBpNGUijLUGZ2dSo X-Received: by 2002:a62:52d7:0:b0:622:c72a:d0e0 with SMTP id g206-20020a6252d7000000b00622c72ad0e0mr13300786pfb.13.1679978304358; Mon, 27 Mar 2023 21:38:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679978304; cv=none; d=google.com; s=arc-20160816; b=hTO+93969jaXx8G7bSz998si/FdTqOQ+1p78MY0gnLi41uTt2DVC33J0Gd85yxhXXg 4BmUvXYm/oiiB6bnL36JIwdqDZW1j59uCqcktD4Op1+ZBrxm97UCjW9AfkUEsloR+xXs NULTbXRIZaYGq1cc8Z6fl7nB0msMPliIFqZ3I2qWriGQAr5HjvBSSm4vAjOmT2Gx7Q5I 4FN203b75qrdMv5UiO0jDX7JQ1/dzg6UscBVEdETlGb1b0aDSxkik4oV0NOFXdZRxneM GR9tYxmhWJiS3JfrRJAr4/oVDDBvdUGCc8uzybfjX0A+ZmQs3SNqT1tIx5XQvL67tKdy t+Aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=LErsHmsbRWX1t/tlmcxX2OThlM8pztoVwbQlHV2TBMA=; b=TCyP39EC9oqqK+nRaUdfdSjccJ6GOyVDgsD/iFo2rWU4Om9lTtV2dfy/MpIXYLzucm mHD0RHI+XPzvT8oOERj+h9Or9Qt3jkK18txH3rdOxctvFWvP3YrzWf9ypqumh/Dcs/Qj 7ZZV7B4wVzf4zQthYmFahLtbxRTwOPWjK9n2EQT7jRNgNJ4KHODSapPbLzip+gRN5ehX rh+C0wpwGMPFo1pwEM1tDyF2aDvAGyjDtN/CCohyqMK09i+Bn7g9eR3ArWA1MuvDPFI0 wzHoTP4fZE2rWTHTAQbxm/YWEtGzymuPvstEtS/E+LXx/EWfqP3dyGnHlhl8Wcb+WWo1 oNMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=hZz2FWcW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a22-20020a056a000c9600b0062602702839si30285026pfv.87.2023.03.27.21.38.12; Mon, 27 Mar 2023 21:38:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=hZz2FWcW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230138AbjC1EF0 (ORCPT + 99 others); Tue, 28 Mar 2023 00:05:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229459AbjC1EFX (ORCPT ); Tue, 28 Mar 2023 00:05:23 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12D9CDE for ; Mon, 27 Mar 2023 21:04:56 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id o11so10468214ple.1 for ; Mon, 27 Mar 2023 21:04:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1679976295; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LErsHmsbRWX1t/tlmcxX2OThlM8pztoVwbQlHV2TBMA=; b=hZz2FWcWLz63/gqfIBEoqW9H85wulv/9+vjqu8L8Iz4De2+0LXj9tHSZm8aEeN6A99 +Z/ZKIh8oc6pBnK/P7n1Ozj7MB9pQNsczMe+kaKhyydY6WxzGp2rDAMDTMelshih1B4S fAj6FStGtnQb37wbWEmEPW96SuCH/LUwitCQzoaXmhnYk9uMFaKT1UQnl2IeADXi+z+3 Fjzw+TglH9fskNnhCIZK6TRntyWul3lhBtsYgk4cFolT9BISt8fkHEX5fiUK9oHot54Y Gj8qQf7/8jtZGdaUgXyL7Z7wke2jVRQl6Zv4+282jwTuKb8zSkxdIvbrGDl5csiHXyFu GeDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679976295; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LErsHmsbRWX1t/tlmcxX2OThlM8pztoVwbQlHV2TBMA=; b=fDU0OZV5OorsEu8OcmnQMiLcgkd0u3tR8dyX9Eglo9FrtXNPiNnf3Z1pu6HOIByNr3 PAFo3Fuk275FjelHpBIaEj1x7MRGbVdosl8xg0kEMjPyUU0MwLZgit0KidVDN3A0a8I/ vDxt5PA/tCN/6G623C9mfpasdFnDye82/UFAbCbu0ZQGse95sko6C3/mrKkD6Scn13dQ znoWsT1hM5FBiyIyp0xBJvnO2GGXI48dTUdLMdAcGfVWrLmf+DnuTakef91UQpQ9O1dA gH74Dt5phEK+2FsGa0a1dyyHBqsvgF2TmkinHWvhGV/jZ1j9ltW5Eo85wGow93zvGWrM WCmA== X-Gm-Message-State: AAQBX9d687lQHaO4Ncl3dOqFlSUNmt9BdBR3ef9zf9Ef40fS+tGlAij3 R/QIhyS/zP1TMzaJ75EvCDaCbqMCZoOxuYrarlS1 X-Received: by 2002:a17:902:f693:b0:1a1:8bf0:a75a with SMTP id l19-20020a170902f69300b001a18bf0a75amr5015786plg.13.1679976295532; Mon, 27 Mar 2023 21:04:55 -0700 (PDT) MIME-Version: 1.0 References: <20230323053043.35-1-xieyongji@bytedance.com> <20230323053043.35-4-xieyongji@bytedance.com> In-Reply-To: From: Yongji Xie Date: Tue, 28 Mar 2023 12:04:44 +0800 Message-ID: Subject: Re: [PATCH v4 03/11] virtio-vdpa: Support interrupt affinity spreading mechanism To: Jason Wang Cc: "Michael S. Tsirkin" , Thomas Gleixner , Christoph Hellwig , virtualization , linux-kernel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 28, 2023 at 11:44=E2=80=AFAM Jason Wang w= rote: > > On Tue, Mar 28, 2023 at 11:33=E2=80=AFAM Yongji Xie wrote: > > > > On Tue, Mar 28, 2023 at 11:14=E2=80=AFAM Jason Wang wrote: > > > > > > On Tue, Mar 28, 2023 at 11:03=E2=80=AFAM Yongji Xie wrote: > > > > > > > > On Fri, Mar 24, 2023 at 2:28=E2=80=AFPM Jason Wang wrote: > > > > > > > > > > On Thu, Mar 23, 2023 at 1:31=E2=80=AFPM Xie Yongji wrote: > > > > > > > > > > > > To support interrupt affinity spreading mechanism, > > > > > > this makes use of group_cpus_evenly() to create > > > > > > an irq callback affinity mask for each virtqueue > > > > > > of vdpa device. Then we will unify set_vq_affinity > > > > > > callback to pass the affinity to the vdpa device driver. > > > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > Thinking hard of all the logics, I think I've found something int= eresting. > > > > > > > > > > Commit ad71473d9c437 ("virtio_blk: use virtio IRQ affinity") trie= s to > > > > > pass irq_affinity to transport specific find_vqs(). This seems a > > > > > layer violation since driver has no knowledge of > > > > > > > > > > 1) whether or not the callback is based on an IRQ > > > > > 2) whether or not the device is a PCI or not (the details are hid= ed by > > > > > the transport driver) > > > > > 3) how many vectors could be used by a device > > > > > > > > > > This means the driver can't actually pass a real affinity masks s= o the > > > > > commit passes a zero irq affinity structure as a hint in fact, so= the > > > > > PCI layer can build a default affinity based that groups cpus eve= nly > > > > > based on the number of MSI-X vectors (the core logic is the > > > > > group_cpus_evenly). I think we should fix this by replacing the > > > > > irq_affinity structure with > > > > > > > > > > 1) a boolean like auto_cb_spreading > > > > > > > > > > or > > > > > > > > > > 2) queue to cpu mapping > > > > > > > > > > > > > But only the driver knows which queues are used in the control path > > > > which don't need the automatic irq affinity assignment. > > > > > > Is this knowledge awarded by the transport driver now? > > > > > > > This knowledge is awarded by the device driver rather than the transpor= t driver. > > > > E.g. virtio-scsi uses: > > > > struct irq_affinity desc =3D { .pre_vectors =3D 2 }; // vq0 is cont= rol > > queue, vq1 is event queue > > Ok, but it only works as a hint, it's not a real affinity. As replied, > we can pass an array of boolean in this case then transport driver > knows it doesn't need to use automatic affinity for the first two > queues. > But we don't know whether we would use other fields in structure irq_affinity in the future. So a full set should be better? > > > > > E.g virtio-blk uses: > > > > > > struct irq_affinity desc =3D { 0, }; > > > > > > Atleast we can tell the transport driver which vq requires automatic > > > irq affinity. > > > > > > > I think that is what the current implementation does. > > > > > > So I think the > > > > irq_affinity structure can only be created by device drivers and > > > > passed to the virtio-pci/virtio-vdpa driver. > > > > > > This could be not easy since the driver doesn't even know how many > > > interrupts will be used by the transport driver, so it can't built th= e > > > actual affinity structure. > > > > > > > The actual affinity mask is built by the transport driver, > > For PCI yes, it talks directly to the IRQ subsystems. > > > device > > driver only passes a hint on which queues don't need the automatic irq > > affinity assignment. > > But not for virtio-vDPA since the IRQ needs to be dealt with by the > parent driver. For our case, it's the VDUSE where it doesn't need IRQ > at all, a queue to cpu mapping is sufficient. > The device driver doesn't know whether it is binded to virtio-pci or virtio-vdpa. So it should pass a full set needed by the automatic irq affinity assignment instead of a subset. Then virtio-vdpa can choose to pass a queue to cpu mapping to VDUSE, which is what we do now (use set_vq_affinity()). Thanks, Yongji