Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2176715pxb; Fri, 25 Mar 2022 12:27:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwV0YPUjQ7/jn7li843eVZ7qVbbsYojRvSjkhktOP9t2RUJQLrAvbZaug2dq35UIMF6xRBD X-Received: by 2002:a63:1849:0:b0:380:3aee:496a with SMTP id 9-20020a631849000000b003803aee496amr827198pgy.489.1648236463837; Fri, 25 Mar 2022 12:27:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648236463; cv=none; d=google.com; s=arc-20160816; b=mz1OLR5MEULfI1B51/+gM+6RWdMl+lHvPdTyU5l8hvjqh+g3MQ+Zch4Wln8MnSkQPQ H4Q5o76vkgm+pQUPtHlHR+r0Kq7S4Z1TKK1X8JOLX1ShmjkYAYQGEBeUxEUvRHLOJUu4 q3wUiKRCRqD0VeaMeake9KKTZ+dGHI+ZToSA1a8oECkORwUCMrLxAKeQaWvffgq0Uu/4 ynS6V3dxPcJmv9Q16utc9yO+w2zjTqcn5lq4Q1EJZk/GYaY6Uh7Yj9S88mx3NjtJNMOf +tporFXJkYdOpcpsIBbKDucM9pD/RFCYHQ+Y3VN2zbMM6Pw+fDAJ1IJUxjKEt0gnlnWE LcKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=5n657RBxJ9u00nOhcNkzlNKfY65Y8K8C3fIfane0FAY=; b=W+yfVqVnAdS8TdtV5eKPFOluEq7I6vR/l0xokqZLk7jl4Q+TkxMcdLqf875cdwMLZL EAl9dn2J+Fxw9eR5Rzeb0zIdKlstGm6dzfJ9TZ0ZrFrSIkCY5bShxPhN7YalNYdEkQov LWwaBJGw2ntPoRK70kpMKUSvpeg9dHWNMxlr3JEV5r23OgYbyUpaKpx4f4K2B7wkK8+a peJid6dqWfTT79MeG7+ABVlvOuv3gCeVxbBupGvg1juvc6s6ELbdZDX7cMFZWz2UW0+d DmOvS9jnutLIyrfhUb5tu91YQS3h0qM/o5f3fzIvhQ78vZt1RNzx3vGAzigqh+66jx9y TzLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d05VTF+F; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n125-20020a632783000000b003816043f01asi3257437pgn.527.2022.03.25.12.27.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 12:27:43 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d05VTF+F; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AC6D12BB376; Fri, 25 Mar 2022 11:37:44 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241299AbiCYHzA (ORCPT + 99 others); Fri, 25 Mar 2022 03:55:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229826AbiCYHy6 (ORCPT ); Fri, 25 Mar 2022 03:54:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5CCDD5BD1D for ; Fri, 25 Mar 2022 00:53:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648194803; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5n657RBxJ9u00nOhcNkzlNKfY65Y8K8C3fIfane0FAY=; b=d05VTF+FjoOI02PvK8/2YgDUen5exbnUCN2meT+af2VHWRH1YkvjPQm1gHoQg9+PM42fBc 0PY9TBbIi5628LtLWGgovbE4lezQJvs/9EuY6xC8K5lVF4ao8pqPb2+1c0LH4un0Gd6VZw oa8YlgWDBXZX1/JEwhCv8w2xQvzc3M4= Received: from mail-yb1-f198.google.com (mail-yb1-f198.google.com [209.85.219.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-450-2iYE_WNJMfybs1Hj2JOytg-1; Fri, 25 Mar 2022 03:53:22 -0400 X-MC-Unique: 2iYE_WNJMfybs1Hj2JOytg-1 Received: by mail-yb1-f198.google.com with SMTP id e7-20020a258747000000b00633ee0cbca5so5535156ybn.23 for ; Fri, 25 Mar 2022 00:53:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5n657RBxJ9u00nOhcNkzlNKfY65Y8K8C3fIfane0FAY=; b=jfv400RKEwiAXec8Viw6JCFrfzRovC7LgSLovaOPWbyHJzqf4DjAoYJJ/wg/NlNBf0 tf1AWvDSujQ8xYXZmPprSujKymVh0UfJT21I+ICJz5WHHOn9FxgBSth3QDzB8u7lNoEk GTrSDtf63jTVcyuAHYTWS5hHbTXbb0rP0yFHqUfgHGWxAT0i983L4FlFGxqP+cthMQPR ossyYlg9B765Ao0DCleYLB0Kh0tEME2sHPk2WETdHzyT4QAkOtxDfXAkiO/zC3+5yJ+t nRHZWtm5nWkk3pe0JCM/vdiRRxKIAVuxH27mE93ev5ZAO+vyHvtSTVxlv57vXsZ8h/77 hjFA== X-Gm-Message-State: AOAM533wYFXOzoUI/c03lWUIPMWfM3AgktwhQc1xkXZVd34efX4JfvM3 qwHHk6Qumak93yR5hJpkR1oYmby5FdrASoiTblHa29FNqU0K0GqoWcAkXXBmAjL6wmZnJIla3od F6RdrvnDagrqKpdlLo5eDQ8jvx+f8hsntr9LLIkMC X-Received: by 2002:a81:5309:0:b0:2e6:dcfc:8665 with SMTP id h9-20020a815309000000b002e6dcfc8665mr3622391ywb.501.1648194801635; Fri, 25 Mar 2022 00:53:21 -0700 (PDT) X-Received: by 2002:a81:5309:0:b0:2e6:dcfc:8665 with SMTP id h9-20020a815309000000b002e6dcfc8665mr3622382ywb.501.1648194801467; Fri, 25 Mar 2022 00:53:21 -0700 (PDT) MIME-Version: 1.0 References: <20220321123420.3207-1-hdanton@sina.com> <20220324005345.3623-1-hdanton@sina.com> <20220324060419.3682-1-hdanton@sina.com> <20220324021428-mutt-send-email-mst@kernel.org> <20220324120217.3746-1-hdanton@sina.com> <20220325024324-mutt-send-email-mst@kernel.org> In-Reply-To: <20220325024324-mutt-send-email-mst@kernel.org> From: Jason Wang Date: Fri, 25 Mar 2022 15:53:09 +0800 Message-ID: Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU To: "Michael S. Tsirkin" Cc: Eli Cohen , Hillf Danton , virtualization , linux-kernel Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 25, 2022 at 2:45 PM Michael S. Tsirkin wrote: > > On Fri, Mar 25, 2022 at 11:22:25AM +0800, Jason Wang wrote: > > On Thu, Mar 24, 2022 at 8:24 PM Eli Cohen wrote: > > > > > > > > > > > > > -----Original Message----- > > > > From: Hillf Danton > > > > Sent: Thursday, March 24, 2022 2:02 PM > > > > To: Jason Wang > > > > Cc: Eli Cohen ; Michael S. Tsirkin ; virtualization ; linux- > > > > kernel > > > > Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU > > > > > > > > On Thu, 24 Mar 2022 16:20:34 +0800 Jason Wang wrote: > > > > > On Thu, Mar 24, 2022 at 2:17 PM Michael S. Tsirkin wrote: > > > > > > On Thu, Mar 24, 2022 at 02:04:19PM +0800, Hillf Danton wrote: > > > > > > > On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote: > > > > > > > > On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton wrote: > > > > > > > > > > > > > > > > > > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote: > > > > > > > > > > > > > > > > > > > > Yes, there will be no "infinite" loop, but since the loop is triggered > > > > > > > > > > by userspace. It looks to me it will delay the flush/drain of the > > > > > > > > > > workqueue forever which is still suboptimal. > > > > > > > > > > > > > > > > > > Usually it is barely possible to shoot two birds using a stone. > > > > > > > > > > > > > > > > > > Given the "forever", I am inclined to not running faster, hehe, though > > > > > > > > > another cobble is to add another line in the loop checking if mvdev is > > > > > > > > > unregistered, and for example make mvdev->cvq unready before destroying > > > > > > > > > workqueue. > > > > > > > > > > > > > > > > > > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device *dev) > > > > > > > > > { > > > > > > > > > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev); > > > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev); > > > > > > > > > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > > > > > > > > > > > > > > > > > > mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > > > > > > > destroy_workqueue(mvdev->wq); > > > > > > > > > _vdpa_unregister_device(dev); > > > > > > > > > mgtdev->ndev = NULL; > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > Yes, so we had > > > > > > > > > > > > > > > > 1) using a quota for re-requeue > > > > > > > > 2) using something like > > > > > > > > > > > > > > > > while (READ_ONCE(cvq->ready)) { > > > > > > > > ... > > > > > > > > cond_resched(); > > > > > > > > } > > > > > > > > > > > > > > > > There should not be too much difference except we need to use > > > > > > > > cancel_work_sync() instead of flush_work for 1). > > > > > > > > > > > > > > > > I would keep the code as is but if you stick I can change. > > > > > > > > > > > > > > No Sir I would not - I am simply not a fan of work requeue. > > > > > > > > > > > > > > Hillf > > > > > > > > > > > > I think I agree - requeue adds latency spikes under heavy load - > > > > > > unfortunately, not measured by netperf but still important > > > > > > for latency sensitive workloads. Checking a flag is cheaper. > > > > > > > > > > Just spot another possible issue. > > > > > > > > > > The workqueue will be used by another work to update the carrier > > > > > (event_handler()). Using cond_resched() may still have unfair issue > > > > > which blocks the carrier update for infinite time, > > > > > > > > Then would you please specify the reason why mvdev->wq is single > > > > threaded? > > > > I didn't see a reason why it needs to be a single threaded (ordered). > > > > > Given requeue, the serialization of the two works is not > > > > strong. Otherwise unbound WQ that can process works in parallel is > > > > a cure to the unfairness above. > > > > Yes, and we probably don't want a per device workqueue but a per > > module one. Or simply use the system_wq one. > > > > > > > > > > > > I think the proposed patch can still be used with quota equal to one. > > > That would guarantee fairness. > > > This is not performance critical and a single workqueue should be enough. > > > > Yes, but both Hillf and Michael don't like requeuing. So my plan is > > > > 1) send patch 2 first since it's a hard requirement for the next RHEL release > > 2) a series to fix this hogging issue by > > 2.1) switch to use a per module workqueue > > 2.2) READ_ONCE(cvq->ready) + cond_resched() > > > > Thanks > > Actually if we don't care about speed here then requeing with quota of 1 > is fine, in that we don't have a quota at all, we just always requeue > instead of a loop. > > It's the mix of requeue and a loop that I consider confusing. Ok, Hillf, does this make sense for you? We want the issue to be fixed soon, it's near to our product release. Thanks > > > > > > > > > Thanks > > > > Hillf > > > >