Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2220376pxb; Fri, 25 Mar 2022 13:18:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxtxmLDBx7F8q3zTQAkz8pnZnAkMarTzhzYAk0C1hA1/D9fMh4oXIANDKesfzoUB7/EPuz6 X-Received: by 2002:a63:5a49:0:b0:382:1ced:330e with SMTP id k9-20020a635a49000000b003821ced330emr991037pgm.478.1648239488003; Fri, 25 Mar 2022 13:18:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648239487; cv=none; d=google.com; s=arc-20160816; b=JNsAwYmiDafqzgOLCg3iZM++YSMOSJAHBaMetYC3Z2Xh5XouDvRyhUijfzKrxpAn3T zdYAgJZwfmEDeoUdqEy+AcvbVEmbWNcDDIAepYIVHWe9sV8n+nsRj7g0otJS24Wnq7QL XrQ4gI4a1yuA+2ErM6dlgXpxQgPCxQP9mf7rYQaMqzBj15dj1IJJfNtSuGqNK0VnfGW8 P+V3F1a5bli5lbxet1eL5A+z4d2JQgzR33UrJRlht0Bv11gd59taxjnYdA7F/QG1sfjJ p/QyL3UNuZ6aAjE0vJVlC2lEkx8ZicTk4AMQYbkdLl/QytNxf8hBCLiu5j3NJS1beTGt jfUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=/Ps+162FjCw1COHBRdJjTCL/Gy1wv1H2La87XaBnoPc=; b=PLUg3ztjZ1b1PQy8HVyzvGhPPqmgwslUdlCQWhBgxZABbJZ3UwWgo8EKo/EZFcbDTf kwBGdwEDm4InRT6f3KGsfBE/mY6tbUWNhOi5iyDqaejY2Gd9ltkVkHtVwHMt9ULsNvrK MgUGrDjGJ2a8hF1eqLPlNzEf3F8062wNBd9dNrNFX0y27bKmtenFlUmEmTuXpeRCGwYy FG86iNcATXIqM+xF1oQDTvLZ4vzKBwPIO+dEOwpSQmUlQ7JZ20Bo+xy3uujgvkqeN0jI Ejvld9BXd1uZsM+RvejyEuUBoBrWLlgmQdc6Nu9MRsl0+KE2DiwN4dyigoykXFbczVI9 NhHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SmhiFPUP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id oj11-20020a17090b4d8b00b001c6cd25051csi7258026pjb.79.2022.03.25.13.18.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 13:18:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SmhiFPUP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8A0EE3EAD15; Fri, 25 Mar 2022 12:17:08 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244846AbiCYGrK (ORCPT + 99 others); Fri, 25 Mar 2022 02:47:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231467AbiCYGrJ (ORCPT ); Fri, 25 Mar 2022 02:47:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 000F7C55AD for ; Thu, 24 Mar 2022 23:45:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648190735; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/Ps+162FjCw1COHBRdJjTCL/Gy1wv1H2La87XaBnoPc=; b=SmhiFPUP/cRI/uNUn0G2xD7fdnkj05Tvb/zxWtOQ7mLZU/LU0CnoD0ULj8QUq2OfnUeqtc BKo0VJFF9tg7PjCyXAFQBl5DGNVQFotU7sAoSkEIpIjrhgJdioZTKAn7z/6kuBnFH6Rlkh bUERy4ojEMl6jGR0jCX+L15m1xdho7Y= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-500-pToOcXLpPy62rLXJ0WRbHw-1; Fri, 25 Mar 2022 02:45:31 -0400 X-MC-Unique: pToOcXLpPy62rLXJ0WRbHw-1 Received: by mail-wm1-f72.google.com with SMTP id h127-20020a1c2185000000b0038c6f7e22a4so4824383wmh.9 for ; Thu, 24 Mar 2022 23:45:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/Ps+162FjCw1COHBRdJjTCL/Gy1wv1H2La87XaBnoPc=; b=T5kDEnw1GImbEyvKn3BeKUrteJRjgwqdE9hOi8IIgP735WuOh+Lcv9zYPhvtNsQWfM vbzYKktEp4MbOAbneRvT6nEt+5UlDmD0zvTk89tj+EdTI+Q4U8cZT2dE2+o77GUiw7Np UmdKtivkFPLn8l1v8IkXjuHrX1d9eUNGqirAzqrLDP2ylP9pUha8BfzwiGxbJ30GTTfv tY6oKvLm3jvQElrIZ/6HuJsCczUMlwVAjNKUdBaqnzpL0jgAEKpKST+pVInZ2iwAWjXV dWUHLSuGs7cQu138paLLFdgmfuj1dEhCo9lK48gtQfZzwTonEZrPU7SjtlMXrDSA/tj4 /OFg== X-Gm-Message-State: AOAM5325dWeHVVtRueL+iSDknZFIRBJlRRj0TdNNW9ENId9BL4QuGvM5 4rFqM0kB9RYFRhHiT0L2d0CdsUNPQd/0MTNzXrWRn0FHhxGA7hJk30FRK+xYxXJedgqO6GwxqnX z5z2bpofarLb6nLMufy6upvc1 X-Received: by 2002:a05:600c:6020:b0:38c:d24c:5fbf with SMTP id az32-20020a05600c602000b0038cd24c5fbfmr5990681wmb.18.1648190729747; Thu, 24 Mar 2022 23:45:29 -0700 (PDT) X-Received: by 2002:a05:600c:6020:b0:38c:d24c:5fbf with SMTP id az32-20020a05600c602000b0038cd24c5fbfmr5990669wmb.18.1648190729557; Thu, 24 Mar 2022 23:45:29 -0700 (PDT) Received: from redhat.com ([2.55.151.118]) by smtp.gmail.com with ESMTPSA id i9-20020a5d5849000000b002058631cfacsm4455201wrf.61.2022.03.24.23.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Mar 2022 23:45:28 -0700 (PDT) Date: Fri, 25 Mar 2022 02:45:25 -0400 From: "Michael S. Tsirkin" To: Jason Wang Cc: Eli Cohen , Hillf Danton , virtualization , linux-kernel Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU Message-ID: <20220325024324-mutt-send-email-mst@kernel.org> References: <20220321123420.3207-1-hdanton@sina.com> <20220324005345.3623-1-hdanton@sina.com> <20220324060419.3682-1-hdanton@sina.com> <20220324021428-mutt-send-email-mst@kernel.org> <20220324120217.3746-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 25, 2022 at 11:22:25AM +0800, Jason Wang wrote: > On Thu, Mar 24, 2022 at 8:24 PM Eli Cohen wrote: > > > > > > > > > -----Original Message----- > > > From: Hillf Danton > > > Sent: Thursday, March 24, 2022 2:02 PM > > > To: Jason Wang > > > Cc: Eli Cohen ; Michael S. Tsirkin ; virtualization ; linux- > > > kernel > > > Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU > > > > > > On Thu, 24 Mar 2022 16:20:34 +0800 Jason Wang wrote: > > > > On Thu, Mar 24, 2022 at 2:17 PM Michael S. Tsirkin wrote: > > > > > On Thu, Mar 24, 2022 at 02:04:19PM +0800, Hillf Danton wrote: > > > > > > On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote: > > > > > > > On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton wrote: > > > > > > > > > > > > > > > > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote: > > > > > > > > > > > > > > > > > > Yes, there will be no "infinite" loop, but since the loop is triggered > > > > > > > > > by userspace. It looks to me it will delay the flush/drain of the > > > > > > > > > workqueue forever which is still suboptimal. > > > > > > > > > > > > > > > > Usually it is barely possible to shoot two birds using a stone. > > > > > > > > > > > > > > > > Given the "forever", I am inclined to not running faster, hehe, though > > > > > > > > another cobble is to add another line in the loop checking if mvdev is > > > > > > > > unregistered, and for example make mvdev->cvq unready before destroying > > > > > > > > workqueue. > > > > > > > > > > > > > > > > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device *dev) > > > > > > > > { > > > > > > > > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev); > > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev); > > > > > > > > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > > > > > > > > > > > > > > > > mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); > > > > > > > > destroy_workqueue(mvdev->wq); > > > > > > > > _vdpa_unregister_device(dev); > > > > > > > > mgtdev->ndev = NULL; > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > Yes, so we had > > > > > > > > > > > > > > 1) using a quota for re-requeue > > > > > > > 2) using something like > > > > > > > > > > > > > > while (READ_ONCE(cvq->ready)) { > > > > > > > ... > > > > > > > cond_resched(); > > > > > > > } > > > > > > > > > > > > > > There should not be too much difference except we need to use > > > > > > > cancel_work_sync() instead of flush_work for 1). > > > > > > > > > > > > > > I would keep the code as is but if you stick I can change. > > > > > > > > > > > > No Sir I would not - I am simply not a fan of work requeue. > > > > > > > > > > > > Hillf > > > > > > > > > > I think I agree - requeue adds latency spikes under heavy load - > > > > > unfortunately, not measured by netperf but still important > > > > > for latency sensitive workloads. Checking a flag is cheaper. > > > > > > > > Just spot another possible issue. > > > > > > > > The workqueue will be used by another work to update the carrier > > > > (event_handler()). Using cond_resched() may still have unfair issue > > > > which blocks the carrier update for infinite time, > > > > > > Then would you please specify the reason why mvdev->wq is single > > > threaded? > > I didn't see a reason why it needs to be a single threaded (ordered). > > > Given requeue, the serialization of the two works is not > > > strong. Otherwise unbound WQ that can process works in parallel is > > > a cure to the unfairness above. > > Yes, and we probably don't want a per device workqueue but a per > module one. Or simply use the system_wq one. > > > > > > > > I think the proposed patch can still be used with quota equal to one. > > That would guarantee fairness. > > This is not performance critical and a single workqueue should be enough. > > Yes, but both Hillf and Michael don't like requeuing. So my plan is > > 1) send patch 2 first since it's a hard requirement for the next RHEL release > 2) a series to fix this hogging issue by > 2.1) switch to use a per module workqueue > 2.2) READ_ONCE(cvq->ready) + cond_resched() > > Thanks Actually if we don't care about speed here then requeing with quota of 1 is fine, in that we don't have a quota at all, we just always requeue instead of a loop. It's the mix of requeue and a loop that I consider confusing. > > > > > Thanks > > > Hillf > >