Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp345549rwr; Thu, 27 Apr 2023 02:02:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5hQum6vV7kXqJhslJEvigpDCDi6bO4S2QtgYUUDdIPhJ/iOp56f9UbwBOrJWO2mc8Ig52F X-Received: by 2002:a05:6a00:4009:b0:63b:4978:a50a with SMTP id by9-20020a056a00400900b0063b4978a50amr1336145pfb.1.1682586161703; Thu, 27 Apr 2023 02:02:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682586161; cv=none; d=google.com; s=arc-20160816; b=HXKXsNVVWikT4xcMu5fR2MXpQ97hzM1CiCG1dvyVaO05Sldped2lN5xplWXtwTg/sC yU+lEcITVLWSq9NeLck9LpeDNhVrCzDFZGGKpyMKVLZEBo00Ytd5I+c7/AhAmCZDa5a2 T+jPepFEht8grJuHgH5MoOqLWqGkxxYVXd/hbEN2/EUmOOXioJmEHPyaLrND+n+H7ye0 J7prF8/th87BeefpBGp+9Ub1ZzCworXaf1B+goNIrmV0swD50Bst50qU71NwjPaaV5MK 99YQJXNpzKXeIyqwdzC+IbBefB9/tC8crFMyEwRnnt7BU9lkVLG5SBCQOZAKPCBuoYdJ Jcfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:references:cc:to:from:date:subject :message-id; bh=8UyMf/s4RNDVHgqK581TmauGsK8KO3NH6QY29YlxKJQ=; b=YRTuCbG0h5DjgYSclCpRhIE3OZW3WYUJNrZEVtRFaLJrn3DiOyjVQfaV+/Gh+8f8gV EFkEy1q8hpOSCyg2tZmM03VIGHrpYngKAj/Ol96Hq/1IKaU5+Pm9QwwEa73QSalX9Csg RlsnvQ68ppXSu6wr8Nwv+4Kf4VBlEbUW45iW3mAeuU9hrrXZAc/1nKP9aGj0P6czVkbp tb8XvjEIkSsIJrCKewQh6FgybwVgOXp4d0nn1j35cjWpzA8nKeCLi/wKup8yk+Mcyzk0 Mqirpxk1QnO91LtX+JxtLXhJVm9Du8057NBvze4XZCJDimpq0vZtwGnQiR9KYqbwDC6o Fo9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y7-20020aa78f27000000b0063f2827e1dasi13684773pfr.184.2023.04.27.02.02.26; Thu, 27 Apr 2023 02:02:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243222AbjD0I50 (ORCPT + 99 others); Thu, 27 Apr 2023 04:57:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243145AbjD0I5T (ORCPT ); Thu, 27 Apr 2023 04:57:19 -0400 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21C374EE4; Thu, 27 Apr 2023 01:57:17 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xuanzhuo@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0Vh709jd_1682585833; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0Vh709jd_1682585833) by smtp.aliyun-inc.com; Thu, 27 Apr 2023 16:57:14 +0800 Message-ID: <1682585517.595783-3-xuanzhuo@linux.alibaba.com> Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs Date: Thu, 27 Apr 2023 16:51:57 +0800 From: Xuan Zhuo To: Wenliang Wang Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, "Michael S. Tsirkin" References: <20230427043433.2594960-1-wangwenliang.1995@bytedance.com> <1682576442.2203932-1-xuanzhuo@linux.alibaba.com> <252ee222-f918-426e-68ef-b3710a60662e@bytedance.com> <1682579624.5395834-1-xuanzhuo@linux.alibaba.com> <20230427041206-mutt-send-email-mst@kernel.org> <1682583225.3180113-2-xuanzhuo@linux.alibaba.com> <20230427042259-mutt-send-email-mst@kernel.org> In-Reply-To: X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 27 Apr 2023 16:49:58 +0800, Wenliang Wang wrote: > On 4/27/23 4:23 PM, Michael S. Tsirkin wrote: > > On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote: > >> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" wrote: > >>> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote: > >>>> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang wrote: > >>>>> > >>>>> > >>>>> On 4/27/23 2:20 PM, Xuan Zhuo wrote: > >>>>>> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang wrote: > >>>>>>> For multi-queue and large rx-ring-size use case, the following error > >>>>>> > >>>>>> Cound you give we one number for example? > >>>>> > >>>>> 128 queues and 16K queue_size is typical. > >>>>> > >>>>>> > >>>>>>> occurred when free_unused_bufs: > >>>>>>> rcu: INFO: rcu_sched self-detected stall on CPU. > >>>>>>> > >>>>>>> Signed-off-by: Wenliang Wang > >>>>>>> --- > >>>>>>> drivers/net/virtio_net.c | 1 + > >>>>>>> 1 file changed, 1 insertion(+) > >>>>>>> > >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >>>>>>> index ea1bd4bb326d..21d8382fd2c7 100644 > >>>>>>> --- a/drivers/net/virtio_net.c > >>>>>>> +++ b/drivers/net/virtio_net.c > >>>>>>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi) > >>>>>>> struct virtqueue *vq = vi->rq[i].vq; > >>>>>>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) > >>>>>>> virtnet_rq_free_unused_buf(vq, buf); > >>>>>>> + schedule(); > >>>>>> > >>>>>> Just for rq? > >>>>>> > >>>>>> Do we need to do the same thing for sq? > >>>>> Rq buffers are pre-allocated, take seconds to free rq unused buffers. > >>>>> > >>>>> Sq unused buffers are much less, so do the same for sq is optional. > >>>> > >>>> I got. > >>>> > >>>> I think we should look for a way, compatible with the less queues or the smaller > >>>> rings. Calling schedule() directly may be not a good way. > >>>> > >>>> Thanks. > >>> > >>> Why isn't it a good way? > >> > >> For the small ring, I don't think it is a good way, maybe we only deal with one > >> buf, then call schedule(). > >> > >> We can call the schedule() after processing a certain number of buffers, > >> or check need_resched () first. > >> > >> Thanks. > > > > > > Wenliang, does > > if (need_resched()) > > schedule(); > > fix the issue for you? > > > Yeah, it works better. I prefer to use it in combination with a fixed number(such as 256). Every time 256 buffers are processed, check need_resched(). This can accommodate large rings and small rings. Also, it is necessary to add similar logic to sq. Although the possibility is low, it is possible that the same problem will occur. Thanks. > > > >> > >> > >>> > >>>> > >>>>> > >>>>>> > >>>>>> Thanks. > >>>>>> > >>>>>> > >>>>>>> } > >>>>>>> } > >>>>>>> > >>>>>>> -- > >>>>>>> 2.20.1 > >>>>>>> > >>> > >