Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1847740pxk; Sun, 13 Sep 2020 19:12:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZ/Lb7V6ZKhzBunsLqPGSxhSeBRoMqmrgxNrk9ML9YK5XKNEMik5LU7UfFpXdS0TTfMYhH X-Received: by 2002:a17:906:8399:: with SMTP id p25mr12305921ejx.243.1600049522002; Sun, 13 Sep 2020 19:12:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600049521; cv=none; d=google.com; s=arc-20160816; b=WJgLqJej7FdqjMhbZjcTYi3Xt7wXEbDhnWIVDuyc+fxHibuIPoJkpmfOE91WmhG4L+ Z8HN5OXp0Y1WdYyEd6p59XYQhXGlRBi7XJ+nV5DMeNy60tf4zEzy4Fo37s16iWP4hX7Y QT3nEeTDccVNXhUlzGoXjxq8flHdRz/r49r1szuNBeSIta4GndTswTVmfSJCP+Cp6LND 4t/u8CZRxLEh9+x1jRtEQjyi9xnUEilBlqtU/Mmqq6v3zuQjzKSihVW401Kk/Her8TBJ /cMQLc20OtinSsNHQDfkgrObLQcq1yVNb5fMlbSfBLGdmQQ4wtl3h4hNIAbcg8zQXhKA Najg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=UTcsOL9+4yGHdHTQWt3rngii54e38OTuWCCiN5rhw3A=; b=uB/etgCOA08syTPjlq+8s6Q7+ilQ80196hnoZg911U14EphDtzzF0uvxOW/lMaLwNu SX3YsWYoqrj8fKUeSTo2zlxmBmP2CKLLwn9deaNUj+YaFamGUEnJ9gED39kVhBClR28q LCjV2Oz09LZs8ALFs5mxVGTS70/W9iXMsxDVv2+6jgNCc2an/oa9TY7iWyziDCfXypUY ZinX1Vtw8iSXsdz9NXBwfVMbXW+BTbe8uTKmHssTTOIUd7OYl0YFxXIu0dwRqHup8RId gQc8D6RaHbKHt5CJGRqUdKju7FKqy3WzNqqc9hJcEDfsUTuUc7PfyyEP3gSVIetjm46+ Mqmg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t19si6483990ejb.646.2020.09.13.19.11.39; Sun, 13 Sep 2020 19:12:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726009AbgINCKo (ORCPT + 99 others); Sun, 13 Sep 2020 22:10:44 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:51680 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725965AbgINCKn (ORCPT ); Sun, 13 Sep 2020 22:10:43 -0400 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 4FE7837DAEF3B9BAE1FC; Mon, 14 Sep 2020 10:10:39 +0800 (CST) Received: from [10.74.191.121] (10.74.191.121) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.487.0; Mon, 14 Sep 2020 10:10:31 +0800 Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc To: Cong Wang , Kehuan Feng CC: Hillf Danton , Paolo Abeni , "Jike Song" , Josh Hunt , Jonas Bonn , Michael Zhivich , "David Miller" , John Fastabend , LKML , Netdev , "linuxarm@huawei.com" References: <465a540e-5296-32e7-f6a6-79942dfe2618@netrounds.com> <20200825032312.11776-1-hdanton@sina.com> <20200825162329.11292-1-hdanton@sina.com> <5f46032e.1c69fb81.9880c.7a6cSMTPIN_ADDED_MISSING@mx.google.com> <20200827125747.5816-1-hdanton@sina.com> <20200903101957.428-1-hdanton@sina.com> From: Yunsheng Lin Message-ID: Date: Mon, 14 Sep 2020 10:10:31 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.191.121] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/9/11 4:19, Cong Wang wrote: > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote: >> I also tried Cong's patch (shown below on my tree) and it could avoid >> the issue (stressing for 30 minutus for three times and not jitter >> observed). > > Thanks for verifying it! > >> >> --- ./include/net/sch_generic.h.orig 2020-08-21 15:13:51.787952710 +0800 >> +++ ./include/net/sch_generic.h 2020-09-03 21:36:11.468383738 +0800 >> @@ -127,8 +127,7 @@ >> static inline bool qdisc_run_begin(struct Qdisc *qdisc) >> { >> if (qdisc->flags & TCQ_F_NOLOCK) { >> - if (!spin_trylock(&qdisc->seqlock)) >> - return false; >> + spin_lock(&qdisc->seqlock); >> } else if (qdisc_is_running(qdisc)) { >> return false; >> } >> >> I am not actually know what you are discussing above. It seems to me >> that Cong's patch is similar as disabling lockless feature. > >>From performance's perspective, yeah. Did you see any performance > downgrade with my patch applied? It would be great if you can compare > it with removing NOLOCK. And if the performance is as bad as no > NOLOCK, then we can remove the NOLOCK bit for pfifo_fast, at least > for now. It seems the lockless qdisc may have below concurrent problem: cpu0: cpu1: q->enqueue . qdisc_run_begin(q) . __qdisc_run(q) ->qdisc_restart() -> dequeue_skb() . -> sch_direct_xmit() . . q->enqueue qdisc_run_begin(q) qdisc_run_end(q) cpu1 enqueue a skb without calling __qdisc_run(), and cpu0 did not see the enqueued skb when calling __qdisc_run(q) because cpu1 may enqueue the skb after cpu0 called __qdisc_run(q) and before cpu0 called qdisc_run_end(q). Kehuan, do you care to try the below patch if it is the same problem? diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index d60e7c3..c97c1ed 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -36,6 +36,7 @@ struct qdisc_rate_table { enum qdisc_state_t { __QDISC_STATE_SCHED, __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_ENQUEUED, }; struct qdisc_size_table { diff --git a/net/core/dev.c b/net/core/dev.c index 0362419..5985648 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3748,6 +3748,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, qdisc_calculate_pkt_len(skb, q); if (q->flags & TCQ_F_NOLOCK) { + set_bit(__QDISC_STATE_ENQUEUED, &q->state); + smp_mb__after_atomic(); rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; qdisc_run(q); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 265a61d..c389641 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -381,6 +381,8 @@ void __qdisc_run(struct Qdisc *q) int quota = dev_tx_weight; int packets; + clear_bit(__QDISC_STATE_ENQUEUED, &q->state); + smp_mb__after_atomic(); while (qdisc_restart(q, &packets)) { quota -= packets; if (quota <= 0) { @@ -388,6 +390,9 @@ void __qdisc_run(struct Qdisc *q) break; } } + + if (test_bit(__QDISC_STATE_ENQUEUED, &q->state)) + __netif_schedule(q); } unsigned long dev_trans_start(struct net_device *dev) > > Thanks. >