Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4035297pxf; Tue, 16 Mar 2021 04:16:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxSd2LXyojzRz0PEOjUo7MtLd3T2kE1EOjqqGE5jAl2GljVL169SYJlgln14nIaPydiLH5Y X-Received: by 2002:a17:906:eb4e:: with SMTP id mc14mr28335631ejb.169.1615893383311; Tue, 16 Mar 2021 04:16:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615893383; cv=none; d=google.com; s=arc-20160816; b=gvLQutah7GoSwraS+5URqWtLiHPIIfVuU04iezBAwQO1QvP5XuPQ49638hsn92tptv T/lbhVGhlv3uL4E7qJoqP+gHw/uYIATnMiJK0kVjvhv40Ocl2vtwXf2mK5sE5m+mllDY QqFHCGt1mp3SPd6uoV7hUsY41+vGYao2Gq6YrZQHqQAH8GG14x7r8VaAE2q3VJFm9UFF 4c2vWJiLdVwuw27jgq/PS4doK1tXl1PTTMx8JbF9jFxwZ6Y1pecFc/Qpf2TIlfxRRMLN v7XtMI8xt1KGeFgkKjsSfCBpu6Y+NqaZ5kkfFsjujDb4OFMMYwS3TRXxPzeF/qBWNrFz Kzgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=en5Kb9FYkWmX8DcmWvVc2GgA6EfZp1vq7aYUx9ReKKI=; b=u3zTcd/8ooK/yaYEBMw2N5HAUVYM/X/ilkGL6IJ4DbjU6i+bXbVKmvBrF969oqtW01 24sWaScCIWyg0u+5Km6qo2JVbT3jteAIy3in1lrmmOhbSxm8ncW6Gzn59c49FYYF3PCm lyxwRHYr3uZqNGzDi6Tec0Q6MEsiYp5LlAMuPD47C6pft3X2G20y2pNx8PIoLf+4Qjgw ed2/GmXlPlqp5g3QcioLCCa0r7AMt5Yq1GZZseRLlvtvBAzB4ou5X8B/wGym3KMQGX6D LliO8uQA423We2vZ9L1fU7wj6uFgSg7x/Gs77eL6Z3gt/Jh1XnBvirNigk1qv3D6No6T yPtQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b13si14369270eja.326.2021.03.16.04.16.00; Tue, 16 Mar 2021 04:16:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233273AbhCPAf1 (ORCPT + 99 others); Mon, 15 Mar 2021 20:35:27 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:13932 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232830AbhCPAf0 (ORCPT ); Mon, 15 Mar 2021 20:35:26 -0400 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DzvR64D47zkZ1p; Tue, 16 Mar 2021 08:33:38 +0800 (CST) Received: from [127.0.0.1] (10.69.30.204) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Tue, 16 Mar 2021 08:35:08 +0800 Subject: Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc To: Jakub Kicinski CC: , , , , , , , , , , , , , References: <1615603667-22568-1-git-send-email-linyunsheng@huawei.com> <1615777818-13969-1-git-send-email-linyunsheng@huawei.com> <20210315115332.1647e92b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> From: Yunsheng Lin Message-ID: <3838b7c2-c32f-aeda-702a-5cb8f712ec0c@huawei.com> Date: Tue, 16 Mar 2021 08:35:07 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20210315115332.1647e92b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/3/16 2:53, Jakub Kicinski wrote: > On Mon, 15 Mar 2021 11:10:18 +0800 Yunsheng Lin wrote: >> @@ -606,6 +623,11 @@ static const u8 prio2band[TC_PRIO_MAX + 1] = { >> */ >> struct pfifo_fast_priv { >> struct skb_array q[PFIFO_FAST_BANDS]; >> + >> + /* protect against data race between enqueue/dequeue and >> + * qdisc->empty setting >> + */ >> + spinlock_t lock; >> }; >> >> static inline struct skb_array *band2list(struct pfifo_fast_priv *priv, >> @@ -623,7 +645,10 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc, >> unsigned int pkt_len = qdisc_pkt_len(skb); >> int err; >> >> - err = skb_array_produce(q, skb); >> + spin_lock(&priv->lock); >> + err = __ptr_ring_produce(&q->ring, skb); >> + WRITE_ONCE(qdisc->empty, false); >> + spin_unlock(&priv->lock); >> >> if (unlikely(err)) { >> if (qdisc_is_percpu_stats(qdisc)) >> @@ -642,6 +667,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) >> struct sk_buff *skb = NULL; >> int band; >> >> + spin_lock(&priv->lock); >> for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) { >> struct skb_array *q = band2list(priv, band); >> >> @@ -655,6 +681,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) >> } else { >> WRITE_ONCE(qdisc->empty, true); >> } >> + spin_unlock(&priv->lock); >> >> return skb; >> } > > I thought pfifo was supposed to be "lockless" and this change > re-introduces a lock between producer and consumer, no? Yes, the lock breaks the "lockless" of the lockless qdisc for now I do not how to solve the below data race locklessly: CPU1: CPU2: dequeue skb . . . . enqueue skb . . . WRITE_ONCE(qdisc->empty, false); . . . . WRITE_ONCE(qdisc->empty, true); If the above happens, the qdisc->empty is true even if the qdisc has some skb, which may cuase out of order or packet stuck problem. It seems we may need to update ptr_ring' status(empty or not) while enqueuing/dequeuing atomically in the ptr_ring implementation. Any better idea? > > . >