Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4098487pxf; Tue, 6 Apr 2021 07:55:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwr7oduFNLY5aVpzbAoV0k5gTlcaxNP03Bi/1Tqmi2pMC6X8V+lXCWIXA94lgdk6yHMFnFf X-Received: by 2002:a92:d08b:: with SMTP id h11mr4416253ilh.70.1617720922846; Tue, 06 Apr 2021 07:55:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617720922; cv=none; d=google.com; s=arc-20160816; b=plU+xeKWJ5SDkvxZ62vycStaUUP3lqiF8plMtjd8rs4nRzfwW1h+bM4wut4DPCpDXZ nd/6gbtfpzpi8m2QBD2T6OSDZl/gzl57CPMX+TFXqEBIw9F3LZkVtjiL8p/qX6O4HvC9 BrUJNUqRdZ4NX7yxpNmRtwyZIWkTAzN6mAdCUkBBjGk/FnlJy0nvEgB3fUXekEHY1icy ih0O7IOcWwFj8fTYAxXMy4p3YweJqLQD8gKExJD5pt+sCr9vYY4bHOMs/uUR7Bsnr2Ku bRvrgw08u9qFKtdmsff7esYYKbhJaZGZmDax5890j/zId19KMSHKQDQaq6bOAH9yuOQ/ LAPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=g177cYwj9DBQk5zOKUcDfcqn8AroCG72aYJpUlO0YPs=; b=NVUbGvuaLa7nJMmLr4NjASo3WHoSBRdLUncNcRf+b71Vg32+f3DnxOBM2sTTFIfQj8 5hCwMK09yaHLMolPXs9AgU9VJUNjfxx6UCJ0kfYLBfszaYzGRqX6RCI6ynsJNccRUTbr MHxxDsmX163J+bwL9r4Qei9ePS8OSeZfkwidkkAqS9XuHQKawK6YAYDYFM+rDoJCviTV bQcG+4B4OK18ZJZmp+5fz2JV6ANdqJotGnrub4yzPzHMHuc0ZQ74NST8CVW0xpiV60++ 2TQmW9t1ZEtjqZZlMZk1mlfiOWWtv4oPEma/n4XtDi0IW++77/wU4rF0j/8bllbYSvE+ DbfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a16si17641208iow.42.2021.04.06.07.55.09; Tue, 06 Apr 2021 07:55:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243475AbhDFCqz (ORCPT + 99 others); Mon, 5 Apr 2021 22:46:55 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:3069 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233030AbhDFCqy (ORCPT ); Mon, 5 Apr 2021 22:46:54 -0400 Received: from dggeml406-hub.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4FDsK14nkfzWTlg; Tue, 6 Apr 2021 10:43:17 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggeml406-hub.china.huawei.com (10.3.17.50) with Microsoft SMTP Server (TLS) id 14.3.498.0; Tue, 6 Apr 2021 10:46:30 +0800 Received: from [127.0.0.1] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2106.2; Tue, 6 Apr 2021 10:46:30 +0800 Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc To: Cong Wang , Jiri Kosina CC: Hillf Danton , John Fastabend , Paolo Abeni , Kehuan Feng , Jike Song , Jonas Bonn , Michael Zhivich , "David Miller" , LKML , "Michal Kubecek" , Netdev , Josh Hunt , "Jason A. Donenfeld" , =?UTF-8?Q?Toke_H=c3=b8iland-J=c3=b8rgensen?= References: <465a540e-5296-32e7-f6a6-79942dfe2618@netrounds.com> <20200825162329.11292-1-hdanton@sina.com> <5f46032e.1c69fb81.9880c.7a6cSMTPIN_ADDED_MISSING@mx.google.com> <20200827125747.5816-1-hdanton@sina.com> <5f51cbad3cc2_3eceb208fc@john-XPS-13-9370.notmuch> <20210403003537.2032-1-hdanton@sina.com> From: Yunsheng Lin Message-ID: <2b99fce1-c235-6083-bd39-cece1f4a0343@huawei.com> Date: Tue, 6 Apr 2021 10:46:29 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggeme712-chm.china.huawei.com (10.1.199.108) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/4/6 9:49, Cong Wang wrote: > On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: >> >> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the >> coming days. If it works, then we can consider proceeding with it, >> otherwise I am all for reverting the whole NOLOCK stuff. >> >> [1] https://lore.kernel.org/linux-can/1616641991-14847-1-git-send-email-linyunsheng@huawei.com/T/#u > > I personally prefer to just revert that bit, as it brings more troubles > than gains. Even with Yunsheng's patch, there are still some issues. > Essentially, I think the core qdisc scheduling code is not ready for > lockless, just look at those NOLOCK checks in sch_generic.c. :-/ I am also awared of the NOLOCK checks too:), and I am willing to take care of it if that is possible. As the number of cores in a system is increasing, it is the trend to become lockless, right? Even there is only one cpu involved, the spinlock taking and releasing takes about 30ns on our arm64 system when CONFIG_PREEMPT_VOLUNTARY is enable(ip forwarding testing). Currently I has three ideas to optimize the lockless qdisc: 1. implement the qdisc bypass for lockless qdisc too, see [1]. 2. implement lockless enqueuing for lockless qdisc using the idea from Jason and Toke. And it has a noticable proformance increase with 1-4 threads running using the below prototype based on ptr_ring. static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr) { int producer, next_producer; do { producer = READ_ONCE(r->producer); if (unlikely(!r->size) || r->queue[producer]) return -ENOSPC; next_producer = producer + 1; if (unlikely(next_producer >= r->size)) next_producer = 0; } while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer); /* Make sure the pointer we are storing points to a valid data. */ /* Pairs with the dependency ordering in __ptr_ring_consume. */ smp_wmb(); WRITE_ONCE(r->queue[producer], ptr); return 0; } 3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc too, because dev_hard_start_xmit is also in the protection of qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using a netdev queue, which is true for pfifo_fast, I believe). [1]. https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@huawei.com/ > > Thanks. > > . >