Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp857202pxu; Mon, 23 Nov 2020 06:02:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJx9G0P63FNJ6ayi266+lNEdU1RksPSoNXLJpa7/kVV55QAJxImChvh4w8qZOv0yA+K72z0q X-Received: by 2002:a5d:618b:: with SMTP id j11mr30553057wru.161.1606140171762; Mon, 23 Nov 2020 06:02:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606140171; cv=none; d=google.com; s=arc-20160816; b=0Wj9r9WMPobqgRS5ceMP4+xYhKl9Dh6EBW60gtshcWpvpfnEQIOt5L625G+KIeA9zE nixOvFqCux2H+YtUdUQ6jmev22BIb0TbBwaa1wrq8Z2VdklDB0Rqp4qOAUyOHPzuLHKr aMs2ag3qxt9QXytp4p0QuIIwszPxoAIi3Ub627AUB1X/izz/FwIDvNKwDjNeUvrsnzfX TUPUi9wBN1lVNYavX9TZh0o1EqIou2QOinjPC1kPJHPG4oStAP7cyBzwHJz7fbn9tpxS xRCKzZ4ts6rtWcYhmUxDAHj/aZ3O5bwgodEeR6KQUvPtiJYo2dPnlAPdRIFg/SQyyX8u dODA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=kTzAwGnKctQP6f8k4mc+aVTNiIF28i6q0wCQZdb9TIQ=; b=Mgu4kIDlNrXkq134eyv7R+MkzYvQAXGb2uBNFlLYR7I0imzTuUcznny3ZAyV6sG/11 9b+Zxg2UgWdStJmZuvYsL7V1578LdhDHFbxrcwqWjQFe0+WlGEErhc5xKx1Q76mNfZ6p T3ca8vz1FQJcwyWx6EOx1GDpzteiFXghGXUrb4WrXGgZrcEbt8BI5x1OaVpOn315pEqe 9a8d9zgmPraoYg37Zh93LC27aPyEtwzcCYJcZQNmzeoS3JOHSaC+TYKrFHamByq6czbF hE0Zsun7Z9q5kUrl7JK7J4aE1D2D8y/js7U7H3TeiskvSLGPBeAPU1c1aWbUabwkEtY9 Pfww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="hzVDgoB/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g2si6855378edn.210.2020.11.23.06.02.26; Mon, 23 Nov 2020 06:02:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="hzVDgoB/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729473AbgKWOBE (ORCPT + 99 others); Mon, 23 Nov 2020 09:01:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729372AbgKWOBB (ORCPT ); Mon, 23 Nov 2020 09:01:01 -0500 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FB03C0613CF; Mon, 23 Nov 2020 06:01:01 -0800 (PST) Received: by mail-pl1-x642.google.com with SMTP id 18so8872786pli.13; Mon, 23 Nov 2020 06:01:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kTzAwGnKctQP6f8k4mc+aVTNiIF28i6q0wCQZdb9TIQ=; b=hzVDgoB/C4oCV5aKdslaJu2N+g9Q9d002IloasUd8GbE1Jvmm8w+D0sl07Jmw8TvqS OMaWK4JoIe+9TvZ7w1XaH1c2zUubPwwrCVGmovVqE73YdvCQm14BsFI9QFhdtHYKEtZn SvXZzykNEljNWrRwwbtQANRdla2V6JiguVQo/7/TNUOMDimtbhXlgtJZzBbL7ECrEkGh Vf7NDk3Ldsbt+q4JAhhI9l7NqTcb33LwRvl0O4fPThQCuvZ+2mQMs4pQW/8uDTUv7r7N DIcJIdd2dz6Jl96XLJ5bCRa7J0fezcdIowEsdtElYAtIWkfHz+pGsDYtIeo49CUMo2NE 8y1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kTzAwGnKctQP6f8k4mc+aVTNiIF28i6q0wCQZdb9TIQ=; b=jZD4l2J0orwhxCqGubMUxRcgn7vKSSkzsGn33yUkxeGcNIMBOgYQzs8mYSdjaNGi4v i8g7uhC1wtX6kXMEXRWK2izKgBow8WJEYh2haJT3JldZ0pZxkE8aeUehnoFJP6NSay/n mRje9k4GgRcikJ/hgZAzQMnSQBd++xrOr/WVCyWwJItbPwTzPWg0S6jZJ0ULj3blA6p2 5QTJtVWjnms0OKy5veWlNdtr9VNe+kZvNNN9Yi2jf2A+h4L94mnaZlR/y8cB9XVU1OTs s4Kw+Zq7JZBFuq9GbnBqxLfdE5VmInyDMk5uGku6ecyqUEco6/fYdxh1eqYX2Ah6N6+J XPkg== X-Gm-Message-State: AOAM530iwW6uOG0ivoYZeD5xs5fYiTdNDYbOgWdzwqFmWMexmr9ZkCOb KfEnIWrTa1jSk4xW18QnZrEOSzbhHGt4HXvkClU= X-Received: by 2002:a17:902:bd02:b029:da:8fd:af6b with SMTP id p2-20020a170902bd02b02900da08fdaf6bmr4870343pls.7.1606140059409; Mon, 23 Nov 2020 06:00:59 -0800 (PST) MIME-Version: 1.0 References: <3306b4d8-8689-b0e7-3f6d-c3ad873b7093@intel.com> In-Reply-To: From: Magnus Karlsson Date: Mon, 23 Nov 2020 15:00:48 +0100 Message-ID: Subject: Re: [PATCH 0/3] xsk: fix for xsk_poll writeable To: Xuan Zhuo Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Network Development , bpf , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 18, 2020 at 9:25 AM Xuan Zhuo wrote: > > I tried to combine cq available and tx writeable, but I found it very difficult. > Sometimes we pay attention to the status of "available" for both, but sometimes, > we may only pay attention to one, such as tx writeable, because we can use the > item of fq to write to tx. And this kind of demand may be constantly changing, > and it may be necessary to set it every time before entering xsk_poll, so > setsockopt is not very convenient. I feel even more that using a new event may > be a better solution, such as EPOLLPRI, I think it can be used here, after all, > xsk should not have OOB data ^_^. > > However, two other problems were discovered during the test: > > * The mask returned by datagram_poll always contains EPOLLOUT > * It is not particularly reasonable to return EPOLLOUT based on tx not full > > After fixing these two problems, I found that when the process is awakened by > EPOLLOUT, the process can always get the item from cq. > > Because the number of packets that the network card can send at a time is > actually limited, suppose this value is "nic_num". Once the number of > consumed items in the tx queue is greater than nic_num, this means that there > must also be new recycled items in the cq queue from nic. > > In this way, as long as the tx configured by the user is larger, we won't have > the situation that tx is already in the writeable state but cannot get the item > from cq. I think the overall approach of tying this into poll() instead of setsockopt() is the right way to go. But we need a more robust solution. Your patch #3 also breaks backwards compatibility and that is not allowed. Could you please post some simple code example of what it is you would like to do in user space? So you would like to wake up when there are entries in the cq that can be retrieved and the reason you would like to do this is that you then know you can put some more entries into the Tx ring and they will get sent as there now are free slots in the cq. Correct me if wrong. Would an event that wakes you up when there is both space in the Tx ring and space in the cq work? Is there a case in which we would like to be woken up when only the Tx ring is non-full? Maybe there are as it might be beneficial to fill the Tx and while doing that some entries in the cq has been completed and away the packets go. But it would be great if you could post some simple example code, does not need to compile or anything. Can be pseudo code. It would also be good to know if your goal is max throughput, max burst size, or something else. Thanks: Magnus > Xuan Zhuo (3): > xsk: replace datagram_poll by sock_poll_wait > xsk: change the tx writeable condition > xsk: set tx/rx the min entries > > include/uapi/linux/if_xdp.h | 2 ++ > net/xdp/xsk.c | 26 ++++++++++++++++++++++---- > net/xdp/xsk_queue.h | 6 ++++++ > 3 files changed, 30 insertions(+), 4 deletions(-) > > -- > 1.8.3.1 >