Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp2922498pxb; Mon, 16 Nov 2020 00:15:50 -0800 (PST) X-Google-Smtp-Source: ABdhPJyu2MwczoKh98KDjF82BP50YjuKgb7G1/Si7rfgfE33GQWEfJzIE4ZxfSxCdlqREcEqbyjc X-Received: by 2002:a17:906:604e:: with SMTP id p14mr14215195ejj.515.1605514550788; Mon, 16 Nov 2020 00:15:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605514550; cv=none; d=google.com; s=arc-20160816; b=oGuW2FDB+c7uEUGuXLFQ95UaUzmPd4K0I0onxFUcWrSj0Bl6lJzFWQgMKEdLZqfjCw Z6wJPxcov+SKbwnSgFLrOJzxjNve2nlbA4myIdEcgzHF3D+UaATV2gTbrsuk/CQZ9pLa ftrPkMP/hc9R0H/IRUOhhT+Ie1DAWGhNgGY/ZL2dG6fKuG8HrmhECy0ltM7JxbhERJ18 m1vAYkAxh+vOPurvR54MCMczmJRlUhthKwY4hCyxbOaBx/LqoNizTcioX8qkRZclUKGv EXNsBIgaZM7NWPJMfLJdYmNbiTXpujovqJxZ+B6EDXhZbZrj4AmbqAEeQZoN4B4HiAeh mJBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from; bh=On7fW+6Rmt3+RxTUfK/E6dEYvXXzAIOQua7kd6FSFL4=; b=Tj1VtFbV8xbkNC5XvqSmNou4lcwKr6H7/MwdX263wu+iZsGRreLLlYm8gBWg7B+w+b 6M8zS6kEwjZ0r8XxESfcgE4o9YIbNOO1XLED284b0APDz2aLkC2EGH95suZvyfkmZnUH 8XAFTTWHI32Iy5jnJKlQpOVHdysTUVz90sqgsVfhq9LvOVocnjmOR3JwfIyMbkxcjkY2 j0JiwaVTgxkZSo8ixoeauTWYSr4iv++kIdNn/WTXU/z2wacebgWh4FP4nxklNGk5+Xqs rVom8zFULTKx4yLyteiSa/kK/ZKyAtxgU4NI8VHt7Rm8/YZ9o+8RHWoUVd4pbuRHadf5 OLRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cc8si12519472edb.163.2020.11.16.00.15.27; Mon, 16 Nov 2020 00:15:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728158AbgKPILD (ORCPT + 99 others); Mon, 16 Nov 2020 03:11:03 -0500 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:40962 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727789AbgKPILD (ORCPT ); Mon, 16 Nov 2020 03:11:03 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R581e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=xuanzhuo@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0UFVOgpP_1605514257; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UFVOgpP_1605514257) by smtp.aliyun-inc.com(127.0.0.1); Mon, 16 Nov 2020 16:10:58 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: Xuan Zhuo , =?UTF-8?q?Bj=C3=B6rn=20T=C3=B6pel?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] xsk: add cq event Date: Mon, 16 Nov 2020 16:10:55 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When we write all cq items to tx, we have to wait for a new event based on poll to indicate that it is writable. But the current writability is triggered based on whether tx is full or not, and In fact, when tx is dissatisfied, the user of cq's item may not necessarily get it, because it may still be occupied by the network card. In this case, we need to know when cq is available, so this patch adds a socket option, When the user configures this option using setsockopt, when cq is available, a readable event is generated for all xsk bound to this umem. I can't find a better description of this event, I think it can also be 'readable', although it is indeed different from the 'readable' of the new data. But the overhead of xsk checking whether cq or rx is readable is small. Signed-off-by: Xuan Zhuo --- include/net/xdp_sock.h | 1 + include/uapi/linux/if_xdp.h | 1 + net/xdp/xsk.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 30 insertions(+) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 1a9559c..faf5b1a 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -49,6 +49,7 @@ struct xdp_sock { struct xsk_buff_pool *pool; u16 queue_id; bool zc; + bool cq_event; enum { XSK_READY = 0, XSK_BOUND, diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index a78a809..2dba3cb 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -63,6 +63,7 @@ struct xdp_mmap_offsets { #define XDP_UMEM_COMPLETION_RING 6 #define XDP_STATISTICS 7 #define XDP_OPTIONS 8 +#define XDP_CQ_EVENT 9 struct xdp_umem_reg { __u64 addr; /* Start of packet data area */ diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index cfbec39..0c53403 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -285,7 +285,16 @@ void __xsk_map_flush(void) void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries) { + struct xdp_sock *xs; + xskq_prod_submit_n(pool->cq, nb_entries); + + rcu_read_lock(); + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { + if (xs->cq_event) + sock_def_readable(&xs->sk); + } + rcu_read_unlock(); } EXPORT_SYMBOL(xsk_tx_completed); @@ -495,6 +504,9 @@ static __poll_t xsk_poll(struct file *file, struct socket *sock, __xsk_sendmsg(sk); } + if (xs->cq_event && pool->cq && !xskq_prod_is_empty(pool->cq)) + mask |= EPOLLIN | EPOLLRDNORM; + if (xs->rx && !xskq_prod_is_empty(xs->rx)) mask |= EPOLLIN | EPOLLRDNORM; if (xs->tx && !xskq_cons_is_full(xs->tx)) @@ -882,6 +894,22 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, mutex_unlock(&xs->mutex); return err; } + case XDP_CQ_EVENT: + { + int cq_event; + + if (optlen < sizeof(cq_event)) + return -EINVAL; + if (copy_from_sockptr(&cq_event, optval, sizeof(cq_event))) + return -EFAULT; + + if (cq_event) + xs->cq_event = true; + else + xs->cq_event = false; + + return 0; + } default: break; } -- 1.8.3.1