Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp396921iog; Thu, 30 Jun 2022 03:00:16 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vaXuXA25GaLbt3Lez8CMaYQutkvX1mUyFqL1SHZ1mVbdyty/b+KleFklfZ+s+SiW1T9Sjb X-Received: by 2002:a17:90b:4a42:b0:1ec:ae10:3408 with SMTP id lb2-20020a17090b4a4200b001ecae103408mr9476952pjb.172.1656583216170; Thu, 30 Jun 2022 03:00:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656583216; cv=none; d=google.com; s=arc-20160816; b=IyZ4OKPGVVnuSZSRePRtBtsUmb/CttK8v+yL6QTnMJq7ZEKSpl+/vgFPssuIEe+szS JGyGpyX39CDz/Wsp/ALXipsLfp2RXG0UOvC/wLs+IEycj2Z6d8v+/A54uu01wisZnpEK 5NdDmqlDwCkO/+D/HWUaUnbf78TVwjx+lJmiwPvQ4Tlw7JGwl7pB+yU9Lysor/4ydvTz fYwR2xowP0zfI14XazPH88a9pQULv5NYBNcExUB+qTv7YZguEy58PY62DzDGfqwBBa2Y 52t9awzE19fJY6HYs0bB1CSOI8Id78HuHD0QVy3fIEcGXictByjkTfZWc53xVstTmTcJ W6+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :message-id:date:subject:cc:to:from:dkim-signature; bh=J3jHAef7pJ/WI/T7+2xR7U+5ImtGMBx6TbdEpajVzHg=; b=TzBWWR1MZhuMxFKx4rybU2Qu2YofBae1orD/5YRC6wZqzKJ/tj5gagUHOUR1/1Ask+ dA2hr+Ef+l7GwCCjiR/qT0xsgwh5b3+CB8Ja2juCC+usPyOklRagWsXE5wbpvVWS00wg UQ7UgQV+oxUove43FjpLn/Vndb56VmLYKStv4aE6/rMFR3wPdysP9sW/8vh5oq+1xnn7 VMnKVmVSV0rxwrwEFaAFYDec/whm96zx9kylU8MRw7/kKDL6X8SifgnL/4Nf4kBQKEwc I0xm0CXDNAu3tDPtM2P+NURNyoZcnvFmZHiew+iz89Do000T1XmjWwI3j18R6/jxWZYA HF2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=NMx9NkwV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=fb.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x32-20020a634a20000000b003fc7f8ccf26si25726209pga.89.2022.06.30.03.00.04; Thu, 30 Jun 2022 03:00:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=NMx9NkwV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=fb.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234050AbiF3JOT (ORCPT + 99 others); Thu, 30 Jun 2022 05:14:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234016AbiF3JOQ (ORCPT ); Thu, 30 Jun 2022 05:14:16 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B429201A5 for ; Thu, 30 Jun 2022 02:14:15 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25U0LZjF012121 for ; Thu, 30 Jun 2022 02:14:15 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : content-type : content-transfer-encoding : mime-version; s=facebook; bh=J3jHAef7pJ/WI/T7+2xR7U+5ImtGMBx6TbdEpajVzHg=; b=NMx9NkwVKuV3649LLjRE8fnUAnvHhHw1Y2Hgym8VlCopAOrvEp39ubQ7kSlOkvC9W8/Z qNiBsgrls9OUpIbLXbG9v9/UYnupOq10yvAdyBi2YFXNmGnm7mZPkPmfF1R/RxfTrhxQ n9140gh9XpXoMsLZaufei5y7tdvtR++gPnw= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3h0rk5wwhx-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 30 Jun 2022 02:14:14 -0700 Received: from twshared14577.08.ash8.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 30 Jun 2022 02:14:13 -0700 Received: by devbig038.lla2.facebook.com (Postfix, from userid 572232) id 88CD72599FCB; Thu, 30 Jun 2022 02:14:08 -0700 (PDT) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov , CC: , , Dylan Yudaken Subject: [PATCH v2 for-next 00/12] io_uring: multishot recv Date: Thu, 30 Jun 2022 02:12:19 -0700 Message-ID: <20220630091231.1456789-1-dylany@fb.com> X-Mailer: git-send-email 2.30.2 X-FB-Internal: Safe Content-Type: text/plain X-Proofpoint-ORIG-GUID: kYBUq0lu5u269BPW6WjQaZExRQpWAcaY X-Proofpoint-GUID: kYBUq0lu5u269BPW6WjQaZExRQpWAcaY Content-Transfer-Encoding: quoted-printable X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-06-30_05,2022-06-28_01,2022-06-22_01 X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This series adds support for multishot recv/recvmsg to io_uring. The idea is that generally socket applications will be continually enqueuing a new recv() when the previous one completes. This can be improved on by allowing the application to queue a multishot receive, which will post completions as and when data is available. It uses the provided buffers feature to receive new data into a pool provided by the application. This is more performant in a few ways: * Subsequent receives are queued up straight away without requiring the application to finish a processing loop. * If there are more data in the socket (sat the provided buffer size is smaller than the socket buffer) then the data is immediately returned, improving batching. * Poll is only armed once and reused, saving CPU cycles Running a small network benchmark [1] shows improved QPS of ~6-8% over a ra= nge of loads. [1]: https://github.com/DylanZA/netbench/tree/multishot_recv While building this I noticed a small problem in multishot poll which is a = really big problem for receive. If CQEs overflow, then they will be returned to th= e user out of order. This is annoying for the existing use cases of poll and accep= t but doesn't totally break the functionality. Both of these return results that = aren't strictly ordered except for the IORING_CQE_F_MORE flag. For receive this ob= viously is a critical requirement as otherwise data will be received out of order b= y the application. To fix this, when a multishot CQE hits overflow we remove multishot. The ap= plication should then clear CQEs until it sees that CQE, and noticing that IORING_CQE= _F_MORE is not set can re-issue the multishot request. Patches: 1-3: relax restrictions around provided buffers to allow 0 size lengths 4: recycles more buffers on kernel side in error conditions 5-6: clean up multishot poll API a bit allowing it to end with succesful error conditions 7-8: fix existing problems with multishot poll on overflow 9: is the multishot receive patch 10-11: are small fixes to tracing of CQEs v2: * Added patches 6,7,8 (fixing multishot poll bugs) * Added patches 10,11 (trace cleanups) * added io_recv_finish to reduce duplicate logic Dylan Yudaken (12): io_uring: allow 0 length for buffer select io_uring: restore bgid in io_put_kbuf io_uring: allow iov_len =3D 0 for recvmsg and buffer select io_uring: recycle buffers on error io_uring: clean up io_poll_check_events return values io_uring: add IOU_STOP_MULTISHOT return code io_uring: add allow_overflow to io_post_aux_cqe io_uring: fix multishot poll on overflow io_uring: fix multishot accept ordering io_uring: multishot recv io_uring: fix io_uring_cqe_overflow trace format io_uring: only trace one of complete or overflow include/trace/events/io_uring.h | 2 +- include/uapi/linux/io_uring.h | 5 ++ io_uring/io_uring.c | 17 ++-- io_uring/io_uring.h | 20 +++-- io_uring/kbuf.c | 4 +- io_uring/kbuf.h | 9 ++- io_uring/msg_ring.c | 4 +- io_uring/net.c | 139 ++++++++++++++++++++++++++------ io_uring/poll.c | 44 ++++++---- io_uring/rsrc.c | 4 +- 10 files changed, 190 insertions(+), 58 deletions(-) base-commit: 864a15ca4f196184e3f44d72efc1782a7017cbbd --=20 2.30.2