Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp3145002pxm; Mon, 28 Feb 2022 13:02:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJx46v5yIsKV4405T05u3VwFWli8Vk5N/ZFLQEqMAY4mnfKGeaeL4jGcFpsaSC8iOE20RF7V X-Received: by 2002:a05:6a00:1891:b0:4f4:2286:2729 with SMTP id x17-20020a056a00189100b004f422862729mr2622623pfh.19.1646082144034; Mon, 28 Feb 2022 13:02:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646082144; cv=none; d=google.com; s=arc-20160816; b=YP086GW+NPLTvltqlPs+mR9/4bycaA1eE0N933E/1EMmW0c0KATjDr2cz/mCZ26tJZ EXL8qNCz+rX3oQZcNAg3YgXExrigc0YHVBjMkehbu8Nn6Xz0cuKdJ4KwtoagygXl6lL4 oz1Y81zQwBvyP/NBNrozjOI2KwJMrrU5GImWfyMppveSgd4UZdpI7P6fVRP7YDLvAjAC NYeIywG7W1u2EdKapeFj6kLW4LHZyGIGQkXlE1u7TTK15qPr3IvdCCaRtyoAhgk05KTQ QaNCR+kPa6W/KU8mmiLpFFwLElMobmWoleVrDnkN0nu/1BEKg6uVZLzurCJQKZGx7A0Y mgiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:organization:references:in-reply-to:date:cc:to:from :subject:message-id; bh=+UXCql8vR/0sJL0t+rogYu7luMDqq4ugFLNAVe5aMiA=; b=Hb60JhsWfo6NYSrjkeSYjfhvdpjdjY4K3nCGAIESg+sz1bntYh9iR3l8BSgpSixsh+ ypLQTBJpedG8KYMQUAhDJclo7MxNk9VQHT7GtK4aWhPz7TWhnVyH5ndSMvzHZyklDJHP RIys/eWhT7hZxIYWs4kICwpfRfS46E7QhG0DWZFdd7e45Tmj+kfC8yv6NhiuHeIWHDyx TRTe1+xT8peeWgX8uVNMDpRJ6Va9EQKUZE9wvYz3i5QpdWyv5iqzY64m1Sdj0JrKtGBR RLVftLq59Opv6DoV6w+26BauFqzx5vZGLJmHBL4g7VdIHKFwSx7jczx5xsbEly9yHUp2 5cMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dw7-20020a17090b094700b001bed39e61d7si471756pjb.49.2022.02.28.13.02.06; Mon, 28 Feb 2022 13:02:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229770AbiB1VCL (ORCPT + 99 others); Mon, 28 Feb 2022 16:02:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbiB1VCK (ORCPT ); Mon, 28 Feb 2022 16:02:10 -0500 Received: from cloud48395.mywhc.ca (cloud48395.mywhc.ca [173.209.37.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3164DD7603; Mon, 28 Feb 2022 13:01:30 -0800 (PST) Received: from [45.44.224.220] (port=57032 helo=[192.168.1.179]) by cloud48395.mywhc.ca with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nOn97-0003rK-1O; Mon, 28 Feb 2022 16:01:29 -0500 Message-ID: Subject: Re: [PATCH v1] io_uring: Add support for napi_busy_poll From: Olivier Langlois To: Hao Xu , Jens Axboe Cc: Pavel Begunkov , io-uring , linux-kernel Date: Mon, 28 Feb 2022 16:01:27 -0500 In-Reply-To: References: Organization: Trillion01 Inc Content-Type: text/plain; charset="ISO-8859-1" User-Agent: Evolution 3.42.4 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cloud48395.mywhc.ca X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - trillion01.com X-Get-Message-Sender-Via: cloud48395.mywhc.ca: authenticated_id: olivier@trillion01.com X-Authenticated-Sender: cloud48395.mywhc.ca: olivier@trillion01.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2022-03-01 at 02:26 +0800, Hao Xu wrote: > > On 2/25/22 13:32, Olivier Langlois wrote: > > On Mon, 2022-02-21 at 13:23 +0800, Hao Xu wrote: > > > > @@ -5776,6 +5887,7 @@ static int __io_arm_poll_handler(struct > > > > io_kiocb *req, > > > > ?????????????????__io_poll_execute(req, mask); > > > > ?????????????????return 0; > > > > ?????????} > > > > +???????io_add_napi(req->file, req->ctx); > > > I think this may not be the right place to do it. the process > > > will > > > be: > > > arm_poll sockfdA--> get invalid napi_id from sk->napi_id --> > > > event > > > triggered --> arm_poll for sockfdA again --> get valid napi_id > > > then why not do io_add_napi() in event > > > handler(apoll_task_func/poll_task_func). > > You have a valid concern that the first time a socket is passed to > > io_uring that napi_id might not be assigned yet. > > > > OTOH, getting it after data is available for reading does not help > > neither since busy polling must be done before data is received. > > > > for both places, the extracted napi_id will only be leveraged at > > the > > next polling. > > Hi Olivier, > > I think we have some gap here. AFAIK, it's not 'might not', it is > > 'definitely not', the sk->napi_id won't be valid until the poll > callback. > > Some driver's code FYR: > (drivers/net/ethernet/intel/e1000/e1000_main.c) > > e1000_receive_skb-->napi_gro_receive-->napi_skb_finish-- > >gro_normal_one > > and in gro_normal_one(), it does: > > ?????????? if (napi->rx_count >= gro_normal_batch) > ?????????????????? gro_normal_list(napi); > > > The gro_normal_list() delivers the info up to the specifical network > protocol like tcp. > > And then sk->napi_id is set, meanwhile the poll callback is > triggered. > > So that's why I call the napi polling technology a 'speculation'. > It's > totally for the > > future data. Correct me if I'm wrong especially for the poll callback > triggering part. > When I said 'might not', I was meaning that from the io_uring point of view, it has no idea what is the previous socket usage. If it has been used outside io_uring, the napi_id could available on the first call. If it is really read virgin socket, neither my choosen call site or your proposed sites will make the napi busy poll possible for the first poll. I feel like there is not much to gain to argue on this point since I pretty much admitted that your solution was most likely the only call site making MULTIPOLL requests work correctly with napi busy poll as those requests could visit __io_arm_poll_handler only once (Correct me if my statement is wrong). The only issue was that I wasn't sure is how using your calling sites would make locking work. I suppose that adding a dedicated spinlock for protecting napi_list instead of relying on uring_lock could be a solution. Would that work?