Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp3600391rwb; Tue, 8 Nov 2022 06:26:23 -0800 (PST) X-Google-Smtp-Source: AMsMyM7CgjHa1MKbx/gIfAuQc5Vu47WnUdcOHPE1l+/npKY9Yw7F+cznLuCERGFWi5UorsF6CMPW X-Received: by 2002:a17:90b:d88:b0:211:4d8:1c93 with SMTP id bg8-20020a17090b0d8800b0021104d81c93mr74075073pjb.41.1667917583014; Tue, 08 Nov 2022 06:26:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667917583; cv=none; d=google.com; s=arc-20160816; b=utX6wGyPeO+DC05m4aWlYzNUJ1bQr/sG6DsjK+/mYVZ0SQ3/2KfTTnTJGiPjuwxPsN lRE8vgq1ggktcmOTS+RSvGtpywaMqTB5QbXAZi4/B2p4eaLhAAJP70SlshgnG1NzWirE MxQNhPapc+XSRxVx5go+tdWUTHWMm3d2ceQe6cQo5FY0rD52Sz06teP1idgxXGZCBbcY Y6XNcBz5CSBLN9ybf+0aYYZuKG2ghQ+W+7zZuG/sIiYtlNZDtbV+1/xqvIkF6ckwXwwj xrGpe3gCLmj/8i0aLdFJA3kT3Nn61+qoDyztiNylXPbWV1rt9V5JhqEyO1/yt0uIe7+t EoTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=e7so45kCZWiO9Fne67cf9GT73zH4jXuY/xOts4w/4F4=; b=Dic+E3NwAVF3Ezpg8RHsCImQQcthLP3edmco0ugy7W89PYX9scoZhLG5bgPaqQgNHT LS2Dqtggi3xdzSOzMdaT+6An1QRZsJvxrtwcRpRDBRO+ofDsrTSfd7FjTU1leaqz31+T eqUK7crJDwbikc76SBtfEuNCxr8eVPGJ9O0fhw5zCbSZYFNaY1FHY4hsHy/xue+mMj6X ujg7cN1VCC3R2EVQp9ygc9edWhyB1er0G5vbUKgSbpl0KalcUOpxOSwuBvRzzI5HvW7i LZxqPOlO6KWbcdjEb67YUrZNxCVupsT3ReRwnKd/n0n1HnvxkEk9fCfDrh8lxcyXUbu3 q4Mw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=0FpV3esp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b75-20020a63344e000000b0046f5ffcfc1dsi17220909pga.324.2022.11.08.06.26.10; Tue, 08 Nov 2022 06:26:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=0FpV3esp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235159AbiKHOJm (ORCPT + 90 others); Tue, 8 Nov 2022 09:09:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235181AbiKHOJh (ORCPT ); Tue, 8 Nov 2022 09:09:37 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DCB7BF47 for ; Tue, 8 Nov 2022 06:09:34 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id r61-20020a17090a43c300b00212f4e9cccdso17969134pjg.5 for ; Tue, 08 Nov 2022 06:09:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=e7so45kCZWiO9Fne67cf9GT73zH4jXuY/xOts4w/4F4=; b=0FpV3espfNl8M9gPAphHfZi1iEWRF72DrvEaptOlEw5H71MgjyGRaoHR9DKTC5kX8K kWRAnc5gqZ8ClrLMbjniM+mk5KgA5QXCG1STYHO8GXB1hcv+v4Noaaxd+ZsppImuao5y qEMxUNOXMOsvKrpGMWisvafed0ouWm7u7gGCLlZe3bQa4IzcBRiFh6yESzBvVsjZiKN4 VeetDGbpvBCCR3NXvZ+ASKil0qkbNABdBI6zG03Vh1+D/l4RfkVom1GMxi7oSIJnPAbc GjEhT+48/spLero9rEdctpNBNCCHHyD21ltbCpxsh/Bx1mtd80BlCuqq4QjBX5EBPynD WxCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e7so45kCZWiO9Fne67cf9GT73zH4jXuY/xOts4w/4F4=; b=LbKQ1+8ZmhLaWli4GY5s7gK/7nvR3QyzDTnCGMsL5nY2i0g7blHHTlQxUKNchSpAHk yz87EvcCVJsQCpQVaBTplIn7jDXk2avgtg/osg7GD/0H8upq/bx8mhSgnPu+5z1nYcza Xp/GPw8lQ1Cve9hPOlFPhRlapz4RpKFafLQ8UMhgTVBK9FBuKHCXTxucCxw4Zlsbif+L O+jSCiaSrHJrUF7C5fP84/8gNRc+vEzaz+mX5oZQ13vxmazMjBjCdokiOdgQYqK4ISeE h0uaVggQOuIdhhHFIqfAr4Mujg3a5s9k5SPVcb+DIxX/dmA2/ERh4+0f/GL9XBxncPXw hh5g== X-Gm-Message-State: ACrzQf0GRZ8mHnBJtVrXaUoUJO1K82WAME28V+YY8iWGrhRwCAR2SoIf cLU5jfp1hwYz3IELgGfjXiDCLw== X-Received: by 2002:a17:90a:2b47:b0:213:a42a:13e5 with SMTP id y7-20020a17090a2b4700b00213a42a13e5mr58414355pjc.31.1667916573582; Tue, 08 Nov 2022 06:09:33 -0800 (PST) Received: from [192.168.1.136] ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id pc3-20020a17090b3b8300b00212cf2fe8c3sm21361986pjb.1.2022.11.08.06.09.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Nov 2022 06:09:32 -0800 (PST) Message-ID: <93fa2da5-c81a-d7f8-115c-511ed14dcdbb@kernel.dk> Date: Tue, 8 Nov 2022 07:09:30 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCHSET v3 0/5] Add support for epoll min_wait Content-Language: en-US To: Stefan Hajnoczi Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org References: <4281b354-d67d-2883-d966-a7816ed4f811@kernel.dk> From: Jens Axboe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/8/22 7:00 AM, Stefan Hajnoczi wrote: > On Mon, Nov 07, 2022 at 02:38:52PM -0700, Jens Axboe wrote: >> On 11/7/22 1:56 PM, Stefan Hajnoczi wrote: >>> Hi Jens, >>> NICs and storage controllers have interrupt mitigation/coalescing >>> mechanisms that are similar. >> >> Yep >> >>> NVMe has an Aggregation Time (timeout) and an Aggregation Threshold >>> (counter) value. When a completion occurs, the device waits until the >>> timeout or until the completion counter value is reached. >>> >>> If I've read the code correctly, min_wait is computed at the beginning >>> of epoll_wait(2). NVMe's Aggregation Time is computed from the first >>> completion. >>> >>> It makes me wonder which approach is more useful for applications. With >>> the Aggregation Time approach applications can control how much extra >>> latency is added. What do you think about that approach? >> >> We only tested the current approach, which is time noted from entry, not >> from when the first event arrives. I suspect the nvme approach is better >> suited to the hw side, the epoll timeout helps ensure that we batch >> within xx usec rather than xx usec + whatever the delay until the first >> one arrives. Which is why it's handled that way currently. That gives >> you a fixed batch latency. > > min_wait is fine when the goal is just maximizing throughput without any > latency targets. That's not true at all, I think you're in different time scales than this would be used for. > The min_wait approach makes it hard to set a useful upper bound on > latency because unlucky requests that complete early experience much > more latency than requests that complete later. As mentioned in the cover letter or the main patch, this is most useful for the medium load kind of scenarios. For high load, the min_wait time ends up not mattering because you will hit maxevents first anyway. For the testing that we did, the target was 2-300 usec, and 200 usec was used for the actual test. Depending on what the kind of traffic the server is serving, that's usually not much of a concern. From your reply, I'm guessing you're thinking of much higher min_wait numbers. I don't think those would make sense. If your rate of arrival is low enough that min_wait needs to be high to make a difference, then the load is low enough anyway that it doesn't matter. Hence I'd argue that it is indeed NOT hard to set a useful upper bound on latency, because that is very much what min_wait is. I'm happy to argue merits of one approach over another, but keep in mind that this particular approach was not pulled out of thin air AND it has actually been tested and verified successfully on a production workload. This isn't a hypothetical benchmark kind of setup. -- Jens Axboe