Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1717490ybn; Thu, 26 Sep 2019 00:47:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqz9mNqwulBO8xUstM8WGZ2ydk438j5psbbmNSemm+jhfkS8s3/tYKOPu/6GITCyifAB87/y X-Received: by 2002:a50:9eac:: with SMTP id a41mr2119186edf.193.1569484076483; Thu, 26 Sep 2019 00:47:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569484076; cv=none; d=google.com; s=arc-20160816; b=HL0IwopuHy0UzlyV6KncYVIb6V5AYLSy2NlfRQTM/9vgQ5TuJssbxguOOY24CWqpTe ehrSEdSU1l/JzWNpW9zr/qEnPKxp4Lcx6m24elxQRdq7l+7YRAdf/cr1Tmo+2pChQL15 I8NqC6goJjoTSeNySKSEzVlR26LFTzwT+ffhhc3mUJeOpKxaSY5w+lXh67Js9eBUYGqR n+uUd2RVBy3f4W0dEKB8kb/rq59WWPEV6ln/7vXh6zLO5ZiPmPlWrUDvC2/V6Q0wmBCI A7wUkreUqt9xdse5fgMlrxx7XSYF2VDDR0m7pa+zBRxxhi955XbfM0xJQWPuJx+WNtBE Ab1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=zFj+W21UBEqh8XHsvLd6EQrEC8J1XYOJHhKVopUNcRg=; b=B194HZk3KUMQ4uIpXsW515KhuvwDtCC9uVcD68BVtZd7ypEsRaoAr9Ppribqrq4SoQ QO7bWyYTKdg9ELmMw9LlohlvB+aUm8pNYjOlsmUp0O3bWE9sxlcOqoJJzNT6S0I00IFa qIAYJXUyqcInoKhZra0SSso3xFGTJt2z0yrnVhli/gIDlyG/rM8ILtgylS9aYpbc5JWP YPZSavKNjV9/Kv+qKRc5J3ptTmcUTwXXpibnNUtLWNIUcAa8p8A6J5uA+hcYnnsZkzc8 2rLanqYgRU8L1FLiPU7md4/i39f8/tptljo0gNRo8e/gBY1yUz7/r0r/ul5sOOG4oBOG z4dQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ph0fHye0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u12si582260ejt.21.2019.09.26.00.47.33; Thu, 26 Sep 2019 00:47:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ph0fHye0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440707AbfIXKLx (ORCPT + 99 others); Tue, 24 Sep 2019 06:11:53 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:36205 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2438579AbfIXKLw (ORCPT ); Tue, 24 Sep 2019 06:11:52 -0400 Received: by mail-pf1-f196.google.com with SMTP id y22so1062548pfr.3 for ; Tue, 24 Sep 2019 03:11:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=zFj+W21UBEqh8XHsvLd6EQrEC8J1XYOJHhKVopUNcRg=; b=ph0fHye07aJm1FGweBetkYVsbQUiB0tpJyiAO2rPMjnyVDrKN1V9JuylbOJJh+Uiq6 ERZAssw97/WpZXs4cdaPKHzf9oVPvx3lZq9+zwjKc9fJXtHvMIGO8mSZevacuz3CEI2G HwDAGHD8AuJy+yE+YT1cct+oAg9qWOu/i6fEKHRlrf5MhZJk1tfKFd9IZyEgNwcKgRJU 3F1cLIR46AKWefArQYOgqOPc5P0qBhH4PCWZaUYYcaueDp4Fc7HX7ICkYnbENrEcABH1 z0isubH0xJDKGfjUGmhAz2d4Gcy+wBw+rYQwtEui9QEIvHcn30BzJP4QU/hr7uH5zv++ 380A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=zFj+W21UBEqh8XHsvLd6EQrEC8J1XYOJHhKVopUNcRg=; b=crajQPqdq7ivuSh1QY+ajT+JCypgRPJUY1vmWTvCBG7H3ojaXqXf/PcYuGQhLwisMB rhVa/V2D6X1gaUsFc2eDItTEIYno43XZgb06YF4TPqCNDPoTP/Ii5KPPO65ufgWBfTjm sx9HyD7rxRfHOmMeSF0GTvU71hgOLOMuo1TIcuUagnPnicQKwDY4AjI36arIlq2qcsNR nBi8xH6Gp6XOWoldGq0wHziWd4mTao0ijf1z4iyl0H17ku9LMj0c3UDzgRZrQM1bLdfR e5/Qu54Mw5GrTU8ZutrzxQt2sHSAesgL87o/AGBT305ka/226UMMuVFEh+hCVRAho9vk xu1Q== X-Gm-Message-State: APjAAAWluToPGgpzUAOdlK9nOki4XOG9Nj5nQph/efHICtfboq6h6oPW Hc6vgygJKaYZJBVpp4AhSz9G2g6wSDDi0zL/ X-Received: by 2002:a63:2a87:: with SMTP id q129mr2374326pgq.101.1569319911155; Tue, 24 Sep 2019 03:11:51 -0700 (PDT) Received: from ?IPv6:2600:380:8419:743e:6023:99b1:fa9f:a39c? ([2600:380:8419:743e:6023:99b1:fa9f:a39c]) by smtp.gmail.com with ESMTPSA id n29sm4277137pgm.4.2019.09.24.03.11.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Sep 2019 03:11:50 -0700 (PDT) Subject: Re: [PATCH v2 0/2] Optimise io_uring completion waiting To: Pavel Begunkov , Ingo Molnar Cc: Ingo Molnar , Peter Zijlstra , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org References: <20190923083549.GA42487@gmail.com> <731b2087-7786-5374-68ff-8cba42f0cd68@kernel.dk> <759b9b48-1de3-1d43-3e39-9c530bfffaa0@kernel.dk> <43244626-9cfd-0c0b-e7a1-878363712ef3@gmail.com> <0fec66fb-4534-59f8-cd88-d8d2297779aa@gmail.com> From: Jens Axboe Message-ID: Date: Tue, 24 Sep 2019 12:11:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <0fec66fb-4534-59f8-cd88-d8d2297779aa@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/24/19 3:33 AM, Pavel Begunkov wrote: > > > On 24/09/2019 11:36, Jens Axboe wrote: >> On 9/24/19 2:27 AM, Jens Axboe wrote: >>> On 9/24/19 2:02 AM, Jens Axboe wrote: >>>> On 9/24/19 1:06 AM, Pavel Begunkov wrote: >>>>> On 24/09/2019 02:00, Jens Axboe wrote: >>>>>>> I think we can do the same thing, just wrapping the waitqueue in a >>>>>>> structure with a count in it, on the stack. Got some flight time >>>>>>> coming up later today, let me try and cook up a patch. >>>>>> >>>>>> Totally untested, and sent out 5 min before departure... But something >>>>>> like this. >>>>> Hmm, reminds me my first version. Basically that's the same thing but >>>>> with macroses inlined. I wanted to make it reusable and self-contained, >>>>> though. >>>>> >>>>> If you don't think it could be useful in other places, sure, we could do >>>>> something like that. Is that so? >>>> >>>> I totally agree it could be useful in other places. Maybe formalized and >>>> used with wake_up_nr() instead of adding a new primitive? Haven't looked >>>> into that, I may be talking nonsense. >>>> >>>> In any case, I did get a chance to test it and it works for me. Here's >>>> the "finished" version, slightly cleaned up and with a comment added >>>> for good measure. >>> >>> Notes: >>> >>> This version gets the ordering right, you need exclusive waits to get >>> fifo ordering on the waitqueue. >>> >>> Both versions (yours and mine) suffer from the problem of potentially >>> waking too many. I don't think this is a real issue, as generally we >>> don't do threaded access to the io_urings. But if you had the following >>> tasks wait on the cqring: >>> >>> [min_events = 32], [min_events = 8], [min_events = 8] >>> >>> and we reach the io_cqring_events() == threshold, we'll wake all three. >>> I don't see a good solution to this, so I suspect we just live with >>> until proven an issue. Both versions are much better than what we have >>> now. >> >> Forgot an issue around signal handling, version below adds the >> right check for that too. > > It seems to be a good reason to not keep reimplementing > "prepare_to_wait*() + wait loop" every time, but keep it in sched :) I think if we do the ->private cleanup that Peter mentioned, then there's not much left in terms of consolidation. Not convinced the case is interesting enough to warrant a special helper. If others show up, it's easy enough to consolidate the use cases and unify them. If you look at wake_up_nr(), I would have thought that would be more widespread. But it really isn't. >> Curious what your test case was for this? > You mean a performance test case? It's briefly described in a comment > for the second patch. That's just rewritten io_uring-bench, with > 1. a thread generating 1 request per call in a loop > 2. and the second thread waiting for ~128 events. > Both are pinned to the same core. Gotcha, thanks. -- Jens Axboe