Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp2452245pxa; Mon, 3 Aug 2020 16:58:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx6m+ktCxeEWozc0RN+H1yFADDKAlbC2YZxMX5f0ffGKWOgmw/5cDn3YL3Ltxb/gxVH2HLO X-Received: by 2002:a17:906:eca4:: with SMTP id qh4mr18738111ejb.255.1596499137212; Mon, 03 Aug 2020 16:58:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596499137; cv=none; d=google.com; s=arc-20160816; b=rEgLN4R9Ivsrg5nrk0SM8fO6XP0mOVmiMYBSf0tDr5zSNDiO/8H4WJ+Tn9fE5ysnTC /JbYHoh/ftypqmVeHV22kN7vmhyy0qyzgEmrql0X3V2sLhQP4fdQLIrOpJozarruMoc6 xlKuB36O7/pLALYi7bADCuBoHVhqBDAAXSPsukqE7snHzfbiQw7c65f3SLNCBnDepMhf LiwldXZWSJcp9XeG9O+c2z5Wo0tBZymD5u3Q1n/OvWgeeOPvFHhGHY88Z3SUEd2CevIC R0hxUManjBQoMXzinPWU/C6G6l9LNXPUXk5iAOOwYDrjmwRcN7Y62Yt5XZk6AmpleJbn D2cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=qCO0I9XODj8tcvSwCtrfb839SzPfAPZbTABsM0q9zhw=; b=tQL8T+a4ffaTMxwN6enwO4vDSIiYyeigePxhjN/R/E+FxL3pleVdw/yaOQmR96aCKr 5yUknYbSttuLzZY1g7P2hVtjRHEazu8PZiysMxnx942uUzU6rPCvicJQTRzmKzV0ABVS CFDydHU/IHja8OqN4elF0pmnIbsGB3DDsX4oEIvobxMshTpqtlEEhYXjjvPy7HeaC8xM 1qt/zNDKFlaoywG7diDG+1I965aLWcN01wiU3bRS00nz+wTsfHKwE0buOoV3jqCtYyma UBlbsgedeSq+Qyj6uduesvYsjFxORB+K2uiHt21Nq0guOwOsv6FLrEGWHFubAOknFHL7 6Pqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=uxCHtMGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h18si12609259eds.431.2020.08.03.16.58.35; Mon, 03 Aug 2020 16:58:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=uxCHtMGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728890AbgHCX4E (ORCPT + 99 others); Mon, 3 Aug 2020 19:56:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726718AbgHCX4E (ORCPT ); Mon, 3 Aug 2020 19:56:04 -0400 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D626C061756 for ; Mon, 3 Aug 2020 16:56:04 -0700 (PDT) Received: by mail-pg1-x543.google.com with SMTP id e8so21075105pgc.5 for ; Mon, 03 Aug 2020 16:56:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=qCO0I9XODj8tcvSwCtrfb839SzPfAPZbTABsM0q9zhw=; b=uxCHtMGDCPrDsexQiLlLDBwZTUIrXfXTceIkO6FI9lI3VQOB0vdPqeTq4h77KpjB73 VSHlpQnTxWhFZYXqdvCdXfhDufN4d0KlatMO/xUIM63TY/zrdOvPlhjznDrowD1ZKENt 5JfxdTrZzx3AwMtsPZDEflD7Pt6ofu2pAjL+ZxNTT5hM0RNnVeR6cf46Bzs2HSl/lioo x+awvymEwJfq4CVhq+WCS3plYffZ8TiCDg5zdpRfyo2AY1sGQ4/tc5FPAXuLWlao6Ejn RP/0FyzfQLGC5xQc6V8bTpCVegtIpTDj3VNoK1JHc/KJe5LBv3wjh86losl+zRlSOsGj P1YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qCO0I9XODj8tcvSwCtrfb839SzPfAPZbTABsM0q9zhw=; b=i22x6cr50VPuasF6gFTBidCNxLVKanDnzW9jy3vaXUugZaidlEEKzWgUN5XUBw8JPx alKOUJ3FuQuq41yCCuwSHYPAxVTTuelCrKiS/ZwnrAnUus+R3I8v5dgFoL1yYWKa2hCp Z3WJlGLP667m26Lskf3st+mXv+YBHILX0W75D+ml7zGhzk16eWp9uLeMugYhmRYIQnbS Vmjk+zc72fH/i8PBISjR4BdMLBep49Gtpg9BA17VugzFcu0ExPAoXz7PLuYYFDjbzQPq upi6RnRa/qPjVmb4v1TBuZQkQwIudvj+nlMCwzRpkVczOAgagy7JSkXFTLJTBUYrWYu2 un9Q== X-Gm-Message-State: AOAM532jc7Qyk7crjRlkAg/S2Q+hacFAWcn/KhNmg1n8OlWxFtxXH9eb gcwQC8s8ZqY2zAAx0xEoUwx/DYGGKeA= X-Received: by 2002:a65:6644:: with SMTP id z4mr17317591pgv.391.1596498963444; Mon, 03 Aug 2020 16:56:03 -0700 (PDT) Received: from [192.168.1.182] ([66.219.217.173]) by smtp.gmail.com with ESMTPSA id x9sm22720340pfq.216.2020.08.03.16.56.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 Aug 2020 16:56:02 -0700 (PDT) Subject: Re: [GIT PULL] io_uring changes for 5.9-rc1 To: Linus Torvalds Cc: io-uring , "linux-kernel@vger.kernel.org" References: <50466810-9148-e245-7c1e-e7435b753582@kernel.dk> <56cb11b1-7943-086e-fb31-6564f4d4d089@kernel.dk> <025dcd45-46df-b3fa-6b4a-a8c6a73787b0@kernel.dk> From: Jens Axboe Message-ID: Date: Mon, 3 Aug 2020 17:56:01 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/3/20 5:49 PM, Linus Torvalds wrote: > On Mon, Aug 3, 2020 at 4:31 PM Jens Axboe wrote: >> >> Updated to honor exclusive return value as well: > > See my previous email, You're just adding code that makes no sense, > because your wait entry fundamentally isn't an exclusive one. Right, I get that now, it's just dead code for my use case. It was sent out before your previous email. > So all that code is a no-op and only makes it more confusing to read. > > Your wakeup handler has _nothing_ to do with the generic > wake_page_function(). There is _zero_ overlap. Your wakeup handler > gets called only for the wait entries _you_ created. > > Trying to use the wakeup logic from wake_page_function() makes no > sense, because the rules for wake_page_function() are entirely > different. Yes, they are called for the same thing (somebody unlocked > a page and is waking up waiters), but it's using a completely > different sleeping logic. > > See? When wake_page_function() does that > > wait->flags |= WQ_FLAG_WOKEN; > > and does something different (and returns different values) depending > on whether WQ_FLAG_EXCLUSIVE was set, that is all because > wait_on_page_bit_common() entry set yo that wait entry (on its stack) > with those exact rules in mind. > > So the wakeup function is 1:1 tied to the code that registers the wait > entry. wait_on_page_bit_common() has one set of rules, that are then > honored by the wakeup function it uses. But those rules have _zero_ > impact on your use. You can have - and you *do* have - different sets > of rules. > > For example, none of your wakeups are ever exclusive. All you do is > make a work runnable - that doesn't mean that other people shouldn't > do other things when they get a "page was unlocked" wakeup > notification. > > Also, for you "list_del_init()" is fine, because you never do the > unlocked "list_empty_careful()" on that wait entry. All the waitqueue > operations run under the queue head lock. > > So what I think you _should_ do is just something like this: > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 2a3af95be4ca..1e243f99643b 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -2965,10 +2965,10 @@ static int io_async_buf_func(struct > wait_queue_entry *wait, unsigned mode, > if (!wake_page_match(wpq, key)) > return 0; > > - /* Stop waking things up if the page is locked again */ > - if (test_bit(key->bit_nr, &key->page->flags)) > - return -1; > - > + /* > + * Somebody unlocked the page. Unqueue the wait entry > + * and run the task_work > + */ > list_del_init(&wait->entry); > > init_task_work(&req->task_work, io_req_task_submit); > > because that matches what you're actually doing. > > There's no reason to stop waking up others because the page is locked, > because you don't know what others want. > > And there's never any reason for the exclusive thing, b3ecause none of > what you do guarantees that you take exclusive ownership of the page > lock. Running the work *may* end up doing a "lock_page()", but you > don't actually guarantee that. What I ended up with after the last email was just removing the test bit: https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.9&id=cbd287c09351f1d3a4b3cb9167a2616a11390d32 and I clarified the comments on the io_async_buf_func() to add more hints on how everything is triggered instead of just a vague "handler" reference: https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.9&id=c1dd91d16246b168b80af9b64c5cc35a66410455 -- Jens Axboe