Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp3057363imm; Fri, 24 Aug 2018 09:46:32 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZO+ZyU3hgoUFo4M2FIMQ70g761z326y6FMF0tJMwmqilBOmPhX5G7cHznUeEvqVo+O0b2+ X-Received: by 2002:a65:650f:: with SMTP id x15-v6mr2482438pgv.127.1535129192272; Fri, 24 Aug 2018 09:46:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535129192; cv=none; d=google.com; s=arc-20160816; b=q6MCMs8xxCxXTB0JCeKwjCWCA504GmdPc6ZpNs25CXUXOK8FVq8jYLA4NY5Kyoalt3 sSjKgJxL/+qYmhiDjutZ/X8C5eetyMo4kssjWNqD9KsQX0FQp+djVvTFq2lTxGpacYpu dJAZozxSEag4C3c26ES5wjt4U+eLHrDU47SktufblDnSfj6MXn5XqXSBig5t1rnkNvuH RiqJRktM76EgfC3HmDzY2oRlLX4MSKNtA1+5E8B72GOYfqg+svtowbCMQZ9h2eDg4giu 6eFwAXwzPVYVZv38AWdbDja3Om7MwTj2PebId4iCKbNpX+fnNuh6nBm2nKzam7YWFziA 0z8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=mQzpNYc8zDNJff8fBN0aJvjxf2fNbN6T0t68juMZGQQ=; b=eqfnwgGy5DizQn5dHQeFXzAcVYv9NnnQoiBhrD0uClry7rAF0gAsGUflixoo4lJMDv dR1hOb/S/r7pUlXxJp4LRkdVEj3Yrejn7KDHJzeCTakj0UIHe9/vDHhCCzJU9bhw9P4X 1Y8kAlKcvYs39gqZxoDd4yOSj0xXCAe9Ls1QhHchqyAmqN7ClRHtPjpFvg5VuSTJMUue MfxaKpPM002U+EEJCj59hdzo+3A9IbycDLTMLxdmJKgKM3rVinLONjBi3H5+cue6hLLp N5q3qCv3SECPwGt3fp49QTnYX6ikGCYHnvf6S/WAs2yB6sO9G+xfZhsQdOZx0nfHLwWq U2OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=JouH+mZu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q10-v6si7234759pge.674.2018.08.24.09.46.15; Fri, 24 Aug 2018 09:46:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=JouH+mZu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727004AbeHXUTw (ORCPT + 99 others); Fri, 24 Aug 2018 16:19:52 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:53758 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726277AbeHXUTw (ORCPT ); Fri, 24 Aug 2018 16:19:52 -0400 Received: by mail-it0-f65.google.com with SMTP id k188-v6so2948762itk.3 for ; Fri, 24 Aug 2018 09:44:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=mQzpNYc8zDNJff8fBN0aJvjxf2fNbN6T0t68juMZGQQ=; b=JouH+mZuu4SPAd/5QpGRMTTnah/yD9n37sYSdSe6z1Pzq6GPNWkb1aio9gZe8ZwM/P A93DNEKaRVafdvXsQAGeT1zslXxD64z9eS7vVo5PfEszClv//TONby9YXC8ro9i2+h7h XPzpWPsvLBKn+Z1Q8L2y1jaZ/Nx2hPOLAkOADjE5IDuHQHTLzle9jAC3qmxfh6L5qDfN CUpqu9myEh1mWze90tjMMJOPgABSPtqH3cQxyxzpn2Ca2VA6FRx44sjxHXe81SM4haEh fxVxtOJaPJIXSMhALAMxP0WjMD4OWmr6sH6qansQUHIFo5PLPzvNgZbHs7z0ep7hmt8k ynbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=mQzpNYc8zDNJff8fBN0aJvjxf2fNbN6T0t68juMZGQQ=; b=QP8sgOW3clkH5lP9GK9PtjRMtYnlEv/HB6hk5J86XChQv4KTEF1Byx4AlfZvfLO5Lx ob0kdGInRv4ioUTYMN7Q5V4yxQzrJfE3exaj0rXWoMcPTv/Hfix7OcrEN0RBlyC2ggCR YK154mlefeb3W0NCp5QNIPrIYNQiNcpBIZwmEUQxwmKfxPU3GSWzQMlsDTVy6w+Dv308 QDVlaAaUb0Tc3VwT4D96Wb/eVAnrV4eUmcXVt/TNW7tWEljSGT1QoyqSIQrGsjKbYZp9 2FI28AUr6as8bxuT6pGJTs6IkWeYlorN8QWQy/BXYVqnjWLaXkI5RyX1NSaUwGeRdfAd 14sg== X-Gm-Message-State: APzg51Cvzf8QLcYg9470OTEWAWFKkIsfZaAqi0fHu/FFM2bSLpP4V3oX 5F9Bm4UbdgjRuviJJCYVGLwx5QuxqnM= X-Received: by 2002:a24:2911:: with SMTP id p17-v6mr2003804itp.134.1535129064652; Fri, 24 Aug 2018 09:44:24 -0700 (PDT) Received: from [192.168.1.56] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id a11-v6sm745434ita.21.2018.08.24.09.44.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Aug 2018 09:44:23 -0700 (PDT) Subject: Re: [PATCH] blk-wbt: get back the missed wakeup from __wbt_done To: "van der Linden, Frank" , "jianchao.wang" , Anchal Agarwal Cc: "mlinux-block@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <1535029718-17259-1-git-send-email-jianchao.w.wang@oracle.com> <20180823210144.GB5624@kaos-source-ops-60001.pdx1.amazon.com> <3eaa20ce-0599-c405-d979-87d91ea331d2@kernel.dk> <969389e7-b1bc-0559-6cc9-9461b034a24f@kernel.dk> <8af76974-08b2-f4ef-91b9-7bd42291b8d9@oracle.com> <347a7a07dc5f4122a37afd703ef2a3d0@EX13D13UWB002.ant.amazon.com> From: Jens Axboe Message-ID: Date: Fri, 24 Aug 2018 10:44:22 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <347a7a07dc5f4122a37afd703ef2a3d0@EX13D13UWB002.ant.amazon.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/24/18 10:40 AM, van der Linden, Frank wrote: > On 8/23/18 10:56 PM, jianchao.wang wrote: >> >> On 08/24/2018 07:14 AM, Jens Axboe wrote: >>> On 8/23/18 5:03 PM, Jens Axboe wrote: >>>>> Hi Jens, This patch looks much cleaner for sure as Frank pointed out >>>>> too. Basically this looks similar to wake_up_nr only making sure that >>>>> those woken up requests won't get reordered. This does solves the >>>>> thundering herd issue. However, I tested the patch against my >>>>> application and lock contention numbers rose to around 10 times from >>>>> what I had from your last 3 patches. Again this did add to drop in >>>>> of total files read by 0.12% and rate at which they were read by >>>>> 0.02% but this is not a very significant drop. Is lock contention >>>>> worth the tradeoff? I also added missing >>>>> __set_current_state(TASK_RUNNING) to the patch for testing. >>>> Can you try this variant? I don't think we need a >>>> __set_current_state() after io_schedule(), should be fine as-is. >>>> >>>> I'm not surprised this will raise contention a bit, since we're now >>>> waking N tasks potentially, if N can queue. With the initial change, >>>> we'd always just wake one. That is arguably incorrect. You say it's >>>> 10 times higher contention, how does that compare to before your >>>> patch? >>>> >>>> Is it possible to run something that looks like your workload? >>> Additionally, is the contention you are seeing the wait queue, or the >>> atomic counter? When you say lock contention, I'm inclined to think it's >>> the rqw->wait.lock. >>> >> I guess the increased lock contend is due to: >> when the wake up is ongoing with wait head lock is held, there is still waiter >> on wait queue, and __wbt_wait will go to wait and try to require the wait head lock. >> This is necessary to keep the order on the rqw->wait queue. >> >> The attachment does following thing to try to avoid the scenario above. >> " >> Introduce wait queue rqw->delayed. Try to lock rqw->wait.lock firstly, if fails, add >> the waiter on rqw->delayed. __wbt_done will pick the waiters on rqw->delayed up and >> queue them on the tail of rqw->wait before it do wake up operation. >> " >> > Hmm, I am not sure about this one. Sure, it will reduce lock contention > for the waitq lock, but it also introduces more complexity. > > It's expected that there will be more contention if the waitq lock is > held longer. That's the tradeoff for waking up more throttled tasks and > making progress faster. Is this added complexity worth the gains? My > first inclination would be to say no. > > If lock contention on a wait queue is an issue, then either the wait > queue mechanism itself should be improved, or the code that uses the > wait queue should be fixed. Also, the contention is still a lot lower > than it used to be. Hard to disagree with that. If you look at the blk-mq tagging code, it spreads out the wait queues exactly because of this. -- Jens Axboe