Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp101806imm; Thu, 28 Jun 2018 15:47:37 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKGGKiMacK/69wmNFRyewB6Pp+AL3J4aUt/pz/E9UKeo8W0lue8rnnF+AicyRpqyxvNe8Lw X-Received: by 2002:a17:902:8:: with SMTP id 8-v6mr12507547pla.287.1530226057821; Thu, 28 Jun 2018 15:47:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530226057; cv=none; d=google.com; s=arc-20160816; b=lCvqCqUIjF6WXBWJXO7Fs0CgcT6H8u56hnB2YU2RLv7uSfOXcPyVjOJZ9VZ199gSio dY17eGOlS/H6Lrpi/iONPYQyikYveIRoKM7dQqka3yH0PczRupTwhfy9DZrAj0xSCRMo Dp7vHbUUcPaw2yePYLeJ1g5lQXislVRxv5/g7Z9XdBMxu/vIfMIFpjGe/AfZO6o+UwPo qDJwbiJM4kyhVCy7+N9VqvRuWhkbgrLbzqtQGQ4AB+NMOkvwPwqPqvmlGpSFOy9hJwAU dUJz22XflJVXSfI8R6Be2YrueZNsQmC6Tbppc6J9Z/wzeAWu+soCxQad2HXh8nyD07Yj aWNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=uUGMPFsNG/dQAI5hqeo64+nXDRsq8B3N+5Op22Ma5Lg=; b=w/NNP5IdtHB2V/Pz3TMruI4cvpRblVrroXrZGKBygLh4D4MjMGup/TQ45k3tlBcLSt qaj3RhdRVe+3nLOM8Yst9dyWyqT3rd61b4i84sDHbz8iz1+3vglb7fGkQEKYv0q59oZK moJR2g9DH41OIKAKxU1RqrJnETSvkIRH3Za1Y6xMcY8N6oERk4287nku3wdn0X26qDW4 SVh+erC8liNEGQQh/QcHf6rJ+FODBGVAn5aVkmjFZI3NYMQU1+K7u/PLyb8VFMi4Sxls Nyj4tu1K6UFHI2QhZK0r+9tEyoFm7zHDFvQatIz+u8un+2bcDPXuKTJ5iL26RBCNNcXm r2nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=BlStTvj5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d64-v6si7758818pfc.31.2018.06.28.15.47.23; Thu, 28 Jun 2018 15:47:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=BlStTvj5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967186AbeF1Pfm (ORCPT + 99 others); Thu, 28 Jun 2018 11:35:42 -0400 Received: from mail-it0-f66.google.com ([209.85.214.66]:40639 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966930AbeF1Pfj (ORCPT ); Thu, 28 Jun 2018 11:35:39 -0400 Received: by mail-it0-f66.google.com with SMTP id 188-v6so13133645ita.5 for ; Thu, 28 Jun 2018 08:35:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=uUGMPFsNG/dQAI5hqeo64+nXDRsq8B3N+5Op22Ma5Lg=; b=BlStTvj5m4ApUKXOMQYe/owePNYb/5zyq+nGM/l5+FWn9zSrCuRpwfZi507xyyIWMz JoYBZ63oaAJ2MsCEb3hAK6cWHx63rtSlI+pH+g9o41Lqpk0V8gLU5ifEMHKtBjnjAaW7 21IlymjZjZVX/4JyGCW6aHDIssrs0XbQqNRqVXFivZ9kDB7IxmcA5lpoC8yva2aF2wZs J2Dl1NUfnyn0D4+A8DlzwHUN0cvg+6ITj39H1FJ0ye+euFXkei9SIfL4XY6yKFlk4c5h IgJyUsQ7/YzYSJ9W3FVet9etT6We8ChY73aq5+dMdNZqWsBUpRly+lkNan5easbXRPth CI4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=uUGMPFsNG/dQAI5hqeo64+nXDRsq8B3N+5Op22Ma5Lg=; b=QL4/UFGnSaFv4NrFpg2VLqLRY55gbIYxUUDaoGpH20vH5YFFdcfK18NNt0LeAzTN7M UeWicHPKzHnCfCRvwN6Peo01ScyDNjf0hXUmzwWqVbMcUNCu4q5sdHD53P0YIpQKvEht x0xraNr58pKG6pNTQtydXHMOMem96Rv2Ov+zXp1BCrNoIehCwMLYimeVYOvHmsQ6c1y+ jzEt8b1JX/esJ9rt47O7yeWpT7Qhh5tavcF8sU9dQgmchm7MZpUNJSSZMfqZs8TWntGW lKjJnBn4IukWyhfiF5N/7sJx6b834StwFwQfYAKdK+tHXH8FK6vYVKdPxUqgRc+q0Unh lUSw== X-Gm-Message-State: APt69E1ooLqZGGI32dtELt2rucCZ88G5zHT0SQq4rO7qYcYQzKtgZRRX 9JHRQLDHELpH49TTFt33GBLMvw== X-Received: by 2002:a24:a43:: with SMTP id 64-v6mr8908298itw.140.1530200138569; Thu, 28 Jun 2018 08:35:38 -0700 (PDT) Received: from ?IPv6:2620:10d:c081:1132::1029? ([2620:10d:c090:180::1:dc82]) by smtp.gmail.com with ESMTPSA id p201-v6sm3836094itp.35.2018.06.28.08.35.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Jun 2018 08:35:37 -0700 (PDT) Subject: Re: [PATCH 12/15] block: introduce blk-iolatency io controller To: Josef Bacik , Jens Axboe Cc: linux-block@vger.kernel.org, kernel-team@fb.com, akpm@linux-foundation.org, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, tj@kernel.org, linux-fsdevel@vger.kernel.org, Josef Bacik References: <20180625151243.2132-1-josef@toxicpanda.com> <20180625151243.2132-13-josef@toxicpanda.com> <05a581ed-8f21-9d89-a813-a03d802d3469@kernel.dk> <20180627192046.ieqncfl6ioy37mof@destiny> <784f0862-0441-5ed2-1beb-3effa82b3438@kernel.dk> <20180628132648.wytk67ascubysqun@destiny> From: Jens Axboe Message-ID: <69aaf06b-ab1a-9982-a547-fcab7daff55f@kernel.dk> Date: Thu, 28 Jun 2018 09:35:33 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <20180628132648.wytk67ascubysqun@destiny> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/28/18 7:26 AM, Josef Bacik wrote: > On Wed, Jun 27, 2018 at 01:24:55PM -0600, Jens Axboe wrote: >> On 6/27/18 1:20 PM, Josef Bacik wrote: >>> On Wed, Jun 27, 2018 at 01:06:31PM -0600, Jens Axboe wrote: >>>> On 6/25/18 9:12 AM, Josef Bacik wrote: >>>>> +static void __blkcg_iolatency_throttle(struct rq_qos *rqos, >>>>> + struct iolatency_grp *iolat, >>>>> + spinlock_t *lock, bool issue_as_root, >>>>> + bool use_memdelay) >>>>> + __releases(lock) >>>>> + __acquires(lock) >>>>> +{ >>>>> + struct rq_wait *rqw = &iolat->rq_wait; >>>>> + unsigned use_delay = atomic_read(&lat_to_blkg(iolat)->use_delay); >>>>> + DEFINE_WAIT(wait); >>>>> + bool first_block = true; >>>>> + >>>>> + if (use_delay) >>>>> + blkcg_schedule_throttle(rqos->q, use_memdelay); >>>>> + >>>>> + /* >>>>> + * To avoid priority inversions we want to just take a slot if we are >>>>> + * issuing as root. If we're being killed off there's no point in >>>>> + * delaying things, we may have been killed by OOM so throttling may >>>>> + * make recovery take even longer, so just let the IO's through so the >>>>> + * task can go away. >>>>> + */ >>>>> + if (issue_as_root || fatal_signal_pending(current)) { >>>>> + atomic_inc(&rqw->inflight); >>>>> + return; >>>>> + } >>>>> + >>>>> + if (iolatency_may_queue(iolat, &wait, first_block)) >>>>> + return; >>>>> + >>>>> + do { >>>>> + prepare_to_wait_exclusive(&rqw->wait, &wait, >>>>> + TASK_UNINTERRUPTIBLE); >>>>> + >>>>> + iolatency_may_queue(iolat, &wait, first_block); >>>>> + first_block = false; >>>>> + >>>>> + if (lock) { >>>>> + spin_unlock_irq(lock); >>>>> + io_schedule(); >>>>> + spin_lock_irq(lock); >>>>> + } else { >>>>> + io_schedule(); >>>>> + } >>>>> + } while (1); >>>> >>>> So how does this wait loop ever exit? >>>> >>> >>> Sigh, I cleaned this up from what we're using in production and did it poorly, >>> I'll fix it up. Thanks, >> >> Also may want to consider NOT using exclusive add if first_block == false, as >> you'll end up at the tail of the waitqueue after sleeping and being denied. >> This is similar to the wbt change I posted last week. >> > > This isn't how it works though. You aren't removed from the list until you do > finish_wait(), so you don't lose your spot on the list. We only get added to > the end of the list if > > if (list_empty(&wq_entry->entry)) > > otherwise nothing changes. I missed that you don't do finish_wait() in the loop, I had played with that to see if it fixes things. But yeah, as it stands, you are right. >> For may_queue(), your wq_has_sleeper() is also going to be always true >> inside your loop, since you call it after doing the prepare_to_wait() >> which adds you to the queue. That's why wbt does the list checks, but >> it'd be nicer to have a wq_has_other_sleepers() for that. So your >> first iolatency_may_queue() inside the loop will always be false. > > Ah yeah that's a good point, I'll go back to using what you had to catch that > case. Thanks, Basically we need to do the same thing in wbt and blk-iolatency for this, so we should sync them up. -- Jens Axboe