Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp903547ybk; Wed, 20 May 2020 15:16:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwQVa/XlxczcrG2GGD0llViKajWl7UAg2KBzTBbUM5fsTDaZr2T/NHmy7GgGaPIu0juaVo5 X-Received: by 2002:aa7:c6c6:: with SMTP id b6mr5299835eds.53.1590012987828; Wed, 20 May 2020 15:16:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590012987; cv=none; d=google.com; s=arc-20160816; b=Qc631X3tGCpmbZPBmWUSfrcS7k3nepkTJtza3NPc0PjSfkNcIoMoLIA7w8Ylp99IKT WpQHtsPHY4+fJvDWzver55dTuGSUOOe0dQ7/pAgYikTW87a18k10x6m4C+6M3C6dd5fm HuIZX8bHQzJnFtay4YK9oGY1aWnE0KwR9oxDJB7Lc44E7mNNHCqdiKU2jvqxYEzKPPfr JiR+V/RHG6bGMF1eNpOZNyNlwZerKIrm8aEJqYvCfE7TU/9AuZBhFDL3H6qljwDn1sXs OwIToZBXqEN8LV61PgAocrwStOpxUDKGyI/P/b6dKykQLDBqlOiQX8Tp9NJMKHzG4gO7 t73A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=3gq4onHOGxAO4LHhFIeOQPABCTNhUDuCIePzdMpbbxo=; b=CYhorFcYy0eOS+MB7Ecjlc2yk5RDpSudiYUMu9Mu7A1zCHiIQa/5WRyBiIMKNqF25b UJXmOo2Jufqufr9SK882SzSgmND0lqCrQvpe3ORvzX4b26iUC6j4AB6cVQtN1fFeVeCG /K/rplQj4OG6yQXAZshawqYMZ6xBemIAIRvTbofmqbpkSnBC5YfBMfbskdIZ3meyg0qo zKC8Gr1WeLmrxhsF9UwCLRMrfxByQzUTxcGolxMrTHuX5s3UiPoQ2ZggN8uUvYT23Dpi h3qmOAMiJTFqu567dLp98QFeOh2P7CJKmMGUMSCxBldH3KFEAIMkdy0BV7IaneRmvZdO ydaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u23si2155498edy.361.2020.05.20.15.15.59; Wed, 20 May 2020 15:16:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728435AbgETWOe (ORCPT + 99 others); Wed, 20 May 2020 18:14:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726847AbgETWOe (ORCPT ); Wed, 20 May 2020 18:14:34 -0400 Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49D19C061A0E; Wed, 20 May 2020 15:14:34 -0700 (PDT) Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jbWyg-0004su-V0; Thu, 21 May 2020 00:14:19 +0200 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 6333F100C2D; Thu, 21 May 2020 00:14:18 +0200 (CEST) From: Thomas Gleixner To: Jens Axboe , Christoph Hellwig , Ming Lei Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, John Garry , Bart Van Assche , Hannes Reinecke , io-uring@vger.kernel.org, Peter Zijlstra Subject: Re: io_uring vs CPU hotplug, was Re: [PATCH 5/9] blk-mq: don't set data->ctx and data->hctx in blk_mq_alloc_request_hctx In-Reply-To: <2a12a7aa-c339-1e51-de0d-9bc6ced14c64@kernel.dk> References: <20200518093155.GB35380@T590> <87imgty15d.fsf@nanos.tec.linutronix.de> <20200518115454.GA46364@T590> <20200518131634.GA645@lst.de> <20200518141107.GA50374@T590> <20200518165619.GA17465@lst.de> <20200519015420.GA70957@T590> <20200519153000.GB22286@lst.de> <20200520011823.GA415158@T590> <20200520030424.GI416136@T590> <20200520080357.GA4197@lst.de> <8f893bb8-66a9-d311-ebd8-d5ccd8302a0d@kernel.dk> <448d3660-0d83-889b-001f-a09ea53fa117@kernel.dk> <87tv0av1gu.fsf@nanos.tec.linutronix.de> <2a12a7aa-c339-1e51-de0d-9bc6ced14c64@kernel.dk> Date: Thu, 21 May 2020 00:14:18 +0200 Message-ID: <87eereuudh.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jens Axboe writes: > On 5/20/20 1:41 PM, Thomas Gleixner wrote: >> Jens Axboe writes: >>> On 5/20/20 8:45 AM, Jens Axboe wrote: >>>> It just uses kthread_create_on_cpu(), nothing home grown. Pretty sure >>>> they just break affinity if that CPU goes offline. >>> >>> Just checked, and it works fine for me. If I create an SQPOLL ring with >>> SQ_AFF set and bound to CPU 3, if CPU 3 goes offline, then the kthread >>> just appears unbound but runs just fine. When CPU 3 comes online again, >>> the mask appears correct. >> >> When exactly during the unplug operation is it unbound? > > When the CPU has been fully offlined. I check the affinity mask, it > reports 0. But it's still being scheduled, and it's processing work. > Here's an example, PID 420 is the thread in question: > > [root@archlinux cpu3]# taskset -p 420 > pid 420's current affinity mask: 8 > [root@archlinux cpu3]# echo 0 > online > [root@archlinux cpu3]# taskset -p 420 > pid 420's current affinity mask: 0 > [root@archlinux cpu3]# echo 1 > online > [root@archlinux cpu3]# taskset -p 420 > pid 420's current affinity mask: 8 > > So as far as I can tell, it's working fine for me with the goals > I have for that kthread. Works for me is not really useful information and does not answer my question: >> When exactly during the unplug operation is it unbound? The problem Ming and Christoph are trying to solve requires that the thread is migrated _before_ the hardware queue is shut down and drained. That's why I asked for the exact point where this happens. When the CPU is finally offlined, i.e. the CPU cleared the online bit in the online mask is definitely too late simply because it still runs on that outgoing CPU _after_ the hardware queue is shut down and drained. This needs more thought and changes to sched and kthread so that the kthread breaks affinity once the CPU goes offline. Too tired to figure that out right now. Thanks, tglx