Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1550843ybe; Mon, 2 Sep 2019 23:32:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqzXwYsM0R2nG+Il094vzD9NmocPtJlfy+CSOgcqv4jMZYFTkfm5ByCbhyggsNjZr1ly4Uzq X-Received: by 2002:a63:31c1:: with SMTP id x184mr29533079pgx.128.1567492378294; Mon, 02 Sep 2019 23:32:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567492378; cv=none; d=google.com; s=arc-20160816; b=WidgW5E+ekceked6PqLSoYA8HnzM9ZMN5J25Cn6lSir2RD9Z9lc7m30Akhn0J6bzMh RIxQjRqsFJFAqhkLYmvU+gZVX7oVCPvcTWYJ3EowOZhWoFi2pesHTv0J3uj/jKEiVJd0 0EwfQ/j2c5EYJgFnaxoJ3K76S4utmnFB/geX2NPKhi5FYWAfiJkQIM1fxuX35Tl0ItRn gVWPV1jjgO2kgIf6YAPv1MRlS1DVB2rtHBQPMbOja7mC6ylelw4jvByvFyiLZqnBzKgH B5Hz+rckeD+HZPHvS9LsXg1FHkQokaoi7cpbPWraC4OLcPj8B+dTm62M93fjVTm/OdWo UT/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=mW02hBBylMZ+ThKStU2XAVQEu1rb53r6j/aGYvleGys=; b=e8ebtRogToFqfaTKdK36QrC4BqlykO11mbXqlm2m1tjqnevYBTpr/dBtREwaXN8vlt voSUsisewKM1Hz6FxTNHZuCcr0IIn/HZ87LcoqZXCuAKMVfq79NUvbw+wJopkDgVkg/o X2y0pQtQptPDBIDJAy3L5INzaQaVDpfwAyGTlxwgj5eMg/W+rfcX6AfP6MfI815ROx4K 1vRHSyQ1PulpmNb4w+18sQHhPGhYyht6P/BMl5dGDaG1fSqSHfZ+sxY1Tc7VKUKCq6qq uwQKtxHbIVwl5IMPUTCb/t7Ya6y70juQ+1jOgvqlFd6VkzEq2nlWmGq8CW6LWG3XLTP5 /BMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y13si13768260pgp.339.2019.09.02.23.32.42; Mon, 02 Sep 2019 23:32:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727267AbfICGbm (ORCPT + 99 others); Tue, 3 Sep 2019 02:31:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51580 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727025AbfICGbm (ORCPT ); Tue, 3 Sep 2019 02:31:42 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 031D9A36F07; Tue, 3 Sep 2019 06:31:42 +0000 (UTC) Received: from ming.t460p (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 130595DA5B; Tue, 3 Sep 2019 06:31:31 +0000 (UTC) Date: Tue, 3 Sep 2019 14:31:26 +0800 From: Ming Lei To: Daniel Lezcano Cc: Thomas Gleixner , LKML , Long Li , Ingo Molnar , Peter Zijlstra , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , John Garry , Hannes Reinecke , linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism Message-ID: <20190903063125.GA21022@ming.t460p> References: <20190827085344.30799-2-ming.lei@redhat.com> <20190827225827.GA5263@ming.t460p> <20190828110633.GC15524@ming.t460p> <20190828135054.GA23861@ming.t460p> <20190903033001.GB23861@ming.t460p> <299fb6b5-d414-2e71-1dd2-9d6e34ee1c79@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <299fb6b5-d414-2e71-1dd2-9d6e34ee1c79@linaro.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.68]); Tue, 03 Sep 2019 06:31:42 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Daniel, On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote: > > Hi Ming Lei, > > On 03/09/2019 05:30, Ming Lei wrote: > > [ ... ] > > > >>> 2) irq/timing doesn't cover softirq > >> > >> That's solvable, right? > > > > Yeah, we can extend irq/timing, but ugly for irq/timing, since irq/timing > > focuses on hardirq predication, and softirq isn't involved in that > > purpose. > > > >> > >>> Daniel, could you take a look and see if irq flood detection can be > >>> implemented easily by irq/timing.c? > >> > >> I assume you can take a look as well, right? > > > > Yeah, I have looked at the code for a while, but I think that irq/timing > > could become complicated unnecessarily for covering irq flood detection, > > meantime it is much less efficient for detecting IRQ flood. > > In the series, there is nothing describing rigorously the problem (I can > only guess) and why the proposed solution solves it. > > What is your definition of an 'irq flood'? A high irq load? An irq > arriving while we are processing the previous one in the bottom halves? So far, it means that handling interrupt & softirq takes all utilization of one CPU, then processes can't be run on this CPU basically, usually sort of CPU lockup warning will be triggered. > > The patch 2/4 description says "however IO completion is only done on > one of these submission CPU cores". That describes the bottleneck and > then the patch says "Add IRQF_RESCUE_THREAD to create one interrupt > thread handler", what is the rational between the bottleneck (problem) > and the irqf_rescue_thread (solution)? The solution is to switch to handle this interrupt on the created rescue irq thread context when irq flood is detected, and 'this interrupt' means the interrupt requested with IRQF_RESCUE_THREAD. > > Is it really the solution to track the irq timings to detect a flood? The solution tracks the time taken on running do_IRQ() for each CPU. Thanks, Ming