Received: by 10.223.185.116 with SMTP id b49csp5352889wrg; Wed, 7 Mar 2018 10:16:54 -0800 (PST) X-Google-Smtp-Source: AG47ELtA2qxt2nIKdqXFpOGIDFnmkhVgBoFsmQ1JpJYSX3wIEaKj3/nlUED/q+aWT8gcTzhbPAXr X-Received: by 10.99.60.8 with SMTP id j8mr18894348pga.209.1520446614069; Wed, 07 Mar 2018 10:16:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520446614; cv=none; d=google.com; s=arc-20160816; b=J3KJI4a6895Zsa4BQ2oBsyRbFJavRjn2LyJLM8IBie5NqMah8KwuHsv3ect/PGl1cm LatVWqPqdi9XvZuih6UcMHmRDPyYTKc3jOO41CFlWXZdu09H+HN+Gbgkc+bG1oMorsnW hyVUC/zhWOlwzSJdHvcFqd2Q5IDfVyYQhOq0ENF0d4F5AT2bZJWsKyQCHmRVKqY6seh9 ciHiCmu/Z74AJdpq3gy2xysi2wDNQXLmET8OGyrmxpLaTu++P5mMDQruSfaa5obHq1Ll AwWV8nPPZd986lbwpz1/284lbbHiMPa8RuMa8n0qvOG6gZ5qiNXqm7AdIq9dvbaOhtbO m56A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=j0aTiEhGQIk6G4+naZeSWN9atCm4HFtNOs8qBAd8JhE=; b=pl9s+jrTCTQLeTUuPxVFRfOS2fvWJ1auB56BgXpGoYBfa4I54WKswzJ4ttz+ZNjNq+ EQXmvH/alqJoKuVCi7sKSPiaU9C1RceLERnHoQdgd31/pSDUds1lyjkWNuCcKHEtcX6h yBd16f6M6s7jRy3cDawN7+BmFu3WDG900Bxuxz8/Rmcu7fbltRmsMwkjhVgHAu9nr9CV jVDL62qE1E4u+57Zd3oERrbc4rBHmNKTzbSmgVg8RZNsl5HWRdqzG6CxQnP0otR48VNI oIxuwNXaO7wiH3W1uEuUy3voKi0LqCxqErdDatOCTuK5e4HDyadp0nXyfTd0nSc8/i79 1GwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 91-v6si13325879plh.296.2018.03.07.10.16.39; Wed, 07 Mar 2018 10:16:54 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754697AbeCGSOv (ORCPT + 99 others); Wed, 7 Mar 2018 13:14:51 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:55262 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754551AbeCGSOr (ORCPT ); Wed, 7 Mar 2018 13:14:47 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4BD8F1435; Wed, 7 Mar 2018 10:14:47 -0800 (PST) Received: from [10.1.207.62] (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DA5053F24A; Wed, 7 Mar 2018 10:14:43 -0800 (PST) Subject: Re: [PATCH] arm64: kdump: fix interrupt handling done during machine_crash_shutdown To: Grzegorz Jaszczyk , Mark Rutland Cc: catalin.marinas@arm.com, will.deacon@arm.com, james.morse@arm.com, "AKASHI, Takahiro" , Hoeun Ryu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Nadav Haklai , Marcin Wojtas References: <1519837260-30662-1-git-send-email-jaz@semihalf.com> <20180228171647.6t4dabntujcb5kon@lakrids.cambridge.arm.com> From: Marc Zyngier Organization: ARM Ltd Message-ID: <9a709b76-1b45-8250-9d8e-adbad59a43cc@arm.com> Date: Wed, 7 Mar 2018 18:14:42 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/03/18 11:56, Grzegorz Jaszczyk wrote: > Thank you for your feedback. I probably over-interpreted some of the > documentation paragraph to justify (probably) buggy behavior that I am > seeing. Regardless of correctness of this patch I will appreciate if > you could help understanding this issue. > > First the whole story: I was debugging why the crashdump kernel hangs > in v. early stage, when the kdump was triggered from the > ARM_SBSA_WATCHDOG interrupt handler, while everything worked fine when > it was triggered from the process context. Finally It occurred that it > is because the crashdump kernel doesn't get any timer interrupt. I > also notice that this problem doesn't occur when the gic is configured > to work in EOImode == 1. In such circumstances, the write to > GIC_CPU_EOI in gic_handle_irq is causing priority drop to idle, and > therefore when the crashdump kernel starts, the timer interrupt is > able to preempt still active watchdog interrupt (I know that this > interrupt shouldn't be active after irq_set_irqchip_state but for some > reason it seems to not do the job correctly). > > In my commit log I wrongly describe the bahaviour of > irq_set_irqchip_state and irq_get_irqchip_state. In > machine_kexec_mask_interrupts (when watchdog interrupt is active) > after adding some debugs I see that (focusing only on watchdog > interrupt): > 1) before calling irq_set_irqchip_state when I check the status with > irq_get_irqchip_state I see that watchdog interrupt is active > 2) decative interrupt via irq_set_irqchip_state > 3) check the status via irq_get_irqchip_state which indicates that the > status has changed to inactive, so everything seems to be fine, but > still in crashdump kernel I don't get any interrupts (when the EOImode > == 0). > > When I modify the machine_kexec_mask_interrupts, to call the eoi for > watchdog (only temporary to observe the effect): > if (i == watchdog_irq) > chip->irq_eoi(&desc->irq_data); > > everything is working. So it seems that deactivating the interrupt via > write to GIC_CPU_EOI (EOImode == 0) or GIC_CPU_EOI + > GIC_CPU_DEACTIVATE (EOImode == 1) does the job, while deactivating it > with use of GIC_DIST_ACTIVE_CLEAR doesn't. > > I am using the unmodified GICv2m ("arm,gic-400") and the watchdog > interrupt is connected as one of the SPI. Do you have any idea what > can be wrong? Maybe I am missing something? gic configuration? I also > don't exclude that nobody who work with kdump doesn't use (EOImode == > 0) and therefore didn't see this behavior. Not using EOImode==1 is definitely an oddity (at least on the host), but that doesn't mean it shouldn't work. The reason the thing is hanging is that although we correctly deactivate the interrupt, nothing performs the priority drop. Your write to EOI helps in the sense that it guarantees that both priority drop and deactivate are done with the same operation, but that's not something we'd want to expose. My preferred approach would be to nuke the active priority registers at boot time, as the CPUs come up. I'll try to write something this week. M. -- Jazz is not dead. It just smells funny...