Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp2712798ybv; Mon, 24 Feb 2020 10:08:02 -0800 (PST) X-Google-Smtp-Source: APXvYqzQCK77N6Gmbj6OSKME310qX27SuC6e++zu1cLGRZyNaLhpTVjbPAKbbj7Gh6AoSGtgMOEb X-Received: by 2002:a9d:62ca:: with SMTP id z10mr24112840otk.263.1582567682666; Mon, 24 Feb 2020 10:08:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582567682; cv=none; d=google.com; s=arc-20160816; b=FFMQ4bzoD22iCr81LoGsix7t/XoJGG+yTiyfjFr+HFL6O6VyiQf/lZ8xi+d0goyUfU o/dFxKbrPqYqgWocQwV+pHJOo1hS19girrf6absa3y3AAa63M6qh5YTczw7kl2iGoMSf AgkB1sxG41CUWjhX6JgkQaPeCevem6XObZmy1OgCFPXBuz6OgyJpQzk+UDpOhhxdIY/o hPEdol1GMxMHkL3o2etDpXxGje/pND7koR2pUDpQZuC2hKeCvAqcq+Vlx3OFZJvRwV7o sE840it0FhX9uog3Nz3KX4Tvyqts74kaLvft5geJ+RrvWEGjcezEdsY7+ECDL6pn/Vme k6hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=x6XPeRrFkFT/Vq7xanjULDlMHcIpIRA7leHx3w6fUQA=; b=AzM134DzuWU/rPmO2pw6avWjXHo9tdJxkICcV+quxMoK7hybvR76sC2mIGLCgiCQtv SCg7LHmNhya9mOPVCBil69CHrX2Yk2G1y289fquJWW2zAGqB8cWCLInEGuZtYct5F1vb QIP0b3VADWsGyaB/uZY+jeciePgOG1C/8bMfbqFv6zIyTEBeBWvpC7cnrG4TBOS8Ucmd duI66kVmch1lg1TCiu2QatJ02Gb1Xa4K9kZiLYGyUZp3iP4bYXx2vwl2j2K5FyuN2sG0 QEAvZk1OGrJ4RT4n8V/JD31lIDe7kwgHVglbBnmeOkqZ2gIVxpgqFAs3foOQ+cF0cPLG gWtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r12si6392380otq.156.2020.02.24.10.07.49; Mon, 24 Feb 2020 10:08:02 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727689AbgBXSHY (ORCPT + 99 others); Mon, 24 Feb 2020 13:07:24 -0500 Received: from foss.arm.com ([217.140.110.172]:41084 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727479AbgBXSHY (ORCPT ); Mon, 24 Feb 2020 13:07:24 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 285191FB; Mon, 24 Feb 2020 10:07:23 -0800 (PST) Received: from [10.1.196.105] (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5CEF83F703; Mon, 24 Feb 2020 10:07:22 -0800 (PST) Subject: Re: [PATCH v2] arm64:kgdb: Fix kernel single-stepping To: minyard@acm.org Cc: Will Deacon , Catalin Marinas , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Corey Minyard References: <20200219152403.3495-1-minyard@acm.org> <20200220142214.GC14459@willie-the-truck> <20200220163038.GJ3704@minyard.net> <20200220213040.GA2919@minyard.net> From: James Morse Message-ID: <9e2eac0b-ab60-6316-4976-686a8ab7ac8f@arm.com> Date: Mon, 24 Feb 2020 18:07:17 +0000 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20200220213040.GA2919@minyard.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Corey, On 20/02/2020 21:30, Corey Minyard wrote: > On Thu, Feb 20, 2020 at 10:30:38AM -0600, Corey Minyard wrote: >> On Thu, Feb 20, 2020 at 02:22:14PM +0000, Will Deacon wrote: >>> On Wed, Feb 19, 2020 at 09:24:03AM -0600, minyard@acm.org wrote: >>>> After fixing this and doing some more testing, I ran into another issue: >>>> >>>> * Kernel enables the pt_regs single step >>>> * Kernel returns from the exception with ERET. >>>> * An interrupt or page fault happens on the instruction, causing the >>>> instruction to not be run, but the exception handler runs. >>> >>> This sounds like you've broken debug; we should take the step exception >>> in the exception handler. That's the way this is supposed to work. >> >> Ok, here is the disconnect, I think. If that is the case, then what I'm >> seeing is working like it should. That doesn't work with gdb, though, >> gdb expects to be able to single-step and get to the next instruction. >> The scenario I mentioned at the top of this email. >> >> Let me look at this a bit more. I'll look at this on qemu and maybe a >> pi. > Ok, this is the disconnect. I was assuming that single step would stop > at the next instruction after returning from an exception. qemu works > the same way the hardware I have does. So I'm assuming arm64 doesn't > clear PTRACE.SS on an exception, even though that seems to be what the > manual says. PSTATE.SS isn't an enable bit for single step ... its part of a bigger state-machine. (my made-up terminology for it is 'PSTATE.Suppress-Step'...) The diagram in the Arm-Arm's D2.12.3 "The software step state machine" may help. MDSCR_EL1.SS enables single-step, if PSTATE.D is clear the CPU will now take step exceptions instead of pretty much anything else. (active pending state) To execute one instruction you need to ERET with SPSR_ELx.SS set. (active, not pending) The CPU will execute one instruction, then clear PSTATE.SS. (taking us back to active pending) Taking an exception clears PSTATE.SS so that you know you're in active-pending state, and will take a step exception once you re-enable debug with PSTATE.D. This lets you step the exception handlers. (if it was set, you wouldn't see the first instruction in the step handler, if it was inherited, you couldn't know if you would see the first instruction or not). If you take something other than a step exception, PSTATE.SS will be preserved in SPSR_EL1.SS. What I think you are seeing is the step exception once debug is re-enabled, after taking an exception you didn't want. This happens because MDSCR_EL1.SS is still set. > You can reproduce this by setting up kgdb on the kernel and hooking up > gdb, setting a breakpoint somewhere that has interrupts enabled, then > doing a "continue". It will hit the same breakpoint again and again > because the PC doesn't get advanced by the single step and the timer > interrupt is always going to be pending. I can do a more detailed set > of instructions with qemu, if you like. > I looked at kprobes a bit. I don't think kprobes will have a problem > with this particular issue, it disables interrupts while single > stepping and doesn't allow a probe on any instruction that would modify > the interrupt settings. I didn't look at page faults, but I assume that > it also won't allow a probe where there can be a page fault. Yes, arch_prepare_kprobe() checks search_exception_tables() for locations that we know may cause page faults. These are blacklisted. Thanks, James