Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752623AbcD3AJN (ORCPT ); Fri, 29 Apr 2016 20:09:13 -0400 Received: from mail-oi0-f52.google.com ([209.85.218.52]:35316 "EHLO mail-oi0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750973AbcD3AJL (ORCPT ); Fri, 29 Apr 2016 20:09:11 -0400 MIME-Version: 1.0 In-Reply-To: <20160429224112.kl3jlk7ccvfceg2r@treble> References: <20160429201139.pudoged2yathyo64@treble> <20160429202701.yijrohqdsurdxv2a@treble> <20160429212546.t26mvthtvh7543ff@treble> <20160429224112.kl3jlk7ccvfceg2r@treble> From: Andy Lutomirski Date: Fri, 29 Apr 2016 17:08:50 -0700 Message-ID: Subject: Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking To: Josh Poimboeuf Cc: Jiri Kosina , Ingo Molnar , X86 ML , Heiko Carstens , "linux-s390@vger.kernel.org" , live-patching@vger.kernel.org, Michael Ellerman , Chris J Arges , linuxppc-dev@lists.ozlabs.org, Jessica Yu , Petr Mladek , Jiri Slaby , Vojtech Pavlik , "linux-kernel@vger.kernel.org" , Miroslav Benes , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2155 Lines: 55 On Apr 29, 2016 3:41 PM, "Josh Poimboeuf" wrote: > > On Fri, Apr 29, 2016 at 02:37:41PM -0700, Andy Lutomirski wrote: > > On Fri, Apr 29, 2016 at 2:25 PM, Josh Poimboeuf wrote: > > > I think the easiest way to make it work would be to modify the idtentry > > > macro to put all the idt entries in a dedicated section. Then the > > > unwinder could easily detect any calls from that code. > > > > That would work. Would it make sense to do the same for the irq entries? > > Yes, I think so. > > > >> I suppose we could try to rejigger the code so that rbp points to > > >> pt_regs or similar. > > > > > > I think we should avoid doing something like that because it would break > > > gdb and all the other unwinders who don't know about it. > > > > How so? > > > > Currently, rbp in the entry code is meaningless. I'm suggesting that, > > when we do, for example, 'call \do_sym' in idtentry, we point rbp to > > the pt_regs. Currently it points to something stale (which the > > dump_stack code might be relying on. Hmm.) But it's probably also > > safe to assume that if you unwind to the 'call \do_sym', then pt_regs > > is the next thing on the stack, so just doing the section thing would > > work. > > Yes, rbp is meaningless on the entry from user space. But if an > in-kernel interrupt occurs (e.g. page fault, preemption) and you have > nested entry, rbp keeps its old value, right? So the unwinder can walk > past the nested entry frame and keep going until it gets to the original > entry. Yes. It would be nice if we could do better, though, and actually notice the pt_regs and identify the entry. For example, I'd love to see "page fault, RIP=xyz" printed in the middle of a stack dump on a crash. Also, I think that just following rbp links will lose the actual function that took the page fault (or whatever function pt_regs->ip actually points to). > > > We should really re-add DWARF some day. > > Working on it :-) Excellent. Have you looked at my vdso unwinding test at all? If we could do something similar for the kernel, IMO it would make testing much more pleasant. --Andy