Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752881AbcD2UTq (ORCPT ); Fri, 29 Apr 2016 16:19:46 -0400 Received: from mail-oi0-f49.google.com ([209.85.218.49]:36737 "EHLO mail-oi0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752327AbcD2UTo (ORCPT ); Fri, 29 Apr 2016 16:19:44 -0400 MIME-Version: 1.0 In-Reply-To: <20160429201139.pudoged2yathyo64@treble> References: <20160429201139.pudoged2yathyo64@treble> From: Andy Lutomirski Date: Fri, 29 Apr 2016 13:19:23 -0700 Message-ID: Subject: Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking To: Josh Poimboeuf Cc: Jessica Yu , Jiri Kosina , Miroslav Benes , Ingo Molnar , Peter Zijlstra , Michael Ellerman , Heiko Carstens , live-patching@vger.kernel.org, "linux-kernel@vger.kernel.org" , X86 ML , linuxppc-dev@lists.ozlabs.org, "linux-s390@vger.kernel.org" , Vojtech Pavlik , Jiri Slaby , Petr Mladek , Chris J Arges , Andy Lutomirski Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1735 Lines: 37 On Fri, Apr 29, 2016 at 1:11 PM, Josh Poimboeuf wrote: > On Fri, Apr 29, 2016 at 11:06:53AM -0700, Andy Lutomirski wrote: >> On Thu, Apr 28, 2016 at 1:44 PM, Josh Poimboeuf wrote: >> > A preempted function might not have had a chance to save the frame >> > pointer to the stack yet, which can result in its caller getting skipped >> > on a stack trace. >> > >> > Add a flag to indicate when the task has been preempted so that stack >> > dump code can determine whether the stack trace is reliable. >> >> I think I like this, but how do you handle the rather similar case in >> which a task goes to sleep because it's waiting on IO that happened in >> response to get_user, put_user, copy_from_user, etc? > > Hm, good question. I was thinking that page faults had a dedicated > stack, but now looking at the entry and traps code, that doesn't seem to > be the case. > > Anyway I think it shouldn't be a problem if we make sure that any kernel > function which might trigger a valid page fault (e.g., > copy_user_generic_string) do the proper frame pointer setup first. Then > the stack should still be reliable. > > In fact I might be able to teach objtool to enforce that: any function > which uses an exception table should create a stack frame. > > Or alternatively, maybe set some kind of flag for page faults, similar > to what I did with this patch. > How about doing it the other way around: teach the unwinder to detect when it hits a non-outermost entry (i.e. it lands in idtentry, etc) and use some reasonable heuristic as to whether it's okay to keep unwinding. You should be able to handle preemption like that, too -- the unwind process will end up in an IRQ frame. --Andy