Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751419AbdILLSk (ORCPT ); Tue, 12 Sep 2017 07:18:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:47018 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751289AbdILLSj (ORCPT ); Tue, 12 Sep 2017 07:18:39 -0400 Date: Tue, 12 Sep 2017 13:18:36 +0200 From: Petr Mladek To: Helge Deller Cc: Sergey Senozhatsky , "Luck, Tony" , "linux-kernel@vger.kernel.org" , Sergey Senozhatsky , Andrew Morton , "Yu, Fenghua" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman Subject: Re: [PATCH 00/14] Fix wrong %pF and %pS printk format specifier usages Message-ID: <20170912111836.GD2908@pathway.suse.cz> References: <8b93f9ca-95f6-4e40-1cc8-d1a65833abff@gmx.de> <20170907075653.GA533@jagdpanzerIV.localdomain> <20170907083207.GC533@jagdpanzerIV.localdomain> <667b8849-fb60-a312-2483-505252ff737e@gmx.de> <20170907093631.GD533@jagdpanzerIV.localdomain> <20170907095119.GE533@jagdpanzerIV.localdomain> <0604f27e-24ab-625b-9013-c6c0f4f6acc1@gmx.de> <3908561D78D1C84285E8C5FCA982C28F6136C2ED@ORSMSX114.amr.corp.intel.com> <20170908061830.GA496@jagdpanzerIV.localdomain> <6fdd62aa-e9e7-8954-da6b-6fa5e73983c5@gmx.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6fdd62aa-e9e7-8954-da6b-6fa5e73983c5@gmx.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3428 Lines: 93 On Fri 2017-09-08 22:49:51, Helge Deller wrote: > On 08.09.2017 08:18, Sergey Senozhatsky wrote: > > On (09/07/17 16:05), Luck, Tony wrote: > > [..] > >>>> if (not_a_function_descriptor(ptr)) > >>>> return ptr; > >>> > >>> I'm not sure if it's possible on ia64/ppc64/parisc64 > >>> to reliably detect if it's a function descriptor or not. > >> > >> Agreed. I don't know how to write this test (without changing the compiler to > >> put the pointers in a separate section ... and then changing the module loader > >> to keep a list of all these sections). > > > > let me try one more time :) > > > > so below is a number of assumptions, let me know if anything is wrong > > there.... and let's try to fix the "wrong bits" ;) > > > > > > RFC > > > > > > 1) function descriptor table is in .data, not in .text > > correct? > > > > 2) symbol resolution consists of 3 steps: > > > > a) we check if this is a kernel symbol and resolve it if so > > b) we check if the addr belongs to any module and resolve the addr > > if so > > c) we check if the addr is bpf and resolve it if so. let's skip this part. > > > > > > so, for (a) we probably can do something like below. can't we? > > // not tested, as usual. > > > > > > so there are probably some broken parts there. like... > > I don't know. something. > > > > so - what is broken, and how can we fix/tweak it? help me out. > > Sergey, I'm sure there is a way how you can get it somehow to work the way > you describe above, but even then nobody can guarantee you that it > will work in 100% of the cases. It seems that dereferencing an invalid function descriptor is rather safe because probe_kernel_address() prevents crashes. The question is if we could get wrong results by the autodetection. The following possibilities come to my mind: First, if the variable used to store the function descriptor is on stack and is not initialized. Then there is a non-trivial chance that the garbage on the stack will be a real return address to an existing function. Then the autodetection would help to hide this. Second, if wonder if the address of the function descriptor might be in callsyms as well. Note that global variables are in kallsyms as well. Then we would always print the name of this variable. I do not have a strong opinion here. On one hand, it is clear that %pS and %pF are often misused. But I am not sure if the above possible problems are acceptable. > It's somehow like "we have %lu and %c specifiers, and it's basically > the same, so let's try to figure out at runtime which one should be > used based on analysis of what was given as argument". > It may work somehow, but not always. I am not sure if I miss something. But the different output of %lu and %c should be easy to distinguish. Also the difference is the same on all architectures and should be well known. This is not true for the %pS vs. %pF species. > What about the idea of a %luS specifier (or something other) ? I am not a big fun of this. IMHO, the relation between a pointer and symbol name makes more sense that a relation between an unsigned long and a symbol name. IMHO, this would just add even more confusion. Best Regards, Petr PS: I wonder if the improved documentation and fixing all occurrences might be enough to reduce this mistake. I guess that most of them were caused by copying the same pattern from an already broken code.