Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964804AbbLRS7v (ORCPT ); Fri, 18 Dec 2015 13:59:51 -0500 Received: from terminus.zytor.com ([198.137.202.10]:57856 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932502AbbLRS7u (ORCPT ); Fri, 18 Dec 2015 13:59:50 -0500 User-Agent: K-9 Mail for Android In-Reply-To: <20151218185040.GN20409@wotan.suse.de> References: <1450217797-19295-1-git-send-email-mcgrof@do-not-panic.com> <56731D32.4040900@zytor.com> <20151217234625.GM20409@wotan.suse.de> <56738AAF.2080601@zytor.com> <20151218185040.GN20409@wotan.suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Subject: Re: [RFC v1 0/8] x86/init: Linux linker tables From: "H. Peter Anvin" Date: Fri, 18 Dec 2015 10:58:50 -0800 To: "Luis R. Rodriguez" CC: "Luis R. Rodriguez" , tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, konrad.wilk@oracle.com, rusty@rustcorp.com.au, luto@amacapital.net, boris.ostrovsky@oracle.com, mcb30@ipxe.org, jgross@suse.com, JBeulich@suse.com, joro@8bytes.org, ryabinin.a.a@gmail.com, andreyknvl@google.com, long.wanglong@huawei.com, qiuxishi@huawei.com, aryabinin@virtuozzo.com, mchehab@osg.samsung.com, valentinrothberg@gmail.com, peter.senna@gmail.com, x86@kernel.org, Michal Marek , xen-devel@lists.xensource.com, Michael Matz , linux-kernel@vger.kernel.org Message-ID: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7015 Lines: 199 On December 18, 2015 10:50:40 AM PST, "Luis R. Rodriguez" wrote: >On Thu, Dec 17, 2015 at 08:25:19PM -0800, H. Peter Anvin wrote: >> On 12/17/15 15:46, Luis R. Rodriguez wrote: >> > >> > I explain why I do that there but the gist of it is that on Linux >we may also >> > want stronger semantics for specific linker table solutions, and >solutions such >> > as those devised on the IOMMU init stuff do memmove() for sorting >depending on >> > semantics defined (in the simplest case here so far dependency >between init >> > sequences), this makes each set of sequences very subsystem >specific. An issue >> > with *one* subsystem could make things really bad for others. I >thought about >> > this quite a bit and figured its best left to the subsystem >maintainers to >> > decide. >> > >> >> A table that needs sorting or other runtime handling is just a >> read-write table for the purpose of the linker table construct. It >> presents to C as an array of initialized data. > >Sure but what I was getting at was that since some run time changes to >the >table *might* be desirable, depending on the subsystem, the subsystem >table needs >to be able to know the start and end address of its table, and that's a >linker script change. But come to think of it, that was me just being >pedantic >and careful, I'll try even a run time sort with a few tables of >different >structure size to ensure its both possible to just declare them in C >and also >sort them without a linker script change. > >> > Perhaps a new sections.h file (you tell me) which documents the >different >> > section components: >> > >> > /* document this *really* well */ >> > #define SECTION_RODATA ".rodata" >> > #define SECTION_INIT ".init" >> > #define SECTION_INIT_RODATA ".init_rodata" >> > #define SECTION_READ_MOSTLY ".read_mostly" >> > >> > Then on tables.h we add the section components support: >> >> Yes, something like that. How to macroize it cleanly is another >matter; >> we may want to use slightly different conventions that iPXE to match >our >> own codebase. > >Sure, the style below is from iPXE, the one in the patch set matches >the >Linux coding style. Other than that I'd welcome feedback on any issues >or recommendations with style on our proposed version of tables.h >Seems you provided some pointers below, so I'll go try to incorporate >those suggestions. > >> > #define __table(component, type, name) (component, type, name) >> > >> > #define __table_component(table) __table_extract_component table > >> > #define __table_extract_component(component, type, name) component >> > >> > #define __table_type(table) __table_extract_type table > >> > #define __table_extract_type(component, type, name) type >> > >> > #define __table_name(table) __table_extract_name table > >> > #define __table_extract_name(component, type, name) name >> > >> > #define __table_str(x) #x >> > >> > #define __table_section(table, idx) \ > >> > "." __table_component (table) ".tbl." __table_name (table) >"." __table_str (idx) >> > >> > #define __table_entry(table, idx) > \ >> > __attribute__ ((__section__(__table_section(table, idx)), > \ >> > __aligned__(__table_alignment(table)))) >> > >> > A user could then be something as follows: >> > >> > #define X86_INIT_FNS __table(SECTION_INIT, struct x86_init_fn, >"x86_init_fns") >> > #define __x86_init_fn(order_level) __table_entry(X86_INIT_FNS, >order_level) >> >> Yes, but in particular the common case of function initialization >tables >> should be generic. >> >> I'm kind of thinking a syntax like this: >> >> DECLARE_LINKTABLE_RO(struct foo, tablename); >> DEFINE_LINKTABLE_RO(struct foo, tablename); >> LINKTABLE_RO(tablename,level) = /* contents */; >> LINKTABLE_SIZE(tablename) >> >> ... which would turn into something like this once it goes through >all >> the preprocessing phases >> >> /* DECLARE_LINKTABLE_RO */ >> extern const struct foo tablename[], tablename__end[]; >> >> /* DEFINE_LINKTABLE_RO */ >> DECLARE_LINKTABLE_RO(struct foo, tablename); >> >> const struct >> foo__attribute__((used,section(".rodata.tbl.tablename.0"))) >tablename[0]; >> >> const struct >> foo__attribute__((used,section(".rodata.tbl.tablename.999"))) >> tablename__end[0]; >> >> /* LINKTABLE_RO */ >> static const __typeof__(tablename) >> __attribute__((used,section(".rodata.tbl.tablename.50"))) >> __tbl_tablename_12345 >> >> /* LINKTABLE_SIZE */ >> ((tablename__end) - (tablename)) >> >> ... and so on for all the possible sections where we may want tables. > >OK thanks so much! Will try working with that. > >> Note: I used 0 and 999 above since they sort before and after all >> possible 2-digit decimal numbers, but that's just cosmetic. > >Ah neat, so we could still use the two digits 99 and 00 for order >level if we wanted then. > >> > If that's what you mean? >> > >> > I'm a bit wary about having the linker sort any of the above >SECTION_*'s, but >> > if we're happy to do that perhaps a simple first step might be to >see if 0-day >> > but would be happy with just the sort without any consequences to >any >> > architecture. Thoughts? >> >> I don't see what is dangerous about it. The section names are such >that >> a lexographical sort will do the right thing, and we can simply use >> SORT(.rodata.tbl.*) in the linker script, for example. > >OK I'll leave it up to you, I'll go test sorting the sections broadly >first. > >> >> The other thing is to take a >> >> clue from the implementation in iPXE, which uses priority levels >00 and >> >> 99 (or we could use non-integers which sort appropriately instead >of >> >> using "real" levels) to contain the start and end symbols, which >> >> eliminates any need for linker script modifications to add new >tables. >> > >> > This solution uses that as well. The only need for adding custom >sections >> > is when they have a requirement for a custom run time sort, and >also to >> > ensure they don't cause regressions on other subsystems if they >have a buggy >> > sort. The run time sorting is all subsystem specific and up to >their own >> > semantics. >> >> Again, from a linker table POV this is nothing other than a >read-write >> table; there is a runtime function that then operates on that >read-write >> table. > >OK, I'm convinced, I'll go try these changes now. > > Luis No, start and end symbols are provided by having zero-length sections which sort at the very beginning and very end. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/