2010-02-20 18:31:22

by Enrico Weigelt

[permalink] [raw]
Subject: Dynamic linking in the kernel

Hi folks,

just some naive thoughts on dynamic linking:

Starting up an dynamically linked executable tends to need a lot
of syscalls. A kernel-based dynamic linker could cache a lot of
relocation data (eg. when the same binary is called many times),
share pages even w/o mmap(), and the ldstub wouldnt be needed
anymore.

At that point we maybe also could create a new binfmt which is
tailored to efficiency (much simpler than ELF)


What do you think about this idea ?


cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/

cellphone: +49 174 7066481 email: [email protected] skype: nekrad666
----------------------------------------------------------------------
Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------


2010-02-20 20:57:42

by David Miller

[permalink] [raw]
Subject: Re: Dynamic linking in the kernel

From: Enrico Weigelt <[email protected]>
Date: Sat, 20 Feb 2010 19:22:40 +0100

> Starting up an dynamically linked executable tends to need a lot
> of syscalls. A kernel-based dynamic linker could cache a lot of
> relocation data (eg. when the same binary is called many times),
> share pages even w/o mmap(), and the ldstub wouldnt be needed
> anymore.

This is not practical.

In order to implement this the kernel would have to also save a copy
of all pieces of the processes's environment and compare all of the
environment variable settings on every execution. This would be
needed to handle things like LD_PRELOAD, LD_LIBRARY_PATH, and LD_DEBUG
as just three examples.

What's more, any filesystem change involved in the shared libraries,
the executable, or the dynamic linker would have to be monitored as
well.

Really, this is not a good idea, and the cost is only ~3 system calls
per shared library and considering the amount of flexibility we get in
return it's not that bad at all.

Actually, the more expensive part of shared libraries are the page
faults from filling in the relocations and we already have a mechanism
to save that cost, it's called 'prelink'.

2010-02-22 05:44:15

by Enrico Weigelt

[permalink] [raw]
Subject: Re: Dynamic linking in the kernel

David Miller wrote:

> In order to implement this the kernel would have to also save a copy
> of all pieces of the processes's environment and compare all of the
> environment variable settings on every execution. This would be
> needed to handle things like LD_PRELOAD, LD_LIBRARY_PATH, and LD_DEBUG
> as just three examples.

Fairly simple:

* parsed per-module data is cached by its inode id
* cached data that can be influenced by LD_PRELOAD/LD_LIBRARY_PATH
(eg. mapping of library names to actual filenames or inode-id's)
is cached on hash of these variables plus inode-id

> What's more, any filesystem change involved in the shared libraries,
> the executable, or the dynamic linker would have to be monitored as
> well.

What could go wrong ?

a) overwring an currently mapped-in library. this also applies to
the traditional approach as well. write-locking (w/o locking
against removal, of course ;-)) might help.
b) filesystems could get remounted while modules are already cached:
that (IMHO) changes the inode-id's as well, so not affecting the
inode-id based cache lookups
c) permissions could get changed: either use the inode data we can on
the file lookups (we most likely wont get rid of) or use inotify.

> Really, this is not a good idea, and the cost is only ~3 system calls
> per shared library and considering the amount of flexibility we get in
> return it's not that bad at all.

It's worth much more than that:

a) able to cache much data that now have to be parsed/computed on
each single exec (eg. dependencies, symbol tables, etc)
b) sharing pages even when mmap() is not available


cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/

cellphone: +49 174 7066481 email: [email protected] skype: nekrad666
----------------------------------------------------------------------
Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

2010-02-22 06:21:08

by David Miller

[permalink] [raw]
Subject: Re: Dynamic linking in the kernel

From: Enrico Weigelt <[email protected]>
Date: Mon, 22 Feb 2010 06:43:27 +0100

> David Miller wrote:
>
>> In order to implement this the kernel would have to also save a copy
>> of all pieces of the processes's environment and compare all of the
>> environment variable settings on every execution. This would be
>> needed to handle things like LD_PRELOAD, LD_LIBRARY_PATH, and LD_DEBUG
>> as just three examples.
>
> Fairly simple:
>
> * parsed per-module data is cached by its inode id
> * cached data that can be influenced by LD_PRELOAD/LD_LIBRARY_PATH
> (eg. mapping of library names to actual filenames or inode-id's)
> is cached on hash of these variables plus inode-id

Feel free to implement this and show us the numbers.

I am not as confident as you :-)