2010-08-31 19:52:24

by Tim Bird

[permalink] [raw]
Subject: Re: [RFC] Kernel 'boot cache' to reduce boot time

On 08/31/2010 12:13 PM, Andrew Murray wrote:
> I have a suggestion for a kernel framework which aims to reduce boot
> time in embedded devices and would be interested in hearing your
> feedback.
>
> A large portion of kernel boot time is spent in driver probe functions
> often waiting for hardware for example calculating LPJ values or
> trying to determine what type of camera is connected (PAL/NTSC) etc.
> However for most embedded devices the hardware remains constant and
> these probes always determine the same information. Therefore boot
> time can be decreased by removing some of this probe code and
> replacing it with known values.
>
> To some extent some of these optimisations have already been done
> through a variety of methods - for example the LPJ calculation can be
> bypassed with the lpj= parameter and some drivers have their own
> methods. My solution aims to generalise these solutions...
>
> The solution is to provide a very simple framework which will allow
> drivers to identify and record such values (LPJ, camera type, decoder
> chip version) during boot. Once booted the user can obtain a
> collection of these values and pass them back to the kernel on
> subsequent boots. During subsequent boots - drivers upon realising
> these values have already been provided can bypass some of their probe
> code and thus reducing boot time. Taking advantage of this framework
> would be very trivial for drivers.
>
> I wanted to see your views on the overall solution prior to
> considering how it could be implemented.

For my part, I think this sounds like a great idea. I have
considered such a mechanism in the past, but never gotten around
to actually designing a solution.

Here are some random thoughts on this idea:

My experience is that we've made good progress on boot time
probing, for fixed hardware. The big problems in the kernel
boot time appear to be with busses that require discovery of
devices, with long timeouts specified in the bus standard and
arbitrary bus connection architecture (I'm thinking of USB,
but other busses have similar problems). For many embedded
devices, scanning these types of pluggable busses aren't
required for what I call "first product use", but they can be
scanned and populated later.

Note that the asynchronous function call stuff by Arjan
van de Ven addresses some of this boot time probing
delay problem.

Having said that, I don't think that probing of static
hardware is a solved problem, by any means.

For the boot cache data, you are going to need to figure
out how to make the data persistent. Doing something
in a regular fashion (rather than ad-hoc via command line
options) should help with this.

To some degree this might end up looking very similar
to the "resume" path in the driver, where a particular
device state is entered into from cold start.

Sony has been doing something related to this called
"Snapshot boot" for some time now, which is kind of an
optimized unhibernate operation, with some hardware bringup
done by firmware, and some bringup done using the normal driver
resume operation. This work was presented at OLS
several years ago, but we haven't pushed it much since
then. (But we're using it in product)
See http://elinux.org/upload/3/37/Snapshot-boot-final.pdf

Sorry for rambling. Anyway - I'm all for the boot cache idea.
But acceptability would, of course, be dependent on the
details of the implementation.

The best thing to get started, IMHO, would be to identify
a few drivers which have long probe times, and see
how they could reduce these with the proposed boot cache.
If you find that each new device adds some new wrinkle
in the cache requirements, that would be a bad sign. But
if different drivers, especially drivers in different functional
areas, are found to be able to use a consistent API, then
this could be a nice feature.

BTW - I could see this tying into the flattened device
tree work by Grant Likely.
-- Tim

P.S. Also, I would recommend cross-posting to LKML
to get wider visibility of your proposal. I'm doing
so in this response - I hope that's OK.

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Network Entertainment
=============================


2010-08-31 20:15:55

by Andrew Murray

[permalink] [raw]
Subject: Re: [RFC] Kernel 'boot cache' to reduce boot time

Hello,

On 31 August 2010 20:52, Tim Bird <[email protected]> wrote:
> Having said that, I don't think that probing of static
> hardware is a solved problem, by any means.
>
> For the boot cache data, you are going to need to figure
> out how to make the data persistent. ?Doing something
> in a regular fashion (rather than ad-hoc via command line
> options) should help with this.
Yes I wanted to see if the community was interested in the general
idea before I considered the implementation in too much detail. Though
the two obvious choices spring to mind. 1) Simply serialise and print
any suitable values to the console - these can be parsed with a
bootscript much like the way you parse initcall_debug. 2) Output these
values to a proc file which the use can extract.

The difficult part as I see it - is passing it back to the kernel - as
you suggested they may be some common ground with the device tree
models. Any suggestions?

I anticipate seeing name/value string pairs (or a small number of
types, e.g. int) - and leave it up to the driver to determine the best
way to serialise the data into the supported types. For example a
driver you may see the following pairs in a 'bootcache' file:
lpj=12313,drivers.media.video.decoder.ver=14,drivers.media.video.camera.type=pal
etc. I think the information encoded in this way would be a much
higher level that the typical register values used in suspend/resume
code.

>
> To some degree this might end up looking very similar
> to the "resume" path in the driver, where a particular
> device state is entered into from cold start.
I anticipated perhaps a simpler approach where instead of completely
getting rid of the probe - the probe simply skips time consuming paths
via this framework - I though this would make it easier for device
drivers to use the framework.

> Sorry for rambling. ?Anyway - I'm all for the boot cache idea.
> But acceptability would, of course, be dependent on the
> details of the implementation.
>
> The best thing to get started, IMHO, would be to identify
> a few drivers which have long probe times, and see
> how they could reduce these with the proposed boot cache.
> If you find that each new device adds some new wrinkle
> in the cache requirements, that would be a bad sign. ?But
> if different drivers, especially drivers in different functional
> areas, are found to be able to use a consistent API, then
> this could be a nice feature.
I agree.

>
> BTW - I could see this tying into the flattened device
> tree work by Grant Likely.
> ?-- Tim
>
> P.S. ?Also, I would recommend cross-posting to LKML
> to get wider visibility of your proposal. ?I'm doing
> so in this response - I hope that's OK.

Thanks for the useful information - I'll read up on those slides.

Andrew Murray

2010-09-01 02:41:09

by Grant Likely

[permalink] [raw]
Subject: Re: [RFC] Kernel 'boot cache' to reduce boot time

On Tue, Aug 31, 2010 at 1:52 PM, Tim Bird <[email protected]> wrote:
> On 08/31/2010 12:13 PM, Andrew Murray wrote:
>> I have a suggestion for a kernel framework which aims to reduce boot
>> time in embedded devices and would be interested in hearing your
>> feedback.
>>
>> A large portion of kernel boot time is spent in driver probe functions
>> often waiting for hardware for example calculating LPJ values or
>> trying to determine what type of camera is connected (PAL/NTSC) etc.
>> However for most embedded devices the hardware remains constant and
>> these probes always determine the same information. Therefore boot
>> time can be decreased by removing some of this probe code and
>> replacing it with known values.
>>
>> To some extent some of these optimisations have already been done
>> through a variety of methods - for example the LPJ calculation can be
>> bypassed with the lpj= parameter and some drivers have their own
>> methods. My solution aims to generalise these solutions...
>>
>> The solution is to provide a very simple framework which will allow
>> drivers to identify and record such values (LPJ, camera type, decoder
>> chip version) during boot. Once booted the user can obtain a
>> collection of these values and pass them back to the kernel on
>> subsequent boots. During subsequent boots - drivers upon realising
>> these values have already been provided can bypass some of their probe
>> code and thus reducing boot time. Taking advantage of this framework
>> would be very trivial for drivers.

I think we've pretty much already got this functionality, even if it
isn't fully taken advantage of by all device drivers. Any device can
be registered with either a device-specific platform_data pointer or a
device_node pointer (when CONFIG_OF is enabled). Any driver can be
adapted to use additional data from either of those sources as an
alternative to HW probing. platform_data is simply a statically
defined C structure provided by the board support code. A device_node
pointer points to the device's node in a flattened device tree data
structure that is passed to the kernel at boot time.

(It's also worth noting that a large number of device drivers in
embedded devices already don't do any form of probing, and only rely
on the static data provided at device registration time).

USB and PCI busses are the obvious cases where the kernel currently HW
probes by default, but those cases could also be modified to make use
of device tree or platform_data configuration sources. From your
description, it sounds like camera connections are in the same boat.

>> I wanted to see your views on the overall solution prior to
>> considering how it could be implemented.
>
> For my part, I think this sounds like a great idea. ?I have
> considered such a mechanism in the past, but never gotten around
> to actually designing a solution.
>
> Here are some random thoughts on this idea:
>
> My experience is that we've made good progress on boot time
> probing, for fixed hardware. ?The big problems in the kernel
> boot time appear to be with busses that require discovery of
> devices, with long timeouts specified in the bus standard and
> arbitrary bus connection architecture (I'm thinking of USB,
> but other busses have similar problems). ?For many embedded
> devices, scanning these types of pluggable busses aren't
> required for what I call "first product use", but they can be
> scanned and populated later.

There actually already are bindings and some support code for
describing fixed PCI devices in a flattened device tree. USB is a bit
harder though, but fortunately USB probing can often be deferred until
later in the boot (as you rightly pointed out) after userspace has
started.

>
> Note that the asynchronous function call stuff by Arjan
> van de Ven addresses some of this boot time probing
> delay problem.
>
> Having said that, I don't think that probing of static
> hardware is a solved problem, by any means.
>
> For the boot cache data, you are going to need to figure
> out how to make the data persistent. ?Doing something
> in a regular fashion (rather than ad-hoc via command line
> options) should help with this.

I'm not convinced that doing a "live" capture is the best approach.
I'd rather see that kind of data obtained by the developer and
explicitly specified either in a device tree or the platform support
code.

If live capture is required, then doing an IPL-style boot, suspend &
dump (like you described below) is probably a more robust approach.
Otherwise you have to figure out how to pick and choose items of data
out of a running kernel and inject them in again at the next boot
(blech!).

> To some degree this might end up looking very similar
> to the "resume" path in the driver, where a particular
> device state is entered into from cold start.
>
> Sony has been doing something related to this called
> "Snapshot boot" for some time now, which is kind of an
> optimized unhibernate operation, with some hardware bringup
> done by firmware, and some bringup done using the normal driver
> resume operation. ?This work was presented at OLS
> several years ago, but we haven't pushed it much since
> then. ?(But we're using it in product)
> See http://elinux.org/upload/3/37/Snapshot-boot-final.pdf

I remember hearing very positive things about that presentation (and
being sorry that I missed it) that year at OLS. I'm pretty intrigued
by the approach and would like to see more research done in that area.

> Sorry for rambling. ?Anyway - I'm all for the boot cache idea.
> But acceptability would, of course, be dependent on the
> details of the implementation.
>
> The best thing to get started, IMHO, would be to identify
> a few drivers which have long probe times, and see
> how they could reduce these with the proposed boot cache.
> If you find that each new device adds some new wrinkle
> in the cache requirements, that would be a bad sign. ?But
> if different drivers, especially drivers in different functional
> areas, are found to be able to use a consistent API, then
> this could be a nice feature.
>
> BTW - I could see this tying into the flattened device
> tree work by Grant Likely.

http://www.devicetree.org/Device_Tree_Usage

g.