2017-03-06 19:46:24

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH 2/2] mm/sparse: add last_section_nr in sparse_init() to reduce some iteration cycle

Hello, Wei.

On Fri, Feb 17, 2017 at 10:12:31PM +0800, Wei Yang wrote:
> > And compare the ruling with the iteration for the loop to be (1UL <<
> > 5) and (1UL << 19).
> > The runtime is 0.00s and 0.04s respectively. The absolute value is not much.

systemd-analyze usually does a pretty good job of breaking down which
phase took how long. It might be worthwhile to test whether the
improvement is actually visible during the boot.

> >> * Do we really need to add full reverse iterator to just get the
> >> highest section number?
> >>
> >
> > You are right. After I sent out the mail, I realized just highest pfn
> > is necessary.

That said, getting efficient is always great as long as the added
complexity is justifiably small enough. If you can make the change
simple enough, it'd be a lot easier to merge.

Thanks.

--
tejun


2017-03-08 08:02:06

by Wei Yang

[permalink] [raw]
Subject: Re: [RFC PATCH 2/2] mm/sparse: add last_section_nr in sparse_init() to reduce some iteration cycle

On Mon, Mar 06, 2017 at 02:42:25PM -0500, Tejun Heo wrote:
>Hello, Wei.
>
>On Fri, Feb 17, 2017 at 10:12:31PM +0800, Wei Yang wrote:
>> > And compare the ruling with the iteration for the loop to be (1UL <<
>> > 5) and (1UL << 19).
>> > The runtime is 0.00s and 0.04s respectively. The absolute value is not much.
>
>systemd-analyze usually does a pretty good job of breaking down which
>phase took how long. It might be worthwhile to test whether the
>improvement is actually visible during the boot.
>

Hi, Tejun

Thanks for your suggestion. I have tried systemd-analyze to measure the
effect, while looks not good.

Result without patch
-------------------------
Startup finished in 7.243s (kernel) + 25.034s (userspace) = 32.277s
Startup finished in 7.254s (kernel) + 19.816s (userspace) = 27.071s
Startup finished in 7.272s (kernel) + 4.363s (userspace) = 11.636s
Startup finished in 7.258s (kernel) + 24.319s (userspace) = 31.577s
Startup finished in 7.262s (kernel) + 9.481s (userspace) = 16.743s
Startup finished in 7.266s (kernel) + 14.766s (userspace) = 22.032s

Avg = 7.259s

Result with patch
-------------------------
Startup finished in 7.262s (kernel) + 14.294s (userspace) = 21.557s
Startup finished in 7.264s (kernel) + 19.519s (userspace) = 26.783s
Startup finished in 7.266s (kernel) + 4.730s (userspace) = 11.997s
Startup finished in 7.258s (kernel) + 9.514s (userspace) = 16.773s
Startup finished in 7.258s (kernel) + 14.371s (userspace) = 21.629s
Startup finished in 7.258s (kernel) + 14.627s (userspace) = 21.885s

Avg = 7.261s

It looks the effect is not obvious. Maybe the improvement is not good
enough :(

>> >> * Do we really need to add full reverse iterator to just get the
>> >> highest section number?
>> >>
>> >
>> > You are right. After I sent out the mail, I realized just highest pfn
>> > is necessary.
>
>That said, getting efficient is always great as long as the added
>complexity is justifiably small enough. If you can make the change
>simple enough, it'd be a lot easier to merge.
>

Agree.

I have replaced the reverse iteration with a simple last pfn return. The test
result above is based on the new version.

>Thanks.
>
>--
>tejun

--
Wei Yang
Help you, Help me


Attachments:
(No filename) (2.16 kB)
signature.asc (819.00 B)
Download all attachments