It's reported that the time of insmoding a klp.ko for one of our
out-tree modules is too long.
~ time sudo insmod klp.ko
real 0m23.799s
user 0m0.036s
sys 0m21.256s
Then we found the reason: klp_find_object_symbol() uses the interface
kallsyms_on_each_symbol() even for finding module symbols, so will waste
a lot of time. This patch changes it to use module_kallsyms_on_each_symbol()
for modules symbols.
After we apply this patch, the sys time reduced dramatically.
~ time sudo insmod klp.ko
real 0m1.007s
user 0m0.032s
sys 0m0.924s
Signed-off-by: Zhou Chengming <[email protected]>
---
kernel/livepatch/core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index af46438..b4b8bb0 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -182,7 +182,10 @@ static int klp_find_object_symbol(const char *objname, const char *name,
};
mutex_lock(&module_mutex);
- kallsyms_on_each_symbol(klp_find_callback, &args);
+ if (objname)
+ module_kallsyms_on_each_symbol(klp_find_callback, &args);
+ else
+ kallsyms_on_each_symbol(klp_find_callback, &args);
mutex_unlock(&module_mutex);
/*
--
1.8.3.1
Hi,
On Tue, 28 Mar 2017, Zhou Chengming wrote:
> It's reported that the time of insmoding a klp.ko for one of our
> out-tree modules is too long.
>
> ~ time sudo insmod klp.ko
> real 0m23.799s
> user 0m0.036s
> sys 0m21.256s
Is this stable through several (>=10) runs? 23 seconds are really
suspicious. Yes, there is a linear search through all the kallsyms in
kallsyms_on_each_symbol(), but there are something like 70k symbols on my
machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
If it is a problem, can we fix kallsyms_on_each_symbol() and replace the
linear search with something better? All users would benefit...
Thanks,
Miroslav
On 2017/3/28 17:00, Miroslav Benes wrote:
>
> Hi,
>
> On Tue, 28 Mar 2017, Zhou Chengming wrote:
>
>> It's reported that the time of insmoding a klp.ko for one of our
>> out-tree modules is too long.
>>
>> ~ time sudo insmod klp.ko
>> real 0m23.799s
>> user 0m0.036s
>> sys 0m21.256s
>
> Is this stable through several (>=10) runs? 23 seconds are really
> suspicious. Yes, there is a linear search through all the kallsyms in
> kallsyms_on_each_symbol(), but there are something like 70k symbols on my
> machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
>
Yes, it's stable through several runs.
I think the big reason is that our out-tree module used a lot of static local
variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it will
waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols of module.
Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 8
000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 0
Relocation section '.rela.kpatch.dynrelas' at offset 0x38328 contains 2562 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 14
000000000018 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 13
000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 0
000000000040 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 20
000000000058 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 13
000000000060 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings + 0
> If it is a problem, can we fix kallsyms_on_each_symbol() and replace the
> linear search with something better? All users would benefit...
>
Yes, it's better if we can improve the linear search, but I can't think of that...
Thanks.
> Thanks,
> Miroslav
>
> .
>
On Tue, 28 Mar 2017, zhouchengming wrote:
> On 2017/3/28 17:00, Miroslav Benes wrote:
> >
> > Hi,
> >
> > On Tue, 28 Mar 2017, Zhou Chengming wrote:
> >
> > > It's reported that the time of insmoding a klp.ko for one of our
> > > out-tree modules is too long.
> > >
> > > ~ time sudo insmod klp.ko
> > > real 0m23.799s
> > > user 0m0.036s
> > > sys 0m21.256s
> >
> > Is this stable through several (>=10) runs? 23 seconds are really
> > suspicious. Yes, there is a linear search through all the kallsyms in
> > kallsyms_on_each_symbol(), but there are something like 70k symbols on my
> > machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
> >
>
> Yes, it's stable through several runs.
>
> I think the big reason is that our out-tree module used a lot of static local
> variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it
> will
> waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols
> of module.
Ok, it means that you have a lot of relocation records which reference
your out-of-tree module. Then for each such entry klp_resolve_symbol()
is called and then klp_find_object_symbol() to actually resolve it. So if
you have 20k entries, you walk through vmlinux kallsyms table 20k times.
It is unneeded and that is why your fix works.
But if there were 20k modules loaded, the problem would still be there.
I think it would be really nice to fix kallsyms :). Replace ordinary array
and the linear search with a hash table.
> Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
> Offset Info Type Sym. Value Sym. Name +
> Addend
> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 8
> 000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 0
Hm, we do not have aarch64 support in upstream (yet). There is even no
dynamic ftrace with regs yet (if I am not mistaken).
> Relocation section '.rela.kpatch.dynrelas' at offset 0x38328 contains 2562
> entries:
> Offset Info Type Sym. Value Sym. Name +
> Addend
> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 14
> 000000000018 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 13
> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 0
> 000000000040 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 20
> 000000000058 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 13
> 000000000060 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
> + 0
>
> > If it is a problem, can we fix kallsyms_on_each_symbol() and replace the
> > linear search with something better? All users would benefit...
> >
>
> Yes, it's better if we can improve the linear search, but I can't think of
> that...
I don't understand. Fixing kallsyms is of course much more work but
everyone would benefit from that.
If there is an agreement, we could accept your solution as temporary. In
such case, please prefix the subject with 'livepatch: ' and use capital
letter in 'Reduce'. Please also improve the changelog and describe where
the problem really is.
Thanks,
Miroslav
On 2017/3/28 19:16, Miroslav Benes wrote:
> On Tue, 28 Mar 2017, zhouchengming wrote:
>
>> On 2017/3/28 17:00, Miroslav Benes wrote:
>>>
>>> Hi,
>>>
>>> On Tue, 28 Mar 2017, Zhou Chengming wrote:
>>>
>>>> It's reported that the time of insmoding a klp.ko for one of our
>>>> out-tree modules is too long.
>>>>
>>>> ~ time sudo insmod klp.ko
>>>> real 0m23.799s
>>>> user 0m0.036s
>>>> sys 0m21.256s
>>>
>>> Is this stable through several (>=10) runs? 23 seconds are really
>>> suspicious. Yes, there is a linear search through all the kallsyms in
>>> kallsyms_on_each_symbol(), but there are something like 70k symbols on my
>>> machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
>>>
>>
>> Yes, it's stable through several runs.
>>
>> I think the big reason is that our out-tree module used a lot of static local
>> variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it
>> will
>> waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols
>> of module.
>
> Ok, it means that you have a lot of relocation records which reference
> your out-of-tree module. Then for each such entry klp_resolve_symbol()
> is called and then klp_find_object_symbol() to actually resolve it. So if
> you have 20k entries, you walk through vmlinux kallsyms table 20k times.
> It is unneeded and that is why your fix works.
>
> But if there were 20k modules loaded, the problem would still be there.
>
Yes, vmlinux kallsyms table is too big, but modules loaded are always few.
> I think it would be really nice to fix kallsyms :). Replace ordinary array
> and the linear search with a hash table.
>
>> Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
>> Offset Info Type Sym. Value Sym. Name +
>> Addend
>> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
>> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 8
>> 000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 0
>
> Hm, we do not have aarch64 support in upstream (yet). There is even no
> dynamic ftrace with regs yet (if I am not mistaken).
>
>> Relocation section '.rela.kpatch.dynrelas' at offset 0x38328 contains 2562
>> entries:
>> Offset Info Type Sym. Value Sym. Name +
>> Addend
>> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 14
>> 000000000018 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 13
>> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 0
>> 000000000040 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 20
>> 000000000058 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 13
>> 000000000060 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 0
>>
>>> If it is a problem, can we fix kallsyms_on_each_symbol() and replace the
>>> linear search with something better? All users would benefit...
>>>
>>
>> Yes, it's better if we can improve the linear search, but I can't think of
>> that...
>
> I don't understand. Fixing kallsyms is of course much more work but
> everyone would benefit from that.
>
> If there is an agreement, we could accept your solution as temporary. In
> such case, please prefix the subject with 'livepatch: ' and use capital
> letter in 'Reduce'. Please also improve the changelog and describe where
> the problem really is.
>
Ok, if there are no oppositions, I will send a patch-v2 with improved changelog.
This is really a temporary solution, and others can go on to fix the kallsyms
if needed later.
Thanks.
> Thanks,
> Miroslav
>
> .
>
+++ Miroslav Benes [28/03/17 13:16 +0200]:
>On Tue, 28 Mar 2017, zhouchengming wrote:
>
>> On 2017/3/28 17:00, Miroslav Benes wrote:
>> >
>> > Hi,
>> >
>> > On Tue, 28 Mar 2017, Zhou Chengming wrote:
>> >
>> > > It's reported that the time of insmoding a klp.ko for one of our
>> > > out-tree modules is too long.
>> > >
>> > > ~ time sudo insmod klp.ko
>> > > real 0m23.799s
>> > > user 0m0.036s
>> > > sys 0m21.256s
>> >
>> > Is this stable through several (>=10) runs? 23 seconds are really
>> > suspicious. Yes, there is a linear search through all the kallsyms in
>> > kallsyms_on_each_symbol(), but there are something like 70k symbols on my
>> > machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
>> >
>>
>> Yes, it's stable through several runs.
>>
>> I think the big reason is that our out-tree module used a lot of static local
>> variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it
>> will
>> waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols
>> of module.
>
>Ok, it means that you have a lot of relocation records which reference
>your out-of-tree module. Then for each such entry klp_resolve_symbol()
>is called and then klp_find_object_symbol() to actually resolve it. So if
>you have 20k entries, you walk through vmlinux kallsyms table 20k times.
>It is unneeded and that is why your fix works.
>
>But if there were 20k modules loaded, the problem would still be there.
>
>I think it would be really nice to fix kallsyms :). Replace ordinary array
>and the linear search with a hash table.
>
>> Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
>> Offset Info Type Sym. Value Sym. Name +
>> Addend
>> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
>> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 8
>> 000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>> + 0
>
>Hm, we do not have aarch64 support in upstream (yet). There is even no
>dynamic ftrace with regs yet (if I am not mistaken).
I'm curious, how was this tested? Since there is no dynamic ftrace
with regs and no livepatch stubs (klp_arch_set_pc, etc) implemented
yet for aarch64. Also, livepatch has switched from klp_relocs/dynrelas
to .klp.rela. sections since 4.7, so I'm curious how your patch module
has a .kpatch.dynrelas section working with livepatch.
Unrelated to this patch, if there is a working aarch64 livepatch port (and
kpatch build tool, it seems) floating out there, it would be
wonderful to push that upstream :-)
Jessica
Hi,
on 2017/3/29 8:03, Jessica Yu wrote:
> +++ Miroslav Benes [28/03/17 13:16 +0200]:
>> On Tue, 28 Mar 2017, zhouchengming wrote:
>>
>>> On 2017/3/28 17:00, Miroslav Benes wrote:
>>> >
>>> > Hi,
>>> >
>>> > On Tue, 28 Mar 2017, Zhou Chengming wrote:
>>> >
>>> > > It's reported that the time of insmoding a klp.ko for one of our
>>> > > out-tree modules is too long.
>>> > >
>>> > > ~ time sudo insmod klp.ko
>>> > > real 0m23.799s
>>> > > user 0m0.036s
>>> > > sys 0m21.256s
>>> >
>>> > Is this stable through several (>=10) runs? 23 seconds are really
>>> > suspicious. Yes, there is a linear search through all the kallsyms in
>>> > kallsyms_on_each_symbol(), but there are something like 70k symbols on my
>>> > machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
>>> >
>>>
>>> Yes, it's stable through several runs.
>>>
>>> I think the big reason is that our out-tree module used a lot of static local
>>> variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it
>>> will
>>> waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols
>>> of module.
>>
>> Ok, it means that you have a lot of relocation records which reference
>> your out-of-tree module. Then for each such entry klp_resolve_symbol()
>> is called and then klp_find_object_symbol() to actually resolve it. So if
>> you have 20k entries, you walk through vmlinux kallsyms table 20k times.
>> It is unneeded and that is why your fix works.
>>
>> But if there were 20k modules loaded, the problem would still be there.
>>
>> I think it would be really nice to fix kallsyms :). Replace ordinary array
>> and the linear search with a hash table.
>>
>>> Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
>>> Offset Info Type Sym. Value Sym. Name +
>>> Addend
>>> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
>>> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>>> + 8
>>> 000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>>> + 0
>>
>> Hm, we do not have aarch64 support in upstream (yet). There is even no
>> dynamic ftrace with regs yet (if I am not mistaken).
>
> I'm curious, how was this tested? Since there is no dynamic ftrace
> with regs and no livepatch stubs (klp_arch_set_pc, etc) implemented
> yet for aarch64. Also, livepatch has switched from klp_relocs/dynrelas
> to .klp.rela. sections since 4.7, so I'm curious how your patch module
> has a .kpatch.dynrelas section working with livepatch.
>
> Unrelated to this patch, if there is a working aarch64 livepatch port (and
> kpatch build tool, it seems) floating out there, it would be
> wonderful to push that upstream :-)
Yeah, from 2014, we started to work on livepatch support on aarch64, and
in May 2015, we pushed the solution to the livepatch community[1] and gcc
community (mfentry feature on aarch64)[2]. And then, there were an another
gcc solution from linaro [3], which proposes to implement a new option
-fprolog-pad=N that generate a pad of N nops at the beginning of each
function, and AFAIK, Torsten Duwe from SUSE is still discussing this method
with gcc community.
At this stage, we are validating the livepatch support on aarch64 based on
aarch64 mfentry feature. When the community has a clear plan, we are happy
to make adaptation and contribute our related work to the community, including
the kpatch-build support :-)
[1] livepatch: add support on arm64
https://lkml.org/lkml/2015/5/28/54
[2] [AArch64] support -mfentry feature for arm64
https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00756.html
[3] Kernel livepatching support in GCC
https://gcc.gnu.org/ml/gcc/2015-05/msg00267.html
[4] arm64: ftrace with regs for livepatch support
http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401352.html
Thanks,
Li Bin
>
> Jessica
>
> .
>
+++ Li Bin [29/03/17 09:50 +0800]:
>Hi,
>
>on 2017/3/29 8:03, Jessica Yu wrote:
>> +++ Miroslav Benes [28/03/17 13:16 +0200]:
>>> On Tue, 28 Mar 2017, zhouchengming wrote:
>>>
>>>> On 2017/3/28 17:00, Miroslav Benes wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > On Tue, 28 Mar 2017, Zhou Chengming wrote:
>>>> >
>>>> > > It's reported that the time of insmoding a klp.ko for one of our
>>>> > > out-tree modules is too long.
>>>> > >
>>>> > > ~ time sudo insmod klp.ko
>>>> > > real 0m23.799s
>>>> > > user 0m0.036s
>>>> > > sys 0m21.256s
>>>> >
>>>> > Is this stable through several (>=10) runs? 23 seconds are really
>>>> > suspicious. Yes, there is a linear search through all the kallsyms in
>>>> > kallsyms_on_each_symbol(), but there are something like 70k symbols on my
>>>> > machine (that is, way less than 1M). 23 seconds are somewhat unexpected.
>>>> >
>>>>
>>>> Yes, it's stable through several runs.
>>>>
>>>> I think the big reason is that our out-tree module used a lot of static local
>>>> variables. We can see '.rela.kpatch.dynrelas' contains many entries, so it
>>>> will
>>>> waste a lot of time if we use kallsyms_on_each_symbol() to find these symbols
>>>> of module.
>>>
>>> Ok, it means that you have a lot of relocation records which reference
>>> your out-of-tree module. Then for each such entry klp_resolve_symbol()
>>> is called and then klp_find_object_symbol() to actually resolve it. So if
>>> you have 20k entries, you walk through vmlinux kallsyms table 20k times.
>>> It is unneeded and that is why your fix works.
>>>
>>> But if there were 20k modules loaded, the problem would still be there.
>>>
>>> I think it would be really nice to fix kallsyms :). Replace ordinary array
>>> and the linear search with a hash table.
>>>
>>>> Relocation section '.rela.kpatch.funcs' at offset 0x382e0 contains 3 entries:
>>>> Offset Info Type Sym. Value Sym. Name +
>>>> Addend
>>>> 000000000000 003300000101 R_AARCH64_ABS64 0000000000000000 value_show + 0
>>>> 000000000020 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>>>> + 8
>>>> 000000000028 000b00000101 R_AARCH64_ABS64 0000000000000000 .kpatch.strings
>>>> + 0
>>>
>>> Hm, we do not have aarch64 support in upstream (yet). There is even no
>>> dynamic ftrace with regs yet (if I am not mistaken).
>>
>> I'm curious, how was this tested? Since there is no dynamic ftrace
>> with regs and no livepatch stubs (klp_arch_set_pc, etc) implemented
>> yet for aarch64. Also, livepatch has switched from klp_relocs/dynrelas
>> to .klp.rela. sections since 4.7, so I'm curious how your patch module
>> has a .kpatch.dynrelas section working with livepatch.
>>
>> Unrelated to this patch, if there is a working aarch64 livepatch port (and
>> kpatch build tool, it seems) floating out there, it would be
>> wonderful to push that upstream :-)
>
>Yeah, from 2014, we started to work on livepatch support on aarch64, and
>in May 2015, we pushed the solution to the livepatch community[1] and gcc
>community (mfentry feature on aarch64)[2]. And then, there were an another
>gcc solution from linaro [3], which proposes to implement a new option
>-fprolog-pad=N that generate a pad of N nops at the beginning of each
>function, and AFAIK, Torsten Duwe from SUSE is still discussing this method
>with gcc community.
>
>At this stage, we are validating the livepatch support on aarch64 based on
>aarch64 mfentry feature. When the community has a clear plan, we are happy
>to make adaptation and contribute our related work to the community, including
>the kpatch-build support :-)
Thanks for the summary and update, it's very helpful. Looking forward
to those patches in the future :-)
>[1] livepatch: add support on arm64
>https://lkml.org/lkml/2015/5/28/54
>[2] [AArch64] support -mfentry feature for arm64
>https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00756.html
>[3] Kernel livepatching support in GCC
>https://gcc.gnu.org/ml/gcc/2015-05/msg00267.html
>[4] arm64: ftrace with regs for livepatch support
>http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401352.html
>
>Thanks,
>Li Bin
>