Hi list,
In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
to be 281474974617600.
While in a x86 box, it is 931131392 which looks sane.
[root@test8 ~]# ll /proc/kcore
-r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
[root@ocfs2-test9 ~]$ ll /proc/kcore
-r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
I just noticed this when kexec fails in "Can't find kernel text map area
from kcore".
Is there something wrong?
Regards,
Tao
On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
> Hi list,
> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
> to be 281474974617600.
> While in a x86 box, it is 931131392 which looks sane.
>
> [root@test8 ~]# ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>
> [root@ocfs2-test9 ~]$ ll /proc/kcore
> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>
> I just noticed this when kexec fails in "Can't find kernel text map area
> from kcore".
>
> Is there something wrong?
>
fs/proc/kcore.c hasn't changed since October last year. Was 2.6.29 OK?
Earlier kernels?
Thanks.
On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
> Hi list,
> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
> to be 281474974617600.
> While in a x86 box, it is 931131392 which looks sane.
>
> [root@test8 ~]# ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>
> [root@ocfs2-test9 ~]$ ll /proc/kcore
> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
Hmm, what is your physical RAM size on test8?
/proc/kcore looks fine on my x86_64 box.
>
> I just noticed this when kexec fails in "Can't find kernel text map area
> from kcore".
>
> Is there something wrong?
It looks like that error message is from userspace?
Amerigo Wang wrote:
> On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
>> Hi list,
>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
>> to be 281474974617600.
>> While in a x86 box, it is 931131392 which looks sane.
>>
>> [root@test8 ~]# ll /proc/kcore
>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>
>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>
> Hmm, what is your physical RAM size on test8?
> /proc/kcore looks fine on my x86_64 box.
Only 4G.
>
>> I just noticed this when kexec fails in "Can't find kernel text map area
>> from kcore".
>>
>> Is there something wrong?
>
> It looks like that error message is from userspace?
I just started kdump and get the error message.
Regards,
Tao
On Fri, Jun 05, 2009 at 02:07:58PM +0800, Tao Ma wrote:
>
>
> Amerigo Wang wrote:
>> On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
>>> Hi list,
>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>> large to be 281474974617600.
>>> While in a x86 box, it is 931131392 which looks sane.
>>>
>>> [root@test8 ~]# ll /proc/kcore
>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>
>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>
>> Hmm, what is your physical RAM size on test8?
>> /proc/kcore looks fine on my x86_64 box.
> Only 4G.
Hmm, my x86_box has 8G mem, the size of kcore looks much
saner than the above huge number, but it is still wrong
according to what the man page describes...
Please do what Andrew said, it will be helpful.
>>
>>> I just noticed this when kexec fails in "Can't find kernel text map
>>> area from kcore".
>>>
>>> Is there something wrong?
>>
>> It looks like that error message is from userspace?
> I just started kdump and get the error message.
IIRC, kdump should use /proc/vmcore, instead of /proc/kcore...
nothing related.
Thanks.
Amerigo Wang wrote:
> On Fri, Jun 05, 2009 at 02:07:58PM +0800, Tao Ma wrote:
>>
>> Amerigo Wang wrote:
>>> On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
>>>> Hi list,
>>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>>> large to be 281474974617600.
>>>> While in a x86 box, it is 931131392 which looks sane.
>>>>
>>>> [root@test8 ~]# ll /proc/kcore
>>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>>
>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>> Hmm, what is your physical RAM size on test8?
>>> /proc/kcore looks fine on my x86_64 box.
>> Only 4G.
>
> Hmm, my x86_box has 8G mem, the size of kcore looks much
> saner than the above huge number, but it is still wrong
> according to what the man page describes...
>
> Please do what Andrew said, it will be helpful.
>
>>>> I just noticed this when kexec fails in "Can't find kernel text map
>>>> area from kcore".
>>>>
>>>> Is there something wrong?
>>> It looks like that error message is from userspace?
>> I just started kdump and get the error message.
>
> IIRC, kdump should use /proc/vmcore, instead of /proc/kcore...
> nothing related.
in el5, when start kdump service it will do something like
/sbin/kexec --args-linux -p '--command-line=ro root=LABEL=/ rhgb quiet
irqpoll maxcpus=1' --initrd=/boot/initrd-2.6.18-53.el5kdump.img
/boot/vmlinuz-2.6.18-53.el5
And the error message is from there.
Regards,
Tao
Andrew Morton wrote:
> On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
>
>> Hi list,
>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
>> to be 281474974617600.
>> While in a x86 box, it is 931131392 which looks sane.
>>
>> [root@test8 ~]# ll /proc/kcore
>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>
>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>
>> I just noticed this when kexec fails in "Can't find kernel text map area
>> from kcore".
>>
>> Is there something wrong?
>>
>
> fs/proc/kcore.c hasn't changed since October last year. Was 2.6.29 OK?
> Earlier kernels?
with 2.6.29, ls shows the same output.
[root@test8 ~]# ll /proc/kcore
-r-------- 1 root root 281474974617600 Jun 5 14:35 /proc/kcore
But the kexec works.
I just checked .28, the same as .29.
Regards,
Tao
On Fri, Jun 05, 2009 at 02:59:46PM +0800, Tao Ma wrote:
>
>
> Andrew Morton wrote:
>> On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
>>
>>> Hi list,
>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>> large to be 281474974617600.
>>> While in a x86 box, it is 931131392 which looks sane.
>>>
>>> [root@test8 ~]# ll /proc/kcore
>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>
>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>>
>>> I just noticed this when kexec fails in "Can't find kernel text map
>>> area from kcore".
>>>
>>> Is there something wrong?
>>>
>>
>> fs/proc/kcore.c hasn't changed since October last year. Was 2.6.29 OK?
>> Earlier kernels?
> with 2.6.29, ls shows the same output.
> [root@test8 ~]# ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 5 14:35 /proc/kcore
Thanks.
It looks like the value of 'high_memory' is insane..
Can you get its value on your machine? You can add a printk() or use
systemtap etc..
On Fri, Jun 05, 2009 at 02:56:36PM +0800, Tao Ma wrote:
>
>
> Amerigo Wang wrote:
>> On Fri, Jun 05, 2009 at 02:07:58PM +0800, Tao Ma wrote:
>>>
>>> Amerigo Wang wrote:
>>>> On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
>>>>> Hi list,
>>>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>>>> large to be 281474974617600.
>>>>> While in a x86 box, it is 931131392 which looks sane.
>>>>>
>>>>> [root@test8 ~]# ll /proc/kcore
>>>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>>>
>>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>>> Hmm, what is your physical RAM size on test8?
>>>> /proc/kcore looks fine on my x86_64 box.
>>> Only 4G.
>>
>> Hmm, my x86_box has 8G mem, the size of kcore looks much
>> saner than the above huge number, but it is still wrong
>> according to what the man page describes...
>>
>> Please do what Andrew said, it will be helpful.
>>
>>>>> I just noticed this when kexec fails in "Can't find kernel text
>>>>> map area from kcore".
>>>>>
>>>>> Is there something wrong?
>>>> It looks like that error message is from userspace?
>>> I just started kdump and get the error message.
>>
>> IIRC, kdump should use /proc/vmcore, instead of /proc/kcore...
>> nothing related.
> in el5, when start kdump service it will do something like
> /sbin/kexec --args-linux -p '--command-line=ro root=LABEL=/ rhgb quiet
> irqpoll maxcpus=1' --initrd=/boot/initrd-2.6.18-53.el5kdump.img
> /boot/vmlinuz-2.6.18-53.el5
>
> And the error message is from there.
>From /sbin/kexec? I just checked the source code of kexec-tools,
I haven't found that message...
Amerigo Wang wrote:
> On Fri, Jun 05, 2009 at 02:59:46PM +0800, Tao Ma wrote:
>>
>> Andrew Morton wrote:
>>> On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
>>>
>>>> Hi list,
>>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>>> large to be 281474974617600.
>>>> While in a x86 box, it is 931131392 which looks sane.
>>>>
>>>> [root@test8 ~]# ll /proc/kcore
>>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>>
>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>>>
>>>> I just noticed this when kexec fails in "Can't find kernel text map
>>>> area from kcore".
>>>>
>>>> Is there something wrong?
>>>>
>>> fs/proc/kcore.c hasn't changed since October last year. Was 2.6.29 OK?
>>> Earlier kernels?
>> with 2.6.29, ls shows the same output.
>> [root@test8 ~]# ll /proc/kcore
>> -r-------- 1 root root 281474974617600 Jun 5 14:35 /proc/kcore
>
>
> Thanks.
>
> It looks like the value of 'high_memory' is insane..
>
> Can you get its value on your machine? You can add a printk() or use
> systemtap etc..
Just did that.
Also a strange number.
high memory 18446612137615818752.
Regards,
Tao
Amerigo Wang wrote:
> On Fri, Jun 05, 2009 at 02:56:36PM +0800, Tao Ma wrote:
>>
>> Amerigo Wang wrote:
>>> On Fri, Jun 05, 2009 at 02:07:58PM +0800, Tao Ma wrote:
>>>> Amerigo Wang wrote:
>>>>> On Fri, Jun 05, 2009 at 12:03:52PM +0800, Tao Ma wrote:
>>>>>> Hi list,
>>>>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable
>>>>>> large to be 281474974617600.
>>>>>> While in a x86 box, it is 931131392 which looks sane.
>>>>>>
>>>>>> [root@test8 ~]# ll /proc/kcore
>>>>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>>>>
>>>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>>>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>>>> Hmm, what is your physical RAM size on test8?
>>>>> /proc/kcore looks fine on my x86_64 box.
>>>> Only 4G.
>>> Hmm, my x86_box has 8G mem, the size of kcore looks much
>>> saner than the above huge number, but it is still wrong
>>> according to what the man page describes...
>>>
>>> Please do what Andrew said, it will be helpful.
>>>
>>>>>> I just noticed this when kexec fails in "Can't find kernel text
>>>>>> map area from kcore".
>>>>>>
>>>>>> Is there something wrong?
>>>>> It looks like that error message is from userspace?
>>>> I just started kdump and get the error message.
>>> IIRC, kdump should use /proc/vmcore, instead of /proc/kcore...
>>> nothing related.
>> in el5, when start kdump service it will do something like
>> /sbin/kexec --args-linux -p '--command-line=ro root=LABEL=/ rhgb quiet
>> irqpoll maxcpus=1' --initrd=/boot/initrd-2.6.18-53.el5kdump.img
>> /boot/vmlinuz-2.6.18-53.el5
>>
>> And the error message is from there.
>
>>From /sbin/kexec? I just checked the source code of kexec-tools,
> I haven't found that message...
No, it is there.
See kexec-tools-1.101-reloc-update.patch.
src rpm is kexec-tools-1.101-194.4.el5.src.rpm. So it is a patch from el5.
Regards,
Tao
On Fri, 5 Jun 2009 17:09:54 +0800 Am__rico Wang <[email protected]> wrote:
> On Fri, Jun 5, 2009 at 4:57 PM, Tao Ma<[email protected]> wrote:
> >
> >
> > Amerigo Wang wrote:
> >>
> >> On Fri, Jun 05, 2009 at 02:59:46PM +0800, Tao Ma wrote:
> >>>
> >>> Andrew Morton wrote:
> >>>>
> >>>> On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
> >>>>
> >>>>> Hi list,
> >>>>> __ __ __ In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
> >>>>> to be 281474974617600.
> >>>>> While in a x86 box, it is 931131392 which looks sane.
> >>>>>
> >>>>> [root@test8 ~]# ll /proc/kcore
> >>>>> -r-------- 1 root root 281474974617600 Jun __5 11:15 /proc/kcore
> >>>>>
> >>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
> >>>>> -r-------- 1 root root 931131392 Jun __5 11:58 /proc/kcore
> >>>>>
> >>>>> I just noticed this when kexec fails in "Can't find kernel text map
> >>>>> area from kcore".
> >>>>>
> >>>>> Is there something wrong?
> >>>>>
> >>>> fs/proc/kcore.c hasn't changed since October last year. __Was 2.6.29 OK?
> >>>> Earlier kernels?
> >>>
> >>> with 2.6.29, ls shows the same output.
> >>> [root@test8 ~]# ll /proc/kcore
> >>> -r-------- 1 root root 281474974617600 Jun __5 14:35 /proc/kcore
> >>
> >>
> >> Thanks.
> >>
> >> It looks like the value of 'high_memory' is insane..
> >> Can you get its value on your machine? You can add a printk() or use
> >> systemtap etc..
> >
> > Just did that.
> > Also a strange number.
> > high memory 18446612137615818752.
> >
(top-posting repaired)
> Add some Cc: to x86 people. :)
>
> Yinghai?
>
Please send the boot logs: dmesg -s 1000000 > foo
Add some Cc: to x86 people. :)
Yinghai?
On Fri, Jun 5, 2009 at 4:57 PM, Tao Ma<[email protected]> wrote:
>
>
> Amerigo Wang wrote:
>>
>> On Fri, Jun 05, 2009 at 02:59:46PM +0800, Tao Ma wrote:
>>>
>>> Andrew Morton wrote:
>>>>
>>>> On Fri, 05 Jun 2009 12:03:52 +0800 Tao Ma <[email protected]> wrote:
>>>>
>>>>> Hi list,
>>>>> In 2.6.30-rc8, /proc/kcore in x86_64's size is unreasonable large
>>>>> to be 281474974617600.
>>>>> While in a x86 box, it is 931131392 which looks sane.
>>>>>
>>>>> [root@test8 ~]# ll /proc/kcore
>>>>> -r-------- 1 root root 281474974617600 Jun 5 11:15 /proc/kcore
>>>>>
>>>>> [root@ocfs2-test9 ~]$ ll /proc/kcore
>>>>> -r-------- 1 root root 931131392 Jun 5 11:58 /proc/kcore
>>>>>
>>>>> I just noticed this when kexec fails in "Can't find kernel text map
>>>>> area from kcore".
>>>>>
>>>>> Is there something wrong?
>>>>>
>>>> fs/proc/kcore.c hasn't changed since October last year. Was 2.6.29 OK?
>>>> Earlier kernels?
>>>
>>> with 2.6.29, ls shows the same output.
>>> [root@test8 ~]# ll /proc/kcore
>>> -r-------- 1 root root 281474974617600 Jun 5 14:35 /proc/kcore
>>
>>
>> Thanks.
>>
>> It looks like the value of 'high_memory' is insane..
>> Can you get its value on your machine? You can add a printk() or use
>> systemtap etc..
>
> Just did that.
> Also a strange number.
> high memory 18446612137615818752.
>
> Regards,
> Tao
>
On Fri, Jun 05, 2009 at 05:01:00PM +0800, Tao Ma wrote:
>>>>>>> I just noticed this when kexec fails in "Can't find kernel
>>>>>>> text map area from kcore".
>>>>>>>
>>>>>>> Is there something wrong?
>>>>>> It looks like that error message is from userspace?
>>>>> I just started kdump and get the error message.
>>>> IIRC, kdump should use /proc/vmcore, instead of /proc/kcore...
>>>> nothing related.
>>> in el5, when start kdump service it will do something like
>>> /sbin/kexec --args-linux -p '--command-line=ro root=LABEL=/ rhgb
>>> quiet irqpoll maxcpus=1'
>>> --initrd=/boot/initrd-2.6.18-53.el5kdump.img
>>> /boot/vmlinuz-2.6.18-53.el5
>>>
>>> And the error message is from there.
>>
>>> From /sbin/kexec? I just checked the source code of kexec-tools,
>> I haven't found that message...
> No, it is there.
> See kexec-tools-1.101-reloc-update.patch.
> src rpm is kexec-tools-1.101-194.4.el5.src.rpm. So it is a patch from el5.
Oh, I used the original source code without any extral patches..
Thanks for your reply.
Linux version 2.6.30-rc4 (test2.cn.oracle.com) (gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)) #3 SMP Fri Jun 5 14:14:28 CST 2009
Command line: ro root=LABEL=/var rhgb quiet crashkernel=128M@16M
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000bf5ffc00 (usable)
BIOS-e820: 00000000bf5ffc00 - 00000000bf601c00 (ACPI NVS)
BIOS-e820: 00000000bf603c00 - 00000000bf653c00 (reserved)
BIOS-e820: 00000000bf653c00 - 00000000bf655c00 (ACPI data)
BIOS-e820: 00000000bf655c00 - 00000000c0000000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved)
BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 000000013c000000 (usable)
DMI 2.3 present.
last_pfn = 0x13c000 max_arch_pfn = 0x100000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
00000-9FFFF write-back
A0000-BFFFF uncachable
C0000-CFFFF write-protect
D0000-EFFFF uncachable
F0000-FFFFF write-protect
MTRR variable ranges enabled:
0 base 000000000 mask 000000000 write-back
1 base 0BF800000 mask FFF800000 uncachable
2 base 0BF700000 mask FFFF00000 uncachable
3 base 0C0000000 mask FC0000000 uncachable
4 disabled
5 disabled
6 disabled
7 disabled
e820 update range: 00000000bf700000 - 0000000100000000 (usable) ==> (reserved)
last_pfn = 0xbf5ff max_arch_pfn = 0x100000000
init_memory_mapping: 0000000000000000-00000000bf5ff000
0000000000 - 00bf400000 page 2M
00bf400000 - 00bf5ff000 page 4k
kernel direct mapping tables up to bf5ff000 @ 8000-d000
init_memory_mapping: 0000000100000000-000000013c000000
0100000000 - 013c000000 page 2M
kernel direct mapping tables up to 13c000000 @ b000-11000
RAMDISK: 37d94000 - 37fef752
ACPI: RSDP 00000000000febf0 00024 (v02 DELL )
ACPI: XSDT 00000000000fce99 0006C (v01 DELL B8K 00000014 ASL 00000061)
ACPI: FACP 00000000000fcfc1 000F4 (v03 DELL B8K 00000014 ASL 00000061)
ACPI: DSDT 00000000fff6a119 0473C (v01 DELL dt_ex 00001000 INTL 20050624)
ACPI: FACS 00000000bf5ffc00 00040
ACPI: SSDT 00000000fff6e974 000AA (v01 DELL st_ex 00001000 INTL 20050624)
ACPI: APIC 00000000000fd0b5 00092 (v01 DELL B8K 00000014 ASL 00000061)
ACPI: BOOT 00000000000fd147 00028 (v01 DELL B8K 00000014 ASL 00000061)
ACPI: ASF! 00000000000fd16f 00092 (v32 DELL B8K 00000014 ASL 00000061)
ACPI: MCFG 00000000000fd201 0003E (v01 DELL B8K 00000014 ASL 00000061)
ACPI: HPET 00000000000fd23f 00038 (v01 DELL B8K 00000014 ASL 00000061)
ACPI: TCPA 00000000000fd49b 00032 (v01 DELL B8K 00000014 ASL 00000061)
ACPI: SLIC 00000000000fd277 00176 (v01 DELL B8K 00000014 ASL 00000061)
ACPI: Local APIC address 0xfee00000
No NUMA configuration found
Faking a node at 0000000000000000-000000013c000000
Bootmem setup node 0 0000000000000000-000000013c000000
NODE_DATA [000000000000c000 - 0000000000011fff]
bootmap [0000000000012000 - 00000000000397ff] pages 28
(8 early reservations) ==> bootmem [0000000000 - 013c000000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0000200000 - 0000d065d0] TEXT DATA BSS ==> [0000200000 - 0000d065d0]
#3 [0037d94000 - 0037fef752] RAMDISK ==> [0037d94000 - 0037fef752]
#4 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000]
#5 [0000d07000 - 0000d07234] BRK ==> [0000d07000 - 0000d07234]
#6 [0000008000 - 000000b000] PGTABLE ==> [0000008000 - 000000b000]
#7 [000000b000 - 000000c000] PGTABLE ==> [000000b000 - 000000c000]
found SMP MP-table at [ffff8800000fe710] fe710
Reserving 128MB of memory at 16MB for crashkernel (System RAM: 5056MB)
[ffffe20000000000-ffffe200045fffff] PMD -> [ffff880028200000-ffff88002b9fffff] on node 0
Zone PFN ranges:
DMA 0x00000000 -> 0x00001000
DMA32 0x00001000 -> 0x00100000
Normal 0x00100000 -> 0x0013c000
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
0: 0x00000000 -> 0x0000009f
0: 0x00000100 -> 0x000bf5ff
0: 0x00100000 -> 0x0013c000
On node 0 totalpages: 1029534
DMA zone: 56 pages used for memmap
DMA zone: 2925 pages reserved
DMA zone: 1018 pages, LIFO batch:0
DMA32 zone: 14280 pages used for memmap
DMA32 zone: 765495 pages, LIFO batch:31
Normal zone: 3360 pages used for memmap
Normal zone: 242400 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x05] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] disabled)
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x00] disabled)
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x01] disabled)
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x02] disabled)
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x03] disabled)
ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 0, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
SMP: Allowing 8 CPUs, 6 hotplug CPUs
nr_irqs_gsi: 24
Allocating PCI resources starting at c2000000 (gap: c0000000:20000000)
NR_CPUS:255 nr_cpumask_bits:255 nr_cpu_ids:8 nr_node_ids:1
PERCPU: Embedded 24 pages at ffff880028104000, static data 69536 bytes
Built 1 zonelists in Node order, mobility grouping on. Total pages: 1008913
Policy zone: Normal
Kernel command line: ro root=LABEL=/var rhgb quiet crashkernel=128M@16M
Initializing CPU#0
NR_IRQS:4352
PID hash table entries: 4096 (order: 12, 32768 bytes)
Fast TSC calibration using PIT
Detected 2128.021 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Checking aperture...
No AGP bridge found
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
software IO TLB at phys 0x20000000 - 0x24000000
Memory: 3848804k/5177344k available (2801k kernel code, 1059208k absent, 269332k reserved, 1965k data, 396k init)
hpet clockevent registered
HPET: 3 timers in total, 0 timers will be used for per-cpu timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4256.04 BogoMIPS (lpj=2128021)
Security Framework initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU 0/0x0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
ACPI: Core revision 20090320
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz stepping 02
Booting processor 1 APIC 0x1 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 4255.80 BogoMIPS (lpj=2127902)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU 1/0x1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU1: Thermal monitoring enabled (TM2)
CPU1: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz stepping 02
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
Total of 2 processors activated (8511.84 BogoMIPS).
CPU0 attaching sched-domain:
domain 0: span 0-1 level MC
groups: 0 1
CPU1 attaching sched-domain:
domain 0: span 0-1 level MC
groups: 1 0
net_namespace: 1736 bytes
NET: Registered protocol family 16
ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
ACPI: bus type pci registered
PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
PCI: MCFG area at e0000000 reserved in E820
PCI: Using MMCONFIG at e0000000 - efffffff
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: BIOS _OSI(Linux) query ignored
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S3 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI Warning (tbutils-0246): Incorrect checksum in table [TCPA] - 00, should be 89 [20090320]
ACPI: ACPI Dock Station Driver: 1 docks/bays found
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
pci 0000:00:01.0: PME# disabled
pci 0000:00:02.0: reg 10 32bit mmio: [0xdfe00000-0xdfefffff]
pci 0000:00:02.0: reg 18 64bit mmio: [0xc0000000-0xcfffffff]
pci 0000:00:02.0: reg 20 io port: [0xecb8-0xecbf]
pci 0000:00:02.1: reg 10 32bit mmio: [0xdff00000-0xdfffffff]
pci 0000:00:1a.0: reg 20 io port: [0xff20-0xff3f]
pci 0000:00:1a.1: reg 20 io port: [0xff00-0xff1f]
pci 0000:00:1a.7: reg 10 32bit mmio: [0xdfdfbc00-0xdfdfbfff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1b.0: reg 10 64bit mmio: [0xdfdfc000-0xdfdfffff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1b.0: PME# disabled
pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.0: PME# disabled
pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.4: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0xff80-0xff9f]
pci 0000:00:1d.1: reg 20 io port: [0xff60-0xff7f]
pci 0000:00:1d.2: reg 20 io port: [0xff40-0xff5f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0xff980800-0xff980bff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0880-08bf claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0c00 (mask 007f)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 2 PIO at 00e0 (mask 0007)
pci 0000:00:1f.2: reg 10 io port: [0xfe00-0xfe07]
pci 0000:00:1f.2: reg 14 io port: [0xfe10-0xfe13]
pci 0000:00:1f.2: reg 18 io port: [0xfe20-0xfe27]
pci 0000:00:1f.2: reg 1c io port: [0xfe30-0xfe33]
pci 0000:00:1f.2: reg 20 io port: [0xfec0-0xfecf]
pci 0000:00:1f.2: reg 24 io port: [0xecc0-0xeccf]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 32bit mmio: [0xdfdfbb00-0xdfdfbbff]
pci 0000:00:1f.3: reg 20 io port: [0xece0-0xecff]
pci 0000:00:1f.5: reg 10 io port: [0xfe40-0xfe47]
pci 0000:00:1f.5: reg 14 io port: [0xfe50-0xfe53]
pci 0000:00:1f.5: reg 18 io port: [0xfe60-0xfe67]
pci 0000:00:1f.5: reg 1c io port: [0xfe70-0xfe73]
pci 0000:00:1f.5: reg 20 io port: [0xfed0-0xfedf]
pci 0000:00:1f.5: reg 24 io port: [0xecd0-0xecdf]
pci 0000:00:1f.5: PME# supported from D3hot
pci 0000:00:1f.5: PME# disabled
pci 0000:00:01.0: bridge 32bit mmio: [0xdfc00000-0xdfcfffff]
pci 0000:00:1c.0: bridge 32bit mmio: [0xdfb00000-0xdfbfffff]
pci 0000:03:00.0: reg 10 64bit mmio: [0xdfaf0000-0xdfafffff]
pci 0000:03:00.0: PME# supported from D3hot D3cold
pci 0000:03:00.0: PME# disabled
pci 0000:00:1c.4: bridge 32bit mmio: [0xdfa00000-0xdfafffff]
pci 0000:04:00.0: reg 10 32bit mmio: [0xdf880000-0xdf89ffff]
pci 0000:04:00.0: reg 14 32bit mmio: [0xdf8a0000-0xdf8bffff]
pci 0000:04:00.0: reg 18 io port: [0xdc80-0xdcbf]
pci 0000:04:00.0: reg 30 32bit mmio: [0xdf900000-0xdf91ffff]
pci 0000:04:00.0: PME# supported from D0 D3hot D3cold
pci 0000:04:00.0: PME# disabled
pci 0000:04:02.0: reg 10 32bit mmio: [0xdf8c0000-0xdf8dffff]
pci 0000:04:02.0: reg 14 32bit mmio: [0xdf8e0000-0xdf8fffff]
pci 0000:04:02.0: reg 18 io port: [0xdcc0-0xdcff]
pci 0000:04:02.0: reg 30 32bit mmio: [0xdf900000-0xdf91ffff]
pci 0000:04:02.0: PME# supported from D0 D3hot D3cold
pci 0000:04:02.0: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge io port: [0xd000-0xdfff]
pci 0000:00:1e.0: bridge 32bit mmio: [0xdf800000-0xdf9fffff]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI4._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI5._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs *3 4 5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 *9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11 12 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 9 10 11 12 15)
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 comparators, 64-bit 14.318180 MHz counter
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp 00:01: io resource (0x800-0x85f) overlaps 0000:00:1f.0 BAR 7 (0x800-0x87f), disabling
pnp 00:01: io resource (0x860-0x8ff) overlaps 0000:00:1f.0 BAR 7 (0x800-0x87f), disabling
pnp: PnP ACPI: found 11 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: ioport range 0xc00-0xc7f has been reserved
system 00:08: iomem range 0x0-0x9ffff could not be reserved
system 00:08: iomem range 0x100000-0xffffff could not be reserved
system 00:08: iomem range 0x1000000-0xbf5ffbff could not be reserved
system 00:08: iomem range 0xf0000-0xfffff could not be reserved
system 00:08: iomem range 0xc0000-0xcffff has been reserved
system 00:08: iomem range 0xfec00000-0xfecfffff has been reserved
system 00:08: iomem range 0xfee00000-0xfeefffff has been reserved
system 00:08: iomem range 0xffb00000-0xffbfffff has been reserved
system 00:08: iomem range 0xffc00000-0xffffffff has been reserved
system 00:09: ioport range 0x100-0x1fe has been reserved
system 00:09: ioport range 0x200-0x277 has been reserved
system 00:09: ioport range 0x280-0x2e7 has been reserved
system 00:09: ioport range 0x2f0-0x2f7 has been reserved
system 00:09: ioport range 0x300-0x377 has been reserved
system 00:09: ioport range 0x380-0x3bb has been reserved
system 00:09: ioport range 0x3c0-0x3e7 could not be reserved
system 00:09: ioport range 0x3f6-0x3f7 has been reserved
system 00:09: ioport range 0x400-0x4cf has been reserved
system 00:09: ioport range 0x4d2-0x57f has been reserved
system 00:09: ioport range 0x580-0x677 has been reserved
system 00:09: ioport range 0x680-0x777 has been reserved
system 00:09: ioport range 0x780-0x7bb has been reserved
system 00:09: ioport range 0x7c0-0x7ff has been reserved
system 00:09: ioport range 0x8e0-0x8ff has been reserved
system 00:09: ioport range 0x900-0x9fe has been reserved
system 00:09: ioport range 0xa00-0xafe has been reserved
system 00:09: ioport range 0xb00-0xbfe has been reserved
system 00:09: ioport range 0xc80-0xcaf has been reserved
system 00:09: ioport range 0xcc0-0xcf7 has been reserved
system 00:09: ioport range 0xd00-0xdfe has been reserved
system 00:09: ioport range 0xe00-0xefe has been reserved
system 00:09: ioport range 0xf00-0xffe has been reserved
system 00:09: ioport range 0x2000-0x20fe has been reserved
system 00:09: ioport range 0x2100-0x21fe has been reserved
system 00:09: ioport range 0x2200-0x22fe has been reserved
system 00:09: ioport range 0x2300-0x23fe has been reserved
system 00:09: ioport range 0x2400-0x24fe has been reserved
system 00:09: ioport range 0x2500-0x25fe has been reserved
system 00:09: ioport range 0x2600-0x26fe has been reserved
system 00:09: ioport range 0x2700-0x27fe has been reserved
system 00:09: ioport range 0x2800-0x28fe has been reserved
system 00:09: ioport range 0x2900-0x29fe has been reserved
system 00:09: ioport range 0x2a00-0x2afe has been reserved
system 00:09: ioport range 0x2b00-0x2bfe has been reserved
system 00:09: ioport range 0x2c00-0x2cfe has been reserved
system 00:09: ioport range 0x2d00-0x2dfe has been reserved
system 00:09: ioport range 0x2e00-0x2efe has been reserved
system 00:09: ioport range 0x2f00-0x2ffe has been reserved
system 00:09: ioport range 0x5000-0x50fe has been reserved
system 00:09: ioport range 0x5100-0x51fe has been reserved
system 00:09: ioport range 0x5200-0x52fe has been reserved
system 00:09: ioport range 0x5300-0x53fe has been reserved
system 00:09: ioport range 0x5400-0x54fe has been reserved
system 00:09: ioport range 0x5500-0x55fe has been reserved
system 00:09: ioport range 0x5600-0x56fe has been reserved
system 00:09: ioport range 0x5700-0x57fe has been reserved
system 00:09: ioport range 0x5800-0x58fe has been reserved
system 00:09: ioport range 0x5900-0x59fe has been reserved
system 00:09: ioport range 0x5a00-0x5afe has been reserved
system 00:09: ioport range 0x5b00-0x5bfe has been reserved
system 00:09: ioport range 0x5c00-0x5cfe has been reserved
system 00:09: ioport range 0x5d00-0x5dfe has been reserved
system 00:09: ioport range 0x5e00-0x5efe has been reserved
system 00:09: ioport range 0x5f00-0x5ffe has been reserved
system 00:09: ioport range 0x6000-0x60fe has been reserved
system 00:09: ioport range 0x6100-0x61fe has been reserved
system 00:09: ioport range 0x6200-0x62fe has been reserved
system 00:09: ioport range 0x6300-0x63fe has been reserved
system 00:09: ioport range 0x6400-0x64fe has been reserved
system 00:09: ioport range 0x6500-0x65fe has been reserved
system 00:09: ioport range 0x6600-0x66fe has been reserved
system 00:09: ioport range 0x6700-0x67fe has been reserved
system 00:09: ioport range 0x6800-0x68fe has been reserved
system 00:09: ioport range 0x6900-0x69fe has been reserved
system 00:09: ioport range 0x6a00-0x6afe has been reserved
system 00:09: ioport range 0x6b00-0x6bfe has been reserved
system 00:09: ioport range 0x6c00-0x6cfe has been reserved
system 00:09: ioport range 0x6d00-0x6dfe has been reserved
system 00:09: ioport range 0x6e00-0x6efe has been reserved
system 00:09: ioport range 0x6f00-0x6ffe has been reserved
system 00:09: ioport range 0xa000-0xa0fe has been reserved
system 00:09: ioport range 0xa100-0xa1fe has been reserved
system 00:09: ioport range 0xa200-0xa2fe has been reserved
system 00:09: ioport range 0xa300-0xa3fe has been reserved
system 00:09: ioport range 0xa400-0xa4fe has been reserved
system 00:09: ioport range 0xa500-0xa5fe has been reserved
system 00:09: ioport range 0xa600-0xa6fe has been reserved
system 00:09: ioport range 0xa700-0xa7fe has been reserved
system 00:09: ioport range 0xa800-0xa8fe has been reserved
system 00:09: ioport range 0xa900-0xa9fe has been reserved
system 00:09: ioport range 0xaa00-0xaafe has been reserved
system 00:09: ioport range 0xab00-0xabfe has been reserved
system 00:09: ioport range 0xac00-0xacfe has been reserved
system 00:09: ioport range 0xad00-0xadfe has been reserved
system 00:09: ioport range 0xae00-0xaefe has been reserved
system 00:09: ioport range 0xaf00-0xaffe has been reserved
system 00:09: iomem range 0xe0000000-0xefffffff has been reserved
system 00:09: iomem range 0xfeda0000-0xfedacfff has been reserved
pci 0000:00:01.0: PCI bridge, secondary bus 0000:01
pci 0000:00:01.0: IO window: disabled
pci 0000:00:01.0: MEM window: 0xdfc00000-0xdfcfffff
pci 0000:00:01.0: PREFETCH window: disabled
pci 0000:00:1c.0: PCI bridge, secondary bus 0000:02
pci 0000:00:1c.0: IO window: disabled
pci 0000:00:1c.0: MEM window: 0xdfb00000-0xdfbfffff
pci 0000:00:1c.0: PREFETCH window: disabled
pci 0000:00:1c.4: PCI bridge, secondary bus 0000:03
pci 0000:00:1c.4: IO window: disabled
pci 0000:00:1c.4: MEM window: 0xdfa00000-0xdfafffff
pci 0000:00:1c.4: PREFETCH window: disabled
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:04
pci 0000:00:1e.0: IO window: 0xd000-0xdfff
pci 0000:00:1e.0: MEM window: 0xdf800000-0xdf9fffff
pci 0000:00:1e.0: PREFETCH window: 0x000000d0000000-0x000000d00fffff
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:01.0: setting latency timer to 64
pci 0000:00:1c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.0: setting latency timer to 64
pci 0000:00:1c.4: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.4: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io: [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 1 mem: [0xdfc00000-0xdfcfffff]
pci_bus 0000:02: resource 1 mem: [0xdfb00000-0xdfbfffff]
pci_bus 0000:03: resource 1 mem: [0xdfa00000-0xdfafffff]
pci_bus 0000:04: resource 0 io: [0xd000-0xdfff]
pci_bus 0000:04: resource 1 mem: [0xdf800000-0xdf9fffff]
pci_bus 0000:04: resource 2 pref mem [0xd0000000-0xd00fffff]
pci_bus 0000:04: resource 3 io: [0x00-0xffff]
pci_bus 0000:04: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
NET: Registered protocol family 1
checking if image is initramfs...
rootfs image is initramfs; unpacking...
Freeing initrd memory: 2413k freed
Simple Boot Flag at 0x7a set to 0x1
audit: initializing netlink socket (disabled)
type=2000 audit(1244191610.804:1): initialized
HugeTLB registered 2 MB page size, pre-allocated 0 pages
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
#######high memory 18446612137615818752, size_t 18446612137615818752
#######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
msgmni has been set to 7521
SELinux: Registering netfilter hooks
alg: No test for stdrng (krng)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci 0000:00:02.0: Boot video device
pcieport-driver 0000:00:01.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:01.0: setting latency timer to 64
pcieport-driver 0000:00:1c.0: irq 25 for MSI/MSI-X
pcieport-driver 0000:00:1c.0: setting latency timer to 64
pcieport-driver 0000:00:1c.4: irq 26 for MSI/MSI-X
pcieport-driver 0000:00:1c.4: setting latency timer to 64
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Marking TSC unstable due to TSC halts in idle
processor ACPI_CPU:00: registered as cooling_device0
processor ACPI_CPU:01: registered as cooling_device1
Non-volatile memory driver v1.3
Linux agpgart interface v0.103
agpgart-intel 0000:00:00.0: Intel 965Q Chipset
agpgart-intel 0000:00:00.0: detected 7676K stolen memory
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xc0000000
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
brd: module loaded
Uniform Multi-Platform E-IDE driver
ide_generic: please use "probe_mask=0x3f" module parameter for probing all legacy ISA IDE ports
Probing IDE interface ide0...
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
ide1 at 0x170-0x177,0x376 on irq 15
ide-gd driver 1.18
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
cpuidle: using governor ladder
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
TCP bic registered
Initializing XFRM netlink socket
NET: Registered protocol family 17
registered taskstats version 1
Freeing unused kernel memory: 396k freed
Write protecting the kernel read-only data: 3968k
uhci_hcd: USB Universal Host Controller Interface driver
uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1a.0: setting latency timer to 64
uhci_hcd 0000:00:1a.0: UHCI Host Controller
uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1a.0: irq 16, io base 0x0000ff20
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
uhci_hcd 0000:00:1a.1: setting latency timer to 64
uhci_hcd 0000:00:1a.1: UHCI Host Controller
uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1a.1: irq 17, io base 0x0000ff00
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
uhci_hcd 0000:00:1d.0: setting latency timer to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000ff80
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
uhci_hcd 0000:00:1d.1: setting latency timer to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.1: irq 17, io base 0x0000ff60
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: setting latency timer to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000ff40
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
Warning! ehci_hcd should always be loaded before uhci_hcd and ohci_hcd, not after
ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 22 (level, low) -> IRQ 22
ehci_hcd 0000:00:1a.7: setting latency timer to 64
ehci_hcd 0000:00:1a.7: EHCI Host Controller
ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 6
ehci_hcd 0000:00:1a.7: debug port 1
ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1a.7: irq 22, io mem 0xdfdfbc00
ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00
usb usb6: configuration #1 chosen from 1 choice
hub 6-0:1.0: USB hub found
hub 6-0:1.0: 4 ports detected
ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 7
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1d.7: irq 23, io mem 0xff980800
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
usb usb7: configuration #1 chosen from 1 choice
hub 7-0:1.0: USB hub found
hub 7-0:1.0: 6 ports detected
SCSI subsystem initialized
Driver 'sd' needs updating - please use bus_type methods
libata version 3.00 loaded.
ata_piix 0000:00:1f.2: version 2.12
ata_piix 0000:00:1f.2: PCI INT C -> GSI 20 (level, low) -> IRQ 20
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ata_piix 0000:00:1f.2: setting latency timer to 64
scsi0 : ata_piix
scsi1 : ata_piix
ata1: SATA max UDMA/133 cmd 0xfe00 ctl 0xfe10 bmdma 0xfec0 irq 20
ata2: SATA max UDMA/133 cmd 0xfe20 ctl 0xfe30 bmdma 0xfec8 irq 20
ata_piix 0000:00:1f.5: PCI INT C -> GSI 20 (level, low) -> IRQ 20
ata_piix 0000:00:1f.5: MAP [ P0 -- P1 -- ]
ata_piix 0000:00:1f.5: setting latency timer to 64
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0xfe40 ctl 0xfe50 bmdma 0xfed0 irq 20
ata4: SATA max UDMA/133 cmd 0xfe60 ctl 0xfe70 bmdma 0xfed8 irq 20
usb 2-2: new low speed USB device using uhci_hcd and address 2
usb 2-2: configuration #1 chosen from 1 choice
input: Dell Dell USB Keyboard as /class/input/input0
generic-usb 0003:413C:2003.0001: input: USB HID v1.10 Keyboard [Dell Dell USB Keyboard] on usb-0000:00:1a.1-2/input0
ata3: SATA link down (SStatus 0 SControl 300)
ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.01: SATA link down (SStatus 4 SControl 300)
ata1.00: ATA-7: ST3250820AS, 3.ADG, max UDMA/133
ata1.00: 488281250 sectors, multi 8: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST3250820AS 3.AD PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 488281250 512-byte hardware sectors: (250 GB/232 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
sd 0:0:0:0: [sda] Attached SCSI disk
ata2.00: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.01: SATA link down (SStatus 4 SControl 300)
ata2.00: ATAPI: TSSTcorp DVD+/-RW TS-H653A, D500, max UDMA/33
ata2.00: applying bridge limits
ata2.00: configured for UDMA/33
scsi 1:0:0:0: CD-ROM TSSTcorp DVD+-RW TS-H653A D500 PQ: 0 ANSI: 5
ata4: SATA link down (SStatus 0 SControl 300)
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with writeback data mode.
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
type=1404 audit(1244191617.685:2): selinux=0 auid=4294967295 ses=4294967295
input: PC Speaker as /class/input/input1
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
i801_smbus 0000:00:1f.3: PCI INT C -> GSI 20 (level, low) -> IRQ 20
Intel(R) PRO/1000 Network Driver - version 7.3.21-k3-NAPI
Copyright (c) 1999-2006 Intel Corporation.
e1000 0000:04:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
e1000: 0000:04:00.0: e1000_probe: (PCI:33MHz:32-bit) 00:1b:21:06:27:78
input: Power Button as /class/input/input2
ACPI: Power Button [PWRF]
input: Power Button as /class/input/input3
ACPI: Power Button [VBTN]
parport_pc 00:06: reported by Plug and Play ACPI
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE]
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000 0000:04:02.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
e1000: 0000:04:02.0: e1000_probe: (PCI:33MHz:32-bit) 00:0e:0c:dc:ed:53
rtc_cmos 00:05: RTC can wake from S4
rtc_cmos 00:05: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one day, 242 bytes nvram, hpet irqs
tg3.c:v3.98 (February 25, 2009)
tg3 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
tg3 0000:03:00.0: setting latency timer to 64
tg3 0000:03:00.0: PME# disabled
eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address 00:1a:a0:af:d1:f5
eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1])
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
eth1: dma_rwctrl[76180000] dma_mask[64-bit]
e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection
Driver 'sr' needs updating - please use bus_type methods
sr0: scsi3-mmc drive: 48x/48x writer cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 1:0:0:0: Attached scsi CD-ROM sr0
HDA Intel 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
HDA Intel 0000:00:1b.0: setting latency timer to 64
dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
Floppy drive(s): fd0 is 1.44M
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 1:0:0:0: Attached scsi generic sg1 type 5
floppy0: no floppy controllers found
Floppy drive(s): fd0 is 1.44M
floppy0: no floppy controllers found
lp0: using parport0 (interrupt-driven).
lp0: console ready
ramfs: bad mount option: maxsize=512
md: Autodetecting RAID arrays.
md: Scanned 0 and added 0 devices.
md: autorun ...
md: ... autorun DONE.
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: [email protected]
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on sda6, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda7, internal journal
EXT3-fs: mounted filesystem with writeback data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda8, internal journal
EXT3-fs: mounted filesystem with writeback data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda5, internal journal
EXT3-fs: mounted filesystem with writeback data mode.
Adding 8193140k swap on /dev/sda3. Priority:-1 extents:1 across:8193140k
platform microcode: firmware: requesting intel-ucode/06-0f-02
platform microcode: firmware: requesting intel-ucode/06-0f-02
Microcode Update Driver: v2.00 <[email protected]>, Peter Oruba
Microcode Update Driver: v2.00 removed.
warning: process `kudzu' used the deprecated sysctl system call with 1.23.
Loading iSCSI transport class v2.0-870.
iscsi: registered transport (tcp)
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
iscsi: registered transport (iser)
e1000: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX
ADDRCONF(NETDEV_UP): eth0: link is not ready
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
e1000: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX
ADDRCONF(NETDEV_UP): eth1: link is not ready
ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
tg3 0000:03:00.0: PME# disabled
tg3 0000:03:00.0: irq 27 for MSI/MSI-X
ADDRCONF(NETDEV_UP): eth2: link is not ready
tg3: eth2: Link is up at 100 Mbps, full duplex.
tg3: eth2: Flow control is off for TX and off for RX.
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
eth0: no IPv6 routers present
warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use)
OCFS2 Node Manager 1.5.0
OCFS2 DLM 1.5.0
ocfs2: Registered cluster interface o2cb
OCFS2 DLMFS 1.5.0
OCFS2 User DLM kernel interface loaded
Bluetooth: Core ver 2.15
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.13
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.11
Bluetooth: HIDP (Human Interface Emulation) ver 1.2
eth1: no IPv6 routers present
Bridge firewalling registered
ip_tables: (C) 2000-2006 Netfilter Core Team
eth2: no IPv6 routers present
[drm] Initialized drm 1.1.0 20060810
pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:02.0: setting latency timer to 64
mtrr: type mismatch for c0000000,10000000 old: write-back new: write-combining
[drm] MTRR allocation failed. Graphics performance may suffer.
pci 0000:00:02.0: irq 28 for MSI/MSI-X
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
mtrr: type mismatch for c0000000,10000000 old: write-back new: write-combining
On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>
>> Please send the boot logs: dmesg -s 1000000 > foo
> attached.
>#######high memory 18446612137615818752, size_t 18446612137615818752
>#######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
These two lines must be added by yourself...
What?!
How can PAGE_OFFSET be 0??
Can you show us these two printk() you just added?
And, the size of kcore is not the crazy number in the subject...
This one is much saner..
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 59b43a0..50830c7 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -408,6 +408,13 @@ static int __init proc_kcore_init(void)
if (proc_root_kcore)
proc_root_kcore->size =
(size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
+ printk(KERN_INFO "#######high memory %lu, size_t %llu\n",
+ (unsigned long long)high_memory, (size_t)high_memory);
+ if (proc_root_kcore)
+ printk(KERN_INFO "#######kcore size %lu, "
+ "PAGE_OFFSET %lu, PAGE_SIZE %lu\n",
+ proc_root_kcore->size,
+ PAGE_OFFSET, PAGE_SIZE);
return 0;
}
module_init(proc_kcore_init);
Tao Ma wrote:
>
>
> Amerigo Wang wrote:
>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>> attached.
>>
>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>
>>
>> These two lines must be added by yourself...
>>
>> What?!
>> How can PAGE_OFFSET be 0??
>> Can you show us these two printk() you just added?
>>
>> And, the size of kcore is not the crazy number in the subject...
>> This one is much saner..
> Sorry, I used the wrong printk. the correct one is:
> #######high memory 18446612137615818752, size_t 18446612137615818752
> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
> PAGE_SIZE 4096
>
%lx should be used.
also you compiler doesn't like
high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
in setup.c?
YH
Yinghai Lu wrote:
> Tao Ma wrote:
>>
>> Amerigo Wang wrote:
>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>> attached.
>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>
>>> These two lines must be added by yourself...
>>>
>>> What?!
>>> How can PAGE_OFFSET be 0??
>>> Can you show us these two printk() you just added?
>>>
>>> And, the size of kcore is not the crazy number in the subject...
>>> This one is much saner..
>> Sorry, I used the wrong printk. the correct one is:
>> #######high memory 18446612137615818752, size_t 18446612137615818752
>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>> PAGE_SIZE 4096
>>
>
> %lx should be used.
>
> also you compiler doesn't like
>
> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>
> in setup.c?
Sorry fo my poor English, bug what do you mean?
I just printk in the setup.c and the result is
@@@@high_momory ffff88013c000000
and my gcc version is:
gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
Thanks.
Tao
Tao Ma wrote:
>
> Yinghai Lu wrote:
>> Tao Ma wrote:
>>>
>>> Amerigo Wang wrote:
>>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>>> attached.
>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>>
>>>> These two lines must be added by yourself...
>>>>
>>>> What?!
>>>> How can PAGE_OFFSET be 0??
>>>> Can you show us these two printk() you just added?
>>>>
>>>> And, the size of kcore is not the crazy number in the subject...
>>>> This one is much saner..
>>> Sorry, I used the wrong printk. the correct one is:
>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>>> PAGE_SIZE 4096
>>>
>>
>> %lx should be used.
>>
>> also you compiler doesn't like
>>
>> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>>
>> in setup.c?
> Sorry fo my poor English, bug what do you mean?
>
> I just printk in the setup.c and the result is
>
> @@@@high_momory ffff88013c000000
so that value print out is right.
YH
On Sat, Jun 06, 2009 at 03:21:23PM -0700, Yinghai Lu wrote:
>Tao Ma wrote:
>>
>> Yinghai Lu wrote:
>>> Tao Ma wrote:
>>>>
>>>> Amerigo Wang wrote:
>>>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>>>> attached.
>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>>>
>>>>> These two lines must be added by yourself...
>>>>>
>>>>> What?!
>>>>> How can PAGE_OFFSET be 0??
>>>>> Can you show us these two printk() you just added?
>>>>>
>>>>> And, the size of kcore is not the crazy number in the subject...
>>>>> This one is much saner..
>>>> Sorry, I used the wrong printk. the correct one is:
>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>>>> PAGE_SIZE 4096
>>>>
>>>
>>> %lx should be used.
>>>
>>> also you compiler doesn't like
>>>
>>> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>>>
>>> in setup.c?
>> Sorry fo my poor English, bug what do you mean?
>>
>> I just printk in the setup.c and the result is
>>
>> @@@@high_momory ffff88013c000000
>
>so that value print out is right.
Yeah.
Tao, can you reproduce the number mentioned in the subject??
Thanks.
Amerigo Wang wrote:
> On Sat, Jun 06, 2009 at 03:21:23PM -0700, Yinghai Lu wrote:
>> Tao Ma wrote:
>>> Yinghai Lu wrote:
>>>> Tao Ma wrote:
>>>>> Amerigo Wang wrote:
>>>>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>>>>> attached.
>>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>>>> These two lines must be added by yourself...
>>>>>>
>>>>>> What?!
>>>>>> How can PAGE_OFFSET be 0??
>>>>>> Can you show us these two printk() you just added?
>>>>>>
>>>>>> And, the size of kcore is not the crazy number in the subject...
>>>>>> This one is much saner..
>>>>> Sorry, I used the wrong printk. the correct one is:
>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>>>>> PAGE_SIZE 4096
>>>>>
>>>> %lx should be used.
>>>>
>>>> also you compiler doesn't like
>>>>
>>>> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>>>>
>>>> in setup.c?
>>> Sorry fo my poor English, bug what do you mean?
>>>
>>> I just printk in the setup.c and the result is
>>>
>>> @@@@high_momory ffff88013c000000
>> so that value print out is right.
>
> Yeah.
>
> Tao, can you reproduce the number mentioned in the subject??
Sorry for the delay.
But the result is the same and I don't think it should be changed by my
printk.
Regards,
Tao
On Mon, Jun 8, 2009 at 2:02 PM, Tao Ma<[email protected]> wrote:
>
>
> Amerigo Wang wrote:
>>
>> On Sat, Jun 06, 2009 at 03:21:23PM -0700, Yinghai Lu wrote:
>>>
>>> Tao Ma wrote:
>>>>
>>>> Yinghai Lu wrote:
>>>>>
>>>>> Tao Ma wrote:
>>>>>>
>>>>>> Amerigo Wang wrote:
>>>>>>>
>>>>>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>>>>>>
>>>>>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>>>>>>
>>>>>>>> attached.
>>>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>>>>>
>>>>>>> These two lines must be added by yourself...
>>>>>>>
>>>>>>> What?!
>>>>>>> How can PAGE_OFFSET be 0??
>>>>>>> Can you show us these two printk() you just added?
>>>>>>>
>>>>>>> And, the size of kcore is not the crazy number in the subject...
>>>>>>> This one is much saner..
>>>>>>
>>>>>> Sorry, I used the wrong printk. the correct one is:
>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>>>>>> PAGE_SIZE 4096
>>>>>>
>>>>> %lx should be used.
>>>>>
>>>>> also you compiler doesn't like
>>>>>
>>>>> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>>>>>
>>>>> in setup.c?
>>>>
>>>> Sorry fo my poor English, bug what do you mean?
>>>>
>>>> I just printk in the setup.c and the result is
>>>>
>>>> @@@@high_momory ffff88013c000000
>>>
>>> so that value print out is right.
>>
>> Yeah.
>>
>> Tao, can you reproduce the number mentioned in the subject??
>
> Sorry for the delay.
>
> But the result is the same
Yes?
Your printk() shows kcore size is: 5301604352, and in your subject it is
281474974617600...
Or they happened in the same time?
Américo Wang wrote:
> On Mon, Jun 8, 2009 at 2:02 PM, Tao Ma<[email protected]> wrote:
>>
>> Amerigo Wang wrote:
>>> On Sat, Jun 06, 2009 at 03:21:23PM -0700, Yinghai Lu wrote:
>>>> Tao Ma wrote:
>>>>> Yinghai Lu wrote:
>>>>>> Tao Ma wrote:
>>>>>>> Amerigo Wang wrote:
>>>>>>>> On Fri, Jun 05, 2009 at 05:30:49PM +0800, Tao Ma wrote:
>>>>>>>>>> Please send the boot logs: dmesg -s 1000000 > foo
>>>>>>>>> attached.
>>>>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>>>>> #######kcore size 5301604352, PAGE_OFFSET 0, PAGE_SIZE 4096
>>>>>>>> These two lines must be added by yourself...
>>>>>>>>
>>>>>>>> What?!
>>>>>>>> How can PAGE_OFFSET be 0??
>>>>>>>> Can you show us these two printk() you just added?
>>>>>>>>
>>>>>>>> And, the size of kcore is not the crazy number in the subject...
>>>>>>>> This one is much saner..
>>>>>>> Sorry, I used the wrong printk. the correct one is:
>>>>>>> #######high memory 18446612137615818752, size_t 18446612137615818752
>>>>>>> #######kcore size 5301604352, PAGE_OFFSET 18446612132314218496,
>>>>>>> PAGE_SIZE 4096
>>>>>>>
>>>>>> %lx should be used.
>>>>>>
>>>>>> also you compiler doesn't like
>>>>>>
>>>>>> high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
>>>>>>
>>>>>> in setup.c?
>>>>> Sorry fo my poor English, bug what do you mean?
>>>>>
>>>>> I just printk in the setup.c and the result is
>>>>>
>>>>> @@@@high_momory ffff88013c000000
>>>> so that value print out is right.
>>> Yeah.
>>>
>>> Tao, can you reproduce the number mentioned in the subject??
>> Sorry for the delay.
>>
>> But the result is the same
>
> Yes?
> Your printk() shows kcore size is: 5301604352, and in your subject it is
> 281474974617600...
>
> Or they happened in the same time?
yes. the same box and the same linux version.
A bit strange.
[taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
high memory ffff88013c000000, size 5301604352
[taoma@ocfs2-test2 ~]$ ll /proc/kcore
-r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
Regards,
Tao
On Mon, Jun 8, 2009 at 4:00 PM, Tao Ma<[email protected]> wrote:
>>>
>>> But the result is the same
>>
>> Yes?
>> Your printk() shows kcore size is: 5301604352, and in your subject it is
>> 281474974617600...
>>
>> Or they happened in the same time?
>
> yes. the same box and the same linux version.
> A bit strange.
>
> [taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
> high memory ffff88013c000000, size 5301604352
> [taoma@ocfs2-test2 ~]$ ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
Really weird...
They should be the same. This means we have some problem in our procfs.
And, we have no problem on i386, I, myself, even can't reproduce this on my
x86_64 box...
Drop Cc to x86 people, add some Cc to proc people. :)
Eric, Alexey, any ideas?
Tao, would you like to send us your .config? Thanks.
Américo Wang <[email protected]> writes:
> On Mon, Jun 8, 2009 at 4:00 PM, Tao Ma<[email protected]> wrote:
>>>>
>>>> But the result is the same
>>>
>>> Yes?
>>> Your printk() shows kcore size is: 5301604352, and in your subject it is
>>> 281474974617600...
>>>
>>> Or they happened in the same time?
>>
>> yes. the same box and the same linux version.
>> A bit strange.
>>
>> [taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
>> high memory ffff88013c000000, size 5301604352
>> [taoma@ocfs2-test2 ~]$ ll /proc/kcore
>> -r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
>
> Really weird...
> They should be the same. This means we have some problem in our procfs.
>
> And, we have no problem on i386, I, myself, even can't reproduce this on my
> x86_64 box...
>
> Drop Cc to x86 people, add some Cc to proc people. :)
>
> Eric, Alexey, any ideas?
>
> Tao, would you like to send us your .config? Thanks.
Short of some strange patch applied I would guess that a non-sense /proc/kcore
size is related to a kernel memory stomp, stepping on the high_memory variable.
Eric
On Mon, Jun 08, 2009 at 09:10:10PM -0700, Eric W. Biederman wrote:
>Américo Wang <[email protected]> writes:
>
>> On Mon, Jun 8, 2009 at 4:00 PM, Tao Ma<[email protected]> wrote:
>>>>>
>>>>> But the result is the same
>>>>
>>>> Yes?
>>>> Your printk() shows kcore size is: 5301604352, and in your subject it is
>>>> 281474974617600...
>>>>
>>>> Or they happened in the same time?
>>>
>>> yes. the same box and the same linux version.
>>> A bit strange.
>>>
>>> [taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
>>> high memory ffff88013c000000, size 5301604352
>>> [taoma@ocfs2-test2 ~]$ ll /proc/kcore
>>> -r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
>>
>> Really weird...
>> They should be the same. This means we have some problem in our procfs.
>>
>> And, we have no problem on i386, I, myself, even can't reproduce this on my
>> x86_64 box...
>>
>> Drop Cc to x86 people, add some Cc to proc people. :)
>>
>> Eric, Alexey, any ideas?
>>
>> Tao, would you like to send us your .config? Thanks.
>
>Short of some strange patch applied I would guess that a non-sense /proc/kcore
>size is related to a kernel memory stomp, stepping on the high_memory variable.
Hello, Eric.
I see the problem now, I think the documentation of /proc/kcore
is wrong, the size of kcore can be more than the size of physical
memory, because it also contains the info of kernel modules which
stay above the mapping of phy memory, see arch/x86/mm/init_64.c.
What do you think?
Thanks!
Amerigo Wang <[email protected]> writes:
> On Mon, Jun 08, 2009 at 09:10:10PM -0700, Eric W. Biederman wrote:
>>Américo Wang <[email protected]> writes:
>>
>>> On Mon, Jun 8, 2009 at 4:00 PM, Tao Ma<[email protected]> wrote:
>>>>>>
>>>>>> But the result is the same
>>>>>
>>>>> Yes?
>>>>> Your printk() shows kcore size is: 5301604352, and in your subject it is
>>>>> 281474974617600...
>>>>>
>>>>> Or they happened in the same time?
>>>>
>>>> yes. the same box and the same linux version.
>>>> A bit strange.
>>>>
>>>> [taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
>>>> high memory ffff88013c000000, size 5301604352
>>>> [taoma@ocfs2-test2 ~]$ ll /proc/kcore
>>>> -r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
>>>
>>> Really weird...
>>> They should be the same. This means we have some problem in our procfs.
>>>
>>> And, we have no problem on i386, I, myself, even can't reproduce this on my
>>> x86_64 box...
>>>
>>> Drop Cc to x86 people, add some Cc to proc people. :)
>>>
>>> Eric, Alexey, any ideas?
>>>
>>> Tao, would you like to send us your .config? Thanks.
>>
>>Short of some strange patch applied I would guess that a non-sense /proc/kcore
>>size is related to a kernel memory stomp, stepping on the high_memory variable.
>
> Hello, Eric.
>
> I see the problem now, I think the documentation of /proc/kcore
> is wrong, the size of kcore can be more than the size of physical
> memory, because it also contains the info of kernel modules which
> stay above the mapping of phy memory, see arch/x86/mm/init_64.c.
>
> What do you think?
I think that doesn't make any sense.
I was reading the code.
I smell a nasty problem somewhere.
Eric
Hi all,
sorry for the delay. I am occupied by other stuff these days.
I just tried and the strange thing is that 2 same boxes(Dell optiplex
745) with 2.6.29 kernel have different output. One is normal and one is
wrong. So I am totally puzzled now
So Eric may be right(there is a memory stomp), but it does show sometimes.
Regards,
Tao
Eric W. Biederman wrote:
> Amerigo Wang <[email protected]> writes:
>
>> On Mon, Jun 08, 2009 at 09:10:10PM -0700, Eric W. Biederman wrote:
>>> Américo Wang <[email protected]> writes:
>>>
>>>> On Mon, Jun 8, 2009 at 4:00 PM, Tao Ma<[email protected]> wrote:
>>>>>>> But the result is the same
>>>>>> Yes?
>>>>>> Your printk() shows kcore size is: 5301604352, and in your subject it is
>>>>>> 281474974617600...
>>>>>>
>>>>>> Or they happened in the same time?
>>>>> yes. the same box and the same linux version.
>>>>> A bit strange.
>>>>>
>>>>> [taoma@ocfs2-test2 ~]$ dmesg|grep "high memory"
>>>>> high memory ffff88013c000000, size 5301604352
>>>>> [taoma@ocfs2-test2 ~]$ ll /proc/kcore
>>>>> -r-------- 1 root root 281474974617600 Jun 8 15:20 /proc/kcore
>>>> Really weird...
>>>> They should be the same. This means we have some problem in our procfs.
>>>>
>>>> And, we have no problem on i386, I, myself, even can't reproduce this on my
>>>> x86_64 box...
>>>>
>>>> Drop Cc to x86 people, add some Cc to proc people. :)
>>>>
>>>> Eric, Alexey, any ideas?
>>>>
>>>> Tao, would you like to send us your .config? Thanks.
>>> Short of some strange patch applied I would guess that a non-sense /proc/kcore
>>> size is related to a kernel memory stomp, stepping on the high_memory variable.
>> Hello, Eric.
>>
>> I see the problem now, I think the documentation of /proc/kcore
>> is wrong, the size of kcore can be more than the size of physical
>> memory, because it also contains the info of kernel modules which
>> stay above the mapping of phy memory, see arch/x86/mm/init_64.c.
>>
>> What do you think?
>
> I think that doesn't make any sense.
>
> I was reading the code.
>
> I smell a nasty problem somewhere.
>
> Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Fix wrong /proc/kcore size on x86_64.
x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
is wrong too.
So just remove them, use the generic macro.
BTW, the man page for /proc/kcore is wrong, its size can be more than
the physical memory size, because it also contains memory area of
vmalloc(), vsyscall etc...
Reported-by: Tao Ma <[email protected]>
Signed-off-by: WANG Cong <[email protected]>
Cc: Eric W. Biederman <[email protected]>
---
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index abde308..cdbfd1d 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -163,12 +163,6 @@ extern void cleanup_highmap(void);
#define PAGE_AGP PAGE_KERNEL_NOCACHE
#define HAVE_PAGE_AGP 1
-/* fs/proc/kcore.c */
-#define kc_vaddr_to_offset(v) ((v) & __VIRTUAL_MASK)
-#define kc_offset_to_vaddr(o) \
- (((o) & (1UL << (__VIRTUAL_MASK_SHIFT - 1))) \
- ? ((o) | ~__VIRTUAL_MASK) \
- : (o))
#define __HAVE_ARCH_PTE_SAME
#endif /* !__ASSEMBLY__ */
Amerigo Wang <[email protected]> writes:
> Fix wrong /proc/kcore size on x86_64.
How does that change anything?
> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
>
> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
> is wrong too.
>
> So just remove them, use the generic macro.
>
> BTW, the man page for /proc/kcore is wrong, its size can be more than
> the physical memory size, because it also contains memory area of
> vmalloc(), vsyscall etc...
The set of offsets that are usable sure.
However the size from stat is:
proc_root_kcore->size = (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
Which can not be different than the physical memory size.
> Reported-by: Tao Ma <[email protected]>
> Signed-off-by: WANG Cong <[email protected]>
> Cc: Eric W. Biederman <[email protected]>
>
> ---
> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
> index abde308..cdbfd1d 100644
> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
> @@ -163,12 +163,6 @@ extern void cleanup_highmap(void);
> #define PAGE_AGP PAGE_KERNEL_NOCACHE
> #define HAVE_PAGE_AGP 1
>
> -/* fs/proc/kcore.c */
> -#define kc_vaddr_to_offset(v) ((v) & __VIRTUAL_MASK)
> -#define kc_offset_to_vaddr(o) \
> - (((o) & (1UL << (__VIRTUAL_MASK_SHIFT - 1))) \
> - ? ((o) | ~__VIRTUAL_MASK) \
> - : (o))
>
> #define __HAVE_ARCH_PTE_SAME
> #endif /* !__ASSEMBLY__ */
On Fri, Jun 12, 2009 at 09:20:50PM -0700, Eric W. Biederman wrote:
>Amerigo Wang <[email protected]> writes:
>
>> Fix wrong /proc/kcore size on x86_64.
>
>How does that change anything?
Please check the description below.
>
>> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
>> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
>>
>> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
>> is wrong too.
>>
>> So just remove them, use the generic macro.
>>
>> BTW, the man page for /proc/kcore is wrong, its size can be more than
>> the physical memory size, because it also contains memory area of
>> vmalloc(), vsyscall etc...
>
>The set of offsets that are usable sure.
We have generic kc_vaddr_to_offset() etc. in fs/proc/kcore.c.
>
>However the size from stat is:
> proc_root_kcore->size = (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
>
>Which can not be different than the physical memory size.
I never say this is not different, of course they are same, but what Tao
reported is the wrong size after a read operation, please try the following:
#ls -l /proc/kcore
#readelf -l /proc/kcore
#ls -l /proc/kcore
You will find the *second* 'ls -l /proc/kcore' reports a size much more
than the physical mem size.
And you will notice the difference of it after this patch applied.
Hi Amerigo,
Just patched my kernel and tested.
The bad news is that although the number is changed, but it isn't right
either.
Here is the output.
[root@test3 ~]# ls -l /proc/kcore
-r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
But your patch does change something. I just try your commands in
another box which show the right value after reboot. And the result is:
[root@test8 ~]# ls -l /proc/kcore
-r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
[root@test8 ~]# readelf -l /proc/kcore
Elf file type is CORE (Core file)
Entry point 0x0
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000190 0x0000000000000000 0x0000000000000000
0x00000000000008bc 0x0000000000000000 0
LOAD 0x000077ffff601000 0xffffffffff600000 0x0000000000000000
0x0000000000800000 0x0000000000800000 RWE 1000
LOAD 0x000077ffa0001000 0xffffffffa0000000 0x0000000000000000
0x000000005f000000 0x000000005f000000 RWE 1000
LOAD 0x000077ff8200a000 0xffffffff82009000 0x0000000000000000
0x00000000006ceb50 0x00000000006ceb50 RWE 1000
LOAD 0x00003a0000001000 0xffffc20000000000 0x0000000000000000
0x00001fffffffffff 0x00001fffffffffff RWE 1000
LOAD 0x0000000000001000 0xffff880000000000 0x0000000000000000
0x000000013c000000 0x000000013c000000 RWE 1000
[root@test8 ~]# ls -l /proc/kcore
-r-------- 1 root root 131941393240064 Jun 15 13:35 /proc/kcore
So you see, the second "ls -l" will show the wrong value.
Regards,
Tao
Amerigo Wang wrote:
> On Fri, Jun 12, 2009 at 09:20:50PM -0700, Eric W. Biederman wrote:
>> Amerigo Wang <[email protected]> writes:
>>
>>> Fix wrong /proc/kcore size on x86_64.
>> How does that change anything?
>
> Please check the description below.
>
>>> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
>>> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
>>>
>>> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
>>> is wrong too.
>>>
>>> So just remove them, use the generic macro.
>>>
>>> BTW, the man page for /proc/kcore is wrong, its size can be more than
>>> the physical memory size, because it also contains memory area of
>>> vmalloc(), vsyscall etc...
>> The set of offsets that are usable sure.
>
> We have generic kc_vaddr_to_offset() etc. in fs/proc/kcore.c.
>
>
>> However the size from stat is:
>> proc_root_kcore->size = (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
>>
>> Which can not be different than the physical memory size.
>
> I never say this is not different, of course they are same, but what Tao
> reported is the wrong size after a read operation, please try the following:
>
> #ls -l /proc/kcore
> #readelf -l /proc/kcore
> #ls -l /proc/kcore
>
> You will find the *second* 'ls -l /proc/kcore' reports a size much more
> than the physical mem size.
>
> And you will notice the difference of it after this patch applied.
>
>
On Mon, Jun 15, 2009 at 01:59:08PM +0800, Tao Ma wrote:
> Hi Amerigo,
> Just patched my kernel and tested.
> The bad news is that although the number is changed, but it isn't right
> either.
Thanks for testing.
What do you mean by saying it isn't right? You think it is wrong only because
it is more than phy mem size?
Again, the document of /proc/kcore is wrong, it _can_ be more than phy mem size.
Regards.
Hi Amerigo,
The wrong number I mean is 131941393240064.
So do you think
[root@test3 ~]# ls -l /proc/kcore
-r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
is better than
[taoma@test2 ~]$ ll /proc/kcore
-r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
?
I don't think so.
Actually the right result should look like
[root@test8 ~]# ls -l /proc/kcore
-r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
And with your patch I can't get this number.
Regards,
Tao
Amerigo Wang wrote:
> On Mon, Jun 15, 2009 at 01:59:08PM +0800, Tao Ma wrote:
>> Hi Amerigo,
>> Just patched my kernel and tested.
>> The bad news is that although the number is changed, but it isn't right
>> either.
>
> Thanks for testing.
>
> What do you mean by saying it isn't right? You think it is wrong only because
> it is more than phy mem size?
>
> Again, the document of /proc/kcore is wrong, it _can_ be more than phy mem size.
>
> Regards.
>
Please don't top-post.
On Mon, Jun 15, 2009 at 04:34:27PM +0800, Tao Ma wrote:
> Hi Amerigo,
>
> The wrong number I mean is 131941393240064.
>
> So do you think
> [root@test3 ~]# ls -l /proc/kcore
> -r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
>
> is better than
>
> [taoma@test2 ~]$ ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
> ?
Yes, the former *is* what I can expect.
>
> I don't think so.
>
> Actually the right result should look like
>
> [root@test8 ~]# ls -l /proc/kcore
> -r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
>
> And with your patch I can't get this number.
Of course not.
Again and again, kernel modules and vsyscall are also included
into kcore, unless doing this is wrong you will never get the
number you mentioned above, because they sit above the
phy mem map on x86_64.
Please read the code, I don't want to explain again and again.
Amerigo Wang <[email protected]> writes:
> Fix wrong /proc/kcore size on x86_64.
>
> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
Ok. I finally understand what is going on here, and no kc_vaddr_to_offset
is not wrong when applied to a virtual address. In fact I expect the current
definition makes things a bit more predictable.
And yes kclist_add is must be given a virtual address
> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
> is wrong too.
How so? The file offset is a number space that is different from both
physical and virtual addresses.
> So just remove them, use the generic macro.
I think a case can be made either way. In practice neither answer
gives us a dense offset space on x86_64 so I think I prefer the
current definition which sets or clears the high bits as opposed
to something that mangles the address more.
> BTW, the man page for /proc/kcore is wrong, its size can be more than
> the physical memory size, because it also contains memory area of
> vmalloc(), vsyscall etc...
Yes, the man page is wrong. The kcore code is also misleading as it
uses two entirely different definitions of size (aka the maximum
offset accepted).
It uses get_kcore_size and (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
The second definition being bogus as it has nothing to do with which
offsets are accepted.
Eric
Tao Ma <[email protected]> writes:
> Hi Amerigo,
>
> The wrong number I mean is 131941393240064.
>
> So do you think
> [root@test3 ~]# ls -l /proc/kcore
> -r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
>
> is better than
>
> [taoma@test2 ~]$ ll /proc/kcore
> -r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
> ?
>
> I don't think so.
>
> Actually the right result should look like
>
> [root@test8 ~]# ls -l /proc/kcore
> -r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
>
> And with your patch I can't get this number.
Actually that value is the bug. It has absolutely nothing
to do with the offsets that are valid within /proc/kcore.
Why do you prefer the smaller number?
Eric
[email protected] wrote:
> Tao Ma <[email protected]> writes:
>
>
>> Hi Amerigo,
>>
>> The wrong number I mean is 131941393240064.
>>
>> So do you think
>> [root@test3 ~]# ls -l /proc/kcore
>> -r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
>>
>> is better than
>>
>> [taoma@test2 ~]$ ll /proc/kcore
>> -r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
>> ?
>>
>> I don't think so.
>>
>> Actually the right result should look like
>>
>> [root@test8 ~]# ls -l /proc/kcore
>> -r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
>>
>> And with your patch I can't get this number.
>>
>
> Actually that value is the bug. It has absolutely nothing
> to do with the offsets that are valid within /proc/kcore.
>
> Why do you prefer the smaller number?
>
Amerigo said in the previous e-mail that " the man page for/proc/kcore
is wrong, its size can be more than the physical memory size, because it
also contains memory area of vmalloc(), vsyscall etc..."
I have 4G memory, and 5301604352 is just a bit larger than 4G and looks
sane. So I misunderstand that this number is right.
But if it is also a bug, I am willing to test any of the new patch. ;)
Regards,
Tao
TaoMa <[email protected]> writes:
> [email protected] wrote:
>> Tao Ma <[email protected]> writes:
>>
>>
>>> Hi Amerigo,
>>>
>>> The wrong number I mean is 131941393240064.
>>>
>>> So do you think
>>> [root@test3 ~]# ls -l /proc/kcore
>>> -r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
>>>
>>> is better than
>>>
>>> [taoma@test2 ~]$ ll /proc/kcore
>>> -r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
>>> ?
>>>
>>> I don't think so.
>>>
>>> Actually the right result should look like
>>>
>>> [root@test8 ~]# ls -l /proc/kcore
>>> -r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
>>>
>>> And with your patch I can't get this number.
>>>
>>
>> Actually that value is the bug. It has absolutely nothing
>> to do with the offsets that are valid within /proc/kcore.
>>
>> Why do you prefer the smaller number?
>>
> Amerigo said in the previous e-mail that " the man page for/proc/kcore is wrong,
> its size can be more than the physical memory size, because it also contains
> memory area of vmalloc(), vsyscall etc..."
>
> I have 4G memory, and 5301604352 is just a bit larger than 4G and looks sane. So
> I misunderstand that this number is right.
It should also include the 32 Tebibyte range we have for vmalloc. So
a completely dense encoding would be a bit larger than 35184372088832
bytes. You can see that range in your readelf -l output.
Since the encoding is not dense the size actually comes to. 256TiB.
Or roughly 281474976710656 bytes.
> But if it is also a bug, I am willing to test any of the new patch. ;)
Not in the sense that anything could go wrong. Merely in the sense that
we have a contradictory definition. Which causes loads of confusion.
I am wondering if this difference in definition has caused any
problems applications to fail or if this just started out as an
observation of an anomaly?
Eric
[email protected] wrote:
> TaoMa <[email protected]> writes:
>
>> [email protected] wrote:
>>> Tao Ma <[email protected]> writes:
>>>
>>>
>>>> Hi Amerigo,
>>>>
>>>> The wrong number I mean is 131941393240064.
>>>>
>>>> So do you think
>>>> [root@test3 ~]# ls -l /proc/kcore
>>>> -r-------- 1 root root 131941393240064 Jun 15 13:39 /proc/kcore
>>>>
>>>> is better than
>>>>
>>>> [taoma@test2 ~]$ ll /proc/kcore
>>>> -r-------- 1 root root 281474974617600 Jun 15 15:20 /proc/kcore
>>>> ?
>>>>
>>>> I don't think so.
>>>>
>>>> Actually the right result should look like
>>>>
>>>> [root@test8 ~]# ls -l /proc/kcore
>>>> -r-------- 1 root root 5301604352 Jun 15 13:35 /proc/kcore
>>>>
>>>> And with your patch I can't get this number.
>>>>
>>> Actually that value is the bug. It has absolutely nothing
>>> to do with the offsets that are valid within /proc/kcore.
>>>
>>> Why do you prefer the smaller number?
>>>
>> Amerigo said in the previous e-mail that " the man page for/proc/kcore is wrong,
>> its size can be more than the physical memory size, because it also contains
>> memory area of vmalloc(), vsyscall etc..."
>>
>> I have 4G memory, and 5301604352 is just a bit larger than 4G and looks sane. So
>> I misunderstand that this number is right.
>
> It should also include the 32 Tebibyte range we have for vmalloc. So
> a completely dense encoding would be a bit larger than 35184372088832
> bytes. You can see that range in your readelf -l output.
>
> Since the encoding is not dense the size actually comes to. 256TiB.
> Or roughly 281474976710656 bytes.
>
>> But if it is also a bug, I am willing to test any of the new patch. ;)
>
> Not in the sense that anything could go wrong. Merely in the sense that
> we have a contradictory definition. Which causes loads of confusion.
>
> I am wondering if this difference in definition has caused any
> problems applications to fail or if this just started out as an
> observation of an anomaly?
I first noticed it when my el5 box refused to start kdump service and
kexec said something like "Can't find kernel text map area from kcore".
And then I found this number which looked a bit strange.
I also just have another x86 box and "ls -l /proc/kcore" shows:
-r-------- 1 root root 939528192 Jun 16 10:01 /proc/kcore
So I thought this may be a bug and started this thread.
Anyway, later I found that kexec's problem isn't related to this issue.
So maybe we can leave as-is.
regards,
Tao
On Mon, Jun 15, 2009 at 6:08 PM, Eric W. Biederman<[email protected]> wrote:
> Amerigo Wang <[email protected]> writes:
>
>> Fix wrong /proc/kcore size on x86_64.
>>
>> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
>> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
>
> Ok. I finally understand what is going on here, and no kc_vaddr_to_offset
> is not wrong when applied to a virtual address. In fact I expect the current
> definition makes things a bit more predictable.
>
> And yes kclist_add is must be given a virtual address
>
>> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
>> is wrong too.
>
> How so? The file offset is a number space that is different from both
> physical and virtual addresses.
Why? They _do_ have some calculated relations.
>
>> So just remove them, use the generic macro.
>
> I think a case can be made either way. In practice neither answer
> gives us a dense offset space on x86_64 so I think I prefer the
> current definition which sets or clears the high bits as opposed
> to something that mangles the address more.
>
I am trying to dig more... There must be something wrong there.
>
> It uses get_kcore_size and (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
> The second definition being bogus as it has nothing to do with which
> offsets are accepted.
Agreed. Maybe we can just remove the second one and update the doc?
Américo Wang <[email protected]> writes:
> On Mon, Jun 15, 2009 at 6:08 PM, Eric W. Biederman<[email protected]> wrote:
>> Amerigo Wang <[email protected]> writes:
>>
>>> Fix wrong /proc/kcore size on x86_64.
>>>
>>> x86_64 uses __va() macro to caculate the virtual address passed to kclist_add()
>>> but decodes it with its own macro kc_vadd_to_offset(). This is wrong.
>>
>> Ok. I finally understand what is going on here, and no kc_vaddr_to_offset
>> is not wrong when applied to a virtual address. In fact I expect the current
>> definition makes things a bit more predictable.
>>
>> And yes kclist_add is must be given a virtual address
>>
>>> Also, according to Documentation/x86/x86_64/mm.txt, kc_vaddr_to_offset()
>>> is wrong too.
>>
>> How so? The file offset is a number space that is different from both
>> physical and virtual addresses.
>
> Why? They _do_ have some calculated relations.
Sure. The offset is what you give to read/write. The virtual
addresses are what the kernel uses. In general in a core file they
are only tied together with the elf header. We do something a little
more pragmatic in the kernel.
>>> So just remove them, use the generic macro.
>>
>> I think a case can be made either way. In practice neither answer
>> gives us a dense offset space on x86_64 so I think I prefer the
>> current definition which sets or clears the high bits as opposed
>> to something that mangles the address more.
>>
>
> I am trying to dig more... There must be something wrong there.
How so?
>> It uses get_kcore_size and (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
>> The second definition being bogus as it has nothing to do with which
>> offsets are accepted.
>
> Agreed. Maybe we can just remove the second one and update the doc?
Yes. It isn't critical but reducing confusion is good.
Do you want to cook up the patch for that?
Eric
On Tue, Jun 16, 2009 at 12:27:36PM -0700, Eric W. Biederman wrote:
>Américo Wang <[email protected]> writes:
>>> I think a case can be made either way. In practice neither answer
>>> gives us a dense offset space on x86_64 so I think I prefer the
>>> current definition which sets or clears the high bits as opposed
>>> to something that mangles the address more.
>>>
>>
>> I am trying to dig more... There must be something wrong there.
>
>How so?
See what you will get for kc_vaddr_to_offset(__va(0))?
It is supposed to be 0.
>
>>> It uses get_kcore_size and (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
>>> The second definition being bogus as it has nothing to do with which
>>> offsets are accepted.
>>
>> Agreed. Maybe we can just remove the second one and update the doc?
>
>Yes. It isn't critical but reducing confusion is good.
>Do you want to cook up the patch for that?
Yes, I am cooking a patch set... will send them when ready.
Amerigo Wang <[email protected]> writes:
> On Tue, Jun 16, 2009 at 12:27:36PM -0700, Eric W. Biederman wrote:
>>Américo Wang <[email protected]> writes:
>>>> I think a case can be made either way. In practice neither answer
>>>> gives us a dense offset space on x86_64 so I think I prefer the
>>>> current definition which sets or clears the high bits as opposed
>>>> to something that mangles the address more.
>>>>
>>>
>>> I am trying to dig more... There must be something wrong there.
>>
>>How so?
>
> See what you will get for kc_vaddr_to_offset(__va(0))?
> It is supposed to be 0.
I see: 0x0000880000001000 That extra 0x1000 looks suspicous.
It MUST NOT be 0. That is where the ELF header lives in the file.
> Yes, I am cooking a patch set... will send them when ready.
The I will leave it to you.
Eric
On Wed, Jun 17, 2009 at 08:37:40PM -0700, Eric W. Biederman wrote:
>Amerigo Wang <[email protected]> writes:
>
>> On Tue, Jun 16, 2009 at 12:27:36PM -0700, Eric W. Biederman wrote:
>>>Américo Wang <[email protected]> writes:
>>>>> I think a case can be made either way. In practice neither answer
>>>>> gives us a dense offset space on x86_64 so I think I prefer the
>>>>> current definition which sets or clears the high bits as opposed
>>>>> to something that mangles the address more.
>>>>>
>>>>
>>>> I am trying to dig more... There must be something wrong there.
>>>
>>>How so?
>>
>> See what you will get for kc_vaddr_to_offset(__va(0))?
>> It is supposed to be 0.
>
>I see: 0x0000880000001000 That extra 0x1000 looks suspicous.
huh? 0x0000880000000000 not?
>
>It MUST NOT be 0. That is where the ELF header lives in the file.
Of course I knew this.
Just read the code:
phdr->p_offset = kc_vaddr_to_offset(m->addr) + dataoff;
So it should be 0, 'dataoff' is there...
Amerigo Wang <[email protected]> writes:
> On Wed, Jun 17, 2009 at 08:37:40PM -0700, Eric W. Biederman wrote:
>>Amerigo Wang <[email protected]> writes:
>>
>>> On Tue, Jun 16, 2009 at 12:27:36PM -0700, Eric W. Biederman wrote:
>>>>Américo Wang <[email protected]> writes:
>>>>>> I think a case can be made either way. In practice neither answer
>>>>>> gives us a dense offset space on x86_64 so I think I prefer the
>>>>>> current definition which sets or clears the high bits as opposed
>>>>>> to something that mangles the address more.
>>>>>>
>>>>>
>>>>> I am trying to dig more... There must be something wrong there.
>>>>
>>>>How so?
>>>
>>> See what you will get for kc_vaddr_to_offset(__va(0))?
>>> It is supposed to be 0.
>>
>>I see: 0x0000880000001000 That extra 0x1000 looks suspicous.
>
>
> huh? 0x0000880000000000 not?
>
>>
>>It MUST NOT be 0. That is where the ELF header lives in the file.
>
> Of course I knew this.
>
> Just read the code:
>
> phdr->p_offset = kc_vaddr_to_offset(m->addr) + dataoff;
>
> So it should be 0, 'dataoff' is there...
Sorry. The naming then is horrible. It is really
kc_vaddr_to_something_like_the_offset.
I still don't see the need for a flat offset space.
I can see a real point of only having a single kc_vaddr_to_offset
function. Instead of the 3 in existence.
No point in cluttering the whole world with the oddities of the kcore
code. Especially when it should get cleaned up.
My real point earlier is that kc_vaddr_to_offset and
kc_offset_to_vaddr actually on x86_64 aren't broken. They are just
peculiar. There is some small point to their oddities, in that if
something is in the upper half of the address space (like xen) but
below PAGE_OFFSET you have a chance of accessing it with /proc/kcore.
But that is a very minor benefit.
Eric
On Wed, Jun 17, 2009 at 10:41:32PM -0700, Eric W. Biederman wrote:
>Amerigo Wang <[email protected]> writes:
>>
>> Of course I knew this.
>>
>> Just read the code:
>>
>> phdr->p_offset = kc_vaddr_to_offset(m->addr) + dataoff;
>>
>> So it should be 0, 'dataoff' is there...
>
>Sorry. The naming then is horrible. It is really
>kc_vaddr_to_something_like_the_offset.
>
>I still don't see the need for a flat offset space.
>
>I can see a real point of only having a single kc_vaddr_to_offset
>function. Instead of the 3 in existence.
>
>No point in cluttering the whole world with the oddities of the kcore
>code. Especially when it should get cleaned up.
>
>My real point earlier is that kc_vaddr_to_offset and
>kc_offset_to_vaddr actually on x86_64 aren't broken. They are just
>peculiar. There is some small point to their oddities, in that if
>something is in the upper half of the address space (like xen) but
>below PAGE_OFFSET you have a chance of accessing it with /proc/kcore.
>But that is a very minor benefit.
It looks like that Linus fixes this in commit 9063c61fd5cbd.
So I will only fix the rest.
Signed-off-by: WANG Cong <[email protected]>
Cc: [email protected]
---
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 59b43a0..eca5201 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -405,9 +405,6 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
static int __init proc_kcore_init(void)
{
proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
- if (proc_root_kcore)
- proc_root_kcore->size =
- (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
return 0;
}
module_init(proc_kcore_init);
---
diff --git a/man5/proc.5 b/man5/proc.5
index ed47f70..e31aae4 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1246,8 +1246,6 @@ kernel
binary, GDB can be used to
examine the current state of any kernel data structures.
-The total length of the file is the size of physical memory (RAM) plus
-4KB.
.TP
.I /proc/kmsg
This file can be used instead of the
Linus fixes wrong size of /proc/kcore problem in commit 9063c61fd5cbd.
But its size still looks insane, since it never equals to the size
of physical memory.
Signed-off-by: WANG Cong <[email protected]>
Cc: [email protected]
(Andrew, could you please just cut off the kernel part from below? :)
---
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 59b43a0..eca5201 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -405,9 +405,6 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
static int __init proc_kcore_init(void)
{
proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
- if (proc_root_kcore)
- proc_root_kcore->size =
- (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
return 0;
}
module_init(proc_kcore_init);
---
diff --git a/man5/proc.5 b/man5/proc.5
index ed47f70..e31aae4 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1246,8 +1246,6 @@ kernel
binary, GDB can be used to
examine the current state of any kernel data structures.
-The total length of the file is the size of physical memory (RAM) plus
-4KB.
.TP
.I /proc/kmsg
This file can be used instead of the
On Tue, 30 Jun 2009 18:08:50 +0800
Amerigo Wang <[email protected]> wrote:
>
> Linus fixes wrong size of /proc/kcore problem in commit 9063c61fd5cbd.
>
> But its size still looks insane, since it never equals to the size
> of physical memory.
Better changelogs, please!
I think that what you're saying is that the stat.st_size field of the
/proc/kcore inode does not equal the amount of physical memory, and
that you think it should do so?
If that is correct then it would be appropriate to explain what value
the stat.st_size field has before the patch and afterwards. Just
calling it "insane" isn't optimal.
> Signed-off-by: WANG Cong <[email protected]>
> Cc: [email protected]
>
> (Andrew, could you please just cut off the kernel part from below? :)
>
> ---
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index 59b43a0..eca5201 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -405,9 +405,6 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> static int __init proc_kcore_init(void)
> {
> proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
> - if (proc_root_kcore)
> - proc_root_kcore->size =
> - (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
> return 0;
> }
> module_init(proc_kcore_init);
AFAICT this means that proc_root_kcore->size will remain uninitialised
until a process opens and reads from /proc/kcore. So on initial boot
the `ls' output will presumably show a size of zero, and this will
change once /proc/kcore has been read?
If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
In fact, do we need to run get_kcore_size() more than once per boot?
AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
is called at the appropriate time, we can permanently cache its result?
In which case get_kcore_size() and kclist_add() can be marked __init.
Maybe that's all wrong - I didn't look terribly closely.
Andrew Morton <[email protected]> writes:
> On Tue, 30 Jun 2009 18:08:50 +0800
> Amerigo Wang <[email protected]> wrote:
>
>>
>> Linus fixes wrong size of /proc/kcore problem in commit 9063c61fd5cbd.
>>
>> But its size still looks insane, since it never equals to the size
>> of physical memory.
>
> Better changelogs, please!
>
> I think that what you're saying is that the stat.st_size field of the
> /proc/kcore inode does not equal the amount of physical memory, and
> that you think it should do so?
>
> If that is correct then it would be appropriate to explain what value
> the stat.st_size field has before the patch and afterwards. Just
> calling it "insane" isn't optimal.
>
>> Signed-off-by: WANG Cong <[email protected]>
>> Cc: [email protected]
>>
>> (Andrew, could you please just cut off the kernel part from below? :)
>>
>> ---
>> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
>> index 59b43a0..eca5201 100644
>> --- a/fs/proc/kcore.c
>> +++ b/fs/proc/kcore.c
>> @@ -405,9 +405,6 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
>> static int __init proc_kcore_init(void)
>> {
>> proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
>> - if (proc_root_kcore)
>> - proc_root_kcore->size =
>> - (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
>> return 0;
>> }
>> module_init(proc_kcore_init);
>
> AFAICT this means that proc_root_kcore->size will remain uninitialised
> until a process opens and reads from /proc/kcore. So on initial boot
> the `ls' output will presumably show a size of zero, and this will
> change once /proc/kcore has been read?
Which is better than showing a random number of dubious relationship
to the size we normally show. That code is just a maintenance problem.
> If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
>
> In fact, do we need to run get_kcore_size() more than once per boot?
>
> AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
> is called at the appropriate time, we can permanently cache its result?
>
> In which case get_kcore_size() and kclist_add() can be marked __init.
>
> Maybe that's all wrong - I didn't look terribly closely.
Memory hot add I expect is the excuse. There is more that could be
done. But this patch is an obvious bit of chipping away nonsense
code.
Eric
On Wed, 01 Jul 2009 16:25:05 -0700 [email protected] (Eric W. Biederman) wrote:
> Andrew Morton <[email protected]> writes:
>
> >> index 59b43a0..eca5201 100644
> >> --- a/fs/proc/kcore.c
> >> +++ b/fs/proc/kcore.c
> >> @@ -405,9 +405,6 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> >> static int __init proc_kcore_init(void)
> >> {
> >> proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
> >> - if (proc_root_kcore)
> >> - proc_root_kcore->size =
> >> - (size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
> >> return 0;
> >> }
> >> module_init(proc_kcore_init);
> >
> > AFAICT this means that proc_root_kcore->size will remain uninitialised
> > until a process opens and reads from /proc/kcore. So on initial boot
> > the `ls' output will presumably show a size of zero, and this will
> > change once /proc/kcore has been read?
>
> Which is better than showing a random number of dubious relationship
> to the size we normally show. That code is just a maintenance problem.
Well it's not just that st_size is wrong before the first read. It's
also wrong after memory hot-add, up until the next read.
> > If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
> >
> > In fact, do we need to run get_kcore_size() more than once per boot?
> >
> > AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
> > is called at the appropriate time, we can permanently cache its result?
> >
> > In which case get_kcore_size() and kclist_add() can be marked __init.
> >
> > Maybe that's all wrong - I didn't look terribly closely.
>
> Memory hot add I expect is the excuse. There is more that could be
> done. But this patch is an obvious bit of chipping away nonsense
> code.
We have the infrastructure to get this right, I think:
- run
proc_root_kcore->size = get_kcore_size(...)
within proc_kcore_init()
- register a memory-hotplug notifier and each time memory goes online
or offline, rerun
proc_root_kcore->size = get_kcore_size(...)
- stop running get_kcore_size() within read_kcore().
I suspect that read_kcore() will not behave well if a memory hotplug
operation happens concurrently. But that's a separate problem.
(hopefully cc's some memory-hotplug people)
Or we just leave /proc/kcore's st_size at zero. It's a pretty hopeless
exercise trying to get this "right", as nobody can safely _use_ that
size - it can be wrong as soon as the caller has read from it.
On Wed, 1 Jul 2009 17:12:49 -0700
Andrew Morton <[email protected]> wrote:
> On Wed, 01 Jul 2009 16:25:05 -0700 [email protected] (Eric W. Biederman) wrote:
> > Which is better than showing a random number of dubious relationship
> > to the size we normally show. That code is just a maintenance problem.
>
> Well it's not just that st_size is wrong before the first read. It's
> also wrong after memory hot-add, up until the next read.
>
And I found kclist_add() is not called at memory hotplug...
> > > If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
> > >
> > > In fact, do we need to run get_kcore_size() more than once per boot?
> > >
> > > AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
> > > is called at the appropriate time, we can permanently cache its result?
> > >
> > > In which case get_kcore_size() and kclist_add() can be marked __init.
> > >
> > > Maybe that's all wrong - I didn't look terribly closely.
> >
> > Memory hot add I expect is the excuse. There is more that could be
> > done. But this patch is an obvious bit of chipping away nonsense
> > code.
>
> We have the infrastructure to get this right, I think:
>
> - run
>
> proc_root_kcore->size = get_kcore_size(...)
>
> within proc_kcore_init()
>
yes, seems sane.
> - register a memory-hotplug notifier and each time memory goes online
> or offline, rerun
>
> proc_root_kcore->size = get_kcore_size(...)
>
yes. and we need kclist_add() under memory hotplug.
> - stop running get_kcore_size() within read_kcore().
>
> I suspect that read_kcore() will not behave well if a memory hotplug
> operation happens concurrently. But that's a separate problem.
>
> (hopefully cc's some memory-hotplug people)
>
Maybe no problem. I don't think people does memory hotplug while he reads
/proc/kcore. (It sounds like modify coredump while investigating it.)
Thanks,
-Kame
>
> Or we just leave /proc/kcore's st_size at zero. It's a pretty hopeless
> exercise trying to get this "right", as nobody can safely _use_ that
> size - it can be wrong as soon as the caller has read from it.
>
>
On Wed, Jul 01, 2009 at 02:47:42PM -0700, Andrew Morton wrote:
>On Tue, 30 Jun 2009 18:08:50 +0800
>Amerigo Wang <[email protected]> wrote:
>
>>
>> Linus fixes wrong size of /proc/kcore problem in commit 9063c61fd5cbd.
>>
>> But its size still looks insane, since it never equals to the size
>> of physical memory.
>
>Better changelogs, please!
>
>I think that what you're saying is that the stat.st_size field of the
>/proc/kcore inode does not equal the amount of physical memory, and
>that you think it should do so?
No, it is expected to be more than the amount of physical memory.
>
>If that is correct then it would be appropriate to explain what value
>the stat.st_size field has before the patch and afterwards. Just
>calling it "insane" isn't optimal.
Yup!
My bad, I just mentioned this in the earlier email in this thread,
but I forgot it put it here. Sorry for this!
>
>AFAICT this means that proc_root_kcore->size will remain uninitialised
>until a process opens and reads from /proc/kcore. So on initial boot
>the `ls' output will presumably show a size of zero, and this will
>change once /proc/kcore has been read?
Yes, exactly...
>
>If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
Yes, we can, but I think leaving this like what the rest /proc files
behave is better.
>
>In fact, do we need to run get_kcore_size() more than once per boot?
>AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
>is called at the appropriate time, we can permanently cache its result?
>
>In which case get_kcore_size() and kclist_add() can be marked __init.
A quick grep shows kclist_add() can be marked as __init, but I don't
know if anyone will use it in other parts in the future.
I prefer leaving it as it is.
On Thu, 2 Jul 2009 09:41:38 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:
> On Wed, 1 Jul 2009 17:12:49 -0700
> Andrew Morton <[email protected]> wrote:
>
> > On Wed, 01 Jul 2009 16:25:05 -0700 [email protected] (Eric W. Biederman) wrote:
> > > Which is better than showing a random number of dubious relationship
> > > to the size we normally show. That code is just a maintenance problem.
> >
> > Well it's not just that st_size is wrong before the first read. It's
> > also wrong after memory hot-add, up until the next read.
> >
> And I found kclist_add() is not called at memory hotplug...
>
>
> > > > If so, should we run get_kcore_size() in proc_kcore_init(), perhaps?
> > > >
> > > > In fact, do we need to run get_kcore_size() more than once per boot?
> > > >
> > > > AFAICT we only run kclist_add() during bootup, so if proc_kcore_init()
> > > > is called at the appropriate time, we can permanently cache its result?
> > > >
> > > > In which case get_kcore_size() and kclist_add() can be marked __init.
> > > >
> > > > Maybe that's all wrong - I didn't look terribly closely.
> > >
> > > Memory hot add I expect is the excuse. There is more that could be
> > > done. But this patch is an obvious bit of chipping away nonsense
> > > code.
> >
> > We have the infrastructure to get this right, I think:
> >
> > - run
> >
> > proc_root_kcore->size = get_kcore_size(...)
> >
> > within proc_kcore_init()
> >
> yes, seems sane.
>
>
> > - register a memory-hotplug notifier and each time memory goes online
> > or offline, rerun
> >
> > proc_root_kcore->size = get_kcore_size(...)
> >
> yes. and we need kclist_add() under memory hotplug.
>
>
> > - stop running get_kcore_size() within read_kcore().
> >
> > I suspect that read_kcore() will not behave well if a memory hotplug
> > operation happens concurrently. But that's a separate problem.
> >
> > (hopefully cc's some memory-hotplug people)
> >
> Maybe no problem. I don't think people does memory hotplug while he reads
> /proc/kcore. (It sounds like modify coredump while investigating it.)
>
I think I'm about to forget about the above issues. If everyone else
does the same, they won't get addressed. Oh well.
And I still need to decide whether
kcore-fix-proc-kcores-statst_size.patch fixes things up sufficiently
well to justify merging it.
On Fri, 17 Jul 2009 15:29:55 -0700
Andrew Morton <[email protected]> wrote:
> On Thu, 2 Jul 2009 09:41:38 +0900
> KAMEZAWA Hiroyuki <[email protected]> wrote:
> I think I'm about to forget about the above issues. If everyone else
> does the same, they won't get addressed. Oh well.
>
> And I still need to decide whether
> kcore-fix-proc-kcores-statst_size.patch fixes things up sufficiently
> well to justify merging it.
>
Hmm, I read fs/proc/kcore.c and feel followng, now.
- kclist doesn't handle memory hole, then, it will never be "correct" size.
For example, arch/x86/mm/init.c calls kclist_add() as following
715 kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
716 VMALLOC_END-VMALLOC_START);
Wow, extremely big anyway.
- Then, yes. Size of /proc/kcode is pointless. Anyway, what's important is
not "size", but ELF phder of kcore.
To this patch,
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
BTW, I'd like to look into handling physical memory range for /proc/kcore.
IMHO, kclist for physical memory is not necessary...it's handled by /proc/iomem.
"kdump" uses this information and it's properly maintained by memory hotplug.
I'd like to try some pathces and make kclist_add() for physical memory cleaner,
later.
Thanks,
-Kame
On Tue, 21 Jul 2009 11:09:24 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:
> On Fri, 17 Jul 2009 15:29:55 -0700
> Andrew Morton <[email protected]> wrote:
>
> > On Thu, 2 Jul 2009 09:41:38 +0900
> > KAMEZAWA Hiroyuki <[email protected]> wrote:
> > I think I'm about to forget about the above issues. If everyone else
> > does the same, they won't get addressed. Oh well.
> >
> > And I still need to decide whether
> > kcore-fix-proc-kcores-statst_size.patch fixes things up sufficiently
> > well to justify merging it.
> >
>
> Hmm, I read fs/proc/kcore.c and feel followng, now.
>
> - kclist doesn't handle memory hole, then, it will never be "correct" size.
> For example, arch/x86/mm/init.c calls kclist_add() as following
>
> 715 kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
> 716 VMALLOC_END-VMALLOC_START);
>
> Wow, extremely big anyway.
>
> - Then, yes. Size of /proc/kcode is pointless. Anyway, what's important is
> not "size", but ELF phder of kcore.
>
> To this patch,
> Acked-by: KAMEZAWA Hiroyuki <[email protected]>
>
Ah...BTW, if set size to be 0,
%objdump -x /proc/kcore
returns immediately because objdump finds size as 0. but readelf seems to
work well.
Thanks,
-Kame
Now, /proc/kcore is built on kclist information which is constructed at boot.
This kclist includes physical memory range information but not updated at
memory hotplug. And, this information tends to includes big memory hole.
On the other hand, /proc/iomem includes all physical memory information as
"System RAM" and this is updated properly and kdump use this, IIUC.
(I hope all archtecuture stores necessary information...)
This patch tries to build kclist for physical memory(direct map) on
/proc/iomem info. It's refreshed at open("/proc/kcore",) if necesasry.
This is just a RFC. Any comments are welcome.
[1/3] ... clean up kclist handling.
[2/3] ... clean up kclist_add()
[3/3] ... use /proc/iomem information for /proc/kcore.
I can only test x86-64.
Thanks,
-Kame
From: KAMEZAWA Hiroyuki <[email protected]>
/proc/kcore uses its own list handling codes. But it's better to use
generic list codes.
And read_kcore() use "m" to specifiy
- kcore entry
- vmalloc entry
both in different types.
This patch renames "m" to "vms" for vmalloc(), avoiding confusion.
No changes in logic. just clean up.
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
fs/proc/kcore.c | 41 ++++++++++++++++++++++-------------------
include/linux/proc_fs.h | 2 +-
2 files changed, 23 insertions(+), 20 deletions(-)
Index: mmotm-2.6.31-Jul16/fs/proc/kcore.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/fs/proc/kcore.c
+++ mmotm-2.6.31-Jul16/fs/proc/kcore.c
@@ -20,6 +20,7 @@
#include <linux/init.h>
#include <asm/uaccess.h>
#include <asm/io.h>
+#include <linux/list.h>
#define CORE_STR "CORE"
@@ -57,7 +58,7 @@ struct memelfnote
void *data;
};
-static struct kcore_list *kclist;
+static LIST_HEAD(kclist_head);
static DEFINE_RWLOCK(kclist_lock);
void
@@ -67,8 +68,7 @@ kclist_add(struct kcore_list *new, void
new->size = size;
write_lock(&kclist_lock);
- new->next = kclist;
- kclist = new;
+ list_add_tail(&new->list, &kclist_head);
write_unlock(&kclist_lock);
}
@@ -80,7 +80,7 @@ static size_t get_kcore_size(int *nphdr,
*nphdr = 1; /* PT_NOTE */
size = 0;
- for (m=kclist; m; m=m->next) {
+ list_for_each_entry(m, &kclist_head, list) {
try = kc_vaddr_to_offset((size_t)m->addr + m->size);
if (try > size)
size = try;
@@ -192,7 +192,7 @@ static void elf_kcore_store_hdr(char *bu
nhdr->p_align = 0;
/* setup ELF PT_LOAD program header for every area */
- for (m=kclist; m; m=m->next) {
+ list_for_each_entry(m, &kclist_head, list) {
phdr = (struct elf_phdr *) bufp;
bufp += sizeof(struct elf_phdr);
offset += sizeof(struct elf_phdr);
@@ -317,7 +317,7 @@ read_kcore(struct file *file, char __use
struct kcore_list *m;
read_lock(&kclist_lock);
- for (m=kclist; m; m=m->next) {
+ list_for_each_entry(m, &kclist_head, list) {
if (start >= m->addr && start < (m->addr+m->size))
break;
}
@@ -328,7 +328,7 @@ read_kcore(struct file *file, char __use
return -EFAULT;
} else if (is_vmalloc_addr((void *)start)) {
char * elf_buf;
- struct vm_struct *m;
+ struct vm_struct *vms;
unsigned long curstart = start;
unsigned long cursize = tsz;
@@ -337,29 +337,32 @@ read_kcore(struct file *file, char __use
return -ENOMEM;
read_lock(&vmlist_lock);
- for (m=vmlist; m && cursize; m=m->next) {
+ for (vms = vmlist; vms && cursize; vms = vms->next) {
unsigned long vmstart;
unsigned long vmsize;
- unsigned long msize = m->size - PAGE_SIZE;
+ unsigned long msize = vms->size - PAGE_SIZE;
+ unsigned long curend, vmend;
- if (((unsigned long)m->addr + msize) <
+ if (((unsigned long)vms->addr + msize) <
curstart)
continue;
- if ((unsigned long)m->addr > (curstart +
+ if ((unsigned long)vms->addr > (curstart +
cursize))
break;
- vmstart = (curstart < (unsigned long)m->addr ?
- (unsigned long)m->addr : curstart);
- if (((unsigned long)m->addr + msize) >
- (curstart + cursize))
- vmsize = curstart + cursize - vmstart;
+ if (curstart < (unsigned long)vms->addr)
+ vmstart = (unsigned long)vms->addr;
else
- vmsize = (unsigned long)m->addr +
- msize - vmstart;
+ vmstart = curstart;
+ curend = curstart + cursize;
+ vmend = (unsigned long)vms->addr + msize;
+ if (vmend > curend)
+ vmsize = curend - vmstart;
+ else
+ vmsize = vmend - vmstart;
curstart = vmstart + vmsize;
cursize -= vmsize;
/* don't dump ioremap'd stuff! (TA) */
- if (m->flags & VM_IOREMAP)
+ if (vms->flags & VM_IOREMAP)
continue;
memcpy(elf_buf + (vmstart - start),
(char *)vmstart, vmsize);
Index: mmotm-2.6.31-Jul16/include/linux/proc_fs.h
===================================================================
--- mmotm-2.6.31-Jul16.orig/include/linux/proc_fs.h
+++ mmotm-2.6.31-Jul16/include/linux/proc_fs.h
@@ -79,7 +79,7 @@ struct proc_dir_entry {
};
struct kcore_list {
- struct kcore_list *next;
+ struct list_head list;
unsigned long addr;
size_t size;
};
From: KAMEZAWA Hiroyuki <[email protected]>
Now, kclist_add() only eats start address and size as its arguments.
Considering to make kclist dynamically reconfigulable, it's necessary
to know which kclists are for System RAM and which are not.
This patch add kclist types as
KCORE_RAM
KCORE_VMALLOC
KCORE_TEXT
KCORE_OTHER
region for KCORE_RAM will be dynamically updated at memory hotplug.
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
arch/ia64/mm/init.c | 7 ++++---
arch/mips/mm/init.c | 7 ++++---
arch/powerpc/mm/init_32.c | 4 ++--
arch/powerpc/mm/init_64.c | 5 +++--
arch/sh/mm/init.c | 4 ++--
arch/x86/mm/init_32.c | 4 ++--
arch/x86/mm/init_64.c | 11 ++++++-----
fs/proc/kcore.c | 3 ++-
include/linux/proc_fs.h | 13 +++++++++++--
9 files changed, 36 insertions(+), 22 deletions(-)
Index: mmotm-2.6.31-Jul16/include/linux/proc_fs.h
===================================================================
--- mmotm-2.6.31-Jul16.orig/include/linux/proc_fs.h
+++ mmotm-2.6.31-Jul16/include/linux/proc_fs.h
@@ -78,10 +78,18 @@ struct proc_dir_entry {
struct list_head pde_openers; /* who did ->open, but not ->release */
};
+enum kcore_type {
+ KCORE_TEXT,
+ KCORE_VMALLOC,
+ KCORE_RAM,
+ KCORE_OTHER,
+};
+
struct kcore_list {
struct list_head list;
unsigned long addr;
size_t size;
+ int type;
};
struct vmcore {
@@ -233,11 +241,12 @@ static inline void dup_mm_exe_file(struc
#endif /* CONFIG_PROC_FS */
#if !defined(CONFIG_PROC_KCORE)
-static inline void kclist_add(struct kcore_list *new, void *addr, size_t size)
+static inline void
+kclist_add(struct kcore_list *new, void *addr, size_t size, int type)
{
}
#else
-extern void kclist_add(struct kcore_list *, void *, size_t);
+extern void kclist_add(struct kcore_list *, void *, size_t, int type);
#endif
union proc_op {
Index: mmotm-2.6.31-Jul16/arch/ia64/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/ia64/mm/init.c
+++ mmotm-2.6.31-Jul16/arch/ia64/mm/init.c
@@ -639,9 +639,10 @@ mem_init (void)
high_memory = __va(max_low_pfn * PAGE_SIZE);
- kclist_add(&kcore_mem, __va(0), max_low_pfn * PAGE_SIZE);
- kclist_add(&kcore_vmem, (void *)VMALLOC_START, VMALLOC_END-VMALLOC_START);
- kclist_add(&kcore_kernel, _stext, _end - _stext);
+ kclist_add(&kcore_mem, __va(0), max_low_pfn * PAGE_SIZE, KCORE_RAM);
+ kclist_add(&kcore_vmem, (void *)VMALLOC_START,
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
+ kclist_add(&kcore_kernel, _stext, _end - _stext, KCORE_TEXT);
for_each_online_pgdat(pgdat)
if (pgdat->bdata->node_bootmem_map)
Index: mmotm-2.6.31-Jul16/arch/mips/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/mips/mm/init.c
+++ mmotm-2.6.31-Jul16/arch/mips/mm/init.c
@@ -409,11 +409,12 @@ void __init mem_init(void)
if ((unsigned long) &_text > (unsigned long) CKSEG0)
/* The -4 is a hack so that user tools don't have to handle
the overflow. */
- kclist_add(&kcore_kseg0, (void *) CKSEG0, 0x80000000 - 4);
+ kclist_add(&kcore_kseg0, (void *) CKSEG0,
+ 0x80000000 - 4, KCORE_TEXT);
#endif
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT);
+ kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
- VMALLOC_END-VMALLOC_START);
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
printk(KERN_INFO "Memory: %luk/%luk available (%ldk kernel code, "
"%ldk reserved, %ldk data, %ldk init, %ldk highmem)\n",
Index: mmotm-2.6.31-Jul16/arch/powerpc/mm/init_32.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/powerpc/mm/init_32.c
+++ mmotm-2.6.31-Jul16/arch/powerpc/mm/init_32.c
@@ -270,11 +270,11 @@ static int __init setup_kcore(void)
size);
}
- kclist_add(kcore_mem, __va(base), size);
+ kclist_add(kcore_mem, __va(base), size, KCORE_RAM);
}
kclist_add(&kcore_vmem, (void *)VMALLOC_START,
- VMALLOC_END-VMALLOC_START);
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
return 0;
}
Index: mmotm-2.6.31-Jul16/arch/powerpc/mm/init_64.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/powerpc/mm/init_64.c
+++ mmotm-2.6.31-Jul16/arch/powerpc/mm/init_64.c
@@ -128,10 +128,11 @@ static int __init setup_kcore(void)
if (!kcore_mem)
panic("%s: kmalloc failed\n", __func__);
- kclist_add(kcore_mem, __va(base), size);
+ kclist_add(kcore_mem, __va(base), size, KCORE_RAM);
}
- kclist_add(&kcore_vmem, (void *)VMALLOC_START, VMALLOC_END-VMALLOC_START);
+ kclist_add(&kcore_vmem, (void *)VMALLOC_START,
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
return 0;
}
Index: mmotm-2.6.31-Jul16/arch/sh/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/sh/mm/init.c
+++ mmotm-2.6.31-Jul16/arch/sh/mm/init.c
@@ -218,9 +218,9 @@ void __init mem_init(void)
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT);
+ kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
- VMALLOC_END - VMALLOC_START);
+ VMALLOC_END - VMALLOC_START, KCORE_VMALLOC);
printk(KERN_INFO "Memory: %luk/%luk available (%dk kernel code, "
"%dk data, %dk init)\n",
Index: mmotm-2.6.31-Jul16/arch/x86/mm/init_32.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/x86/mm/init_32.c
+++ mmotm-2.6.31-Jul16/arch/x86/mm/init_32.c
@@ -886,9 +886,9 @@ void __init mem_init(void)
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT);
+ kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
- VMALLOC_END-VMALLOC_START);
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
printk(KERN_INFO "Memory: %luk/%luk available (%dk kernel code, "
"%dk reserved, %dk data, %dk init, %ldk highmem)\n",
Index: mmotm-2.6.31-Jul16/arch/x86/mm/init_64.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/x86/mm/init_64.c
+++ mmotm-2.6.31-Jul16/arch/x86/mm/init_64.c
@@ -677,13 +677,14 @@ void __init mem_init(void)
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
/* Register memory areas for /proc/kcore */
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT);
+ kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
- VMALLOC_END-VMALLOC_START);
- kclist_add(&kcore_kernel, &_stext, _end - _stext);
- kclist_add(&kcore_modules, (void *)MODULES_VADDR, MODULES_LEN);
+ VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
+ kclist_add(&kcore_kernel, &_stext, _end - _stext, KCORE_TEXT);
+ kclist_add(&kcore_modules, (void *)MODULES_VADDR, MODULES_LEN,
+ KCORE_OTHER);
kclist_add(&kcore_vsyscall, (void *)VSYSCALL_START,
- VSYSCALL_END - VSYSCALL_START);
+ VSYSCALL_END - VSYSCALL_START, KCORE_OTHER);
printk(KERN_INFO "Memory: %luk/%luk available (%ldk kernel code, "
"%ldk absent, %ldk reserved, %ldk data, %ldk init)\n",
Index: mmotm-2.6.31-Jul16/fs/proc/kcore.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/fs/proc/kcore.c
+++ mmotm-2.6.31-Jul16/fs/proc/kcore.c
@@ -62,10 +62,11 @@ static LIST_HEAD(kclist_head);
static DEFINE_RWLOCK(kclist_lock);
void
-kclist_add(struct kcore_list *new, void *addr, size_t size)
+kclist_add(struct kcore_list *new, void *addr, size_t size, int type)
{
new->addr = (unsigned long)addr;
new->size = size;
+ new->type = type;
write_lock(&kclist_lock);
list_add_tail(&new->list, &kclist_head);
From: KAMEZAWA Hiroyuki <[email protected]>
For /proc/kcore, each arch registers its memory range by kclist_add().
In usual,
- range of physical memory
- range of vmalloc area
- text, etc...
are registered but "range of physical memory" has some troubles.
It doesn't updated at memory hotplug and it tend to include
unnecessary memory holes. Now, /proc/iomem (kernel/resource.c)
includes required physical memory range information and it's
properly updated at memory hotplug. Then, it's good to avoid
using its own code(duplicating information) and to rebuild
kclist for physical memory based on /proc/iomem.
By this, per-arch kclist_add() for KCORE_RAM can be dropped.
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
Index: mmotm-2.6.31-Jul16/fs/proc/kcore.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/fs/proc/kcore.c 2009-07-20 20:44:57.000000000 +0900
+++ mmotm-2.6.31-Jul16/fs/proc/kcore.c 2009-07-20 22:01:52.000000000 +0900
@@ -21,6 +21,9 @@
#include <asm/uaccess.h>
#include <asm/io.h>
#include <linux/list.h>
+#include <linux/ioport.h>
+#include <linux/memory_hotplug.h>
+#include <linux/memory.h>
#define CORE_STR "CORE"
@@ -30,17 +33,6 @@
static struct proc_dir_entry *proc_root_kcore;
-static int open_kcore(struct inode * inode, struct file * filp)
-{
- return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
-}
-
-static ssize_t read_kcore(struct file *, char __user *, size_t, loff_t *);
-
-static const struct file_operations proc_kcore_operations = {
- .read = read_kcore,
- .open = open_kcore,
-};
#ifndef kc_vaddr_to_offset
#define kc_vaddr_to_offset(v) ((v) - PAGE_OFFSET)
@@ -60,6 +52,7 @@
static LIST_HEAD(kclist_head);
static DEFINE_RWLOCK(kclist_lock);
+static int kcore_need_update;
void
kclist_add(struct kcore_list *new, void *addr, size_t size, int type)
@@ -98,6 +91,104 @@
return size + *elf_buflen;
}
+static void free_kclist_ents(struct list_head *head)
+{
+ struct kcore_list *tmp, *pos;
+
+ list_for_each_entry_safe(pos, tmp, head, list) {
+ list_del(&pos->list);
+ kfree(pos);
+ }
+}
+/*
+ * Replace all KCORE_RAM information with passed list.
+ */
+static void __kcore_update_ram(struct list_head *list)
+{
+ struct kcore_list *tmp, *pos;
+ LIST_HEAD(garbage);
+
+ write_lock(&kclist_lock);
+ if (kcore_need_update) {
+ list_for_each_entry_safe(pos, tmp, &kclist_head, list) {
+ if (pos->type == KCORE_RAM)
+ list_move(&pos->list, &garbage);
+ }
+ list_splice(list, &kclist_head);
+ } else
+ list_splice(list, &garbage);
+ kcore_need_update = 0;
+ write_unlock(&kclist_lock);
+
+ free_kclist_ents(&garbage);
+}
+
+
+#ifdef CONFIG_HIGHMEM
+/*
+ * If no highmem, we can assume [0...max_low_pfn) continuous range of memory
+ * because memory hole is not as big as !HIGHMEM case.
+ * (HIGHMEM is special because part of memory is _invisible_ from the kernel.)
+ */
+static int kcore_update_ram(void)
+{
+ LIST_HEAD(head);
+ struct kcore_list *ent;
+ int ret = 0;
+
+ ent = kmalloc(sizeof(*head), GFP_KERNEL);
+ if (!ent) {
+ ret = -ENOMEM;
+ goto unlock_out;
+ }
+ ent->addr = __va(0);
+ ent->size = max_low_pfn << PAGE_SHIFT;
+ ent->type = SYSTEM_RAM;
+ list_add(&ent->list, &head);
+ __kcore_update_ram(&head);
+ return ret;
+}
+
+#else /* !CONFIG_HIGHMEM */
+
+static int
+kclist_add_private(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+ struct list_head *head = (struct list_head *)arg;
+ struct kcore_list *ent;
+
+ ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+ if (!ent)
+ return -ENOMEM;
+ ent->addr = (unsigned long)__va((pfn << PAGE_SHIFT));
+ ent->size = nr_pages << PAGE_SHIFT;
+ ent->type = KCORE_RAM;
+ list_add(&ent->list, head);
+ return 0;
+}
+
+static int kcore_update_ram(void)
+{
+ int nid, ret;
+ unsigned long end_pfn;
+ LIST_HEAD(head);
+
+ /* Not inialized....update now */
+ /* find out "max pfn" */
+ end_pfn = 0;
+ for_each_node_state(nid, N_HIGH_MEMORY)
+ if (end_pfn < node_end_pfn(nid))
+ end_pfn = node_end_pfn(nid);
+ /* scan 0 to max_pfn */
+ ret = walk_memory_resource(0, end_pfn, &head, kclist_add_private);
+ if (ret) {
+ free_kclist_ents(&head);
+ return -ENOMEM;
+ }
+ __kcore_update_ram(&head);
+ return ret;
+}
+#endif /* CONFIG_HIGH_MEM */
/*****************************************************************************/
/*
@@ -271,6 +362,11 @@
read_unlock(&kclist_lock);
return 0;
}
+ /* memory hotplug ?? */
+ if (kcore_need_update) {
+ read_unlock(&kclist_lock);
+ return -EBUSY;
+ }
/* trim buflen to not go beyond EOF */
if (buflen > size - *fpos)
@@ -406,9 +502,42 @@
return acc;
}
+static int open_kcore(struct inode * inode, struct file *filp)
+{
+ if (!capable(CAP_SYS_RAWIO))
+ return -EPERM;
+ if (kcore_need_update)
+ kcore_update_ram();
+ return 0;
+}
+
+
+static const struct file_operations proc_kcore_operations = {
+ .read = read_kcore,
+ .open = open_kcore,
+};
+
+/* just remember that we have to update kcore */
+static int __meminit kcore_callback(struct notifier_block *self,
+ unsigned long action, void *arg)
+{
+ switch (action) {
+ case MEM_ONLINE:
+ case MEM_OFFLINE:
+ write_lock(&kclist_lock);
+ kcore_need_update = 1;
+ write_unlock(&kclist_lock);
+ }
+ return NOTIFY_OK;
+}
+
+
static int __init proc_kcore_init(void)
{
proc_root_kcore = proc_create("kcore", S_IRUSR, NULL, &proc_kcore_operations);
+ kcore_update_ram();
+ hotplug_memory_notifier(kcore_callback, 0);
return 0;
}
module_init(proc_kcore_init);
+
Index: mmotm-2.6.31-Jul16/include/linux/ioport.h
===================================================================
--- mmotm-2.6.31-Jul16.orig/include/linux/ioport.h 2009-07-20 20:44:57.000000000 +0900
+++ mmotm-2.6.31-Jul16/include/linux/ioport.h 2009-07-20 20:45:10.000000000 +0900
@@ -186,5 +186,13 @@
extern int iomem_map_sanity_check(resource_size_t addr, unsigned long size);
extern int iomem_is_exclusive(u64 addr);
+/*
+ * Walk through all SYSTEM_RAM which is registered as resource.
+ * arg is (start_pfn, nr_pages, private_arg_pointer)
+ */
+extern int walk_memory_resource(unsigned long start_pfn,
+ unsigned long nr_pages, void *arg,
+ int (*func)(unsigned long, unsigned long, void *));
+
#endif /* __ASSEMBLY__ */
#endif /* _LINUX_IOPORT_H */
Index: mmotm-2.6.31-Jul16/include/linux/memory_hotplug.h
===================================================================
--- mmotm-2.6.31-Jul16.orig/include/linux/memory_hotplug.h 2009-07-20 20:44:57.000000000 +0900
+++ mmotm-2.6.31-Jul16/include/linux/memory_hotplug.h 2009-07-20 20:45:10.000000000 +0900
@@ -191,13 +191,6 @@
#endif /* ! CONFIG_MEMORY_HOTPLUG */
-/*
- * Walk through all memory which is registered as resource.
- * arg is (start_pfn, nr_pages, private_arg_pointer)
- */
-extern int walk_memory_resource(unsigned long start_pfn,
- unsigned long nr_pages, void *arg,
- int (*func)(unsigned long, unsigned long, void *));
#ifdef CONFIG_MEMORY_HOTREMOVE
Index: mmotm-2.6.31-Jul16/kernel/resource.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/kernel/resource.c 2009-07-20 20:44:57.000000000 +0900
+++ mmotm-2.6.31-Jul16/kernel/resource.c 2009-07-20 20:45:10.000000000 +0900
@@ -234,7 +234,7 @@
EXPORT_SYMBOL(release_resource);
-#if defined(CONFIG_MEMORY_HOTPLUG) && !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
+#if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
/*
* Finds the lowest memory reosurce exists within [res->start.res->end)
* the caller must specify res->start, res->end, res->flags.
Index: mmotm-2.6.31-Jul16/arch/ia64/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/ia64/mm/init.c 2009-07-20 19:29:53.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/ia64/mm/init.c 2009-07-20 21:20:24.000000000 +0900
@@ -639,7 +639,6 @@
high_memory = __va(max_low_pfn * PAGE_SIZE);
- kclist_add(&kcore_mem, __va(0), max_low_pfn * PAGE_SIZE, KCORE_RAM);
kclist_add(&kcore_vmem, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
kclist_add(&kcore_kernel, _stext, _end - _stext, KCORE_TEXT);
Index: mmotm-2.6.31-Jul16/arch/mips/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/mips/mm/init.c 2009-07-20 19:39:16.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/mips/mm/init.c 2009-07-20 21:20:55.000000000 +0900
@@ -412,7 +412,6 @@
kclist_add(&kcore_kseg0, (void *) CKSEG0,
0x80000000 - 4, KCORE_TEXT);
#endif
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
Index: mmotm-2.6.31-Jul16/arch/powerpc/mm/init_32.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/powerpc/mm/init_32.c 2009-07-20 19:41:13.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/powerpc/mm/init_32.c 2009-07-20 21:21:54.000000000 +0900
@@ -249,30 +249,6 @@
static int __init setup_kcore(void)
{
- int i;
-
- for (i = 0; i < lmb.memory.cnt; i++) {
- unsigned long base;
- unsigned long size;
- struct kcore_list *kcore_mem;
-
- base = lmb.memory.region[i].base;
- size = lmb.memory.region[i].size;
-
- kcore_mem = kmalloc(sizeof(struct kcore_list), GFP_ATOMIC);
- if (!kcore_mem)
- panic("%s: kmalloc failed\n", __func__);
-
- /* must stay under 32 bits */
- if ( 0xfffffffful - (unsigned long)__va(base) < size) {
- size = 0xfffffffful - (unsigned long)(__va(base));
- printk(KERN_DEBUG "setup_kcore: restrict size=%lx\n",
- size);
- }
-
- kclist_add(kcore_mem, __va(base), size, KCORE_RAM);
- }
-
kclist_add(&kcore_vmem, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
Index: mmotm-2.6.31-Jul16/arch/powerpc/mm/init_64.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/powerpc/mm/init_64.c 2009-07-20 19:42:06.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/powerpc/mm/init_64.c 2009-07-20 21:22:20.000000000 +0900
@@ -114,23 +114,6 @@
static int __init setup_kcore(void)
{
- int i;
-
- for (i=0; i < lmb.memory.cnt; i++) {
- unsigned long base, size;
- struct kcore_list *kcore_mem;
-
- base = lmb.memory.region[i].base;
- size = lmb.memory.region[i].size;
-
- /* GFP_ATOMIC to avoid might_sleep warnings during boot */
- kcore_mem = kmalloc(sizeof(struct kcore_list), GFP_ATOMIC);
- if (!kcore_mem)
- panic("%s: kmalloc failed\n", __func__);
-
- kclist_add(kcore_mem, __va(base), size, KCORE_RAM);
- }
-
kclist_add(&kcore_vmem, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
Index: mmotm-2.6.31-Jul16/arch/sh/mm/init.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/sh/mm/init.c 2009-07-20 19:43:19.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/sh/mm/init.c 2009-07-20 21:22:52.000000000 +0900
@@ -218,7 +218,6 @@
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
VMALLOC_END - VMALLOC_START, KCORE_VMALLOC);
Index: mmotm-2.6.31-Jul16/arch/x86/mm/init_32.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/x86/mm/init_32.c 2009-07-20 19:44:21.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/x86/mm/init_32.c 2009-07-20 21:23:36.000000000 +0900
@@ -886,7 +886,6 @@
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
Index: mmotm-2.6.31-Jul16/arch/x86/mm/init_64.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/arch/x86/mm/init_64.c 2009-07-20 19:45:45.000000000 +0900
+++ mmotm-2.6.31-Jul16/arch/x86/mm/init_64.c 2009-07-20 21:24:28.000000000 +0900
@@ -677,7 +677,6 @@
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;
/* Register memory areas for /proc/kcore */
- kclist_add(&kcore_mem, __va(0), max_low_pfn << PAGE_SHIFT, KCORE_RAM);
kclist_add(&kcore_vmalloc, (void *)VMALLOC_START,
VMALLOC_END-VMALLOC_START, KCORE_VMALLOC);
kclist_add(&kcore_kernel, &_stext, _end - _stext, KCORE_TEXT);
KAMEZAWA Hiroyuki <[email protected]> writes:
> Now, /proc/kcore is built on kclist information which is constructed at boot.
> This kclist includes physical memory range information but not updated at
> memory hotplug. And, this information tends to includes big memory hole.
>
> On the other hand, /proc/iomem includes all physical memory information as
> "System RAM" and this is updated properly and kdump use this, IIUC.
> (I hope all archtecuture stores necessary information...)
>
> This patch tries to build kclist for physical memory(direct map) on
> /proc/iomem info. It's refreshed at open("/proc/kcore",) if necesasry.
>
> This is just a RFC. Any comments are welcome.
>
> [1/3] ... clean up kclist handling.
> [2/3] ... clean up kclist_add()
> [3/3] ... use /proc/iomem information for /proc/kcore.
Great cleanup! Thanks.
The only missing part that we still need is to also include
the kallsyms information, then the core would be even more useful.
-Andi
--
[email protected] -- Speaking for myself only.
On Tue, 21 Jul 2009 13:29:57 +0200
Andi Kleen <[email protected]> wrote:
> KAMEZAWA Hiroyuki <[email protected]> writes:
>
> > Now, /proc/kcore is built on kclist information which is constructed at boot.
> > This kclist includes physical memory range information but not updated at
> > memory hotplug. And, this information tends to includes big memory hole.
> >
> > On the other hand, /proc/iomem includes all physical memory information as
> > "System RAM" and this is updated properly and kdump use this, IIUC.
> > (I hope all archtecuture stores necessary information...)
> >
> > This patch tries to build kclist for physical memory(direct map) on
> > /proc/iomem info. It's refreshed at open("/proc/kcore",) if necesasry.
> >
> > This is just a RFC. Any comments are welcome.
> >
> > [1/3] ... clean up kclist handling.
> > [2/3] ... clean up kclist_add()
> > [3/3] ... use /proc/iomem information for /proc/kcore.
>
> Great cleanup! Thanks.
>
Thank you. I'll reveiw this set again and post v2.
> The only missing part that we still need is to also include
> the kallsyms information, then the core would be even more useful.
>
yes.
Thanks,
-Kame