On Mon, 2009-03-02 at 09:39 +0800, Brian Maly wrote:
> Huang Ying wrote:
> > Hi, Brian,
> >
> > On Mon, 2009-03-02 at 04:13 +0800, Brian Maly wrote:
> >
> > > I was able verify the kernel that does not boot on the MacBook (vanilla
> > > 2.6.29-rc4) does call efi_ioremap() which bails out early returning
> > > NULL. So no remapping happens in this case. I have no idea if
> > > efi_ioremap ever does succeed in mapping any ranges though being I have
> > > no video or console this early in the boot and have to rely on triple
> > > faulting as a means of debugging.
> > >
> >
> > Please attach your dmesg of successful boot, so we can take a look at
> > the EFI memory map.
> >
> > Best Regards,
> > Huang Ying
> >
> This dmesg is from a 2.6.25 kernel which works fine. I can gather
> other debugging info from the booting kernels if needed. But its a
> challenge to debug the bad kernel being efifb is initialized very late
> (so you never even get to the video initialization and cant see any
> logged messages) and since its a MacBook I dont have a real serial
> port for serial console. The efi map is for MacBook has a different
> layout from other EFI systems I have to test on. 2.6.29 kernel works
> on every EFI system I have except MacBook.
It seems that you have an EFI system which has too big runtime area.
EFI: mem44: type=0, attr=0x8000000000000000, range=[0x000000007ff00000-0x0000000080000000) (1MB)
efi_ioremap() can map only memory range < 400k now.
It seems that efi_ioremap is the bottle net now. Can we just use
init_memory_mapping() instead of efi_ioremap() for EFI runtime area?
Yinghai, how about your opinion?
Best Regards,
Huang Ying
Huang Ying wrote:
> On Mon, 2009-03-02 at 09:39 +0800, Brian Maly wrote:
>> Huang Ying wrote:
>>> Hi, Brian,
>>>
>>> On Mon, 2009-03-02 at 04:13 +0800, Brian Maly wrote:
>>>
>>>> I was able verify the kernel that does not boot on the MacBook (vanilla
>>>> 2.6.29-rc4) does call efi_ioremap() which bails out early returning
>>>> NULL. So no remapping happens in this case. I have no idea if
>>>> efi_ioremap ever does succeed in mapping any ranges though being I have
>>>> no video or console this early in the boot and have to rely on triple
>>>> faulting as a means of debugging.
>>>>
>>> Please attach your dmesg of successful boot, so we can take a look at
>>> the EFI memory map.
>>>
>>> Best Regards,
>>> Huang Ying
>>>
>> This dmesg is from a 2.6.25 kernel which works fine. I can gather
>> other debugging info from the booting kernels if needed. But its a
>> challenge to debug the bad kernel being efifb is initialized very late
>> (so you never even get to the video initialization and cant see any
>> logged messages) and since its a MacBook I dont have a real serial
>> port for serial console. The efi map is for MacBook has a different
>> layout from other EFI systems I have to test on. 2.6.29 kernel works
>> on every EFI system I have except MacBook.
>
> It seems that you have an EFI system which has too big runtime area.
>
> EFI: mem44: type=0, attr=0x8000000000000000, range=[0x000000007ff00000-0x0000000080000000) (1MB)
>
> efi_ioremap() can map only memory range < 400k now.
>
> It seems that efi_ioremap is the bottle net now. Can we just use
> init_memory_mapping() instead of efi_ioremap() for EFI runtime area?
>
> Yinghai, how about your opinion?
you could call init_memory_maping() in that efi_ioremap position?
problems is how about 32bit?
YH
On Mon, 2009-03-02 at 10:16 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Mon, 2009-03-02 at 09:39 +0800, Brian Maly wrote:
> >> Huang Ying wrote:
> >>> Hi, Brian,
> >>>
> >>> On Mon, 2009-03-02 at 04:13 +0800, Brian Maly wrote:
> >>>
> >>>> I was able verify the kernel that does not boot on the MacBook (vanilla
> >>>> 2.6.29-rc4) does call efi_ioremap() which bails out early returning
> >>>> NULL. So no remapping happens in this case. I have no idea if
> >>>> efi_ioremap ever does succeed in mapping any ranges though being I have
> >>>> no video or console this early in the boot and have to rely on triple
> >>>> faulting as a means of debugging.
> >>>>
> >>> Please attach your dmesg of successful boot, so we can take a look at
> >>> the EFI memory map.
> >>>
> >>> Best Regards,
> >>> Huang Ying
> >>>
> >> This dmesg is from a 2.6.25 kernel which works fine. I can gather
> >> other debugging info from the booting kernels if needed. But its a
> >> challenge to debug the bad kernel being efifb is initialized very late
> >> (so you never even get to the video initialization and cant see any
> >> logged messages) and since its a MacBook I dont have a real serial
> >> port for serial console. The efi map is for MacBook has a different
> >> layout from other EFI systems I have to test on. 2.6.29 kernel works
> >> on every EFI system I have except MacBook.
> >
> > It seems that you have an EFI system which has too big runtime area.
> >
> > EFI: mem44: type=0, attr=0x8000000000000000, range=[0x000000007ff00000-0x0000000080000000) (1MB)
> >
> > efi_ioremap() can map only memory range < 400k now.
> >
> > It seems that efi_ioremap is the bottle net now. Can we just use
> > init_memory_mapping() instead of efi_ioremap() for EFI runtime area?
> >
> > Yinghai, how about your opinion?
>
> you could call init_memory_maping() in that efi_ioremap position?
>
> problems is how about 32bit?
efi_ioremap() is defined as ioremap_cache() on 32bit system. As that in
arch/x86/include/asm/efi.h.
On 64bit system, efi_ioremap() can be a wrapper for
init_memory_mapping(). Do you think it is appropriate?
Best Regards,
Huang Ying
Huang Ying wrote:
> On Mon, 2009-03-02 at 10:16 +0800, Yinghai Lu wrote:
>> Huang Ying wrote:
>>> On Mon, 2009-03-02 at 09:39 +0800, Brian Maly wrote:
>>>> Huang Ying wrote:
>>>>> Hi, Brian,
>>>>>
>>>>> On Mon, 2009-03-02 at 04:13 +0800, Brian Maly wrote:
>>>>>
>>>>>> I was able verify the kernel that does not boot on the MacBook (vanilla
>>>>>> 2.6.29-rc4) does call efi_ioremap() which bails out early returning
>>>>>> NULL. So no remapping happens in this case. I have no idea if
>>>>>> efi_ioremap ever does succeed in mapping any ranges though being I have
>>>>>> no video or console this early in the boot and have to rely on triple
>>>>>> faulting as a means of debugging.
>>>>>>
>>>>> Please attach your dmesg of successful boot, so we can take a look at
>>>>> the EFI memory map.
>>>>>
>>>>> Best Regards,
>>>>> Huang Ying
>>>>>
>>>> This dmesg is from a 2.6.25 kernel which works fine. I can gather
>>>> other debugging info from the booting kernels if needed. But its a
>>>> challenge to debug the bad kernel being efifb is initialized very late
>>>> (so you never even get to the video initialization and cant see any
>>>> logged messages) and since its a MacBook I dont have a real serial
>>>> port for serial console. The efi map is for MacBook has a different
>>>> layout from other EFI systems I have to test on. 2.6.29 kernel works
>>>> on every EFI system I have except MacBook.
>>> It seems that you have an EFI system which has too big runtime area.
>>>
>>> EFI: mem44: type=0, attr=0x8000000000000000, range=[0x000000007ff00000-0x0000000080000000) (1MB)
>>>
>>> efi_ioremap() can map only memory range < 400k now.
>>>
>>> It seems that efi_ioremap is the bottle net now. Can we just use
>>> init_memory_mapping() instead of efi_ioremap() for EFI runtime area?
>>>
>>> Yinghai, how about your opinion?
>> you could call init_memory_maping() in that efi_ioremap position?
>>
>> problems is how about 32bit?
>
> efi_ioremap() is defined as ioremap_cache() on 32bit system. As that in
> arch/x86/include/asm/efi.h.
>
> On 64bit system, efi_ioremap() can be a wrapper for
> init_memory_mapping(). Do you think it is appropriate?
so 64bit could use ioremap_cache() too?
we may keep 32bit and 64bit a bit consistent.
YH
On Mon, 2009-03-02 at 10:32 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Mon, 2009-03-02 at 10:16 +0800, Yinghai Lu wrote:
> >> Huang Ying wrote:
> >>> On Mon, 2009-03-02 at 09:39 +0800, Brian Maly wrote:
> >>>> Huang Ying wrote:
> >>>>> Hi, Brian,
> >>>>>
> >>>>> On Mon, 2009-03-02 at 04:13 +0800, Brian Maly wrote:
> >>>>>
> >>>>>> I was able verify the kernel that does not boot on the MacBook (vanilla
> >>>>>> 2.6.29-rc4) does call efi_ioremap() which bails out early returning
> >>>>>> NULL. So no remapping happens in this case. I have no idea if
> >>>>>> efi_ioremap ever does succeed in mapping any ranges though being I have
> >>>>>> no video or console this early in the boot and have to rely on triple
> >>>>>> faulting as a means of debugging.
> >>>>>>
> >>>>> Please attach your dmesg of successful boot, so we can take a look at
> >>>>> the EFI memory map.
> >>>>>
> >>>>> Best Regards,
> >>>>> Huang Ying
> >>>>>
> >>>> This dmesg is from a 2.6.25 kernel which works fine. I can gather
> >>>> other debugging info from the booting kernels if needed. But its a
> >>>> challenge to debug the bad kernel being efifb is initialized very late
> >>>> (so you never even get to the video initialization and cant see any
> >>>> logged messages) and since its a MacBook I dont have a real serial
> >>>> port for serial console. The efi map is for MacBook has a different
> >>>> layout from other EFI systems I have to test on. 2.6.29 kernel works
> >>>> on every EFI system I have except MacBook.
> >>> It seems that you have an EFI system which has too big runtime area.
> >>>
> >>> EFI: mem44: type=0, attr=0x8000000000000000, range=[0x000000007ff00000-0x0000000080000000) (1MB)
> >>>
> >>> efi_ioremap() can map only memory range < 400k now.
> >>>
> >>> It seems that efi_ioremap is the bottle net now. Can we just use
> >>> init_memory_mapping() instead of efi_ioremap() for EFI runtime area?
> >>>
> >>> Yinghai, how about your opinion?
> >> you could call init_memory_maping() in that efi_ioremap position?
> >>
> >> problems is how about 32bit?
> >
> > efi_ioremap() is defined as ioremap_cache() on 32bit system. As that in
> > arch/x86/include/asm/efi.h.
> >
> > On 64bit system, efi_ioremap() can be a wrapper for
> > init_memory_mapping(). Do you think it is appropriate?
>
> so 64bit could use ioremap_cache() too?
> we may keep 32bit and 64bit a bit consistent.
If we use ioremap_cache(), kexec runtime service will not work in kexec
situation, which needs EFI runtime memory area to be mapped at exact
same location across kexec. I think we should support kexec if possible.
Best Regards,
Huang Ying
On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>> so 64bit could use ioremap_cache() too?
>> we may keep 32bit and 64bit a bit consistent.
>
> If we use ioremap_cache(), kexec runtime service will not work in kexec
> situation, which needs EFI runtime memory area to be mapped at exact
> same location across kexec. I think we should support kexec if possible.
sure.
please don't touch max_low_pfn_mapped, because some range may not
directly mapped under those efi run-time code
YH
In looking at an older (working) kernel a bit more and it looks like
efi_ioremap() is not called in efi_enter_virtual_mode() pre 2.6.27 on
this hardware, so we never followed the efi_ioremap codepath up until
recently. This explains the regression.
Brian
On Mon, 2009-03-02 at 10:57 +0800, Brian Maly wrote:
> In looking at an older (working) kernel a bit more and it looks like
> efi_ioremap() is not called in efi_enter_virtual_mode() pre 2.6.27 on
> this hardware, so we never followed the efi_ioremap codepath up until
> recently. This explains the regression.
Yes. That is why the regression is triggered after 2.6.27.
Best Regards,
Huang Ying
On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
> >> so 64bit could use ioremap_cache() too?
> >> we may keep 32bit and 64bit a bit consistent.
> >
> > If we use ioremap_cache(), kexec runtime service will not work in kexec
> > situation, which needs EFI runtime memory area to be mapped at exact
> > same location across kexec. I think we should support kexec if possible.
>
>
> sure.
>
> please don't touch max_low_pfn_mapped, because some range may not
> directly mapped under those efi run-time code
Find an issue to use init_memory_mapping() here.
If the memory range to be mapped is less than 2M, the last mapped
address may be next 2M aligned position, this may lead mapping
overlapping between memory range. Such as:
0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
mapped!
Two way to fix the issue:
a) Make init_memory_mapping stopped at specified end page exactly! Even
for memory range smaller than 2M.
b) init_memory_mapping maps all underlying pages when split large
mapping.
What do you think about that?
Best Regards,
Huang Ying
Huang Ying wrote:
> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>>>> so 64bit could use ioremap_cache() too?
>>>> we may keep 32bit and 64bit a bit consistent.
>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
>>> situation, which needs EFI runtime memory area to be mapped at exact
>>> same location across kexec. I think we should support kexec if possible.
>>
>> sure.
>>
>> please don't touch max_low_pfn_mapped, because some range may not
>> directly mapped under those efi run-time code
>
> Find an issue to use init_memory_mapping() here.
>
> If the memory range to be mapped is less than 2M, the last mapped
> address may be next 2M aligned position, this may lead mapping
> overlapping between memory range. Such as:
>
> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
>
> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
> mapped!
what is max_low_pfn_mapped before that?
YH
On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
> >> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
> >>>> so 64bit could use ioremap_cache() too?
> >>>> we may keep 32bit and 64bit a bit consistent.
> >>> If we use ioremap_cache(), kexec runtime service will not work in kexec
> >>> situation, which needs EFI runtime memory area to be mapped at exact
> >>> same location across kexec. I think we should support kexec if possible.
> >>
> >> sure.
> >>
> >> please don't touch max_low_pfn_mapped, because some range may not
> >> directly mapped under those efi run-time code
> >
> > Find an issue to use init_memory_mapping() here.
> >
> > If the memory range to be mapped is less than 2M, the last mapped
> > address may be next 2M aligned position, this may lead mapping
> > overlapping between memory range. Such as:
> >
> > 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
> > 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
> >
> > The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
> > mapped!
>
> what is max_low_pfn_mapped before that?
I don't know exactly what you mean. Can you elaborate a little?
0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
init_memory_mapping() properly.
The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
not mapped.
Best Regards,
Huang Ying
Huang Ying wrote:
> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
>> Huang Ying wrote:
>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>>>>>> so 64bit could use ioremap_cache() too?
>>>>>> we may keep 32bit and 64bit a bit consistent.
>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
>>>>> situation, which needs EFI runtime memory area to be mapped at exact
>>>>> same location across kexec. I think we should support kexec if possible.
>>>> sure.
>>>>
>>>> please don't touch max_low_pfn_mapped, because some range may not
>>>> directly mapped under those efi run-time code
>>> Find an issue to use init_memory_mapping() here.
>>>
>>> If the memory range to be mapped is less than 2M, the last mapped
>>> address may be next 2M aligned position, this may lead mapping
>>> overlapping between memory range. Such as:
>>>
>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
>>>
>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
>>> mapped!
>> what is max_low_pfn_mapped before that?
>
> I don't know exactly what you mean. Can you elaborate a little?
>
> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
> init_memory_mapping() properly.
>
> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
> not mapped.
what is max_low_pfn_mapped?
what is init_memory_mapping() printout?
YH
On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
> >> Huang Ying wrote:
> >>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
> >>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
> >>>>>> so 64bit could use ioremap_cache() too?
> >>>>>> we may keep 32bit and 64bit a bit consistent.
> >>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
> >>>>> situation, which needs EFI runtime memory area to be mapped at exact
> >>>>> same location across kexec. I think we should support kexec if possible.
> >>>> sure.
> >>>>
> >>>> please don't touch max_low_pfn_mapped, because some range may not
> >>>> directly mapped under those efi run-time code
> >>> Find an issue to use init_memory_mapping() here.
> >>>
> >>> If the memory range to be mapped is less than 2M, the last mapped
> >>> address may be next 2M aligned position, this may lead mapping
> >>> overlapping between memory range. Such as:
> >>>
> >>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
> >>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
> >>>
> >>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
> >>> mapped!
> >> what is max_low_pfn_mapped before that?
> >
> > I don't know exactly what you mean. Can you elaborate a little?
> >
> > 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
> > init_memory_mapping() properly.
> >
> > The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
> > sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
> > not mapped.
> what is max_low_pfn_mapped?
>
> what is init_memory_mapping() printout?
This does not comes from a real test case. To test the changes I made, I
make efi_ioremap() being used even if the corresponding memory range is
below max_low_pfn_mapped. The dmesg of test is attached with the mail.
The printout of init_memory_mapping shows:
init_memory_mapping: 000000003f488000-000000003f4bb000
last_map_addr: 3f600000 end: 3f4bb000
init_memory_mapping: 000000003f590000-000000003f5bb000
last_map_addr: 3f600000 end: 3f5bb000
init_memory_mapping: 00000000fffb0000-00000000fffba000
last_map_addr: 100000000 end: fffba000
So I think it is possible to have the issue I mentioned above.
Best Regards,
Huang Ying
Huang Ying wrote:
> On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
>> Huang Ying wrote:
>>> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
>>>> Huang Ying wrote:
>>>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
>>>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>>>>>>>> so 64bit could use ioremap_cache() too?
>>>>>>>> we may keep 32bit and 64bit a bit consistent.
>>>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
>>>>>>> situation, which needs EFI runtime memory area to be mapped at exact
>>>>>>> same location across kexec. I think we should support kexec if possible.
>>>>>> sure.
>>>>>>
>>>>>> please don't touch max_low_pfn_mapped, because some range may not
>>>>>> directly mapped under those efi run-time code
>>>>> Find an issue to use init_memory_mapping() here.
>>>>>
>>>>> If the memory range to be mapped is less than 2M, the last mapped
>>>>> address may be next 2M aligned position, this may lead mapping
>>>>> overlapping between memory range. Such as:
>>>>>
>>>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
>>>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
>>>>>
>>>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
>>>>> mapped!
>>>> what is max_low_pfn_mapped before that?
>>> I don't know exactly what you mean. Can you elaborate a little?
>>>
>>> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
>>> init_memory_mapping() properly.
>>>
>>> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
>>> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
>>> not mapped.
>> what is max_low_pfn_mapped?
>>
>> what is init_memory_mapping() printout?
>
> This does not comes from a real test case. To test the changes I made, I
> make efi_ioremap() being used even if the corresponding memory range is
> below max_low_pfn_mapped. The dmesg of test is attached with the mail.
>
> The printout of init_memory_mapping shows:
>
> init_memory_mapping: 000000003f488000-000000003f4bb000
> last_map_addr: 3f600000 end: 3f4bb000
> init_memory_mapping: 000000003f590000-000000003f5bb000
> last_map_addr: 3f600000 end: 3f5bb000
init_memory_mapping: 0000000000000000-000000003f700000
last_map_addr: 3f700000 end: 3f700000
(6 early reservations) ==> bootmem [0000000000 - 003f700000]
so max_low_pfn_mapped is (3f700000>>12)
and you try to init_memory_mapping again before it
> init_memory_mapping: 00000000fffb0000-00000000fffba000
> last_map_addr: 100000000 end: fffba000
this one is interesting... got over mapped...
>
> So I think it is possible to have the issue I mentioned above.
looks like.
YH
On Tue, 2009-03-03 at 10:53 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
> >> Huang Ying wrote:
> >>> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
> >>>> Huang Ying wrote:
> >>>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
> >>>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
> >>>>>>>> so 64bit could use ioremap_cache() too?
> >>>>>>>> we may keep 32bit and 64bit a bit consistent.
> >>>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
> >>>>>>> situation, which needs EFI runtime memory area to be mapped at exact
> >>>>>>> same location across kexec. I think we should support kexec if possible.
> >>>>>> sure.
> >>>>>>
> >>>>>> please don't touch max_low_pfn_mapped, because some range may not
> >>>>>> directly mapped under those efi run-time code
> >>>>> Find an issue to use init_memory_mapping() here.
> >>>>>
> >>>>> If the memory range to be mapped is less than 2M, the last mapped
> >>>>> address may be next 2M aligned position, this may lead mapping
> >>>>> overlapping between memory range. Such as:
> >>>>>
> >>>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
> >>>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
> >>>>>
> >>>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
> >>>>> mapped!
> >>>> what is max_low_pfn_mapped before that?
> >>> I don't know exactly what you mean. Can you elaborate a little?
> >>>
> >>> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
> >>> init_memory_mapping() properly.
> >>>
> >>> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
> >>> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
> >>> not mapped.
> >> what is max_low_pfn_mapped?
> >>
> >> what is init_memory_mapping() printout?
> >
> > This does not comes from a real test case. To test the changes I made, I
> > make efi_ioremap() being used even if the corresponding memory range is
> > below max_low_pfn_mapped. The dmesg of test is attached with the mail.
> >
> > The printout of init_memory_mapping shows:
> >
> > init_memory_mapping: 000000003f488000-000000003f4bb000
> > last_map_addr: 3f600000 end: 3f4bb000
> > init_memory_mapping: 000000003f590000-000000003f5bb000
> > last_map_addr: 3f600000 end: 3f5bb000
> init_memory_mapping: 0000000000000000-000000003f700000
>
> last_map_addr: 3f700000 end: 3f700000
>
> (6 early reservations) ==> bootmem [0000000000 - 003f700000]
>
> so max_low_pfn_mapped is (3f700000>>12)
> and you try to init_memory_mapping again before it
Yes. Just for testing, I want to use efi_ioremap() on more memory range
to test.
> > init_memory_mapping: 00000000fffb0000-00000000fffba000
> > last_map_addr: 100000000 end: fffba000
> this one is interesting... got over mapped...
> >
> > So I think it is possible to have the issue I mentioned above.
>
> looks like.
So, If you have no time, I can try to fix that. Do you think
init_memory_mapping should stop at specified end page?
Best Regards,
Huang Ying
Huang Ying wrote:
> On Tue, 2009-03-03 at 10:53 +0800, Yinghai Lu wrote:
>> Huang Ying wrote:
>>> On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
>>>> Huang Ying wrote:
>>>>> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
>>>>>> Huang Ying wrote:
>>>>>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
>>>>>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>>>>>>>>>> so 64bit could use ioremap_cache() too?
>>>>>>>>>> we may keep 32bit and 64bit a bit consistent.
>>>>>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
>>>>>>>>> situation, which needs EFI runtime memory area to be mapped at exact
>>>>>>>>> same location across kexec. I think we should support kexec if possible.
>>>>>>>> sure.
>>>>>>>>
>>>>>>>> please don't touch max_low_pfn_mapped, because some range may not
>>>>>>>> directly mapped under those efi run-time code
>>>>>>> Find an issue to use init_memory_mapping() here.
>>>>>>>
>>>>>>> If the memory range to be mapped is less than 2M, the last mapped
>>>>>>> address may be next 2M aligned position, this may lead mapping
>>>>>>> overlapping between memory range. Such as:
>>>>>>>
>>>>>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
>>>>>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
>>>>>>>
>>>>>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
>>>>>>> mapped!
>>>>>> what is max_low_pfn_mapped before that?
>>>>> I don't know exactly what you mean. Can you elaborate a little?
>>>>>
>>>>> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
>>>>> init_memory_mapping() properly.
>>>>>
>>>>> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
>>>>> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
>>>>> not mapped.
>>>> what is max_low_pfn_mapped?
>>>>
>>>> what is init_memory_mapping() printout?
>>> This does not comes from a real test case. To test the changes I made, I
>>> make efi_ioremap() being used even if the corresponding memory range is
>>> below max_low_pfn_mapped. The dmesg of test is attached with the mail.
>>>
>>> The printout of init_memory_mapping shows:
>>>
>>> init_memory_mapping: 000000003f488000-000000003f4bb000
>>> last_map_addr: 3f600000 end: 3f4bb000
>>> init_memory_mapping: 000000003f590000-000000003f5bb000
>>> last_map_addr: 3f600000 end: 3f5bb000
>> init_memory_mapping: 0000000000000000-000000003f700000
>>
>> last_map_addr: 3f700000 end: 3f700000
>>
>> (6 early reservations) ==> bootmem [0000000000 - 003f700000]
>>
>> so max_low_pfn_mapped is (3f700000>>12)
>> and you try to init_memory_mapping again before it
>
> Yes. Just for testing, I want to use efi_ioremap() on more memory range
> to test.
>
>>> init_memory_mapping: 00000000fffb0000-00000000fffba000
>>> last_map_addr: 100000000 end: fffba000
>> this one is interesting... got over mapped...
>>> So I think it is possible to have the issue I mentioned above.
>> looks like.
>
> So, If you have no time, I can try to fix that. Do you think
> init_memory_mapping should stop at specified end page?
you may boot with debug, so could figure out how init_memory_mapping want to map the range.
it should stop at specified end page at least with 64bit.
YH
On Tue, 2009-03-03 at 11:57 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Tue, 2009-03-03 at 10:53 +0800, Yinghai Lu wrote:
> >> Huang Ying wrote:
> >>> On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
> >>>> Huang Ying wrote:
> >>>>> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
> >>>>>> Huang Ying wrote:
> >>>>>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
> >>>>>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
> >>>>>>>>>> so 64bit could use ioremap_cache() too?
> >>>>>>>>>> we may keep 32bit and 64bit a bit consistent.
> >>>>>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
> >>>>>>>>> situation, which needs EFI runtime memory area to be mapped at exact
> >>>>>>>>> same location across kexec. I think we should support kexec if possible.
> >>>>>>>> sure.
> >>>>>>>>
> >>>>>>>> please don't touch max_low_pfn_mapped, because some range may not
> >>>>>>>> directly mapped under those efi run-time code
> >>>>>>> Find an issue to use init_memory_mapping() here.
> >>>>>>>
> >>>>>>> If the memory range to be mapped is less than 2M, the last mapped
> >>>>>>> address may be next 2M aligned position, this may lead mapping
> >>>>>>> overlapping between memory range. Such as:
> >>>>>>>
> >>>>>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
> >>>>>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
> >>>>>>>
> >>>>>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
> >>>>>>> mapped!
> >>>>>> what is max_low_pfn_mapped before that?
> >>>>> I don't know exactly what you mean. Can you elaborate a little?
> >>>>>
> >>>>> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
> >>>>> init_memory_mapping() properly.
> >>>>>
> >>>>> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
> >>>>> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
> >>>>> not mapped.
> >>>> what is max_low_pfn_mapped?
> >>>>
> >>>> what is init_memory_mapping() printout?
> >>> This does not comes from a real test case. To test the changes I made, I
> >>> make efi_ioremap() being used even if the corresponding memory range is
> >>> below max_low_pfn_mapped. The dmesg of test is attached with the mail.
> >>>
> >>> The printout of init_memory_mapping shows:
> >>>
> >>> init_memory_mapping: 000000003f488000-000000003f4bb000
> >>> last_map_addr: 3f600000 end: 3f4bb000
> >>> init_memory_mapping: 000000003f590000-000000003f5bb000
> >>> last_map_addr: 3f600000 end: 3f5bb000
> >> init_memory_mapping: 0000000000000000-000000003f700000
> >>
> >> last_map_addr: 3f700000 end: 3f700000
> >>
> >> (6 early reservations) ==> bootmem [0000000000 - 003f700000]
> >>
> >> so max_low_pfn_mapped is (3f700000>>12)
> >> and you try to init_memory_mapping again before it
> >
> > Yes. Just for testing, I want to use efi_ioremap() on more memory range
> > to test.
> >
> >>> init_memory_mapping: 00000000fffb0000-00000000fffba000
> >>> last_map_addr: 100000000 end: fffba000
> >> this one is interesting... got over mapped...
> >>> So I think it is possible to have the issue I mentioned above.
> >> looks like.
> >
> > So, If you have no time, I can try to fix that. Do you think
> > init_memory_mapping should stop at specified end page?
>
> you may boot with debug, so could figure out how init_memory_mapping want to map the range.
>
> it should stop at specified end page at least with 64bit.
The dmesg with ignore_loglevel in kernel parameters is attached with the
mail.
init_memory_mapping: 0000000000000000-000000003f700000
0000000000 - 003f600000 page 2M
003f600000 - 003f700000 page 4k
kernel direct mapping tables up to 3f700000 @ 8000-b000
last_map_addr: 3f700000 end: 3f700000
init_memory_mapping: 00000000fffb0000-00000000fffba000
00fffb0000 - 0100000000 page 4k
last_map_addr: 100000000 end: fffba000
Best Regards,
Huang Ying
Huang Ying wrote:
> On Tue, 2009-03-03 at 11:57 +0800, Yinghai Lu wrote:
>> Huang Ying wrote:
>>> On Tue, 2009-03-03 at 10:53 +0800, Yinghai Lu wrote:
>>>> Huang Ying wrote:
>>>>> On Tue, 2009-03-03 at 09:28 +0800, Yinghai Lu wrote:
>>>>>> Huang Ying wrote:
>>>>>>> On Tue, 2009-03-03 at 05:38 +0800, Yinghai Lu wrote:
>>>>>>>> Huang Ying wrote:
>>>>>>>>> On Mon, 2009-03-02 at 10:51 +0800, Yinghai Lu wrote:
>>>>>>>>>> On Sun, Mar 1, 2009 at 6:37 PM, Huang Ying <[email protected]> wrote:
>>>>>>>>>>>> so 64bit could use ioremap_cache() too?
>>>>>>>>>>>> we may keep 32bit and 64bit a bit consistent.
>>>>>>>>>>> If we use ioremap_cache(), kexec runtime service will not work in kexec
>>>>>>>>>>> situation, which needs EFI runtime memory area to be mapped at exact
>>>>>>>>>>> same location across kexec. I think we should support kexec if possible.
>>>>>>>>>> sure.
>>>>>>>>>>
>>>>>>>>>> please don't touch max_low_pfn_mapped, because some range may not
>>>>>>>>>> directly mapped under those efi run-time code
>>>>>>>>> Find an issue to use init_memory_mapping() here.
>>>>>>>>>
>>>>>>>>> If the memory range to be mapped is less than 2M, the last mapped
>>>>>>>>> address may be next 2M aligned position, this may lead mapping
>>>>>>>>> overlapping between memory range. Such as:
>>>>>>>>>
>>>>>>>>> 0x3f388000 - 0x3f488000: real mapped 0x3f388000 - 0x3f600000
>>>>>>>>> 0x3f590000 - 0x3f5bb000: real mapped 0x3f590000 - 0x3f600000
>>>>>>>>>
>>>>>>>>> The problem is that the memory range 0x3f400000 - 0x3f590000 is left not
>>>>>>>>> mapped!
>>>>>>>> what is max_low_pfn_mapped before that?
>>>>>>> I don't know exactly what you mean. Can you elaborate a little?
>>>>>>>
>>>>>>> 0 ~ max_low_pfn_mapped ~ max_pfn_mapped can be mapped with
>>>>>>> init_memory_mapping() properly.
>>>>>>>
>>>>>>> The issue of above example is that 0x3f400000 ~ 0x3f488000 is a
>>>>>>> sub-range of 0x3f388000 ~ 0x3f488000, which should be mapped but is left
>>>>>>> not mapped.
>>>>>> what is max_low_pfn_mapped?
>>>>>>
>>>>>> what is init_memory_mapping() printout?
>>>>> This does not comes from a real test case. To test the changes I made, I
>>>>> make efi_ioremap() being used even if the corresponding memory range is
>>>>> below max_low_pfn_mapped. The dmesg of test is attached with the mail.
>>>>>
>>>>> The printout of init_memory_mapping shows:
>>>>>
>>>>> init_memory_mapping: 000000003f488000-000000003f4bb000
>>>>> last_map_addr: 3f600000 end: 3f4bb000
>>>>> init_memory_mapping: 000000003f590000-000000003f5bb000
>>>>> last_map_addr: 3f600000 end: 3f5bb000
>>>> init_memory_mapping: 0000000000000000-000000003f700000
>>>>
>>>> last_map_addr: 3f700000 end: 3f700000
>>>>
>>>> (6 early reservations) ==> bootmem [0000000000 - 003f700000]
>>>>
>>>> so max_low_pfn_mapped is (3f700000>>12)
>>>> and you try to init_memory_mapping again before it
>>> Yes. Just for testing, I want to use efi_ioremap() on more memory range
>>> to test.
>>>
>>>>> init_memory_mapping: 00000000fffb0000-00000000fffba000
>>>>> last_map_addr: 100000000 end: fffba000
>>>> this one is interesting... got over mapped...
>>>>> So I think it is possible to have the issue I mentioned above.
>>>> looks like.
>>> So, If you have no time, I can try to fix that. Do you think
>>> init_memory_mapping should stop at specified end page?
>> you may boot with debug, so could figure out how init_memory_mapping want to map the range.
>>
>> it should stop at specified end page at least with 64bit.
>
> The dmesg with ignore_loglevel in kernel parameters is attached with the
> mail.
>
> init_memory_mapping: 0000000000000000-000000003f700000
> 0000000000 - 003f600000 page 2M
> 003f600000 - 003f700000 page 4k
> kernel direct mapping tables up to 3f700000 @ 8000-b000
> last_map_addr: 3f700000 end: 3f700000
>
> init_memory_mapping: 00000000fffb0000-00000000fffba000
> 00fffb0000 - 0100000000 page 4k
> last_map_addr: 100000000 end: fffba000
that is funny, the range calculating has some problem...when the range size < 2M...
YH
On Tue, 2009-03-03 at 13:37 +0800, Yinghai Lu wrote:
[...]
> > The dmesg with ignore_loglevel in kernel parameters is attached with the
> > mail.
> >
> > init_memory_mapping: 0000000000000000-000000003f700000
> > 0000000000 - 003f600000 page 2M
> > 003f600000 - 003f700000 page 4k
> > kernel direct mapping tables up to 3f700000 @ 8000-b000
> > last_map_addr: 3f700000 end: 3f700000
> >
> > init_memory_mapping: 00000000fffb0000-00000000fffba000
> > 00fffb0000 - 0100000000 page 4k
> > last_map_addr: 100000000 end: fffba000
> that is funny, the range calculating has some problem...when the range size < 2M...
Yes. Can you fix that? If you have no time, I can do that.
Best Regards,
Huang Ying
Huang Ying wrote:
> On Tue, 2009-03-03 at 13:37 +0800, Yinghai Lu wrote:
> [...]
>>> The dmesg with ignore_loglevel in kernel parameters is attached with the
>>> mail.
>>>
>>> init_memory_mapping: 0000000000000000-000000003f700000
>>> 0000000000 - 003f600000 page 2M
>>> 003f600000 - 003f700000 page 4k
>>> kernel direct mapping tables up to 3f700000 @ 8000-b000
>>> last_map_addr: 3f700000 end: 3f700000
>>>
>>> init_memory_mapping: 00000000fffb0000-00000000fffba000
>>> 00fffb0000 - 0100000000 page 4k
>>> last_map_addr: 100000000 end: fffba000
>> that is funny, the range calculating has some problem...when the range size < 2M...
>
> Yes. Can you fix that? If you have no time, I can do that.
>
please try
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index c9d4466..25a7be8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -748,6 +748,8 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
pos = start_pfn << PAGE_SHIFT;
end_pfn = ((pos + (PMD_SIZE - 1)) >> PMD_SHIFT)
<< (PMD_SHIFT - PAGE_SHIFT);
+ if (end_pfn > (end>>PAGE_SHIFT))
+ end_pfn = end>>PAGE_SHIFT;
if (start_pfn < end_pfn) {
nr_range = save_mr(mr, nr_range, start_pfn, end_pfn, 0);
pos = end_pfn << PAGE_SHIFT;
On Tue, 2009-03-03 at 13:51 +0800, Yinghai Lu wrote:
> Huang Ying wrote:
> > On Tue, 2009-03-03 at 13:37 +0800, Yinghai Lu wrote:
> > [...]
> >>> The dmesg with ignore_loglevel in kernel parameters is attached with the
> >>> mail.
> >>>
> >>> init_memory_mapping: 0000000000000000-000000003f700000
> >>> 0000000000 - 003f600000 page 2M
> >>> 003f600000 - 003f700000 page 4k
> >>> kernel direct mapping tables up to 3f700000 @ 8000-b000
> >>> last_map_addr: 3f700000 end: 3f700000
> >>>
> >>> init_memory_mapping: 00000000fffb0000-00000000fffba000
> >>> 00fffb0000 - 0100000000 page 4k
> >>> last_map_addr: 100000000 end: fffba000
> >> that is funny, the range calculating has some problem...when the range size < 2M...
> >
> > Yes. Can you fix that? If you have no time, I can do that.
> >
>
> please try
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index c9d4466..25a7be8 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -748,6 +748,8 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
> pos = start_pfn << PAGE_SHIFT;
> end_pfn = ((pos + (PMD_SIZE - 1)) >> PMD_SHIFT)
> << (PMD_SHIFT - PAGE_SHIFT);
> + if (end_pfn > (end>>PAGE_SHIFT))
> + end_pfn = end>>PAGE_SHIFT;
> if (start_pfn < end_pfn) {
> nr_range = save_mr(mr, nr_range, start_pfn, end_pfn, 0);
> pos = end_pfn << PAGE_SHIFT;
Thanks, it works well. The dmesg is attached with the mail.
init_memory_mapping: 0000000000000000-000000003f700000
0000000000 - 003f600000 page 2M
003f600000 - 003f700000 page 4k
kernel direct mapping tables up to 3f700000 @ 8000-b000
last_map_addr: 3f700000 end: 3f700000
init_memory_mapping: 00000000fffb0000-00000000fffba000
00fffb0000 - 00fffba000 page 4k
last_map_addr: fffba000 end: fffba000
Best Regards,
Huang Ying
Impact: fix small range ...
Ying Huang found init_memory_mapping has problem for small range less than 2M
when he tried to direct map for EFI runtime code out of max_low_pfn_mapped
it turns out we never consider that will be usedd for small range. and didn't
check the range...
Signed-off-by: Yinghai Lu <[email protected]>
Reported-by: Ying Huang <[email protected]>
---
arch/x86/mm/init_64.c | 2 ++
1 file changed, 2 insertions(+)
Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -748,6 +748,8 @@ unsigned long __init_refok init_memory_m
pos = start_pfn << PAGE_SHIFT;
end_pfn = ((pos + (PMD_SIZE - 1)) >> PMD_SHIFT)
<< (PMD_SHIFT - PAGE_SHIFT);
+ if (end_pfn > (end>>PAGE_SHIFT))
+ end_pfn = end>>PAGE_SHIFT;
if (start_pfn < end_pfn) {
nr_range = save_mr(mr, nr_range, start_pfn, end_pfn, 0);
pos = end_pfn << PAGE_SHIFT;
Commit-ID: 0fc59d3a01820765e5f3a723733728758b0cf577
Gitweb: http://git.kernel.org/tip/0fc59d3a01820765e5f3a723733728758b0cf577
Author: "Yinghai Lu" <[email protected]>
AuthorDate: Mon, 2 Mar 2009 23:36:13 -0800
Commit: Ingo Molnar <[email protected]>
CommitDate: Tue, 3 Mar 2009 08:50:22 +0100
x86: fix init_memory_mapping() to handle small ranges
Impact: fix failed EFI bootup in certain circumstances
Ying Huang found init_memory_mapping() has problem with small ranges
less than 2M when he tried to direct map the EFI runtime code out of
max_low_pfn_mapped.
It turns out we never considered that case and didn't check the range...
Reported-by: Ying Huang <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
Cc: Brian Maly <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/init_64.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index e6d36b4..b135225 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -714,6 +714,8 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
pos = start_pfn << PAGE_SHIFT;
end_pfn = ((pos + (PMD_SIZE - 1)) >> PMD_SHIFT)
<< (PMD_SHIFT - PAGE_SHIFT);
+ if (end_pfn > (end >> PAGE_SHIFT))
+ end_pfn = end >> PAGE_SHIFT;
if (start_pfn < end_pfn) {
nr_range = save_mr(mr, nr_range, start_pfn, end_pfn, 0);
pos = end_pfn << PAGE_SHIFT;