When KASLR enaled(CONFIG_RANDOMIZE_BASE=y), the top 4K virtual
address have chance to be mapped to physical address, but which
is expected to leave room for ERR_PTR.
Also, it might cause some other warparound issue when somewhere
use the last memory page but no overflow check. Such as the last
page compressed by LZO:
[ 2738.034508] Unable to handle kernel NULL pointer dereference at virtual address 00000009
[ 2738.034515] Mem abort info:
[ 2738.034518] Exception class = DABT (current EL), IL = 32 bits
[ 2738.034520] SET = 0, FnV = 0
[ 2738.034523] EA = 0, S1PTW = 0
[ 2738.034524] FSC = 5
[ 2738.034526] Data abort info:
[ 2738.034528] ISV = 0, ISS = 0x00000005
[ 2738.034530] CM = 0, WnR = 0
[ 2738.034533] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffff94cee000
[ 2738.034535] [0000000000000009] *pgd=0000000000000000, *pud=0000000000000000
...
[ 2738.034592] pc : lzo1x_1_do_compress+0x198/0x610
[ 2738.034595] lr : lzo1x_1_compress+0x98/0x3d8
[ 2738.034597] sp : ffffff801caa3470 pstate : 00c00145
[ 2738.034598] x29: ffffff801caa3500 x28: 0000000000001000
[ 2738.034601] x27: 0000000000001000 x26: fffffffffffff000
[ 2738.034604] x25: ffffffff4ebc0000 x24: 0000000000000000
[ 2738.034607] x23: 000000000000004c x22: fffffffffffff7b8
[ 2738.034610] x21: ffffffff2e2ee0b3 x20: ffffffff2e2ee0bb
[ 2738.034612] x19: 0000000000000fcc x18: fffffffffffff84a
[ 2738.034615] x17: 00000000801b03d6 x16: 0000000000000782
[ 2738.034618] x15: ffffffff2e2ee0bf x14: fffffffffffffff0
[ 2738.034620] x13: 000000000000000f x12: 0000000000000020
[ 2738.034623] x11: 000000001824429d x10: ffffffffffffffec
[ 2738.034626] x9 : 0000000000000009 x8 : 0000000000000000
[ 2738.034628] x7 : 0000000000000868 x6 : 0000000000000434
[ 2738.034631] x5 : ffffffff4ebc0000 x4 : 0000000000000000
[ 2738.034633] x3 : ffffff801caa3510 x2 : ffffffff2e2ee000
[ 2738.034636] x1 : 0000000000000000 x0 : fffffffffffff000
...
[ 2738.034717] Process kworker/u16:1 (pid: 8705, stack limit = 0xffffff801caa0000)
[ 2738.034720] Call trace:
[ 2738.034722] lzo1x_1_do_compress+0x198/0x610
[ 2738.034725] lzo_compress+0x48/0x88
[ 2738.034729] crypto_compress+0x14/0x20
[ 2738.034733] zcomp_compress+0x2c/0x38
[ 2738.034736] zram_bvec_rw+0x3d0/0x860
[ 2738.034738] zram_rw_page+0x88/0xe0
[ 2738.034742] bdev_write_page+0x70/0xc0
[ 2738.034745] __swap_writepage+0x58/0x3f8
[ 2738.034747] swap_writepage+0x40/0x50
[ 2738.034750] shrink_page_list+0x4fc/0xe58
[ 2738.034753] reclaim_pages_from_list+0xa0/0x150
[ 2738.034756] reclaim_pte_range+0x18c/0x1f8
[ 2738.034759] __walk_page_range+0xf8/0x1e0
[ 2738.034762] walk_page_range+0xf8/0x130
[ 2738.034765] reclaim_task_anon+0xcc/0x168
[ 2738.034767] swap_fn+0x438/0x668
[ 2738.034771] process_one_work+0x1fc/0x460
[ 2738.034773] worker_thread+0x2d0/0x478
[ 2738.034775] kthread+0x110/0x120
[ 2738.034778] ret_from_fork+0x10/0x18
[ 2738.034781] Code: 3800167f 54ffffa8 d100066f 14000031 (b9400131)
[ 2738.034784] ---[ end trace 9b5cca106f0e54d1 ]---
[ 2738.035473] Kernel panic - not syncing: Fatal exception
in = 0xfffffffffffff000
in_len = 4096
ip = x9 = 0x0000000000000009 overflowed.
Always leave room the last size of ARM64_MEMSTART_ALIGN region
in linear region.
Signed-off-by: liyueyi <[email protected]>
---
arch/arm64/mm/init.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 0340e45..20fe11e 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -439,7 +439,8 @@ void __init arm64_memblock_init(void)
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
extern u16 memstart_offset_seed;
u64 range = linear_region_size -
- (memblock_end_of_DRAM() - memblock_start_of_DRAM());
+ (memblock_end_of_DRAM() - memblock_start_of_DRAM()) -
+ ARM64_MEMSTART_ALIGN;
/*
* If the size of the linear region exceeds, by a sufficient
--
2.7.4
On Mon, 24 Dec 2018 at 08:40, Yueyi Li <[email protected]> wrote:
>
> When KASLR enaled(CONFIG_RANDOMIZE_BASE=y), the top 4K virtual
> address have chance to be mapped to physical address, but which
> is expected to leave room for ERR_PTR.
>
> Also, it might cause some other warparound issue when somewhere
> use the last memory page but no overflow check. Such as the last
> page compressed by LZO:
>
> [ 2738.034508] Unable to handle kernel NULL pointer dereference at virtual address 00000009
> [ 2738.034515] Mem abort info:
> [ 2738.034518] Exception class = DABT (current EL), IL = 32 bits
> [ 2738.034520] SET = 0, FnV = 0
> [ 2738.034523] EA = 0, S1PTW = 0
> [ 2738.034524] FSC = 5
> [ 2738.034526] Data abort info:
> [ 2738.034528] ISV = 0, ISS = 0x00000005
> [ 2738.034530] CM = 0, WnR = 0
> [ 2738.034533] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffff94cee000
> [ 2738.034535] [0000000000000009] *pgd=0000000000000000, *pud=0000000000000000
> ...
> [ 2738.034592] pc : lzo1x_1_do_compress+0x198/0x610
> [ 2738.034595] lr : lzo1x_1_compress+0x98/0x3d8
> [ 2738.034597] sp : ffffff801caa3470 pstate : 00c00145
> [ 2738.034598] x29: ffffff801caa3500 x28: 0000000000001000
> [ 2738.034601] x27: 0000000000001000 x26: fffffffffffff000
> [ 2738.034604] x25: ffffffff4ebc0000 x24: 0000000000000000
> [ 2738.034607] x23: 000000000000004c x22: fffffffffffff7b8
> [ 2738.034610] x21: ffffffff2e2ee0b3 x20: ffffffff2e2ee0bb
> [ 2738.034612] x19: 0000000000000fcc x18: fffffffffffff84a
> [ 2738.034615] x17: 00000000801b03d6 x16: 0000000000000782
> [ 2738.034618] x15: ffffffff2e2ee0bf x14: fffffffffffffff0
> [ 2738.034620] x13: 000000000000000f x12: 0000000000000020
> [ 2738.034623] x11: 000000001824429d x10: ffffffffffffffec
> [ 2738.034626] x9 : 0000000000000009 x8 : 0000000000000000
> [ 2738.034628] x7 : 0000000000000868 x6 : 0000000000000434
> [ 2738.034631] x5 : ffffffff4ebc0000 x4 : 0000000000000000
> [ 2738.034633] x3 : ffffff801caa3510 x2 : ffffffff2e2ee000
> [ 2738.034636] x1 : 0000000000000000 x0 : fffffffffffff000
> ...
> [ 2738.034717] Process kworker/u16:1 (pid: 8705, stack limit = 0xffffff801caa0000)
> [ 2738.034720] Call trace:
> [ 2738.034722] lzo1x_1_do_compress+0x198/0x610
> [ 2738.034725] lzo_compress+0x48/0x88
> [ 2738.034729] crypto_compress+0x14/0x20
> [ 2738.034733] zcomp_compress+0x2c/0x38
> [ 2738.034736] zram_bvec_rw+0x3d0/0x860
> [ 2738.034738] zram_rw_page+0x88/0xe0
> [ 2738.034742] bdev_write_page+0x70/0xc0
> [ 2738.034745] __swap_writepage+0x58/0x3f8
> [ 2738.034747] swap_writepage+0x40/0x50
> [ 2738.034750] shrink_page_list+0x4fc/0xe58
> [ 2738.034753] reclaim_pages_from_list+0xa0/0x150
> [ 2738.034756] reclaim_pte_range+0x18c/0x1f8
> [ 2738.034759] __walk_page_range+0xf8/0x1e0
> [ 2738.034762] walk_page_range+0xf8/0x130
> [ 2738.034765] reclaim_task_anon+0xcc/0x168
> [ 2738.034767] swap_fn+0x438/0x668
> [ 2738.034771] process_one_work+0x1fc/0x460
> [ 2738.034773] worker_thread+0x2d0/0x478
> [ 2738.034775] kthread+0x110/0x120
> [ 2738.034778] ret_from_fork+0x10/0x18
> [ 2738.034781] Code: 3800167f 54ffffa8 d100066f 14000031 (b9400131)
> [ 2738.034784] ---[ end trace 9b5cca106f0e54d1 ]---
> [ 2738.035473] Kernel panic - not syncing: Fatal exception
>
> in = 0xfffffffffffff000
> in_len = 4096
> ip = x9 = 0x0000000000000009 overflowed.
>
> Always leave room the last size of ARM64_MEMSTART_ALIGN region
> in linear region.
>
> Signed-off-by: liyueyi <[email protected]>
> ---
> arch/arm64/mm/init.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 0340e45..20fe11e 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -439,7 +439,8 @@ void __init arm64_memblock_init(void)
> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> extern u16 memstart_offset_seed;
> u64 range = linear_region_size -
> - (memblock_end_of_DRAM() - memblock_start_of_DRAM());
> + (memblock_end_of_DRAM() - memblock_start_of_DRAM()) -
> + ARM64_MEMSTART_ALIGN;
>
> /*
> * If the size of the linear region exceeds, by a sufficient
Does the following change fix your issue as well?
index 9b432d9fcada..9dcf0ff75a11 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
* memory spans, randomize the linear region as well.
*/
if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
- range = range / ARM64_MEMSTART_ALIGN + 1;
+ range /= ARM64_MEMSTART_ALIGN;
memstart_addr -= ARM64_MEMSTART_ALIGN *
((range * memstart_offset_seed) >> 16);
}
Hi Ard,
On 2018/12/24 17:45, Ard Biesheuvel wrote:
> Does the following change fix your issue as well?
>
> index 9b432d9fcada..9dcf0ff75a11 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
> * memory spans, randomize the linear region as well.
> */
> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
> - range = range / ARM64_MEMSTART_ALIGN + 1;
> + range /= ARM64_MEMSTART_ALIGN;
> memstart_addr -= ARM64_MEMSTART_ALIGN *
> ((range * memstart_offset_seed) >> 16);
> }
Yes, it can fix this also. I just think modify the first *range*
calculation would be easier to
grasp, what do you think?
Thanks,
Yueyi
On Tue, 25 Dec 2018 at 03:30, Yueyi Li <[email protected]> wrote:
>
> Hi Ard,
>
>
> On 2018/12/24 17:45, Ard Biesheuvel wrote:
> > Does the following change fix your issue as well?
> >
> > index 9b432d9fcada..9dcf0ff75a11 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
> > * memory spans, randomize the linear region as well.
> > */
> > if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
> > - range = range / ARM64_MEMSTART_ALIGN + 1;
> > + range /= ARM64_MEMSTART_ALIGN;
> > memstart_addr -= ARM64_MEMSTART_ALIGN *
> > ((range * memstart_offset_seed) >> 16);
> > }
>
> Yes, it can fix this also. I just think modify the first *range*
> calculation would be easier to grasp, what do you think?
>
I don't think there is a difference, to be honest, but I will leave it
up to the maintainers to decide which approach they prefer.
OK, thanks. But seems this mail be ignored, do i need re-sent the patch?
On 2018/12/26 21:49, Ard Biesheuvel wrote:
> On Tue, 25 Dec 2018 at 03:30, Yueyi Li <[email protected]> wrote:
>> Hi Ard,
>>
>>
>> On 2018/12/24 17:45, Ard Biesheuvel wrote:
>>> Does the following change fix your issue as well?
>>>
>>> index 9b432d9fcada..9dcf0ff75a11 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
>>> * memory spans, randomize the linear region as well.
>>> */
>>> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
>>> - range = range / ARM64_MEMSTART_ALIGN + 1;
>>> + range /= ARM64_MEMSTART_ALIGN;
>>> memstart_addr -= ARM64_MEMSTART_ALIGN *
>>> ((range * memstart_offset_seed) >> 16);
>>> }
>> Yes, it can fix this also. I just think modify the first *range*
>> calculation would be easier to grasp, what do you think?
>>
> I don't think there is a difference, to be honest, but I will leave it
> up to the maintainers to decide which approach they prefer.
On Wed, 16 Jan 2019 at 04:37, Yueyi Li <[email protected]> wrote:
>
> OK, thanks. But seems this mail be ignored, do i need re-sent the patch?
>
> On 2018/12/26 21:49, Ard Biesheuvel wrote:
> > On Tue, 25 Dec 2018 at 03:30, Yueyi Li <[email protected]> wrote:
> >> Hi Ard,
> >>
> >>
> >> On 2018/12/24 17:45, Ard Biesheuvel wrote:
> >>> Does the following change fix your issue as well?
> >>>
> >>> index 9b432d9fcada..9dcf0ff75a11 100644
> >>> --- a/arch/arm64/mm/init.c
> >>> +++ b/arch/arm64/mm/init.c
> >>> @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
> >>> * memory spans, randomize the linear region as well.
> >>> */
> >>> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
> >>> - range = range / ARM64_MEMSTART_ALIGN + 1;
> >>> + range /= ARM64_MEMSTART_ALIGN;
> >>> memstart_addr -= ARM64_MEMSTART_ALIGN *
> >>> ((range * memstart_offset_seed) >> 16);
> >>> }
> >> Yes, it can fix this also. I just think modify the first *range*
> >> calculation would be easier to grasp, what do you think?
> >>
> > I don't think there is a difference, to be honest, but I will leave it
> > up to the maintainers to decide which approach they prefer.
>
No it has been merged already. It is in v5.0-rc2 I think.
On 2019/1/16 15:51, Ard Biesheuvel wrote:
> On Wed, 16 Jan 2019 at 04:37, Yueyi Li <[email protected]> wrote:
>> OK, thanks. But seems this mail be ignored, do i need re-sent the patch?
>>
>> On 2018/12/26 21:49, Ard Biesheuvel wrote:
>>> On Tue, 25 Dec 2018 at 03:30, Yueyi Li <[email protected]> wrote:
>>>> Hi Ard,
>>>>
>>>>
>>>> On 2018/12/24 17:45, Ard Biesheuvel wrote:
>>>>> Does the following change fix your issue as well?
>>>>>
>>>>> index 9b432d9fcada..9dcf0ff75a11 100644
>>>>> --- a/arch/arm64/mm/init.c
>>>>> +++ b/arch/arm64/mm/init.c
>>>>> @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
>>>>> * memory spans, randomize the linear region as well.
>>>>> */
>>>>> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
>>>>> - range = range / ARM64_MEMSTART_ALIGN + 1;
>>>>> + range /= ARM64_MEMSTART_ALIGN;
>>>>> memstart_addr -= ARM64_MEMSTART_ALIGN *
>>>>> ((range * memstart_offset_seed) >> 16);
>>>>> }
>>>> Yes, it can fix this also. I just think modify the first *range*
>>>> calculation would be easier to grasp, what do you think?
>>>>
>>> I don't think there is a difference, to be honest, but I will leave it
>>> up to the maintainers to decide which approach they prefer.
> No it has been merged already. It is in v5.0-rc2 I think.
OK, thanks. :-)
Dear stable maintainers,
I encountered a similar issue on a 4.19.33 kernel (Chromium OS). On my
board, the system would not even be able to boot if KASLR decides to
map the linear region to the top of the virtual address space. This
happens every 253 boots on average (there are 0xfd possible random
offsets, and only the top one fails).
I tried to debug the issue, and it appears physical memory allocated
for vmemmap and mem_section array would end up at the same location,
corrupting each other early on boot. I could not figure out exactly
why this is happening, but in any case, this patch fixes my issue (no
failure in 744 reboots with 240 unique offsets, and counting...), and
IMHO the ERR_PTR justification in the commit message is enough to
warrant inclusion in -stable branches.
The patch below was committed to mainline as:
commit c8a43c18a97845e7f94ed7d181c11f41964976a2
arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
and should be included in stable branches after this commit:
Fixes: c031a4213c11a5db ("arm64: kaslr: randomize the linear region")
i.e. anything after kernel 4.5 (git describe says v4.5-rc4-62-gc031a4213c11a5d).
Thanks,
Nicolas
On Wed, Jan 16, 2019 at 4:38 PM Yueyi Li <[email protected]> wrote:
>
>
>
> On 2019/1/16 15:51, Ard Biesheuvel wrote:
> > On Wed, 16 Jan 2019 at 04:37, Yueyi Li <[email protected]> wrote:
> >> OK, thanks. But seems this mail be ignored, do i need re-sent the patch?
> >>
> >> On 2018/12/26 21:49, Ard Biesheuvel wrote:
> >>> On Tue, 25 Dec 2018 at 03:30, Yueyi Li <[email protected]> wrote:
> >>>> Hi Ard,
> >>>>
> >>>>
> >>>> On 2018/12/24 17:45, Ard Biesheuvel wrote:
> >>>>> Does the following change fix your issue as well?
> >>>>>
> >>>>> index 9b432d9fcada..9dcf0ff75a11 100644
> >>>>> --- a/arch/arm64/mm/init.c
> >>>>> +++ b/arch/arm64/mm/init.c
> >>>>> @@ -447,7 +447,7 @@ void __init arm64_memblock_init(void)
> >>>>> * memory spans, randomize the linear region as well.
> >>>>> */
> >>>>> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
> >>>>> - range = range / ARM64_MEMSTART_ALIGN + 1;
> >>>>> + range /= ARM64_MEMSTART_ALIGN;
> >>>>> memstart_addr -= ARM64_MEMSTART_ALIGN *
> >>>>> ((range * memstart_offset_seed) >> 16);
> >>>>> }
> >>>> Yes, it can fix this also. I just think modify the first *range*
> >>>> calculation would be easier to grasp, what do you think?
> >>>>
> >>> I don't think there is a difference, to be honest, but I will leave it
> >>> up to the maintainers to decide which approach they prefer.
> > No it has been merged already. It is in v5.0-rc2 I think.
>
> OK, thanks. :-)
On Sat, Apr 13, 2019 at 08:41:33PM +0800, Nicolas Boichat wrote:
>Dear stable maintainers,
>
>I encountered a similar issue on a 4.19.33 kernel (Chromium OS). On my
>board, the system would not even be able to boot if KASLR decides to
>map the linear region to the top of the virtual address space. This
>happens every 253 boots on average (there are 0xfd possible random
>offsets, and only the top one fails).
>
>I tried to debug the issue, and it appears physical memory allocated
>for vmemmap and mem_section array would end up at the same location,
>corrupting each other early on boot. I could not figure out exactly
>why this is happening, but in any case, this patch fixes my issue (no
>failure in 744 reboots with 240 unique offsets, and counting...), and
>IMHO the ERR_PTR justification in the commit message is enough to
>warrant inclusion in -stable branches.
>
>The patch below was committed to mainline as:
>commit c8a43c18a97845e7f94ed7d181c11f41964976a2
> arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
>
>and should be included in stable branches after this commit:
>Fixes: c031a4213c11a5db ("arm64: kaslr: randomize the linear region")
>i.e. anything after kernel 4.5 (git describe says v4.5-rc4-62-gc031a4213c11a5d).
I've queued it for 4.9-4.19, thanks for the report.
--
Thanks,
Sasha