Summary: Avoid null ptr defererence in __alloc_pages
This exists in 2.4. and 2.5
Configuration: a NUMA (ia32) system which only has highmem
on one or more nodes.
Action to create: Try to allocate ZONE_NORMAL memory
from a node which only has highmem. What we should do
is fall back to another node, looking for ZONE_NORMAL
memory.
In looking at the specified zonelist, we panic because that zonelist
is NULL. The simple patch below avoids the null deference, and
returns failure. alloc_pages will continue looking through the nodes
until it finds one with some ZONE_NORMAL memory. We actually
panic at the moment a few lines later when we do,
classzone->need_balance = 1; thus dereferencing the pointer.
--- linux-2.4.18-memalloc/mm/page_alloc.c.old Fri Mar 8 18:21:41 2002
+++ linux-2.4.18-memalloc/mm/page_alloc.c Fri Mar 8 18:23:27 2002
@@ -317,6 +317,8 @@
zone = zonelist->zones;
classzone = *zone;
+ if (classzone == NULL)
+ return NULL;
min = 1UL << order;
for (;;) {
zone_t *z = *(zone++);
Hi Martin,
On Fri, 8 Mar 2002, Martin J. Bligh wrote:
> Summary: Avoid null ptr defererence in __alloc_pages
> This exists in 2.4. and 2.5
>
> Configuration: a NUMA (ia32) system which only has highmem
> on one or more nodes.
>
> Action to create: Try to allocate ZONE_NORMAL memory
> from a node which only has highmem. What we should do
> is fall back to another node, looking for ZONE_NORMAL
> memory.
If you applied an SGI patch that makes the zonelist contain all the zones
of your machine, then the zonelist should not be NULL.
If you allocate memory with gfp_mask & GFP_ZONEMASK == GFP_NORMAL from a
HIGHMEM only node, then the first entry on the corresponding zonelist
should be the first NORMAL zone on some other node.
Am I missing something here ?
> In looking at the specified zonelist, we panic because that zonelist
> is NULL. The simple patch below avoids the null deference, and
> returns failure. alloc_pages will continue looking through the nodes
> until it finds one with some ZONE_NORMAL memory. We actually
> panic at the moment a few lines later when we do,
> classzone->need_balance = 1; thus dereferencing the pointer.
>
> --- linux-2.4.18-memalloc/mm/page_alloc.c.old Fri Mar 8 18:21:41 2002
> +++ linux-2.4.18-memalloc/mm/page_alloc.c Fri Mar 8 18:23:27 2002
> @@ -317,6 +317,8 @@
>
> zone = zonelist->zones;
> classzone = *zone;
> + if (classzone == NULL)
> + return NULL;
> min = 1UL << order;
> for (;;) {
> zone_t *z = *(zone++);
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Thanks,
Samuel.
> If you applied an SGI patch that makes the zonelist contain all the zones
> of your machine, then the zonelist should not be NULL.
> If you allocate memory with gfp_mask & GFP_ZONEMASK == GFP_NORMAL from a
> HIGHMEM only node, then the first entry on the corresponding zonelist
> should be the first NORMAL zone on some other node.
> Am I missing something here ?
You're missing the fact that I'm missing the SGI patch ;-)
M.
>> If you applied an SGI patch that makes the zonelist contain all the zones
>> of your machine, then the zonelist should not be NULL.
>> If you allocate memory with gfp_mask & GFP_ZONEMASK == GFP_NORMAL from a
>> HIGHMEM only node, then the first entry on the corresponding zonelist
>> should be the first NORMAL zone on some other node.
>> Am I missing something here ?
>
> You're missing the fact that I'm missing the SGI patch ;-)
I should have also mentioned that:
1) I shouldn't need the SGI patch, though it might help performance.
2) The kernel panics without my fix, and runs fine with it.
M.
On Fri, 8 Mar 2002, Martin J. Bligh wrote:
> >> If you applied an SGI patch that makes the zonelist contain all the zones
> >> of your machine, then the zonelist should not be NULL.
> >> If you allocate memory with gfp_mask & GFP_ZONEMASK == GFP_NORMAL from a
> >> HIGHMEM only node, then the first entry on the corresponding zonelist
> >> should be the first NORMAL zone on some other node.
> >> Am I missing something here ?
> >
> > You're missing the fact that I'm missing the SGI patch ;-)
Oh, I see. I was missing something then...;-)
>
> I should have also mentioned that:
>
> 1) I shouldn't need the SGI patch, though it might help performance.
Why shouldn't you need it ? It is NUMA generic, and totally arch
independent.
And it actually helps performance. I also allows the kernel to have a
single memory allocation path. I think it is cleaner than calling _alloc_pages()
from numa.c
> 2) The kernel panics without my fix, and runs fine with it.
I hope so :-)
But your fix is at the same time useless and harmless for UMA machines.
OTOH, the SGI patch doesn't modify __alloc_pages(). I think I'm a little
too picky here...
Cheers,
Samuel.
>> I should have also mentioned that:
>>
>> 1) I shouldn't need the SGI patch, though it might help performance.
>
> Why shouldn't you need it ? It is NUMA generic, and totally arch
> independent.
I just mean the kernel shouldn't panic if I don't have it.
> And it actually helps performance. I also allows the kernel to have a
> single memory allocation path. I think it is cleaner than calling _alloc_pages()
> from numa.c
>
>> 2) The kernel panics without my fix, and runs fine with it.
> I hope so :-)
> But your fix is at the same time useless and harmless for UMA machines.
Yup, though I suppose we could shave off a couple of nanoseconds by
surrounding my little check with #ifdef CONFIG_NUMA.
> OTOH, the SGI patch doesn't modify __alloc_pages(). I think I'm a little
> too picky here...
With the #ifdef, I won't really do this either (at least for the code generated).
The SGI patch is probably a good thing, and I'll pick it up and try it
sometime soon. The only real problem is that it's not in the mainline
already. Until such time as it gets there, the fix I posted is trivial,
and easily seen to be correct (well, I'm biased ;-) ), and should get
shoved into the mainline much easier.
M.
On Fri, 8 Mar 2002, Martin J. Bligh wrote:
> >> I should have also mentioned that:
> >>
> >> 1) I shouldn't need the SGI patch, though it might help performance.
> >
> > Why shouldn't you need it ? It is NUMA generic, and totally arch
> > independent.
>
> I just mean the kernel shouldn't panic if I don't have it.
That is true...
> > And it actually helps performance. I also allows the kernel to have a
> > single memory allocation path. I think it is cleaner than calling _alloc_pages()
> > from numa.c
> >
> >> 2) The kernel panics without my fix, and runs fine with it.
> > I hope so :-)
> > But your fix is at the same time useless and harmless for UMA machines.
>
> Yup, though I suppose we could shave off a couple of nanoseconds by
> surrounding my little check with #ifdef CONFIG_NUMA.
>
> > OTOH, the SGI patch doesn't modify __alloc_pages(). I think I'm a little
> > too picky here...
>
> With the #ifdef, I won't really do this either (at least for the code generated).
>
> The SGI patch is probably a good thing, and I'll pick it up and try it
> sometime soon. The only real problem is that it's not in the mainline
> already. Until such time as it gets there, the fix I posted is trivial,
> and easily seen to be correct (well, I'm biased ;-) ), and should get
> shoved into the mainline much easier.
You're right. It's on -aa kernels, so it'll get in mainline soon.
Cheers,
Samuel.