LinuxLists.cc - why do we still need bootmem allocator?

2018-06-25 14:11:49

Subject: why do we still need bootmem allocator?

Hi,
I am wondering why do we still keep mm/bootmem.c when most architectures
already moved to nobootmem. Is there any fundamental reason why others
cannot or this is just a matter of work? Btw. what really needs to be
done? Btw. is there any documentation telling us what needs to be done
in that regards?
--
Michal Hocko
SUSE Labs

2018-06-25 16:11:12

by Rob Herring (Arm)

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
>
> Hi,
> I am wondering why do we still keep mm/bootmem.c when most architectures
> already moved to nobootmem. Is there any fundamental reason why others
> cannot or this is just a matter of work?

Just because no one has done the work. I did a couple of arches
recently (sh, microblaze, and h8300) mainly because I broke them with
some DT changes.

> Btw. what really needs to be
> done? Btw. is there any documentation telling us what needs to be done
> in that regards?

No. The commits converting the arches are the only documentation. It's
a bit more complicated for platforms that have NUMA support.

Rob

2018-06-25 18:06:16

by Michal Hocko

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Mon 25-06-18 10:09:41, Rob Herring wrote:
> On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> >
> > Hi,
> > I am wondering why do we still keep mm/bootmem.c when most architectures
> > already moved to nobootmem. Is there any fundamental reason why others
> > cannot or this is just a matter of work?
>
> Just because no one has done the work. I did a couple of arches
> recently (sh, microblaze, and h8300) mainly because I broke them with
> some DT changes.

I see

> > Btw. what really needs to be
> > done? Btw. is there any documentation telling us what needs to be done
> > in that regards?
>
> No. The commits converting the arches are the only documentation. It's
> a bit more complicated for platforms that have NUMA support.

I do not see why should be NUMA a problem but I will have a look at your
commits to see what you have done.

Thanks!
--
Michal Hocko
SUSE Labs

2018-06-27 10:25:15

by Mike Rapoport

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> >
> > Hi,
> > I am wondering why do we still keep mm/bootmem.c when most architectures
> > already moved to nobootmem. Is there any fundamental reason why others
> > cannot or this is just a matter of work?
>
> Just because no one has done the work. I did a couple of arches
> recently (sh, microblaze, and h8300) mainly because I broke them with
> some DT changes.

I have a patch for alpha nearly ready.
That leaves m68k and ia64

> > Btw. what really needs to be
> > done? Btw. is there any documentation telling us what needs to be done
> > in that regards?
>
> No. The commits converting the arches are the only documentation. It's
> a bit more complicated for platforms that have NUMA support.
>
> Rob
>

--
Sincerely yours,
Mike.

2018-06-27 10:41:58

by Michal Hocko

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Wed 27-06-18 13:11:44, Mike Rapoport wrote:
> On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> > >
> > > Hi,
> > > I am wondering why do we still keep mm/bootmem.c when most architectures
> > > already moved to nobootmem. Is there any fundamental reason why others
> > > cannot or this is just a matter of work?
> >
> > Just because no one has done the work. I did a couple of arches
> > recently (sh, microblaze, and h8300) mainly because I broke them with
> > some DT changes.
>
> I have a patch for alpha nearly ready.

Cool!

> That leaves m68k and ia64

I will not get to those anytime soon (say a week or two) but I have that
close on top of my todo list.
--
Michal Hocko
SUSE Labs

2018-06-27 11:28:18

by Mike Rapoport

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

Hi,

On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> >
> > Hi,
> > I am wondering why do we still keep mm/bootmem.c when most architectures
> > already moved to nobootmem. Is there any fundamental reason why others
> > cannot or this is just a matter of work?
>
> Just because no one has done the work. I did a couple of arches
> recently (sh, microblaze, and h8300) mainly because I broke them with
> some DT changes.

I've tried running the current upstream on h8300 gdb simulator and it
failed:

[ 0.000000] BUG: Bad page state in process swapper pfn:00004
[ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
index:0x0
[ 0.000000] flags: 0x0()
[ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
ffffff7f 00000000
[ 0.000000] page dumped because: nonzero mapcount
---Type <return> to continue, or q <return> to quit---
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
[ 0.000000] Stack from 00401f2c:
[ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
0004df14 00000000
[ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
00000044 00401fd1
[ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
00000003 00000011
[ 0.000000]
[ 0.000000] Call Trace:
[ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
[ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
[ 0.000000] Disabling lock debugging due to kernel taint

With v4.13 I was able to get to "no valid init found".

I had a quick look at h8300 memory initialization and it seems it has
starting pfn set to 0 while fdt defines memory start at 4M.

> > Btw. what really needs to be
> > done? Btw. is there any documentation telling us what needs to be done
> > in that regards?
>
> No. The commits converting the arches are the only documentation. It's
> a bit more complicated for platforms that have NUMA support.
>
> Rob
>

--
Sincerely yours,
Mike.

2018-06-27 14:16:05

by Rob Herring (Arm)

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Wed, Jun 27, 2018 at 5:27 AM Mike Rapoport <[email protected]> wrote:
>
> Hi,
>
> On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> > >
> > > Hi,
> > > I am wondering why do we still keep mm/bootmem.c when most architectures
> > > already moved to nobootmem. Is there any fundamental reason why others
> > > cannot or this is just a matter of work?
> >
> > Just because no one has done the work. I did a couple of arches
> > recently (sh, microblaze, and h8300) mainly because I broke them with
> > some DT changes.
>
> I've tried running the current upstream on h8300 gdb simulator and it
> failed:

It seems my patch[1] is still not applied. The maintainer said he applied it.

> [ 0.000000] BUG: Bad page state in process swapper pfn:00004
> [ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
> index:0x0
> [ 0.000000] flags: 0x0()
> [ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
> ffffff7f 00000000
> [ 0.000000] page dumped because: nonzero mapcount
> ---Type <return> to continue, or q <return> to quit---
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
> [ 0.000000] Stack from 00401f2c:
> [ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
> 0004df14 00000000
> [ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
> 00000044 00401fd1
> [ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
> 00000003 00000011
> [ 0.000000]
> [ 0.000000] Call Trace:
> [ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
> [ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
> [ 0.000000] Disabling lock debugging due to kernel taint
>
> With v4.13 I was able to get to "no valid init found".
>
> I had a quick look at h8300 memory initialization and it seems it has
> starting pfn set to 0 while fdt defines memory start at 4M.

Perhaps there's another issue.

Rob

[1] https://patchwork.kernel.org/patch/10290317/

2018-06-27 14:18:51

by Rob Herring (Arm)

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Wed, Jun 27, 2018 at 4:11 AM Mike Rapoport <[email protected]> wrote:
>
> On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> > >
> > > Hi,
> > > I am wondering why do we still keep mm/bootmem.c when most architectures
> > > already moved to nobootmem. Is there any fundamental reason why others
> > > cannot or this is just a matter of work?
> >
> > Just because no one has done the work. I did a couple of arches
> > recently (sh, microblaze, and h8300) mainly because I broke them with
> > some DT changes.
>
> I have a patch for alpha nearly ready.
> That leaves m68k and ia64

And c6x, hexagon, mips, nios2, unicore32. Those are all the platforms
which don't select NO_BOOTMEM.

Rob

2018-06-27 16:00:26

by Mike Rapoport

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Wed, Jun 27, 2018 at 07:58:19AM -0600, Rob Herring wrote:
> On Wed, Jun 27, 2018 at 4:11 AM Mike Rapoport <[email protected]> wrote:
> >
> > On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> > > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> > > >
> > > > Hi,
> > > > I am wondering why do we still keep mm/bootmem.c when most architectures
> > > > already moved to nobootmem. Is there any fundamental reason why others
> > > > cannot or this is just a matter of work?
> > >
> > > Just because no one has done the work. I did a couple of arches
> > > recently (sh, microblaze, and h8300) mainly because I broke them with
> > > some DT changes.
> >
> > I have a patch for alpha nearly ready.
> > That leaves m68k and ia64
>
> And c6x, hexagon, mips, nios2, unicore32. Those are all the platforms
> which don't select NO_BOOTMEM.

Yeah, you are right. I've somehow excluded those that HAVE_MEMBLOCK...

> Rob
>

--
Sincerely yours,
Mike.

2018-06-27 16:06:17

by Mike Rapoport

[permalink] [raw]

Subject: Re: why do we still need bootmem allocator?

On Wed, Jun 27, 2018 at 07:33:55AM -0600, Rob Herring wrote:
> On Wed, Jun 27, 2018 at 5:27 AM Mike Rapoport <[email protected]> wrote:
> >
> > Hi,
> >
> > On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote:
> > > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko <[email protected]> wrote:
> > > >
> > > > Hi,
> > > > I am wondering why do we still keep mm/bootmem.c when most architectures
> > > > already moved to nobootmem. Is there any fundamental reason why others
> > > > cannot or this is just a matter of work?
> > >
> > > Just because no one has done the work. I did a couple of arches
> > > recently (sh, microblaze, and h8300) mainly because I broke them with
> > > some DT changes.
> >
> > I've tried running the current upstream on h8300 gdb simulator and it
> > failed:
>
> It seems my patch[1] is still not applied. The maintainer said he applied it.

I've applied it manually. Without it unflatten_and_copy_device_tree() fails
to allocate memory. It indeed can be fixed with moving bootmem_init()
before, as you've noted in the commit message.

I'll try to dig deeper into it.

> > [ 0.000000] BUG: Bad page state in process swapper pfn:00004
> > [ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
> > index:0x0
> > [ 0.000000] flags: 0x0()
> > [ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
> > ffffff7f 00000000
> > [ 0.000000] page dumped because: nonzero mapcount
> > ---Type <return> to continue, or q <return> to quit---
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
> > [ 0.000000] Stack from 00401f2c:
> > [ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
> > 0004df14 00000000
> > [ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
> > 00000044 00401fd1
> > [ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
> > 00000003 00000011
> > [ 0.000000]
> > [ 0.000000] Call Trace:
> > [ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
> > [ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
> > [ 0.000000] Disabling lock debugging due to kernel taint
> >
> > With v4.13 I was able to get to "no valid init found".
> >
> > I had a quick look at h8300 memory initialization and it seems it has
> > starting pfn set to 0 while fdt defines memory start at 4M.
>
> Perhaps there's another issue.
>
> Rob
>
> [1] https://patchwork.kernel.org/patch/10290317/
>

--
Sincerely yours,
Mike.

2018-07-01 12:25:33

by Mike Rapoport

[permalink] [raw]

Subject: h8300: BUG: Bad page state in process swapper (was: Re: why do we still need bootmem allocator?)

(added Yoshinori Sato, here's the beginning of the discussion:
https://lore.kernel.org/lkml/[email protected]/)

On Wed, Jun 27, 2018 at 07:02:06PM +0300, Mike Rapoport wrote:
> On Wed, Jun 27, 2018 at 07:33:55AM -0600, Rob Herring wrote:
> > On Wed, Jun 27, 2018 at 5:27 AM Mike Rapoport <[email protected]> wrote:
> > >
> > > I've tried running the current upstream on h8300 gdb simulator and it
> > > failed:
> >
> > It seems my patch[1] is still not applied. The maintainer said he applied it.
>
> I've applied it manually. Without it unflatten_and_copy_device_tree() fails
> to allocate memory. It indeed can be fixed with moving bootmem_init()
> before, as you've noted in the commit message.
>
> I'll try to dig deeper into it.
>
> > > [ 0.000000] BUG: Bad page state in process swapper pfn:00004
> > > [ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
> > > index:0x0
> > > [ 0.000000] flags: 0x0()
> > > [ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
> > > ffffff7f 00000000
> > > [ 0.000000] page dumped because: nonzero mapcount
> > > ---Type <return> to continue, or q <return> to quit---
> > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
> > > [ 0.000000] Stack from 00401f2c:
> > > [ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
> > > 0004df14 00000000
> > > [ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
> > > 00000044 00401fd1
> > > [ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
> > > 00000003 00000011
> > > [ 0.000000]
> > > [ 0.000000] Call Trace:
> > > [ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
> > > [ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
> > > [ 0.000000] Disabling lock debugging due to kernel taint
> > >
> > > With v4.13 I was able to get to "no valid init found".
> > >
> > > I had a quick look at h8300 memory initialization and it seems it has
> > > starting pfn set to 0 while fdt defines memory start at 4M.
> >
> > Perhaps there's another issue.

In my setup this is caused by __ffs() clobbering start pfn in
nobootmem.c::__free_pages_memory().

If I change the __ffs() implementation from the inline assembly to generic
bitops everything is fine.

I'm using gcc 8.1.0 from [1] and gdb 8.1.0.20180625-git

[1] http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/

--
Sincerely yours,

2018-07-02 06:22:04

by Yoshinori Sato

[permalink] [raw]

Subject: Re: h8300: BUG: Bad page state in process swapper (was: Re: why do we still need bootmem allocator?)

On Sun, 01 Jul 2018 21:22:46 +0900,
Mike Rapoport wrote:
>
> (added Yoshinori Sato, here's the beginning of the discussion:
> https://lore.kernel.org/lkml/[email protected]/)
>
> On Wed, Jun 27, 2018 at 07:02:06PM +0300, Mike Rapoport wrote:
> > On Wed, Jun 27, 2018 at 07:33:55AM -0600, Rob Herring wrote:
> > > On Wed, Jun 27, 2018 at 5:27 AM Mike Rapoport <[email protected]> wrote:
> > > >
> > > > I've tried running the current upstream on h8300 gdb simulator and it
> > > > failed:
> > >
> > > It seems my patch[1] is still not applied. The maintainer said he applied it.
> >
> > I've applied it manually. Without it unflatten_and_copy_device_tree() fails
> > to allocate memory. It indeed can be fixed with moving bootmem_init()
> > before, as you've noted in the commit message.
> >
> > I'll try to dig deeper into it.
> >
> > > > [ 0.000000] BUG: Bad page state in process swapper pfn:00004
> > > > [ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
> > > > index:0x0
> > > > [ 0.000000] flags: 0x0()
> > > > [ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
> > > > ffffff7f 00000000
> > > > [ 0.000000] page dumped because: nonzero mapcount
> > > > ---Type <return> to continue, or q <return> to quit---
> > > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
> > > > [ 0.000000] Stack from 00401f2c:
> > > > [ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
> > > > 0004df14 00000000
> > > > [ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
> > > > 00000044 00401fd1
> > > > [ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
> > > > 00000003 00000011
> > > > [ 0.000000]
> > > > [ 0.000000] Call Trace:
> > > > [ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
> > > > [ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
> > > > [ 0.000000] Disabling lock debugging due to kernel taint
> > > >
> > > > With v4.13 I was able to get to "no valid init found".
> > > >
> > > > I had a quick look at h8300 memory initialization and it seems it has
> > > > starting pfn set to 0 while fdt defines memory start at 4M.
> > >
> > > Perhaps there's another issue.
>
> In my setup this is caused by __ffs() clobbering start pfn in
> nobootmem.c::__free_pages_memory().
>
> If I change the __ffs() implementation from the inline assembly to generic
> bitops everything is fine.

OK.
Current bitops.h implementations have some dependencies on gcc's behavior.
I think that it is necessary to modify it generically so that it can
correspond to the new gcc.

Please wait until it gets fixed.

> I'm using gcc 8.1.0 from [1] and gdb 8.1.0.20180625-git
>
> [1] http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/
>
>
> --
> Sincerely yours,
>

--
Yosinori Sato

2018-07-12 14:41:26

by Yoshinori Sato

[permalink] [raw]

Subject: Re: h8300: BUG: Bad page state in process swapper (was: Re: why do we still need bootmem allocator?)

On Sun, 01 Jul 2018 21:22:46 +0900,
Mike Rapoport wrote:
>
> (added Yoshinori Sato, here's the beginning of the discussion:
> https://lore.kernel.org/lkml/[email protected]/)
>
> On Wed, Jun 27, 2018 at 07:02:06PM +0300, Mike Rapoport wrote:
> > On Wed, Jun 27, 2018 at 07:33:55AM -0600, Rob Herring wrote:
> > > On Wed, Jun 27, 2018 at 5:27 AM Mike Rapoport <[email protected]> wrote:
> > > >
> > > > I've tried running the current upstream on h8300 gdb simulator and it
> > > > failed:
> > >
> > > It seems my patch[1] is still not applied. The maintainer said he applied it.
> >
> > I've applied it manually. Without it unflatten_and_copy_device_tree() fails
> > to allocate memory. It indeed can be fixed with moving bootmem_init()
> > before, as you've noted in the commit message.
> >
> > I'll try to dig deeper into it.
> >
> > > > [ 0.000000] BUG: Bad page state in process swapper pfn:00004
> > > > [ 0.000000] page:007ed080 count:0 mapcount:-128 mapping:00000000
> > > > index:0x0
> > > > [ 0.000000] flags: 0x0()
> > > > [ 0.000000] raw: 00000000 0040bdac 0040bdac 00000000 00000000 00000002
> > > > ffffff7f 00000000
> > > > [ 0.000000] page dumped because: nonzero mapcount
> > > > ---Type <return> to continue, or q <return> to quit---
> > > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc2+ #50
> > > > [ 0.000000] Stack from 00401f2c:
> > > > [ 0.000000] 00401f2c 001116cb 007ed080 00401f40 000e20e6 00401f54
> > > > 0004df14 00000000
> > > > [ 0.000000] 007ed080 007ed000 00401f5c 0004df8c 00401f90 0004e982
> > > > 00000044 00401fd1
> > > > [ 0.000000] 007ed000 007ed000 00000000 00000004 00000008 00000000
> > > > 00000003 00000011
> > > > [ 0.000000]
> > > > [ 0.000000] Call Trace:
> > > > [ 0.000000] [<000e20e6>] [<0004df14>] [<0004df8c>] [<0004e982>]
> > > > [ 0.000000] [<00051a28>] [<00001000>] [<00000100>]
> > > > [ 0.000000] Disabling lock debugging due to kernel taint
> > > >
> > > > With v4.13 I was able to get to "no valid init found".
> > > >
> > > > I had a quick look at h8300 memory initialization and it seems it has
> > > > starting pfn set to 0 while fdt defines memory start at 4M.
> > >
> > > Perhaps there's another issue.
>
> In my setup this is caused by __ffs() clobbering start pfn in
> nobootmem.c::__free_pages_memory().
>
> If I change the __ffs() implementation from the inline assembly to generic
> bitops everything is fine.
>
> I'm using gcc 8.1.0 from [1] and gdb 8.1.0.20180625-git

OK. fixed.
The declaration of the destroyed register was insufficient.
It works fine with NO_BOOTMEM.

> [1] http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/
>
>
> --
> Sincerely yours,
>

--
Yosinori Sato