2001-10-01 05:58:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: __get_free_pages(): is the MEM really mine?


On Sun, 30 Sep 2001, Bernd Harries wrote:

> Is there a guarantee that the n - 1 pages above the 1st one are not
> donated to other programs while my driver uses them?

yes. The 2MB block of 512 x 4k pages (we should perhaps call it a 'order 9
page') is yours.

> > is it a fundamental property of the hardware that it needs a continuous
> > physical memory buffer?
>
> Yes. The FW on the card demands it.

ok. then i'd suggest to do all this allocation at boot-time, and do not
deallocate it. This is the safest method. Unless it's a point to have the
driver as a module (for other than development purposes).

> I'll move the code to init_module later once it is stable.

even init_module() can be executed much later: eg. kmod removes the module
because it's unused, and it's reinserted later. So generally it's really
unrobust to expect a 9th order allocation to succeed at module_init()
time.

the fundamental issue is not the lazyness of Linux VM developers. 99.9% of
all allocations are order 0. 99.9% of the remaining allocations are order
1 or 2. It takes a fair amount of overhead and complexity to handle
high-order allocations 'well' - it takes even more effort (and a perverse
limitation on the use of pointers) to guarantee the success of such
allocations all the time.

there is a longer-term and robust solution that could be used though. We
could support a generic 'physical memory pool', that gets allocated on
bootup (via eg. a physmem=10m kernel boot option), and never gets used for
other than such critical allocations. Your driver could call eg.
alloc_physmem(size) and free_physmem(). It would work similarly to
bootmem.c. This 'physical memory pool' would never be used by generic
subsystems - only drivers which support hardware with such limitations are
allowed to use it. The advantage of this approach is that there would be
one generic way to put physically continuous RAM aside for such drivers -
so the driver would not have to worry about the VM situation. The other
advantage is that we could decrease MAX_ORDER significantly (to around 7)
- support for higher orders increases the runtime overhead of the buddy
allocator, even for low-order allocations.

(later on we could even add support to grow and shrink the size of the
physical memory pool (within certain boundaries), so it could be sized
boot-time.)

would anything like this be useful? Since it's a completely separate pool
(in fact it wont even show up in the normal memory statistics), it does
not disturb the existing VM in any way.

Ingo


2001-10-05 08:49:55

by Bernd Harries

[permalink] [raw]
Subject: Re: __get_free_pages(): is the MEM really mine?

Hi Ingo,

The problem with mmapping a Kernel buffer to userspace is still there .

It appears that __get_free_pages(GFP_KERNEL, max_order) alone is not enough to request a reliable buffer. On Monday I already sent a message to
the list which you may have overseen.

In my driver I have now the normal method on minor 26 and Roman Zippel's method on minor 27. I have used minor 27 quite heavy already and it
appears stable. Using minor 26 makes the system instable quite instantly.

I would like you to try my driver either on my system via remote login or I could try to reproduce the effect without DMA accesses to the buffer
and modify the driver so that you can try it without hardware in your Computer.

Is one of these 2 ways possible for you?

Thanks,
--
Bernd Harries

[email protected] http://bharries.freeyellow.com
[email protected] Tel. +49 421 809 7343 priv. | MSB First!
[email protected] +49 421 457 3966 offi. | Linux-m68k
[email protected] +49 172 139 6054 handy | Medusa T40