2003-01-24 14:15:06

by Thomas Schlichter

[permalink] [raw]
Subject: [PATCH] to support hookable flush_tlb* functions

Hello,

with this mail I send a patch that allows kernel modules to hook into the
different flush_tlb* functions defined in <asm/tlbflush.h> or <asm/pgtable.h> in
order to synchronize devices TLBs.

This is necessary for devices that provide their own TLB and cannot participate
on the CPU busses shootdown protocol. With this patch it is possible to assure
TLB consistency.

Currently this extension could possibly be used by high performance
interconnects like QsNet from Quadrics (http://www.quadrics.com), and in the near
future by even more high performance, low latency NICs that will implement
direct user space DMA transfers to not pinned user pages. TLBs are there a
mandatory requirement.

Currently I am writing my diploma thesis about the development of such a device
where I need this patch, and as it looks good to me I want to provide it to the
public community so it can be reviewed and even more tested. (the i386 parts are
tested work fine for me)

The patch consists of two parts, one generic part and for each supported
architecture an other part that depends on the generic one.

Attached to this mail is only the generic part and the architecture dependend
part for i386 compatible machines just not to waste everyones bandwidth... But
if requested I can send you patches for the other architectures, too.

The i386 patch also includes some cleanups by renaming __flush_tlb_* to
local_flush_tlb_*.

I hope some time this patches will make it into the kernel sources. (perhaps
even into 2.6.x ?)

Sincerely yours

Thomas Schlichter

P.S.: To test this patch I've also written a module that counts the different
flush_tlb* calls and shows them in /proc/tlbstat. If requested I could send you
this, too.


Attachments:
tlbhook_generic.patch (4.99 kB)
tlbhook_i386.patch (9.26 kB)
Download all attachments

2003-01-24 20:53:25

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] to support hookable flush_tlb* functions

Thomas Schlichter <[email protected]> wrote:
>
> Hello,
>
> with this mail I send a patch that allows kernel modules to hook into the
> different flush_tlb* functions defined in <asm/tlbflush.h> or <asm/pgtable.h> in
> order to synchronize devices TLBs.

Looks sensible enough.

A few coding-style nits:

+typedef struct tlb_hook_struct {
+...
+} tlb_hook_t;

typedefs are unpopular. Please just use

struct tlb_hook {
...
};

+static inline void flush_tlb_hook( void )

extraneous whitespace - Linus style is flush_tlb_hook(void)

+ while( hook )

while (hook)

+ {
+ if( hook->flush_tlb )

if (hook->flush_tlb)

+ hook->flush_tlb( );

hook->flush_tlb();

etc...



The unregister_hook implementation is racy - the hook could be in use on
another CPU. That's OK - we have the RCU infrastructure which will allow
hooks to be torn down safely. And nice people who can help others who are
using that code.

> The i386 patch also includes some cleanups by renaming __flush_tlb_* to
> local_flush_tlb_*.

That makes plenty of sense.

> I hope some time this patches will make it into the kernel sources. (perhaps
> even into 2.6.x ?)

Well the big questions are: where are the drivers for these devices? When
can we expect to see significant demand for these devices? Will there be
significant demand across the lifetime of the 2.6 kernel?

BTW, when you say "low latency NICs that will implement direct user space DMA
transfers to not pinned user pages", what do you mean by "not pinned"?

2003-01-24 21:51:25

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] to support hookable flush_tlb* functions

On Fri, 2003-01-24 at 13:21, Andrew Morton wrote:
> Well the big questions are: where are the drivers for these devices?

Andrew, have a look at:

http://www.dolphinics.com/support/os/source.html

it is possibly the most popular example.

2003-01-26 23:21:16

by Thomas Schlichter

[permalink] [raw]
Subject: Re: [PATCH] to support hookable flush_tlb* functions

Hi,

24. Jan. 2003 22:21, Andrew Morton wrote:
> A few coding-style nits:
>
> +typedef struct tlb_hook_struct {
> +...
> +} tlb_hook_t;
>
> typedefs are unpopular. Please just use
>
> struct tlb_hook {
> ...
> };
>
> +static inline void flush_tlb_hook( void )
>
> extraneous whitespace - Linus style is flush_tlb_hook(void)
>
> + while( hook )
>
> while (hook)
>
> + {
> + if( hook->flush_tlb )
>
> if (hook->flush_tlb)
>
> + hook->flush_tlb( );
>
> hook->flush_tlb();
>
> etc...

Thanks for these nits, they should be fixed in the new attached version of the
patches...

> The unregister_hook implementation is racy - the hook could be in use on
> another CPU. That's OK - we have the RCU infrastructure which will allow
> hooks to be torn down safely. And nice people who can help others who are
> using that code.

I do not know what you ment with the RCU infrastructure, but the
unregister-race should be fixed in the new patches, I hope...

> > The i386 patch also includes some cleanups by renaming __flush_tlb_* to
> > local_flush_tlb_*.
>
> That makes plenty of sense.

Fine... ;-)

> > I hope some time this patches will make it into the kernel sources.
> > (perhaps even into 2.6.x ?)
>
> Well the big questions are: where are the drivers for these devices? When
> can we expect to see significant demand for these devices? Will there be
> significant demand across the lifetime of the 2.6 kernel?

Well, I cannot say this for sure as we just do universitary research, but I
really think this may be interesting even for companies which already create
their own kernel patches for existing products...

> BTW, when you say "low latency NICs that will implement direct user space
> DMA transfers to not pinned user pages", what do you mean by "not pinned"?

Well, with "not pinned" I mean the users memory pages need not to be pinned
before the transfer begins...

If the pages are not present they have to be swapped in by the kernel and be
pinned for the transfer, else the NIC may pin the page itself and start the
transfer.

The address translation may be performed directly from the network device or
via an interrupt in the kernel. The translation is then stored in the devices
TLB which may be flushed when the page is unpinned (then this patch is not
neccessary) or just if the translation is invalidated from the kernel (then
this patch is neccessary).

The second approach has the advantage that the TLB hit ratio can be improved
without having many user-pages pinned, and so removed from the memory pool,
for a long time...

Regards
Thomas Schlichter


Attachments:
tlbhook_generic.patch (5.31 kB)
tlbhook_i386.patch (9.27 kB)
Download all attachments