2010-06-05 09:06:06

by Tejun Heo

[permalink] [raw]
Subject: Re: OSS Proxy Jack slave

Hello,

(cc'ing Miklos and mailing lists)

On 06/05/2010 10:50 AM, Mikael Bouillot wrote:
> Just a quick question: have you reached some form of understanding
> with Miklos Szeredi over the direct mmap in FUSE issue?
>
> I've read the thread from February 2010, and though most of the
> kernel material fly way over my head, I've got the general idea.
>
> FUSD has had mmap support since around the time I started using
> oss2jack, so I don't know exactly how many apps I use require it
> (besides Quake3). I'll put a printf in my oss2jack's mmap() to
> see what I'll be missing.
>
> I could use a patched FUSE, but stopping to have to maintain a
> piece of out-of-tree kernel code was The. Whole. Point. of
> switching from FUSD to a CUSE-based system T_T

Miklos, any update on the mmap interface?

Thanks.

--
tejun


2010-06-09 10:08:22

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

On Sat, 05 Jun 2010, Tejun Heo wrote:
> Hello,
>
> (cc'ing Miklos and mailing lists)
>
> On 06/05/2010 10:50 AM, Mikael Bouillot wrote:
> > Just a quick question: have you reached some form of understanding
> > with Miklos Szeredi over the direct mmap in FUSE issue?
> >
> > I've read the thread from February 2010, and though most of the
> > kernel material fly way over my head, I've got the general idea.
> >
> > FUSD has had mmap support since around the time I started using
> > oss2jack, so I don't know exactly how many apps I use require it
> > (besides Quake3). I'll put a printf in my oss2jack's mmap() to
> > see what I'll be missing.
> >
> > I could use a patched FUSE, but stopping to have to maintain a
> > piece of out-of-tree kernel code was The. Whole. Point. of
> > switching from FUSD to a CUSE-based system T_T
>
> Miklos, any update on the mmap interface?

Sorry, got distracted by splice support on the fuse device.

I thought a bit about mmap in the last couple of days, and here's what
I came up with. This week I'll take a stab at implementing some of
this (as a hack week project, let's say :).

First, I think server side mmap might be nice to have but not strictly
necessary. I looked at osspd and it just memcopies in and out of the
mmaped ring buffer. Replacing those memcopies with explicit syscalls
to get and put the data should work fine. I doubt that the latency or
CPU overhead introduced by the syscalls would actually matter in
practice.

So we have the problem of how to do server initiated data transfer
to/from kernel buffers. We could introduce the following
"notifications", which are initiated by the filesystem:

store request
u64 nodeid
u64 offset
u32 size
u32 padding
data...

retrieve request:
u64 request_id
u64 nodeid
u64 offset
u32 size
u32 padding

retrieve reply:
u64 request_id
data...

Notice the asymmetry, store doesn't need a reply but retrieve does.
Which is unfortunate as it makes it harder to impelent on both the
kernel side and the server side.

Next thing is how to deal with multiple buffers for each char device.
For the above to continue to work we need to make sure there's a
separate nodeid associated with each buffer. The most general thing
would be if MMAP reply contained a nodeid which identified the buffer.

Do you see any issues with the above?

Thanks,
Miklos

2010-06-10 11:24:07

by Tejun Heo

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

Hello, Miklos.

On 06/09/2010 12:08 PM, Miklos Szeredi wrote:
> I thought a bit about mmap in the last couple of days, and here's what
> I came up with. This week I'll take a stab at implementing some of
> this (as a hack week project, let's say :).

Cool. :-)

> First, I think server side mmap might be nice to have but not strictly
> necessary. I looked at osspd and it just memcopies in and out of the
> mmaped ring buffer. Replacing those memcopies with explicit syscalls
> to get and put the data should work fine. I doubt that the latency or
> CPU overhead introduced by the syscalls would actually matter in
> practice.

The latency and CPU overhead perse aren't problematic and for osspd,
copying in and out should be just fine as all update events are
clearly denoted.

> So we have the problem of how to do server initiated data transfer
> to/from kernel buffers. We could introduce the following
> "notifications", which are initiated by the filesystem:
>
> store request
> u64 nodeid
> u64 offset
> u32 size
> u32 padding
> data...
>
> retrieve request:
> u64 request_id
> u64 nodeid
> u64 offset
> u32 size
> u32 padding
>
> retrieve reply:
> u64 request_id
> data...
>
> Notice the asymmetry, store doesn't need a reply but retrieve does.
> Which is unfortunate as it makes it harder to impelent on both the
> kernel side and the server side.

How does the kernel know when to issue store or retrieve?

> Next thing is how to deal with multiple buffers for each char device.
> For the above to continue to work we need to make sure there's a
> separate nodeid associated with each buffer. The most general thing
> would be if MMAP reply contained a nodeid which identified the buffer.
>
> Do you see any issues with the above?

It relates to the previous question but mmap can also be used without
all updates being notified by some kind of event where the server is
expected to watch the mmapped area and react which is okay if server
can share the mapped page but if it has to poll by copying data out of
kernel buffer each time, it can get prohibitively expensive unless it
can ask kernel "what changes since when?" which would be pretty nasty
to implement.

Thanks.

--
tejun

2010-06-10 11:52:40

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

On Thu, 10 Jun 2010, Tejun Heo wrote:
> > First, I think server side mmap might be nice to have but not strictly
> > necessary. I looked at osspd and it just memcopies in and out of the
> > mmaped ring buffer. Replacing those memcopies with explicit syscalls
> > to get and put the data should work fine. I doubt that the latency or
> > CPU overhead introduced by the syscalls would actually matter in
> > practice.
>
> The latency and CPU overhead perse aren't problematic and for osspd,
> copying in and out should be just fine as all update events are
> clearly denoted.

Good.

> How does the kernel know when to issue store or retrieve?

No, it's userspace that initiates store and retrieve.

> > Next thing is how to deal with multiple buffers for each char device.
> > For the above to continue to work we need to make sure there's a
> > separate nodeid associated with each buffer. The most general thing
> > would be if MMAP reply contained a nodeid which identified the buffer.
> >
> > Do you see any issues with the above?
>
> It relates to the previous question but mmap can also be used without
> all updates being notified by some kind of event where the server is
> expected to watch the mmapped area and react which is okay if server
> can share the mapped page but if it has to poll by copying data out of
> kernel buffer each time, it can get prohibitively expensive unless it
> can ask kernel "what changes since when?" which would be pretty nasty
> to implement.

If necessary we could export page protection and page fault interfaces
to userspace, which would allow it to watch for changes.

But that's not needed for OSSP, right?

Thanks,
Miklos

2010-06-10 13:14:19

by Tejun Heo

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

Hello,

Nope, it's not required for osspd but it's a rather strong limitation on the interface which would be pretty difficult to work around later. Page protection tricks would be very cumbersome and unscalable. Hmmm, it does remove the necessity for allocating kernel pages, right? Still, it seems like a big limitation. Are you sure this is a good idea?

Thanks.

--
tejun

2010-06-10 13:28:25

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

On Thu, 10 Jun 2010, Tejun Heo wrote:
> Hello,
>
> Nope, it's not required for osspd but it's a rather strong
> limitation on the interface which would be pretty difficult to work
> around later. Page protection tricks would be very cumbersome and
> unscalable. Hmmm, it does remove the necessity for allocating
> kernel pages, right? Still, it seems like a big limitation. Are
> you sure this is a good idea?

I'm sure its a good idea to provide an interface that does't _require_
server side mmaps. This doesn't preclude an mmap interface later on,
but I'd rather start with a "pure" read/write device interface, which
is in a lot of ways simpler to solve than mmap.

As for pinning kernel pages, I still think it's a good idea for the
char device interface, but you're right, it's not necessary.

In fact I'm mostly ready with an implementation of store/retrieve that
just pokes the regular page cache.

Thanks,
Miklos

2010-06-10 15:27:22

by Tejun Heo

[permalink] [raw]
Subject: Re: [fuse-devel] OSS Proxy Jack slave

Hello,

On 06/10/2010 03:28 PM, Miklos Szeredi wrote:
> On Thu, 10 Jun 2010, Tejun Heo wrote:
> I'm sure its a good idea to provide an interface that does't _require_
> server side mmaps. This doesn't preclude an mmap interface later on,
> but I'd rather start with a "pure" read/write device interface, which
> is in a lot of ways simpler to solve than mmap.

It doesn't preclude that but then again having two interfaces can be a
bit silly.

> As for pinning kernel pages, I still think it's a good idea for the
> char device interface, but you're right, it's not necessary.
>
> In fact I'm mostly ready with an implementation of store/retrieve that
> just pokes the regular page cache.

But I think if we can avoid using pinned kernel pages, pure r/w based
implementation could be different and useful enough, so yeah, great.

Thanks.

--
tejun