2015-07-30 13:09:49

by cee1

[permalink] [raw]
Subject: Revisit AF_BUS: is it a better way to implement KDBUS?

Hi all,

I'm interested in the idea of AF_BUS.

There have already been varies discussions about it:
* Missing the AF_BUS - https://lwn.net/Articles/504970/
* Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
http://lwn.net/Articles/537021/
* presentation-kdbus -
https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
* Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
* The kdbuswreck - https://lwn.net/Articles/641275/

I'm wondering whether it is a better way, that is, a general mechanism
to implement varies __Bus__ orientated IPCs, such as Binder[1],
DBus[2], etc.

The original design of AF_BUS is at
https://github.com/Airtau/genivi/blob/master/af_bus-linux/0002-net-bus-Add-AF_BUS-documentation.patch.
And following is my version of AF_BUS.

Some characteristics of a Bus orientated IPC:
1. A process creates a Bus, the process is then called 'bus master'.
2. Connects to a Bus, be assigned Bus address(es).
3. Sending/Receiving multicast message, in additional to P2P communication.
4. The implementation may base on shared memory model to avoid unnecessary copy.

## How to map point 1: """A process creates a Bus, the process is then
called 'bus master'"""
The [bus master] acts:

struct sockaddr_bus {
sa_family_t sbus_family; /* AF_BUS */
unsigned short sbus_addr_ncomp; /* number of
components of sbus_addr */
char sbus_path[BUS_PATH_MAX]; /* pathname of
this bus */
uint64_t sbus_addr[BUS_ADDR_COMP_MAX]; /* address
within the bus */
};
#define BUS_ADDR_MAX (BUS_ADDR_COMP_MAX * sizeof(uint64_t))

char bus_path[] = "/tmp/test"; /* non-abstract path */
char bus_addr[] = "org.example.bus";
struct sockaddr_bus addr = { .sbus_family = AF_BUS };

strncpy(addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
memcpy(addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8, BUS_ADDR_COMP_MAX);

bus_fd = socket(AF_BUS, SOCK_DGRAM, 0);
/* creates a Bus, becomes the master of the bus */
bind(bus_fd, &addr, sizeof(struct sockaddr_bus));


## How to map point 2: """Connects to a Bus, be assigned Bus address(es)"""
### The [bus endpoint] acts:
fd = socket(AF_BUS, SOCK_DGRAM, 0);

/* AUTH message setup */
struct msghdr msghdr = {
.msg_name = &addr, /* bus master's addr */
.msg_namelen = sizeof(struct sockaddr_bus),
.msg_iov = &auth_iovec,
.msg_iovlen = 1,
};

msghdr.msg_controllen = CMSG_SPACE(sizeof(struct ucred));
msghdr.msg_control = alloca(msghdr.msg_controllen);
cmsg = CMSG_FIRSTHDR(&msghdr);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_CREDENTIALS;
cmsg->cmsg_len = CMSG_LEN(sizeof(struct ucred));
ucred = (struct ucred *) CMSG_DATA(cmsg);
ucred->pid = getpid();
ucred->uid = getuid();
ucred->gid = getgid();

sendmsg(fd, &msghdr, MSG_NOSIGNAL);

### The [bus master] acts:
int optval = 1;
setsockopt(bus_fd, SOL_SOCKET, SO_PASSCRED, &optval, sizeof(optval));
recvmsg(bus_fd, &msghdr, MSG_NOSIGNAL);

/* do AUTH ... */

msghdr.msg_iov = &reply_iovec;
msghdr.msg_iovlen = 1;
msghdr.msg_controllen = 0;
msghdr.msg_control = NULL;

if (auth_ok) {
/* bus master allocates a bus addr */
char bus_path[] = "/tmp/test";
char ret_bus_addr[] = "1.1";
struct sockaddr_bus ret_addr = { .sbus_family = AF_BUS };

strncpy(ret_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
memcpy(ret_addr.sbus_addr, ret_bus_addr,
MIN(sizeof(ret_bus_addr), BUS_ADDR_MAX));
ret_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(ret_bus_addr), 8)
/ 8, BUS_ADDR_COMP_MAX);

/*
* 1. bus master returns the bus addr
* 2. kernel will apply it against the bus endpoint
* 3. the bus endpoint is then able to talk with endpoints on the bus.
*/
msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
msghdr.msg_control = alloca(msghdr.msg_controllen);
cmsg = CMSG_FIRSTHDR(&msghdr);
cmsg->cmsg_level = BUS_SOCKET;
cmsg->cmsg_type = SCM_OWNED_ADDR;
cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
memcpy(CMSG_DATA(cmsg), &ret_addr, sizeof(struct sockaddr_bus));
}
sendmsg(bus_fd, &msghdr, MSG_NOSIGNAL);


## How to map point 3: """Sending/Receiving multicast message, in
additional to P2P communication""".
### P2P communication
Sometimes, a bus endpoint maybe assigned to multi-addresses. It may
want to send message through a specific address.

struct msghdr msghdr = {
.msg_name = &dst_addr,
.msg_namelen = sizeof(struct sockaddr_bus),
.msg_iov = &msg_iovec,
.msg_iovlen = 1,
};

char bus_path[] = "/tmp/test";
char bus_addr[] = "com.example.service1";
struct sockaddr_bus src_addr = { .sbus_family = AF_BUS };

strncpy(src_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
memcpy(src_addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
src_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8,
BUS_ADDR_COMP_MAX),

msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
msghdr.msg_control = alloca(msghdr.msg_controllen);
cmsg = CMSG_FIRSTHDR(&msghdr);
cmsg->cmsg_level = BUS_SOCKET;
cmsg->cmsg_type = SCM_SRC_ADDR;
cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
memcpy(CMSG_DATA(cmsg), &src_addr, sizeof(struct sockaddr_bus));

sendmsg(my_sock_fd, &msghdr, MSG_NOSIGNAL);

### Multicast
The multicast address may look like:
{
.sbus_family = AF_BUS,

/* In a multicast addr, its bus_path is '*'-terminated */
.sbus_path = "/tmp/test\0\0\0\0\0...*",

.sbus_addr_ncomp = 8;
.sbus_addr = /* 8 * 64bits bitarray for example */
}

The receiver will request [bus master] for permitting to receive
messages from a set of multicast addresses, and the bus master grants
it with replying a control message:
{
.cmsg_level = BUS_SOCKET,
.cmsg_type = SCM_MULTICAST_MATCH,
.cmsg_data = /* the requested struct sockaddr_bus */
}

How does matching happen?
Let's assume someone sends message to multicast address maddr1, and
the receiver granted a match of maddr2:

The [kernel]:
is_matched = maddr1 & maddr2 == maddr2.

In this way, usespace can deploy bloom filters, and then it may
further apply eBPF to filter out "false positive" case.

## How to avoid unnecessary copy?
A sockopt similar to PACKET_RX_RING[3] may be introduced, which brings
a mmap/shared memory style API.


## Other thoughts
1. The bus master may want to receive notifications from the kernel,
such as "a bus endpoint died". A special sockaddr_bus "{
.sbus_addr_ncomp = 0, .sbus_addr = NULL }" indicates a message from
kernel.
2. A bus endpoint may pass a memfd to another bus endpoint, and then
they communicates under mmap/shared memory model, if it needs ultimate
performance.



---
1. http://www.freedesktop.org/wiki/Software/dbus/
2. http://elinux.org/Android_Binder
3. http://man7.org/linux/man-pages/man7/packet.7.html



Regards,

- cee1


2015-07-30 18:13:08

by Andy Lutomirski

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

On Thu, Jul 30, 2015 at 6:09 AM, cee1 <[email protected]> wrote:
> Hi all,
>
> I'm interested in the idea of AF_BUS.
>
> There have already been varies discussions about it:
> * Missing the AF_BUS - https://lwn.net/Articles/504970/
> * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> http://lwn.net/Articles/537021/
> * presentation-kdbus -
> https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> * The kdbuswreck - https://lwn.net/Articles/641275/
>
> I'm wondering whether it is a better way, that is, a general mechanism
> to implement varies __Bus__ orientated IPCs, such as Binder[1],
> DBus[2], etc.

I find myself wondering whether an in-kernel *bus* is a good idea at
all. Creating a bus that unprivileged programs are allowed to
broadcast on (which is kind of the point) opens up big cans of worms.
Namely: what happens when producers produce data faster than the
consumers consume it? Keep in mind that, with a bus, this scales
pretty badly. Each producer's sends are multiplied by the number of
participants.

At some point soon, I'm planning on playing with Fedora Rawhide with
kdbus. Anything's possible (maybe), but I'd be rather surprised if it
holds up under abuse of the bus.

ISTM kdbus is trying to solve a few problems that really can't be
solved together: it wants (mostly) reliable delivery, it wants
globally ordered messages, and it wants broadcasts. That means that,
if message N gets broadcast, then, until *every* recipient has
received message N, message N and all of its successors need to be
buffered somewhere. I see how this works (by massive use of tmpfs),
but I don't see how it's going to work *well*.

Certainly approximate solutions are possible, but is the kernel really
a good place to arbitrate which message survive under pressure?


People can throw code at this all they want, but ISTM the problem that
the dbus community wants to solve doesn't actually admit a scalable
solution.

--Andy

2015-07-31 04:01:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

On Thu, Jul 30, 2015 at 11:12:44AM -0700, Andy Lutomirski wrote:
> On Thu, Jul 30, 2015 at 6:09 AM, cee1 <[email protected]> wrote:
> > Hi all,
> >
> > I'm interested in the idea of AF_BUS.
> >
> > There have already been varies discussions about it:
> > * Missing the AF_BUS - https://lwn.net/Articles/504970/
> > * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> > http://lwn.net/Articles/537021/
> > * presentation-kdbus -
> > https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> > * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> > * The kdbuswreck - https://lwn.net/Articles/641275/
> >
> > I'm wondering whether it is a better way, that is, a general mechanism
> > to implement varies __Bus__ orientated IPCs, such as Binder[1],
> > DBus[2], etc.
>
> I find myself wondering whether an in-kernel *bus* is a good idea at
> all. Creating a bus that unprivileged programs are allowed to
> broadcast on (which is kind of the point) opens up big cans of worms.
> Namely: what happens when producers produce data faster than the
> consumers consume it? Keep in mind that, with a bus, this scales
> pretty badly. Each producer's sends are multiplied by the number of
> participants.
>
> At some point soon, I'm planning on playing with Fedora Rawhide with
> kdbus. Anything's possible (maybe), but I'd be rather surprised if it
> holds up under abuse of the bus.

Just boot Fedora Rawhide with "kdbus=1" on the kernel command line and
you should be set. If not, please let the kdbus developers know.

thanks,

greg k-h

2015-07-31 09:53:00

by cee1

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

2015-07-31 2:12 GMT+08:00 Andy Lutomirski <[email protected]>:
>
> ISTM kdbus is trying to solve a few problems that really can't be
> solved together: it wants (mostly) reliable delivery, it wants
> globally ordered messages, and it wants broadcasts. That means that,
> if message N gets broadcast, then, until *every* recipient has
> received message N, message N and all of its successors need to be
> buffered somewhere. I see how this works (by massive use of tmpfs),
> but I don't see how it's going to work *well*.

For broadcast, what will the kernel behave if:
1. Lots of processes open netlink socket (to receive uevents), but not
consume it. And someone continues to trigger uevents.
2. Lots of processes open inotify to monitor a directory, but not
consume the events. And someone continues to operate files under the
directory.
...

I guess it may have to drop some data if the producer produces too
fast(or the consumers consume too slow). What it needs may be a chance
for recipients to know some broadcast data lost.



--
Regards,

- cee1

2015-07-31 10:10:06

by cee1

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

In a nutshell, this AF_BUS:

1. For privilege operations, bus endpoints send requests to bus
master, and bus master replies with cmsg(control message, e.g. tells
the kernel to assign specified sockaddr_bus)

2. Bus master allocates sockaddr_bus

3. Three kinds of sockaddr_bus:
* The normal ones
* Multicast addresses (last char of sbus_path is '*')
* Kernel notification addr (sbus_addr == NULL)

4. Bloom filters friendly. (i.e. the multicast logic)

2015-07-30 21:09 GMT+08:00 cee1 <[email protected]>:
> Hi all,
>
> I'm interested in the idea of AF_BUS.
>
> There have already been varies discussions about it:
> * Missing the AF_BUS - https://lwn.net/Articles/504970/
> * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> http://lwn.net/Articles/537021/
> * presentation-kdbus -
> https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> * The kdbuswreck - https://lwn.net/Articles/641275/
>
> I'm wondering whether it is a better way, that is, a general mechanism
> to implement varies __Bus__ orientated IPCs, such as Binder[1],
> DBus[2], etc.
>
> The original design of AF_BUS is at
> https://github.com/Airtau/genivi/blob/master/af_bus-linux/0002-net-bus-Add-AF_BUS-documentation.patch.
> And following is my version of AF_BUS.
>
> Some characteristics of a Bus orientated IPC:
> 1. A process creates a Bus, the process is then called 'bus master'.
> 2. Connects to a Bus, be assigned Bus address(es).
> 3. Sending/Receiving multicast message, in additional to P2P communication.
> 4. The implementation may base on shared memory model to avoid unnecessary copy.
>
> ## How to map point 1: """A process creates a Bus, the process is then
> called 'bus master'"""
> The [bus master] acts:
>
> struct sockaddr_bus {
> sa_family_t sbus_family; /* AF_BUS */
> unsigned short sbus_addr_ncomp; /* number of
> components of sbus_addr */
> char sbus_path[BUS_PATH_MAX]; /* pathname of
> this bus */
> uint64_t sbus_addr[BUS_ADDR_COMP_MAX]; /* address
> within the bus */
> };
> #define BUS_ADDR_MAX (BUS_ADDR_COMP_MAX * sizeof(uint64_t))
>
> char bus_path[] = "/tmp/test"; /* non-abstract path */
> char bus_addr[] = "org.example.bus";
> struct sockaddr_bus addr = { .sbus_family = AF_BUS };
>
> strncpy(addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8, BUS_ADDR_COMP_MAX);
>
> bus_fd = socket(AF_BUS, SOCK_DGRAM, 0);
> /* creates a Bus, becomes the master of the bus */
> bind(bus_fd, &addr, sizeof(struct sockaddr_bus));
>
>
> ## How to map point 2: """Connects to a Bus, be assigned Bus address(es)"""
> ### The [bus endpoint] acts:
> fd = socket(AF_BUS, SOCK_DGRAM, 0);
>
> /* AUTH message setup */
> struct msghdr msghdr = {
> .msg_name = &addr, /* bus master's addr */
> .msg_namelen = sizeof(struct sockaddr_bus),
> .msg_iov = &auth_iovec,
> .msg_iovlen = 1,
> };
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct ucred));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = SOL_SOCKET;
> cmsg->cmsg_type = SCM_CREDENTIALS;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct ucred));
> ucred = (struct ucred *) CMSG_DATA(cmsg);
> ucred->pid = getpid();
> ucred->uid = getuid();
> ucred->gid = getgid();
>
> sendmsg(fd, &msghdr, MSG_NOSIGNAL);
>
> ### The [bus master] acts:
> int optval = 1;
> setsockopt(bus_fd, SOL_SOCKET, SO_PASSCRED, &optval, sizeof(optval));
> recvmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
> /* do AUTH ... */
>
> msghdr.msg_iov = &reply_iovec;
> msghdr.msg_iovlen = 1;
> msghdr.msg_controllen = 0;
> msghdr.msg_control = NULL;
>
> if (auth_ok) {
> /* bus master allocates a bus addr */
> char bus_path[] = "/tmp/test";
> char ret_bus_addr[] = "1.1";
> struct sockaddr_bus ret_addr = { .sbus_family = AF_BUS };
>
> strncpy(ret_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(ret_addr.sbus_addr, ret_bus_addr,
> MIN(sizeof(ret_bus_addr), BUS_ADDR_MAX));
> ret_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(ret_bus_addr), 8)
> / 8, BUS_ADDR_COMP_MAX);
>
> /*
> * 1. bus master returns the bus addr
> * 2. kernel will apply it against the bus endpoint
> * 3. the bus endpoint is then able to talk with endpoints on the bus.
> */
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = BUS_SOCKET;
> cmsg->cmsg_type = SCM_OWNED_ADDR;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
> memcpy(CMSG_DATA(cmsg), &ret_addr, sizeof(struct sockaddr_bus));
> }
> sendmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
>
> ## How to map point 3: """Sending/Receiving multicast message, in
> additional to P2P communication""".
> ### P2P communication
> Sometimes, a bus endpoint maybe assigned to multi-addresses. It may
> want to send message through a specific address.
>
> struct msghdr msghdr = {
> .msg_name = &dst_addr,
> .msg_namelen = sizeof(struct sockaddr_bus),
> .msg_iov = &msg_iovec,
> .msg_iovlen = 1,
> };
>
> char bus_path[] = "/tmp/test";
> char bus_addr[] = "com.example.service1";
> struct sockaddr_bus src_addr = { .sbus_family = AF_BUS };
>
> strncpy(src_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(src_addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> src_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8,
> BUS_ADDR_COMP_MAX),
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = BUS_SOCKET;
> cmsg->cmsg_type = SCM_SRC_ADDR;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
> memcpy(CMSG_DATA(cmsg), &src_addr, sizeof(struct sockaddr_bus));
>
> sendmsg(my_sock_fd, &msghdr, MSG_NOSIGNAL);
>
> ### Multicast
> The multicast address may look like:
> {
> .sbus_family = AF_BUS,
>
> /* In a multicast addr, its bus_path is '*'-terminated */
> .sbus_path = "/tmp/test\0\0\0\0\0...*",
>
> .sbus_addr_ncomp = 8;
> .sbus_addr = /* 8 * 64bits bitarray for example */
> }
>
> The receiver will request [bus master] for permitting to receive
> messages from a set of multicast addresses, and the bus master grants
> it with replying a control message:
> {
> .cmsg_level = BUS_SOCKET,
> .cmsg_type = SCM_MULTICAST_MATCH,
> .cmsg_data = /* the requested struct sockaddr_bus */
> }
>
> How does matching happen?
> Let's assume someone sends message to multicast address maddr1, and
> the receiver granted a match of maddr2:
>
> The [kernel]:
> is_matched = maddr1 & maddr2 == maddr2.
>
> In this way, usespace can deploy bloom filters, and then it may
> further apply eBPF to filter out "false positive" case.
>
> ## How to avoid unnecessary copy?
> A sockopt similar to PACKET_RX_RING[3] may be introduced, which brings
> a mmap/shared memory style API.
>
>
> ## Other thoughts
> 1. The bus master may want to receive notifications from the kernel,
> such as "a bus endpoint died". A special sockaddr_bus "{
> .sbus_addr_ncomp = 0, .sbus_addr = NULL }" indicates a message from
> kernel.
> 2. A bus endpoint may pass a memfd to another bus endpoint, and then
> they communicates under mmap/shared memory model, if it needs ultimate
> performance.
>
>
>
> ---
> 1. http://www.freedesktop.org/wiki/Software/dbus/
> 2. http://elinux.org/Android_Binder
> 3. http://man7.org/linux/man-pages/man7/packet.7.html
>
>
>
> Regards,
>
> - cee1



--
Regards,

- cee1

2015-07-31 16:26:12

by cee1

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

2015-07-31 2:12 GMT+08:00 Andy Lutomirski <[email protected]>:
>
> I find myself wondering whether an in-kernel *bus* is a good idea at
> all. Creating a bus that unprivileged programs are allowed to
> broadcast on (which is kind of the point) opens up big cans of worms.

This can be solved in this AF_BUS like this:
* Becoming a bus master needs a proper CAP.
* Impose a bus endpoint to join multicast address "maddr1" first, if
it wants to send to multicast address "maddr2".

The bus endpoint sends the request of joining maddr1, and the bus
master grants it with replying a cmsg(control message) and setting up
a proper eBPF.

Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
1) maddr1 & maddr2 == maddr1
And 2) the eBPF allows it.
(i.e. the same multicast match logic in this AF_BUS)



--
Regards,

- cee1

2015-07-31 21:16:11

by Andy Lutomirski

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

On Fri, Jul 31, 2015 at 9:25 AM, cee1 <[email protected]> wrote:
> 2015-07-31 2:12 GMT+08:00 Andy Lutomirski <[email protected]>:
>>
>> I find myself wondering whether an in-kernel *bus* is a good idea at
>> all. Creating a bus that unprivileged programs are allowed to
>> broadcast on (which is kind of the point) opens up big cans of worms.
>
> This can be solved in this AF_BUS like this:
> * Becoming a bus master needs a proper CAP.
> * Impose a bus endpoint to join multicast address "maddr1" first, if
> it wants to send to multicast address "maddr2".
>
> The bus endpoint sends the request of joining maddr1, and the bus
> master grants it with replying a cmsg(control message) and setting up
> a proper eBPF.
>
> Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
> 1) maddr1 & maddr2 == maddr1
> And 2) the eBPF allows it.
> (i.e. the same multicast match logic in this AF_BUS)
>

I don't understand.

If the endpoint is unprivileged (i.e. random untrusted things can send
multicast), then you have the scaling problem. If the endpoint is
privileged, then it's much less clear to me that this thing is useful.

--Andy

2015-08-01 02:00:33

by cee1

[permalink] [raw]
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?

2015-08-01 5:15 GMT+08:00 Andy Lutomirski <[email protected]>:
> On Fri, Jul 31, 2015 at 9:25 AM, cee1 <[email protected]> wrote:
>> 2015-07-31 2:12 GMT+08:00 Andy Lutomirski <[email protected]>:
>>>
>>> I find myself wondering whether an in-kernel *bus* is a good idea at
>>> all. Creating a bus that unprivileged programs are allowed to
>>> broadcast on (which is kind of the point) opens up big cans of worms.
>>
>> This can be solved in this AF_BUS like this:
>> * Becoming a bus master needs a proper CAP.
>> * Impose a bus endpoint to join multicast address "maddr1" first, if
>> it wants to send to multicast address "maddr2".
>>
>> The bus endpoint sends the request of joining maddr1, and the bus
>> master grants it with replying a cmsg(control message) and setting up
>> a proper eBPF.
>>
>> Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
>> 1) maddr1 & maddr2 == maddr1
>> And 2) the eBPF allows it.
>> (i.e. the same multicast match logic in this AF_BUS)
>>
>
> I don't understand.
>
> If the endpoint is unprivileged (i.e. random untrusted things can send
> multicast), then you have the scaling problem. If the endpoint is
> privileged, then it's much less clear to me that this thing is useful.

That means an endpoint has to request the ability of sending to a
specific multicast address(aka join a multicast group), and it's up to
bus master whether grants it or not.



--
Regards,

- cee1