Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756030AbcJ0JRC (ORCPT ); Thu, 27 Oct 2016 05:17:02 -0400 Received: from mout.kundenserver.de ([212.227.126.130]:55796 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755992AbcJ0JQ5 (ORCPT ); Thu, 27 Oct 2016 05:16:57 -0400 From: Arnd Bergmann To: Tom Gundersen Cc: Andy Lutomirski , David Herrmann , Hannes Reinecke , Jiri Kosina , "linux-kernel@vger.kernel.org" , Andrew Morton , Josh Triplett , Greg KH , Steven Rostedt , Linus Torvalds Subject: Re: [RFC v1 02/14] bus1: provide stub cdev /dev/bus1 Date: Thu, 27 Oct 2016 11:11:09 +0200 Message-ID: <2469565.N6oiQLH2B8@wuerfel> User-Agent: KMail/5.1.3 (Linux/4.4.0-34-generic; KDE/5.18.0; x86_64; ; ) In-Reply-To: References: <20161026191810.12275-1-dh.herrmann@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Provags-ID: V03:K0:WRh5QrcSVJ8H0KvtF8gZSElgZfAsDCK7Yo2FFPlwUVTxiM3vdSb 7Yrz/A+7jjbItfuCXvPy8xRqThWvbJUXf6HMgGUrDuPUtLOrrSDEJd4QzbuM0sJqgk7xq9o FinW8mPUl3sT66X19IDoJdhynk5L8oOH8ZFjoSYYckXaDIv098tc4zCK3FdNs/LhBAIgDL5 A6bXMM0d/QreP9ptPC4BA== X-UI-Out-Filterresults: notjunk:1;V01:K0:hbOuB6bDgZM=:WEsD6GAbjRKUiNYscMfJii HI5aE0eUN0GpMMbMNIvYOYjE7oQ8yR/Pj0XPpBnQuVgavQtLajboHThehH5sgyLLb6gv3U31Z GGLvlTkcue4oi5ERyu7Ls1r1fSoBYU6AUTFSpWV1GyHpDhlQbg//LzamkNMktlIYV961oogwF d947Omkz8Tuqu0ucqQ+YN9uQNZOJVRbNbfZKiTe2tvGAjPv109Zb+xCQBbaBtdwzhq3XeGyBL xBHUz7zlgboVOvtIgDjZsz/QZwZ3E/dmfhmJJpj2BtvU9Yg8jxeGPbKRpUoCcCaZE9gR3cGQy h+QzcAOB4UY1sVLhOs8aDOJfMAic4c63xH7zQwomuj6hz+9Hw66UmtMkw5pu3tpg6T036iHFL sD9YJASjkF6LMyZ5oS4Zm26evWo/1aBvcOQ55CZ5syRl4mo/dDASgoDvMuBZpkgOv0Bf2lp/R qi3v6xwue9pD1oKvfQ4cfEXQed+/Cfb7VLQ5kYdPCBZd8LdVqHrdAjZIZH4BiR72iCnvdTwGL MCPwzqyRy6HRj/OkFHY8ievdU9h690CqaD/XENGoRiUsLl7PV7qJfUIHOzPFokmG1em9fXP3b suG5oclqL0rs3Rb7HNbDZCVOqvEAciG4GUo1WHVUgUlAUEyO4R65/onVDEkh96cHZq1FfRQi9 fhBywGfPH8ADwAw4ygMhB+tGif0p0VWCkyY99sYI+PFvFmFDdRQr9CkcWkiWGK1gj2y960p/c m9AeNUKbh4nCK3Si Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6487 Lines: 163 On Thursday, October 27, 2016 1:54:05 AM CEST Tom Gundersen wrote: > On Thu, Oct 27, 2016 at 1:19 AM, Andy Lutomirski wrote: > > This may have been covered elsewhere, but could this use syscalls instead? > > Yes, syscalls would work essentially the same. For now, we are using a > cdev as it makes it a lot more convenient to develop and test as an > out-of-tree module, but that could be changed easily before the final > submission, if that's what we want. Generally speaking, I think syscalls would be appropriate here, and put bus1 into a similar category as the other ipc interfaces (shm, msg, sem, mqueue, ...). However, syscall API design is nontrivial, and will require a bit of work to come to a set of syscalls that is fairly compact but also extensible enough. I think it makes sense to go through the exercise of working out what the syscall interface would end up looking like, and then make a decision. There is currently a set of file operations: @@ -48,7 +90,11 @@ const struct file_operations bus1_fops = { .owner = THIS_MODULE, .open = bus1_fop_open, .release = bus1_fop_release, + .poll = bus1_fop_poll, .llseek = noop_llseek, + .mmap = bus1_fop_mmap, + .unlocked_ioctl = bus1_peer_ioctl, + .compat_ioctl = bus1_peer_ioctl, .show_fdinfo = bus1_fop_show_fdinfo, }; and then another set of ioctls: +enum { + BUS1_CMD_PEER_DISCONNECT = _IOWR(BUS1_IOCTL_MAGIC, 0x00, + __u64), + BUS1_CMD_PEER_QUERY = _IOWR(BUS1_IOCTL_MAGIC, 0x01, + struct bus1_cmd_peer_reset), + BUS1_CMD_PEER_RESET = _IOWR(BUS1_IOCTL_MAGIC, 0x02, + struct bus1_cmd_peer_reset), + BUS1_CMD_HANDLE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x10, + __u64), + BUS1_CMD_HANDLE_TRANSFER = _IOWR(BUS1_IOCTL_MAGIC, 0x11, + struct bus1_cmd_handle_transfer), + BUS1_CMD_NODES_DESTROY = _IOWR(BUS1_IOCTL_MAGIC, 0x20, + struct bus1_cmd_nodes_destroy), + BUS1_CMD_SLICE_RELEASE = _IOWR(BUS1_IOCTL_MAGIC, 0x30, + __u64), + BUS1_CMD_SEND = _IOWR(BUS1_IOCTL_MAGIC, 0x40, + struct bus1_cmd_send), + BUS1_CMD_RECV = _IOWR(BUS1_IOCTL_MAGIC, 0x50, + struct bus1_cmd_recv), +}; I think there is no alternative to having some sort of file descriptor with the basic operations you have above, but there is a question of how to get that file descriptor if the ioctls get changed to a syscall, the basic options being: - Keep using a chardev. This works, but feels a little odd to me, and I can't think of any other interfaces combining syscalls with a chardev. - Have one syscall that returns an open file descriptor, replacing the fops->open() function. One advantage is that you can pass additional arguments in that you can't have with open. An example for this would be mqueue_open(). - Have a mountable file system, and use open() on that to create connections. Advantages are that it's fairly easy to have one instance per fs-namespace, and you can have user-defined naming of objects in the file system. For the other operations, the obvious translation would be to turn each ioctl command into one syscall, but that may not always be the best representation. One limitation is that you cannot generally have more than six 'long' arguments on a lot of architectures, and passing 'u64' arguments to syscalls is awkward. For some of the commands, the transformation would be straightforward if we assume that the 'u64' arguments can actually be 'long', I guess like this: +struct bus1_cmd_handle_transfer { + __u64 flags; + __u64 src_handle; + __u64 dst_fd; + __u64 dst_handle; +} __attribute__((__aligned__(8))); long bus1_handle_transfer(int fd, unsigned long handle, int dst_fd, unsigned long *dst_handle, unsigned int flags); +struct bus1_cmd_nodes_destroy { + __u64 flags; + __u64 ptr_nodes; + __u64 n_nodes; +} __attribute__((__aligned__(8))); long bus1_nodes_destroy(int fd, u64 *ptr_nodes, long n_nodes, unsigned int flags); However, the peer_reset would exceed the 6-argument limit when you count the initial file descriptor even if you assume that 'flags' can be made 32-bit: +struct bus1_cmd_peer_reset { + __u64 flags; + __u64 peer_flags; + __u32 max_slices; + __u32 max_handles; + __u32 max_inflight_bytes; + __u32 max_inflight_fds; +} __attribute__((__aligned__(8))); maybe something slightly ugly like long bus1_peer_reset(int fd, const struct bus1_peer_limits *param, unsigned int flags); a library might provide a wrapper that passes all the limits as separate arguments. The receive function would be fairly straightforward again, as we just pass a pointer to the returned message, and all inputs can be arguments, but the send command with this structure +struct bus1_cmd_send { + __u64 flags; + __u64 ptr_destinations; + __u64 ptr_errors; + __u64 n_destinations; + __u64 ptr_vecs; + __u64 n_vecs; + __u64 ptr_handles; + __u64 n_handles; + __u64 ptr_fds; + __u64 n_fds; +} __attribute__((__aligned__(8))); is really tricky, as it's such a central interface but it's also really complex, with its five indirect pointers to variable-length arrays, making a total of 11 arguments (including the first fd). Turning this into a syscall would probably make a more efficient interface, so maybe some of the arrays can be turned into a single argument and require the user to call it multiple times instead of the kernel looping around it. The minimal version would be something like long bus1_send(int fd, long dst, struct iovec *vecs, int n_vecs, long handle, int dst_fd); so you already get to six arguments with one destination, one handle and one fd but no flags. Replacing vecs/n_vecs with pointer and length doesn't help either, so I guess whatever we do here we have to use some indirect structure. Arnd