2019-05-10 00:01:55

by Chenbo Feng

[permalink] [raw]
Subject: [dma-buf v3 0/3] Improve the dma-buf tracking

Currently, all dma-bufs share the same anonymous inode. While we can count
how many dma-buf fds or mappings a process has, we can't get the size of
the backing buffers or tell if two entries point to the same dma-buf. And
in debugfs, we can get a per-buffer breakdown of size and reference count,
but can't tell which processes are actually holding the references to each
buffer.

To resolve the issue above and provide better method for userspace to track
the dma-buf usage across different processes, the following changes are
proposed in dma-buf kernel side. First of all, replace the singleton inode
inside the dma-buf subsystem with a mini-filesystem, and assign each
dma-buf a unique inode out of this filesystem. With this change, calling
stat(2) on each entry gives the caller a unique ID (st_ino), the buffer's
size (st_size), and even the number of pages assigned to each dma-buffer.
Secoundly, add the inode information to /sys/kernel/debug/dma_buf/bufinfo
so in the case where a buffer is mmap()ed into a process’s address space
but all remaining fds have been closed, we can still get the dma-buf
information and try to accociate it with the process by searching the
proc/pid/maps and looking for the corresponding inode number exposed in
dma-buf debug fs. Thirdly, created an ioctl to assign names to dma-bufs
which lets userspace assign short names (e.g., "CAMERA") to buffers. This
information can be extremely helpful for tracking and accounting shared
buffers based on their usage and original purpose. Last but not least, add
dma-buf information to /proc/pid/fdinfo by adding a show_fdinfo() handler
to dma_buf_file_operations. The handler will print the file_count and name
of each buffer.

Change in v2:
* Add a check to prevent changing dma-buf name when it is attached to
devices.
* Fixed some compile warnings

Change in v3:
* Removed the GET_DMA_BUF_NAME ioctls, add the dma_buf pointer to dentry
d_fsdata so the name can be displayed in proc/pid/maps and
proc/pid/map_files.

Greg Hackmann (3):
dma-buf: give each buffer a full-fledged inode
dma-buf: add DMA_BUF_{GET,SET}_NAME ioctls
dma-buf: add show_fdinfo handler

drivers/dma-buf/dma-buf.c | 122 +++++++++++++++++++++++++++++++++--
include/linux/dma-buf.h | 5 +-
include/uapi/linux/dma-buf.h | 3 +
include/uapi/linux/magic.h | 1 +
4 files changed, 124 insertions(+), 7 deletions(-)

--
2.21.0.1020.gf2820cf01a-goog


2019-05-10 00:01:57

by Chenbo Feng

[permalink] [raw]
Subject: [dma-buf v3 3/3] dma-buf: add show_fdinfo handler

From: Greg Hackmann <[email protected]>

The show_fdinfo handler exports the same information available through
debugfs on a per-buffer basis.

Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Chenbo Feng <[email protected]>
---
drivers/dma-buf/dma-buf.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index c1da5f9ce44d..c4efc272fc34 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -381,6 +381,20 @@ static long dma_buf_ioctl(struct file *file,
}
}

+static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
+{
+ struct dma_buf *dmabuf = file->private_data;
+
+ seq_printf(m, "size:\t%zu\n", dmabuf->size);
+ /* Don't count the temporary reference taken inside procfs seq_show */
+ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
+ seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
+ mutex_lock(&dmabuf->lock);
+ if (dmabuf->name)
+ seq_printf(m, "name:\t%s\n", dmabuf->name);
+ mutex_unlock(&dmabuf->lock);
+}
+
static const struct file_operations dma_buf_fops = {
.release = dma_buf_release,
.mmap = dma_buf_mmap_internal,
@@ -390,6 +404,7 @@ static const struct file_operations dma_buf_fops = {
#ifdef CONFIG_COMPAT
.compat_ioctl = dma_buf_ioctl,
#endif
+ .show_fdinfo = dma_buf_show_fdinfo,
};

/*
--
2.21.0.1020.gf2820cf01a-goog

2019-05-10 00:02:00

by Chenbo Feng

[permalink] [raw]
Subject: [dma-buf v3 1/3] dma-buf: give each buffer a full-fledged inode

From: Greg Hackmann <[email protected]>

By traversing /proc/*/fd and /proc/*/map_files, processes with CAP_ADMIN
can get a lot of fine-grained data about how shmem buffers are shared
among processes. stat(2) on each entry gives the caller a unique
ID (st_ino), the buffer's size (st_size), and even the number of pages
currently charged to the buffer (st_blocks / 512).

In contrast, all dma-bufs share the same anonymous inode. So while we
can count how many dma-buf fds or mappings a process has, we can't get
the size of the backing buffers or tell if two entries point to the same
dma-buf. On systems with debugfs, we can get a per-buffer breakdown of
size and reference count, but can't tell which processes are actually
holding the references to each buffer.

Replace the singleton inode with full-fledged inodes allocated by
alloc_anon_inode(). This involves creating and mounting a
mini-pseudo-filesystem for dma-buf, following the example in fs/aio.c.

Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Chenbo Feng <[email protected]>
---
drivers/dma-buf/dma-buf.c | 63 ++++++++++++++++++++++++++++++++++----
include/uapi/linux/magic.h | 1 +
2 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 7c858020d14b..ffd5a2ad7d6f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -34,8 +34,10 @@
#include <linux/poll.h>
#include <linux/reservation.h>
#include <linux/mm.h>
+#include <linux/mount.h>

#include <uapi/linux/dma-buf.h>
+#include <uapi/linux/magic.h>

static inline int is_dma_buf_file(struct file *);

@@ -46,6 +48,25 @@ struct dma_buf_list {

static struct dma_buf_list db_list;

+static const struct dentry_operations dma_buf_dentry_ops = {
+ .d_dname = simple_dname,
+};
+
+static struct vfsmount *dma_buf_mnt;
+
+static struct dentry *dma_buf_fs_mount(struct file_system_type *fs_type,
+ int flags, const char *name, void *data)
+{
+ return mount_pseudo(fs_type, "dmabuf:", NULL, &dma_buf_dentry_ops,
+ DMA_BUF_MAGIC);
+}
+
+static struct file_system_type dma_buf_fs_type = {
+ .name = "dmabuf",
+ .mount = dma_buf_fs_mount,
+ .kill_sb = kill_anon_super,
+};
+
static int dma_buf_release(struct inode *inode, struct file *file)
{
struct dma_buf *dmabuf;
@@ -338,6 +359,31 @@ static inline int is_dma_buf_file(struct file *file)
return file->f_op == &dma_buf_fops;
}

+static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
+{
+ struct file *file;
+ struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb);
+
+ if (IS_ERR(inode))
+ return ERR_CAST(inode);
+
+ inode->i_size = dmabuf->size;
+ inode_set_bytes(inode, dmabuf->size);
+
+ file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf",
+ flags, &dma_buf_fops);
+ if (IS_ERR(file))
+ goto err_alloc_file;
+ file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
+ file->private_data = dmabuf;
+
+ return file;
+
+err_alloc_file:
+ iput(inode);
+ return file;
+}
+
/**
* DOC: dma buf device access
*
@@ -433,8 +479,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
}
dmabuf->resv = resv;

- file = anon_inode_getfile("dmabuf", &dma_buf_fops, dmabuf,
- exp_info->flags);
+ file = dma_buf_getfile(dmabuf, exp_info->flags);
if (IS_ERR(file)) {
ret = PTR_ERR(file);
goto err_dmabuf;
@@ -1025,8 +1070,8 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
return ret;

seq_puts(s, "\nDma-buf Objects:\n");
- seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\n",
- "size", "flags", "mode", "count");
+ seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\t%-8s\n",
+ "size", "flags", "mode", "count", "ino");

list_for_each_entry(buf_obj, &db_list.head, list_node) {
ret = mutex_lock_interruptible(&buf_obj->lock);
@@ -1037,11 +1082,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
continue;
}

- seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\n",
+ seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\n",
buf_obj->size,
buf_obj->file->f_flags, buf_obj->file->f_mode,
file_count(buf_obj->file),
- buf_obj->exp_name);
+ buf_obj->exp_name,
+ file_inode(buf_obj->file)->i_ino);

robj = buf_obj->resv;
while (true) {
@@ -1136,6 +1182,10 @@ static inline void dma_buf_uninit_debugfs(void)

static int __init dma_buf_init(void)
{
+ dma_buf_mnt = kern_mount(&dma_buf_fs_type);
+ if (IS_ERR(dma_buf_mnt))
+ return PTR_ERR(dma_buf_mnt);
+
mutex_init(&db_list.lock);
INIT_LIST_HEAD(&db_list.head);
dma_buf_init_debugfs();
@@ -1146,5 +1196,6 @@ subsys_initcall(dma_buf_init);
static void __exit dma_buf_deinit(void)
{
dma_buf_uninit_debugfs();
+ kern_unmount(dma_buf_mnt);
}
__exitcall(dma_buf_deinit);
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index f8c00045d537..665e18627f78 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -91,5 +91,6 @@
#define UDF_SUPER_MAGIC 0x15013346
#define BALLOON_KVM_MAGIC 0x13661366
#define ZSMALLOC_MAGIC 0x58295829
+#define DMA_BUF_MAGIC 0x444d4142 /* "DMAB" */

#endif /* __LINUX_MAGIC_H__ */
--
2.21.0.1020.gf2820cf01a-goog

2019-05-10 00:02:06

by Chenbo Feng

[permalink] [raw]
Subject: [dma-buf v3 2/3] dma-buf: add DMA_BUF_{GET,SET}_NAME ioctls

From: Greg Hackmann <[email protected]>

This patch adds complimentary DMA_BUF_SET_NAME and DMA_BUF_GET_NAME
ioctls, which lets userspace processes attach a free-form name to each
buffer.

This information can be extremely helpful for tracking and accounting
shared buffers. For example, on Android, we know what each buffer will
be used for at allocation time: GL, multimedia, camera, etc. The
userspace allocator can use DMA_BUF_SET_NAME to associate that
information with the buffer, so we can later give developers a
breakdown of how much memory they're allocating for graphics, camera,
etc.

Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Chenbo Feng <[email protected]>
---
drivers/dma-buf/dma-buf.c | 49 +++++++++++++++++++++++++++++++++---
include/linux/dma-buf.h | 5 +++-
include/uapi/linux/dma-buf.h | 3 +++
3 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index ffd5a2ad7d6f..c1da5f9ce44d 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -48,8 +48,24 @@ struct dma_buf_list {

static struct dma_buf_list db_list;

+static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen)
+{
+ struct dma_buf *dmabuf;
+ char name[DMA_BUF_NAME_LEN];
+ size_t ret = 0;
+
+ dmabuf = dentry->d_fsdata;
+ mutex_lock(&dmabuf->lock);
+ if (dmabuf->name)
+ ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN);
+ mutex_unlock(&dmabuf->lock);
+
+ return dynamic_dname(dentry, buffer, buflen, "/%s:%s",
+ dentry->d_name.name, ret > 0 ? name : "");
+}
+
static const struct dentry_operations dma_buf_dentry_ops = {
- .d_dname = simple_dname,
+ .d_dname = dmabuffs_dname,
};

static struct vfsmount *dma_buf_mnt;
@@ -297,6 +313,27 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
return events;
}

+static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
+{
+ char *name = strndup_user(buf, DMA_BUF_NAME_LEN);
+ long ret = 0;
+
+ if (IS_ERR(name))
+ return PTR_ERR(name);
+
+ mutex_lock(&dmabuf->lock);
+ if (!list_empty(&dmabuf->attachments)) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ kfree(dmabuf->name);
+ dmabuf->name = name;
+
+out_unlock:
+ mutex_unlock(&dmabuf->lock);
+ return ret;
+}
+
static long dma_buf_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
{
@@ -335,6 +372,10 @@ static long dma_buf_ioctl(struct file *file,
ret = dma_buf_begin_cpu_access(dmabuf, direction);

return ret;
+
+ case DMA_BUF_SET_NAME:
+ return dma_buf_set_name(dmabuf, (const char __user *)arg);
+
default:
return -ENOTTY;
}
@@ -376,6 +417,7 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
goto err_alloc_file;
file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = dmabuf;
+ file->f_path.dentry->d_fsdata = dmabuf;

return file;

@@ -1082,12 +1124,13 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
continue;
}

- seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\n",
+ seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\t%s\n",
buf_obj->size,
buf_obj->file->f_flags, buf_obj->file->f_mode,
file_count(buf_obj->file),
buf_obj->exp_name,
- file_inode(buf_obj->file)->i_ino);
+ file_inode(buf_obj->file)->i_ino,
+ buf_obj->name ?: "");

robj = buf_obj->resv;
while (true) {
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 58725f890b5b..582998e19df6 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -255,10 +255,12 @@ struct dma_buf_ops {
* @file: file pointer used for sharing buffers across, and for refcounting.
* @attachments: list of dma_buf_attachment that denotes all devices attached.
* @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and vmap/unmap
+ * @lock: used internally to serialize list manipulation, attach/detach and
+ * vmap/unmap, and accesses to name
* @vmapping_counter: used internally to refcnt the vmaps
* @vmap_ptr: the current vmap ptr if vmapping_counter > 0
* @exp_name: name of the exporter; useful for debugging.
+ * @name: userspace-provided name; useful for accounting and debugging.
* @owner: pointer to exporter module; used for refcounting when exporter is a
* kernel module.
* @list_node: node for dma_buf accounting and debugging.
@@ -286,6 +288,7 @@ struct dma_buf {
unsigned vmapping_counter;
void *vmap_ptr;
const char *exp_name;
+ const char *name;
struct module *owner;
struct list_head list_node;
void *priv;
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index d75df5210a4a..dbc7092e04b5 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -35,7 +35,10 @@ struct dma_buf_sync {
#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
(DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)

+#define DMA_BUF_NAME_LEN 32
+
#define DMA_BUF_BASE 'b'
#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
+#define DMA_BUF_SET_NAME _IOW(DMA_BUF_BASE, 1, const char *)

#endif
--
2.21.0.1020.gf2820cf01a-goog