2010-11-04 19:26:44

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 0/8] Dynamic clock devices

Okay, here is a work in progress, not well tested, but I would like to
get some feedback whether the direction is good or not.

The first patch introduces clock devices which can appear and
disappear like usb devices (and, I suppose, hot plug PCI but I am not
too sure that what is offered here would really work in that case).
The subsequent patches convert the clock_ and timer_ system calls, one
by one.

The clock_ syscalls are moved into a new file and they call the older
posix functions when needed. The timer_ syscalls stay where they are,
in posix-timers.c, since I did not want to change the fairly involved
timer management code. Eventually, we could remove the posix clock_*
functions for the CLOCK_* ids from posix-timers.c and rework them
using the new dynamic clock api. That would leave just the timer code
in posix-timers.c, as the file name suggests.

I dropped the idea of having user space open a sysfs file in order to
get a reference to a clock, since there are no open/release hooks
within sysfs for drivers (coincidentally, there has been some talk
about this on the lkml recently, but previously Greg KH object to
abusing sysfs as a "clockfs").

Instead, since many clocks (hpet, rtc, ptp, ...) will want to offer a
custom chardev for special advanced functionality, the dynamic clock
layer registers a cdev for the driver, placing its own hooks into the
open/release methods. The driver thus needs to access its private data
via a standard access method (not just by using fp->private_data). If
a driver doesn't want any chardev functions, that is okay, too.

Well, please take a look and let me know what you think.

Thanks,

Richard


Richard Cochran (8):
Introduce dynamic clock devices
clock device: convert clock_gettime
clock device: convert clock_getres
clock device: convert clock_settime
clock device: convert timer_create
clock device: convert timer_delete
clock device: convert timer_gettime
clock device: convert timer_settime

include/linux/clockdevice.h | 117 +++++++++++++++
include/linux/posix-timers.h | 23 +++-
include/linux/time.h | 2 +
kernel/posix-timers.c | 51 +++++--
kernel/time/Makefile | 3 +-
kernel/time/clockdevice.c | 336 ++++++++++++++++++++++++++++++++++++++++++
6 files changed, 518 insertions(+), 14 deletions(-)
create mode 100644 include/linux/clockdevice.h
create mode 100644 kernel/time/clockdevice.c


2010-11-04 19:28:11

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 1/8] Introduce dynamic clock devices

This patch adds support for adding and removing clock devices. The clock
lifetime cycle is patterned after usb devices. Each clock is represented
by a standard character device. In addition, the driver may optionally
implemented custom character device operations.

The clock devices do not yet do anything useful. This patch merely
provides some needed infrastructure.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 97 +++++++++++++++++++++++++++
kernel/time/Makefile | 3 +-
kernel/time/clockdevice.c | 155 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 254 insertions(+), 1 deletions(-)
create mode 100644 include/linux/clockdevice.h
create mode 100644 kernel/time/clockdevice.c

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
new file mode 100644
index 0000000..a8f9359
--- /dev/null
+++ b/include/linux/clockdevice.h
@@ -0,0 +1,97 @@
+/*
+ * clockdevice.h - support for dynmanic clock devices
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+#ifndef _LINUX_CLOCKDEVICE_H_
+#define _LINUX_CLOCKDEVICE_H_
+
+#include <linux/fs.h>
+#include <linux/posix-timers.h>
+
+/**
+ * struct clock_device_operations - functional interface to the clock
+ * @owner: The clock driver should set to THIS_MODULE.
+ * @clock_gettime: Read the current time
+ * @clock_getres: Get the clock resolution
+ * @clock_settime: Set the current time value
+ * @timer_create: Create a new timer
+ * @timer_delete: Remove a previously created timer
+ * @timer_gettime: Get remaining time and interval of a timer
+ * @timer_setttime: Set a timer's initial expiration and interval
+ */
+struct clock_device_operations {
+ struct module *owner;
+ int (*clock_gettime)(void *priv, struct timespec *ts);
+ int (*clock_getres) (void *priv, struct timespec *ts);
+ int (*clock_settime)(void *priv, struct timespec *ts);
+ int (*timer_create) (void *priv, struct k_itimer *kit);
+ int (*timer_delete) (void *priv, struct k_itimer *kit);
+ void (*timer_gettime)(void *priv, struct k_itimer *kit,
+ struct itimerspec *tsp);
+ int (*timer_settime)(void *priv, struct k_itimer *kit, int flags,
+ struct itimerspec *tsp, struct itimerspec *old);
+};
+
+/**
+ * struct clock_device - an opaque type
+ */
+struct clock_device;
+
+/**
+ * create_clock_device() - register a new clock
+ * @cops: Pointer to the clock's interface
+ * @fops: The clock driver's custom character device functions.
+ * Drivers without custom methods may pass NULL. The file
+ * operation callbacks should access private data using
+ * clock_device_private(), see below.
+ * @devid: Allocated device id
+ * @priv: Private data passed back to the driver via the interface functions
+ *
+ * A clock driver calls this function to register itself with the
+ * clock device subsystem. The pointers 'cops' and 'fops' must point
+ * to persistent data, so the caller should pass a static global.
+ *
+ * Returns a pointer to a new clock device, or PTR_ERR on error.
+ */
+struct clock_device *create_clock_device(struct clock_device_operations *cops,
+ struct file_operations *fops,
+ dev_t devid, void *priv);
+
+/**
+ * destroy_clock_device() - unregister a clock
+ * @clk: Pointer obtained via create_clock_device()
+ *
+ * A clock driver calls this function to remove itself from the clock
+ * device subsystem. The clock_device itself will remain (in an
+ * inactive state) until its reference count drops to zero, at which
+ * point it will be deallocated.
+ */
+void destroy_clock_device(struct clock_device *clk);
+
+/**
+ * clock_device_private() - obtain clock driver's private data
+ *
+ * Character device file operations are first handled by the clock
+ * device layer, then passed on to the driver by calling its file
+ * operations functions. The clock device layer already uses
+ * fp->private_data, but drivers may use the following function to
+ * access their private data.
+ */
+void *clock_device_private(struct file *fp);
+
+#endif
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index ee26662..a573dd3 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -1,4 +1,5 @@
-obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o timecompare.o timeconv.o
+obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o \
+timecompare.o timeconv.o clockdevice.o

obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD) += clockevents.o
obj-$(CONFIG_GENERIC_CLOCKEVENTS) += tick-common.o
diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
new file mode 100644
index 0000000..323b57b
--- /dev/null
+++ b/kernel/time/clockdevice.c
@@ -0,0 +1,155 @@
+/*
+ * clockdevice.c - support for dynmanic clock devices
+ *
+ * Copyright (C) 2010 OMICRON electronics GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+#include <linux/cdev.h>
+#include <linux/clockdevice.h>
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+
+#define MAX_CLKDEV BITS_PER_LONG
+static DECLARE_BITMAP(clocks_map, MAX_CLKDEV);
+static DEFINE_MUTEX(clocks_mux); /* protects 'clocks_map' */
+
+struct clock_device {
+ struct file_operations fops;
+ struct file_operations *driver_fops;
+ struct clock_device_operations *ops;
+ struct cdev cdev;
+ struct kref kref;
+ struct mutex mux;
+ void *priv;
+ int index;
+ bool zombie;
+};
+
+static void delete_clock(struct kref *kref);
+
+static int clock_device_open(struct inode *inode, struct file *fp)
+{
+ struct clock_device *clk =
+ container_of(inode->i_cdev, struct clock_device, cdev);
+
+ kref_get(&clk->kref);
+ fp->private_data = clk;
+
+ if (clk->driver_fops && clk->driver_fops->open)
+ return clk->driver_fops->open(inode, fp);
+ else
+ return 0;
+}
+
+static int clock_device_release(struct inode *inode, struct file *fp)
+{
+ struct clock_device *clk = fp->private_data;
+ int err = 0;
+
+ if (clk->driver_fops && clk->driver_fops->release)
+ err = clk->driver_fops->release(inode, fp);
+
+ kref_put(&clk->kref, delete_clock);
+
+ return err;
+}
+
+struct clock_device *create_clock_device(struct clock_device_operations *cops,
+ struct file_operations *fops,
+ dev_t devid,
+ void *priv)
+{
+ struct clock_device *clk;
+ int err;
+
+ mutex_lock(&clocks_mux);
+
+ err = -ENOMEM;
+ clk = kzalloc(sizeof(*clk), GFP_KERNEL);
+ if (!clk)
+ goto no_memory;
+
+ err = -EBUSY;
+ clk->index = find_first_zero_bit(clocks_map, MAX_CLKDEV);
+ if (clk->index < MAX_CLKDEV)
+ set_bit(clk->index, clocks_map);
+ else
+ goto no_index;
+
+ clk->ops = cops;
+ clk->priv = priv;
+ kref_init(&clk->kref);
+ mutex_init(&clk->mux);
+
+ if (fops) {
+ clk->driver_fops = fops;
+ clk->fops = *fops;
+ }
+ clk->fops.open = clock_device_open;
+ clk->fops.release = clock_device_release;
+
+ cdev_init(&clk->cdev, &clk->fops);
+ clk->cdev.owner = clk->ops->owner;
+ err = cdev_add(&clk->cdev, devid, 1);
+ if (err)
+ goto no_cdev;
+
+ mutex_unlock(&clocks_mux);
+ return clk;
+
+no_cdev:
+ mutex_destroy(&clk->mux);
+ clear_bit(clk->index, clocks_map);
+no_index:
+ kfree(clk);
+no_memory:
+ mutex_unlock(&clocks_mux);
+ return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(create_clock_device);
+
+static void delete_clock(struct kref *kref)
+{
+ struct clock_device *clk =
+ container_of(kref, struct clock_device, kref);
+
+ mutex_lock(&clocks_mux);
+ clear_bit(clk->index, clocks_map);
+ mutex_unlock(&clocks_mux);
+
+ mutex_destroy(&clk->mux);
+ kfree(clk);
+}
+
+void destroy_clock_device(struct clock_device *clk)
+{
+ cdev_del(&clk->cdev);
+
+ mutex_lock(&clk->mux);
+ clk->zombie = true;
+ mutex_unlock(&clk->mux);
+
+ kref_put(&clk->kref, delete_clock);
+}
+EXPORT_SYMBOL_GPL(destroy_clock_device);
+
+void *clock_device_private(struct file *fp)
+{
+ struct clock_device *clk = fp->private_data;
+ return clk->priv;
+}
+EXPORT_SYMBOL_GPL(clock_device_private);
--
1.7.0.4

2010-11-04 19:28:39

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 2/8] clock device: convert clock_gettime

This patch lets the clock_gettime system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 9 ++++++
include/linux/posix-timers.h | 21 ++++++++++++++-
include/linux/time.h | 2 +
kernel/posix-timers.c | 4 +-
kernel/time/clockdevice.c | 58 ++++++++++++++++++++++++++++++++++++++++++
5 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
index a8f9359..ae258ac 100644
--- a/include/linux/clockdevice.h
+++ b/include/linux/clockdevice.h
@@ -94,4 +94,13 @@ void destroy_clock_device(struct clock_device *clk);
*/
void *clock_device_private(struct file *fp);

+/**
+ * clockid_to_clock_device() - obtain clock device pointer from a clock id
+ * @id: user space clock id
+ *
+ * Returns a pointer to the clock device, or NULL if the id is not a
+ * dynamic clock id.
+ */
+struct clock_device *clockid_to_clock_device(clockid_t id);
+
#endif
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3e23844..70f40e6 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -17,10 +17,22 @@ struct cpu_timer_list {
int firing;
};

+/* Bit fields within a clockid:
+ *
+ * The most significant 29 bits hold either a pid or a file descriptor.
+ *
+ * Bit 2 indicates whether a cpu clock refers to a thread or a process.
+ *
+ * Bits 1 and 0 give the type: PROF=0, VIRT=1, SCHED=2, or FD=3.
+ *
+ * A clockid is invalid if bits 2, 1, and 0 all set (see also CLOCK_INVALID
+ * in include/linux/time.h)
+ */
+
#define CPUCLOCK_PID(clock) ((pid_t) ~((clock) >> 3))
#define CPUCLOCK_PERTHREAD(clock) \
(((clock) & (clockid_t) CPUCLOCK_PERTHREAD_MASK) != 0)
-#define CPUCLOCK_PID_MASK 7
+
#define CPUCLOCK_PERTHREAD_MASK 4
#define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
#define CPUCLOCK_CLOCK_MASK 3
@@ -28,12 +40,17 @@ struct cpu_timer_list {
#define CPUCLOCK_VIRT 1
#define CPUCLOCK_SCHED 2
#define CPUCLOCK_MAX 3
+#define CLOCKFD CPUCLOCK_MAX
+#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)

#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
#define MAKE_THREAD_CPUCLOCK(tid, clock) \
MAKE_PROCESS_CPUCLOCK((tid), (clock) | CPUCLOCK_PERTHREAD_MASK)

+#define FD_TO_CLOCKID(fd) ((clockid_t) (fd << 3) | CLOCKFD)
+#define CLOCKID_TO_FD(clk) (((unsigned int) clk) >> 3)
+
/* POSIX.1b interval timer structure. */
struct k_itimer {
struct list_head list; /* free/ allocate list */
@@ -119,4 +136,6 @@ long clock_nanosleep_restart(struct restart_block *restart_block);

void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);

+int posix_clock_gettime(const clockid_t clock, struct timespec __user *tp);
+
#endif
diff --git a/include/linux/time.h b/include/linux/time.h
index 9f15ac7..914c48d 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -299,6 +299,8 @@ struct itimerval {
#define CLOCKS_MASK (CLOCK_REALTIME | CLOCK_MONOTONIC)
#define CLOCKS_MONO CLOCK_MONOTONIC

+#define CLOCK_INVALID -1
+
/*
* The various flags for setting POSIX.1b interval timers:
*/
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 9ca4973..4aecbfa 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -952,8 +952,8 @@ SYSCALL_DEFINE2(clock_settime, const clockid_t, which_clock,
return CLOCK_DISPATCH(which_clock, clock_set, (which_clock, &new_tp));
}

-SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
- struct timespec __user *,tp)
+int posix_clock_gettime(const clockid_t which_clock,
+ struct timespec __user *tp)
{
struct timespec kernel_tp;
int error;
diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 323b57b..e80117b 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -20,8 +20,11 @@
#include <linux/cdev.h>
#include <linux/clockdevice.h>
#include <linux/device.h>
+#include <linux/file.h>
#include <linux/mutex.h>
#include <linux/slab.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>

#define MAX_CLKDEV BITS_PER_LONG
static DECLARE_BITMAP(clocks_map, MAX_CLKDEV);
@@ -153,3 +156,58 @@ void *clock_device_private(struct file *fp)
return clk->priv;
}
EXPORT_SYMBOL_GPL(clock_device_private);
+
+static inline bool clock_is_static(clockid_t id)
+{
+ if (0 == (id & ~CLOCKFD_MASK))
+ return true;
+ if (CLOCKFD == (id & CLOCKFD_MASK))
+ return false;
+ return true;
+}
+
+struct clock_device *clockid_to_clock_device(clockid_t id)
+{
+ struct clock_device *clk = NULL;
+ struct file *fp;
+
+ if (clock_is_static(id))
+ return NULL;
+
+ fp = fget(CLOCKID_TO_FD(id));
+ if (!fp)
+ return NULL;
+
+ if (fp->f_op->open == clock_device_open)
+ clk = fp->private_data;
+
+ fput(fp);
+ return clk;
+}
+
+SYSCALL_DEFINE2(clock_gettime,
+ const clockid_t, id, struct timespec __user *, user_ts)
+{
+ struct timespec ts;
+ struct clock_device *clk;
+ int err;
+
+ clk = clockid_to_clock_device(id);
+ if (!clk)
+ return posix_clock_gettime(id, user_ts);
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->clock_gettime)
+ err = -EOPNOTSUPP;
+ else
+ err = clk->ops->clock_gettime(clk->priv, &ts);
+
+ if (!err && copy_to_user(user_ts, &ts, sizeof(ts)))
+ err = -EFAULT;
+
+ mutex_unlock(&clk->mux);
+ return err;
+}
--
1.7.0.4

2010-11-04 19:28:58

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 3/8] clock device: convert clock_getres

This patch lets the clock_getres system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/posix-timers.h | 1 +
kernel/posix-timers.c | 3 +--
kernel/time/clockdevice.c | 27 +++++++++++++++++++++++++++
3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 70f40e6..7d6a4f0 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -137,5 +137,6 @@ long clock_nanosleep_restart(struct restart_block *restart_block);
void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);

int posix_clock_gettime(const clockid_t clock, struct timespec __user *tp);
+int posix_clock_getres(const clockid_t clock, struct timespec __user *tp);

#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 4aecbfa..baa6a2e 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -969,8 +969,7 @@ int posix_clock_gettime(const clockid_t which_clock,

}

-SYSCALL_DEFINE2(clock_getres, const clockid_t, which_clock,
- struct timespec __user *, tp)
+int posix_clock_getres(const clockid_t which_clock, struct timespec __user *tp)
{
struct timespec rtn_tp;
int error;
diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index e80117b..6629ae7 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -211,3 +211,30 @@ SYSCALL_DEFINE2(clock_gettime,
mutex_unlock(&clk->mux);
return err;
}
+
+SYSCALL_DEFINE2(clock_getres,
+ const clockid_t, id, struct timespec __user *, user_ts)
+{
+ struct timespec ts;
+ struct clock_device *clk;
+ int err;
+
+ clk = clockid_to_clock_device(id);
+ if (!clk)
+ return posix_clock_getres(id, user_ts);
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->clock_getres)
+ err = -EOPNOTSUPP;
+ else
+ err = clk->ops->clock_getres(clk->priv, &ts);
+
+ if (!err && user_ts && copy_to_user(user_ts, &ts, sizeof(ts)))
+ err = -EFAULT;
+
+ mutex_unlock(&clk->mux);
+ return err;
+}
--
1.7.0.4

2010-11-04 19:29:19

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 4/8] clock device: convert clock_settime

This patch lets the clock_settime system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/posix-timers.h | 1 +
kernel/posix-timers.c | 4 ++--
kernel/time/clockdevice.c | 26 ++++++++++++++++++++++++++
3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 7d6a4f0..864ed7b 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -138,5 +138,6 @@ void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);

int posix_clock_gettime(const clockid_t clock, struct timespec __user *tp);
int posix_clock_getres(const clockid_t clock, struct timespec __user *tp);
+int posix_clock_settime(const clockid_t clock, const struct timespec __user *t);

#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index baa6a2e..ef4e222 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -939,8 +939,8 @@ int do_posix_clock_nonanosleep(const clockid_t clock, int flags,
}
EXPORT_SYMBOL_GPL(do_posix_clock_nonanosleep);

-SYSCALL_DEFINE2(clock_settime, const clockid_t, which_clock,
- const struct timespec __user *, tp)
+int posix_clock_settime(const clockid_t which_clock,
+ const struct timespec __user *tp)
{
struct timespec new_tp;

diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 6629ae7..3f2870d 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -238,3 +238,29 @@ SYSCALL_DEFINE2(clock_getres,
mutex_unlock(&clk->mux);
return err;
}
+
+SYSCALL_DEFINE2(clock_settime,
+ const clockid_t, id, const struct timespec __user *, user_ts)
+{
+ struct timespec ts;
+ struct clock_device *clk;
+ int err;
+
+ clk = clockid_to_clock_device(id);
+ if (!clk)
+ return posix_clock_settime(id, user_ts);
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->clock_settime)
+ err = -EOPNOTSUPP;
+ else if (copy_from_user(&ts, user_ts, sizeof(ts)))
+ err = -EFAULT;
+ else
+ err = clk->ops->clock_settime(clk->priv, &ts);
+
+ mutex_unlock(&clk->mux);
+ return err;
+}
--
1.7.0.4

2010-11-04 19:29:41

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 5/8] clock device: convert timer_create

This patch lets the timer_create system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 5 +++++
kernel/posix-timers.c | 12 ++++++++++--
kernel/time/clockdevice.c | 17 +++++++++++++++++
3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
index ae258ac..3201a28 100644
--- a/include/linux/clockdevice.h
+++ b/include/linux/clockdevice.h
@@ -103,4 +103,9 @@ void *clock_device_private(struct file *fp);
*/
struct clock_device *clockid_to_clock_device(clockid_t id);

+/*
+ * The following functions are only to be called from posix-timers.c
+ */
+int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit);
+
#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index ef4e222..42efbe9 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -39,6 +39,7 @@
#include <asm/uaccess.h>
#include <linux/list.h>
#include <linux/init.h>
+#include <linux/clockdevice.h>
#include <linux/compiler.h>
#include <linux/idr.h>
#include <linux/posix-timers.h>
@@ -523,12 +524,15 @@ SYSCALL_DEFINE3(timer_create, const clockid_t, which_clock,
struct sigevent __user *, timer_event_spec,
timer_t __user *, created_timer_id)
{
+ struct clock_device *clk_dev;
struct k_itimer *new_timer;
int error, new_timer_id;
sigevent_t event;
int it_id_set = IT_ID_NOT_SET;

- if (invalid_clockid(which_clock))
+ clk_dev = clockid_to_clock_device(which_clock);
+
+ if (!clk_dev && invalid_clockid(which_clock))
return -EINVAL;

new_timer = alloc_posix_timer();
@@ -591,7 +595,11 @@ SYSCALL_DEFINE3(timer_create, const clockid_t, which_clock,
goto out;
}

- error = CLOCK_DISPATCH(which_clock, timer_create, (new_timer));
+ if (clk_dev)
+ error = clock_device_timer_create(clk_dev, new_timer);
+ else {
+ error = CLOCK_DISPATCH(which_clock, timer_create, (new_timer));
+ }
if (error)
goto out;

diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 3f2870d..166cc30 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -264,3 +264,20 @@ SYSCALL_DEFINE2(clock_settime,
mutex_unlock(&clk->mux);
return err;
}
+
+int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit)
+{
+ int err;
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->timer_create)
+ err = -EOPNOTSUPP;
+ else
+ err = clk->ops->timer_create(clk->priv, kit);
+
+ mutex_unlock(&clk->mux);
+ return err;
+}
--
1.7.0.4

2010-11-04 19:30:09

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 6/8] clock device: convert timer_delete

This patch lets the timer_delete system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 1 +
kernel/posix-timers.c | 7 +++++++
kernel/time/clockdevice.c | 18 ++++++++++++++++++
3 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
index 3201a28..592bd5e 100644
--- a/include/linux/clockdevice.h
+++ b/include/linux/clockdevice.h
@@ -107,5 +107,6 @@ struct clock_device *clockid_to_clock_device(clockid_t id);
* The following functions are only to be called from posix-timers.c
*/
int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit);
+int clock_device_timer_delete(struct clock_device *clk, struct k_itimer *kit);

#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 42efbe9..5feb565 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -857,6 +857,13 @@ static inline int common_timer_del(struct k_itimer *timer)

static inline int timer_delete_hook(struct k_itimer *timer)
{
+ struct clock_device *clk_dev;
+
+ clk_dev = clockid_to_clock_device(timer->it_clock);
+
+ if (clk_dev)
+ return clock_device_timer_delete(clk_dev, timer);
+
return CLOCK_DISPATCH(timer->it_clock, timer_del, (timer));
}

diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 166cc30..27d13a4 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -281,3 +281,21 @@ int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit)
mutex_unlock(&clk->mux);
return err;
}
+
+int clock_device_timer_delete(struct clock_device *clk, struct k_itimer *kit)
+{
+ int err;
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->timer_delete)
+ err = -EOPNOTSUPP;
+ else
+ err = clk->ops->timer_delete(clk->priv, kit);
+
+ mutex_unlock(&clk->mux);
+
+ return err;
+}
--
1.7.0.4

2010-11-04 19:30:34

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 7/8] clock device: convert timer_gettime

This patch lets the timer_gettime system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 2 ++
kernel/posix-timers.c | 9 ++++++++-
kernel/time/clockdevice.c | 15 +++++++++++++++
3 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
index 592bd5e..b56debb 100644
--- a/include/linux/clockdevice.h
+++ b/include/linux/clockdevice.h
@@ -108,5 +108,7 @@ struct clock_device *clockid_to_clock_device(clockid_t id);
*/
int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit);
int clock_device_timer_delete(struct clock_device *clk, struct k_itimer *kit);
+void clock_device_timer_gettime(struct clock_device *clk, struct k_itimer *kit,
+ struct itimerspec *tsp);

#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 5feb565..d00ff73 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -711,6 +711,7 @@ common_timer_get(struct k_itimer *timr, struct itimerspec *cur_setting)
SYSCALL_DEFINE2(timer_gettime, timer_t, timer_id,
struct itimerspec __user *, setting)
{
+ struct clock_device *clk_dev;
struct k_itimer *timr;
struct itimerspec cur_setting;
unsigned long flags;
@@ -719,7 +720,13 @@ SYSCALL_DEFINE2(timer_gettime, timer_t, timer_id,
if (!timr)
return -EINVAL;

- CLOCK_DISPATCH(timr->it_clock, timer_get, (timr, &cur_setting));
+ clk_dev = clockid_to_clock_device(timr->it_clock);
+
+ if (clk_dev)
+ clock_device_timer_gettime(clk_dev, timr, &cur_setting);
+ else {
+ CLOCK_DISPATCH(timr->it_clock, timer_get, (timr, &cur_setting));
+ }

unlock_timer(timr, flags);

diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 27d13a4..0f6f913 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -299,3 +299,18 @@ int clock_device_timer_delete(struct clock_device *clk, struct k_itimer *kit)

return err;
}
+
+void clock_device_timer_gettime(struct clock_device *clk, struct k_itimer *kit,
+ struct itimerspec *tsp)
+{
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ goto out;
+ else if (!clk->ops->timer_gettime)
+ goto out;
+ else
+ clk->ops->timer_gettime(clk->priv, kit, tsp);
+out:
+ mutex_unlock(&clk->mux);
+}
--
1.7.0.4

2010-11-04 19:30:53

by Richard Cochran

[permalink] [raw]
Subject: [PATCH RFC 8/8] clock device: convert timer_settime

This patch lets the timer_settime system call use dynamic clock devices.

Signed-off-by: Richard Cochran <[email protected]>
---
include/linux/clockdevice.h | 3 +++
kernel/posix-timers.c | 12 +++++++++---
kernel/time/clockdevice.c | 20 ++++++++++++++++++++
3 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
index b56debb..d3589a0 100644
--- a/include/linux/clockdevice.h
+++ b/include/linux/clockdevice.h
@@ -110,5 +110,8 @@ int clock_device_timer_create(struct clock_device *clk, struct k_itimer *kit);
int clock_device_timer_delete(struct clock_device *clk, struct k_itimer *kit);
void clock_device_timer_gettime(struct clock_device *clk, struct k_itimer *kit,
struct itimerspec *tsp);
+int clock_device_timer_settime(struct clock_device *clk,
+ struct k_itimer *kit, int flags,
+ struct itimerspec *tsp, struct itimerspec *old);

#endif
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index d00ff73..b09e37e 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -817,6 +817,7 @@ SYSCALL_DEFINE4(timer_settime, timer_t, timer_id, int, flags,
const struct itimerspec __user *, new_setting,
struct itimerspec __user *, old_setting)
{
+ struct clock_device *clk_dev;
struct k_itimer *timr;
struct itimerspec new_spec, old_spec;
int error = 0;
@@ -837,9 +838,14 @@ retry:
if (!timr)
return -EINVAL;

- error = CLOCK_DISPATCH(timr->it_clock, timer_set,
- (timr, flags, &new_spec, rtn));
-
+ clk_dev = clockid_to_clock_device(timr->it_clock);
+ if (clk_dev)
+ error = clock_device_timer_settime(clk_dev, timr,
+ flags, &new_spec, rtn);
+ else {
+ error = CLOCK_DISPATCH(timr->it_clock, timer_set,
+ (timr, flags, &new_spec, rtn));
+ }
unlock_timer(timr, flag);
if (error == TIMER_RETRY) {
rtn = NULL; // We already got the old time...
diff --git a/kernel/time/clockdevice.c b/kernel/time/clockdevice.c
index 0f6f913..3345c3a 100644
--- a/kernel/time/clockdevice.c
+++ b/kernel/time/clockdevice.c
@@ -314,3 +314,23 @@ void clock_device_timer_gettime(struct clock_device *clk, struct k_itimer *kit,
out:
mutex_unlock(&clk->mux);
}
+
+int clock_device_timer_settime(struct clock_device *clk,
+ struct k_itimer *kit, int flags,
+ struct itimerspec *tsp, struct itimerspec *old)
+{
+ int err;
+
+ mutex_lock(&clk->mux);
+
+ if (clk->zombie)
+ err = -ENODEV;
+ else if (!clk->ops->timer_settime)
+ err = -EOPNOTSUPP;
+ else
+ err = clk->ops->timer_settime(clk->priv, kit, flags, tsp, old);
+
+ mutex_unlock(&clk->mux);
+
+ return err;
+}
--
1.7.0.4

2010-11-08 23:01:13

by john stultz

[permalink] [raw]
Subject: Re: [PATCH RFC 0/8] Dynamic clock devices

On Thu, 2010-11-04 at 20:26 +0100, Richard Cochran wrote:
> The clock_ syscalls are moved into a new file and they call the older
> posix functions when needed. The timer_ syscalls stay where they are,
> in posix-timers.c, since I did not want to change the fairly involved
> timer management code. Eventually, we could remove the posix clock_*
> functions for the CLOCK_* ids from posix-timers.c and rework them
> using the new dynamic clock api. That would leave just the timer code
> in posix-timers.c, as the file name suggests.
>
> I dropped the idea of having user space open a sysfs file in order to
> get a reference to a clock, since there are no open/release hooks
> within sysfs for drivers (coincidentally, there has been some talk
> about this on the lkml recently, but previously Greg KH object to
> abusing sysfs as a "clockfs").
>
> Instead, since many clocks (hpet, rtc, ptp, ...) will want to offer a
> custom chardev for special advanced functionality, the dynamic clock
> layer registers a cdev for the driver, placing its own hooks into the
> open/release methods. The driver thus needs to access its private data
> via a standard access method (not just by using fp->private_data). If
> a driver doesn't want any chardev functions, that is okay, too.
>
> Well, please take a look and let me know what you think.

Hey Richard!
Thanks for sending this patchset out for review! This is definitely a
larger redesign from your earlier patch, and I see now how the
per-thread cputime clockids throws a wrench in my argument just using
the incremental clockids that hash into a smaller array to avoid id
reuse.

That said, given how different this is from the last implementation, I'm
not fully clear I see how to integrate this into my patch set. It might
be useful to see a trivial example of how you see a clockdevice being
registered and used.

Overall it looks interesting, but there may be a few catches that we may
have to watch out for.

I have a few other comments I'll make in context of the patches to
follow shortly.

thanks again!
-john

2010-11-08 23:37:21

by john stultz

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Thu, 2010-11-04 at 20:28 +0100, Richard Cochran wrote:
> This patch lets the clock_gettime system call use dynamic clock devices.
>
> Signed-off-by: Richard Cochran <[email protected]>
> ---
> include/linux/clockdevice.h | 9 ++++++
> include/linux/posix-timers.h | 21 ++++++++++++++-
> include/linux/time.h | 2 +
> kernel/posix-timers.c | 4 +-
> kernel/time/clockdevice.c | 58 ++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 91 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/clockdevice.h b/include/linux/clockdevice.h
> index a8f9359..ae258ac 100644
> --- a/include/linux/clockdevice.h
> +++ b/include/linux/clockdevice.h
> @@ -94,4 +94,13 @@ void destroy_clock_device(struct clock_device *clk);
> */
> void *clock_device_private(struct file *fp);
>
> +/**
> + * clockid_to_clock_device() - obtain clock device pointer from a clock id
> + * @id: user space clock id
> + *
> + * Returns a pointer to the clock device, or NULL if the id is not a
> + * dynamic clock id.
> + */
> +struct clock_device *clockid_to_clock_device(clockid_t id);
> +
> #endif
> diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
> index 3e23844..70f40e6 100644
> --- a/include/linux/posix-timers.h
> +++ b/include/linux/posix-timers.h
> @@ -17,10 +17,22 @@ struct cpu_timer_list {
> int firing;
> };
>
> +/* Bit fields within a clockid:
> + *
> + * The most significant 29 bits hold either a pid or a file descriptor.
> + *
> + * Bit 2 indicates whether a cpu clock refers to a thread or a process.
> + *
> + * Bits 1 and 0 give the type: PROF=0, VIRT=1, SCHED=2, or FD=3.
> + *
> + * A clockid is invalid if bits 2, 1, and 0 all set (see also CLOCK_INVALID
> + * in include/linux/time.h)
> + */

> #define CPUCLOCK_PID(clock) ((pid_t) ~((clock) >> 3))

So I think we may want to add some additional comments here.
Specifically around the limits both new and existing that are created
around the interactions between clockid_t, pid_t and now the clock_fd.

Specifically, pid_t is already limited by futex to 29 bits (I believe,
please correct me if I'm wrong). MAKE_PROCESS_CPUCLOCK macro below seems
to also utilize this 29 bit pid limit assumption as well (via pid<<3),
however it may actually require pid to be below 28 (since CLOCK_DISPATCH
assumes cpu clockids are negative).

Anyway, this seems like it should be a bit more explicit.


> #define CPUCLOCK_PERTHREAD(clock) \
> (((clock) & (clockid_t) CPUCLOCK_PERTHREAD_MASK) != 0)
> -#define CPUCLOCK_PID_MASK 7
> +
> #define CPUCLOCK_PERTHREAD_MASK 4
> #define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
> #define CPUCLOCK_CLOCK_MASK 3
> @@ -28,12 +40,17 @@ struct cpu_timer_list {
> #define CPUCLOCK_VIRT 1
> #define CPUCLOCK_SCHED 2
> #define CPUCLOCK_MAX 3
> +#define CLOCKFD CPUCLOCK_MAX
> +#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)
>
> #define MAKE_PROCESS_CPUCLOCK(pid, clock) \
> ((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
> #define MAKE_THREAD_CPUCLOCK(tid, clock) \
> MAKE_PROCESS_CPUCLOCK((tid), (clock) | CPUCLOCK_PERTHREAD_MASK)
>
> +#define FD_TO_CLOCKID(fd) ((clockid_t) (fd << 3) | CLOCKFD)
> +#define CLOCKID_TO_FD(clk) (((unsigned int) clk) >> 3)

So similarly here, we need to be explicit in making sure that the max fd
value is small enough to fit without dropping bits if we shift it up by
three (trying to quickly review open I don't see any explicit limit,
other then NR_OPEN_DEFAULT, but that's BIT_PER_LONG, which won't fit in
the int returned by open on 64bit systems).


> +SYSCALL_DEFINE2(clock_gettime,
> + const clockid_t, id, struct timespec __user *, user_ts)
> +{
> + struct timespec ts;
> + struct clock_device *clk;
> + int err;
> +
> + clk = clockid_to_clock_device(id);
> + if (!clk)
> + return posix_clock_gettime(id, user_ts);
> +
> + mutex_lock(&clk->mux);
> +
> + if (clk->zombie)
> + err = -ENODEV;
> + else if (!clk->ops->clock_gettime)
> + err = -EOPNOTSUPP;
> + else
> + err = clk->ops->clock_gettime(clk->priv, &ts);
> +
> + if (!err && copy_to_user(user_ts, &ts, sizeof(ts)))
> + err = -EFAULT;
> +
> + mutex_unlock(&clk->mux);
> + return err;
> +}


So sort of minor nit here, but is there a reason the clockfd
implementation is primary here and the standard posix implementation
gets pushed off into its own function rather then doing something like:

clk = clockid_to_clock_device(id)
if(clk)
return clockdev_clock_gettime(clk, user_ts);
[existing sys_clock_gettime()]

As you implemented it, it seems to expect the clockdevice method to be
the most frequent use case, where as its likely to be the inverse. So
folks looking into the more common CLOCK_REALTIME calls have to traverse
more code then the likely more rare clockfd cases.

Also, in my mind, it would seem more in-line with the existing code to
integrate the conditional into CLOCK_DISPATCH. Not that CLOCK_DISPATCH
is pretty, but it avoids making your changes look tacked on in front of
everything.

thanks
-john


2010-11-09 08:23:49

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Mon, Nov 08, 2010 at 03:37:03PM -0800, john stultz wrote:
> On Thu, 2010-11-04 at 20:28 +0100, Richard Cochran wrote:
> > #define CPUCLOCK_PID(clock) ((pid_t) ~((clock) >> 3))
>
> So I think we may want to add some additional comments here.
> Specifically around the limits both new and existing that are created
> around the interactions between clockid_t, pid_t and now the clock_fd.
>
> Specifically, pid_t is already limited by futex to 29 bits (I believe,
> please correct me if I'm wrong). MAKE_PROCESS_CPUCLOCK macro below seems
> to also utilize this 29 bit pid limit assumption as well (via pid<<3),
> however it may actually require pid to be below 28 (since CLOCK_DISPATCH
> assumes cpu clockids are negative).
>
> Anyway, this seems like it should be a bit more explicit.

Yes, you are right, of course, but I would like to point out that the
existing "overloading" of the clockid_t isn't really explained at all.
It was not clear to me whether or not 29 pid bits is enough for the
worst case, or not.

This code is older than 2005 (git/linux) and so I don't know who wrote
it and what they were thinking. I took the first step and tried to
explain as much I understand.

> > +#define FD_TO_CLOCKID(fd) ((clockid_t) (fd << 3) | CLOCKFD)
> > +#define CLOCKID_TO_FD(clk) (((unsigned int) clk) >> 3)
>
> So similarly here, we need to be explicit in making sure that the max fd
> value is small enough to fit without dropping bits if we shift it up by
> three (trying to quickly review open I don't see any explicit limit,
> other then NR_OPEN_DEFAULT, but that's BIT_PER_LONG, which won't fit in
> the int returned by open on 64bit systems).

I also didn't see any limit to the number of open files, except that
it must be a non-negative (signed) integer.

For userspace, there will have to be a glibc function/macro like
FD_TO_CLOCKID() that tests the three most significant bits and returns
CLOCK_INVALID if any are set.

This will result in the limitation that if a userspace program already
has 2^29 (536 million) open files, then it will not be able to obtain
a dynamic clock id. I think we can live with that.

> So sort of minor nit here, but is there a reason the clockfd
> implementation is primary here and the standard posix implementation
> gets pushed off into its own function rather then doing something like:
>
> clk = clockid_to_clock_device(id)
> if(clk)
> return clockdev_clock_gettime(clk, user_ts);
> [existing sys_clock_gettime()]
>
> As you implemented it, it seems to expect the clockdevice method to be
> the most frequent use case, where as its likely to be the inverse. So
> folks looking into the more common CLOCK_REALTIME calls have to traverse
> more code then the likely more rare clockfd cases.

Actually, what I would like to do is refactor the exisiting posix
clock code to use the new framework. The idea is to have a set of
static global clock_device*, one per fixed clock. The function
clockid_to_clock_device() will include a lookup table, like this:

static clock_device *realtime_clock, *monotinic_clock;

switch (id) {
case CLOCK_REALTIME:
return realtime_clock;
case CLOCK_MONOTONIC:
return monotinic_clock;
/* and so on ... */
}

This could be done on top of the existing patch in an incremental way.
However, I did not want to change everything all at once.

> Also, in my mind, it would seem more in-line with the existing code to
> integrate the conditional into CLOCK_DISPATCH. Not that CLOCK_DISPATCH
> is pretty, but it avoids making your changes look tacked on in front of
> everything.

Sorry to disagree, but I really don't want to be the one to extend
that macro in any way. IMHO it really should be replaced with
something more legible.

Thanks for the review,

Richard

2010-11-09 10:16:10

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Thursday 04 November 2010, Richard Cochran wrote:
> diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
> index 3e23844..70f40e6 100644
> --- a/include/linux/posix-timers.h
> +++ b/include/linux/posix-timers.h
> @@ -17,10 +17,22 @@ struct cpu_timer_list {
> int firing;
> };
>
> +/* Bit fields within a clockid:
> + *
> + * The most significant 29 bits hold either a pid or a file descriptor.
> + *
> + * Bit 2 indicates whether a cpu clock refers to a thread or a process.
> + *
> + * Bits 1 and 0 give the type: PROF=0, VIRT=1, SCHED=2, or FD=3.
> + *
> + * A clockid is invalid if bits 2, 1, and 0 all set (see also CLOCK_INVALID
> + * in include/linux/time.h)
> + */
> +
> #define (clock) ((pid_t) ~((clock) >> 3))
> #define CPUCLOCK_PERTHREAD(clock) \
> (((clock) & (clockid_t) CPUCLOCK_PERTHREAD_MASK) != 0)
> -#define CPUCLOCK_PID_MASK 7
> +
> #define CPUCLOCK_PERTHREAD_MASK 4
> #define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
> #define CPUCLOCK_CLOCK_MASK 3
> @@ -28,12 +40,17 @@ struct cpu_timer_list {
> #define CPUCLOCK_VIRT 1
> #define CPUCLOCK_SCHED 2
> #define CPUCLOCK_MAX 3
> +#define CLOCKFD CPUCLOCK_MAX
> +#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)

It looks like you are turning a kernel internal interface into a user ABI,
which I think is highly questionable. Using the bits like this internally is
ok, but making it part of the syscall ABI means that we can never change this
in the future.

Maybe I was misunderstanding your patch though.

Arnd

2010-11-09 10:16:29

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RFC 1/8] Introduce dynamic clock devices

On Thursday 04 November 2010, Richard Cochran wrote:
> This patch adds support for adding and removing clock devices. The clock
> lifetime cycle is patterned after usb devices. Each clock is represented
> by a standard character device. In addition, the driver may optionally
> implemented custom character device operations.
>
> The clock devices do not yet do anything useful. This patch merely
> provides some needed infrastructure.
> +/**
> + * create_clock_device() - register a new clock

I'd call this clock_device_create() etc, putting all clock_device functions
in a common namespace with the clock_device_ prefix

> +#define MAX_CLKDEV BITS_PER_LONG
> +static DECLARE_BITMAP(clocks_map, MAX_CLKDEV);
> +static DEFINE_MUTEX(clocks_mux); /* protects 'clocks_map' */

Using "_mux" as a postfix for the name is highly unusual, this normally
means multiplex, not mutual exlusion.

You could probably avoid the need for this mutex altogether if you use
the atomic bit operations (e.g. test_and_set_bit).

> +struct clock_device {
> + struct file_operations fops;
> + struct file_operations *driver_fops;
> + struct clock_device_operations *ops;
> + struct cdev cdev;
> + struct kref kref;
> + struct mutex mux;
> + void *priv;
> + int index;
> + bool zombie;
> +};

You should really not need the file_operations here, neither the
struct nor the pointer. Just define a static file_operations
structure containing clock_device_open and clock_device_release,
and whatever else you might need, then add the driver's operations
to clock_device_operations, and pass the clock_device pointer
directly to them, instead of passing the file/inode pointers.

Arnd

2010-11-09 10:26:40

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Mon, Nov 08, 2010 at 07:56:50AM +0100, Arnd Bergmann wrote:
> On Thursday 04 November 2010, Richard Cochran wrote:
> > #define CPUCLOCK_PERTHREAD_MASK 4
> > #define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
> > #define CPUCLOCK_CLOCK_MASK 3
> > @@ -28,12 +40,17 @@ struct cpu_timer_list {
> > #define CPUCLOCK_VIRT 1
> > #define CPUCLOCK_SCHED 2
> > #define CPUCLOCK_MAX 3
> > +#define CLOCKFD CPUCLOCK_MAX
> > +#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)
>
> It looks like you are turning a kernel internal interface into a user ABI,
> which I think is highly questionable. Using the bits like this internally is
> ok, but making it part of the syscall ABI means that we can never change this
> in the future.

This set of defines is already part of the ABI and has been copied
into glibc. I am extending the (mis)use of the clock id by adding one
more case.

Richard

2010-11-09 12:47:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Tuesday 09 November 2010, Richard Cochran wrote:
> On Mon, Nov 08, 2010 at 07:56:50AM +0100, Arnd Bergmann wrote:
> > On Thursday 04 November 2010, Richard Cochran wrote:
> > > #define CPUCLOCK_PERTHREAD_MASK 4
> > > #define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
> > > #define CPUCLOCK_CLOCK_MASK 3
> > > @@ -28,12 +40,17 @@ struct cpu_timer_list {
> > > #define CPUCLOCK_VIRT 1
> > > #define CPUCLOCK_SCHED 2
> > > #define CPUCLOCK_MAX 3
> > > +#define CLOCKFD CPUCLOCK_MAX
> > > +#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)
> >
> > It looks like you are turning a kernel internal interface into a user ABI,
> > which I think is highly questionable. Using the bits like this internally is
> > ok, but making it part of the syscall ABI means that we can never change this
> > in the future.
>
> This set of defines is already part of the ABI and has been copied
> into glibc. I am extending the (mis)use of the clock id by adding one
> more case.

Ok, that is very different then. I wonder why posix-timers.h is not an exported
header file, it might be good to change that to make it explicit which parts of
it define the ABI and which parts are kernel internal.

Arnd

2010-11-09 21:11:34

by john stultz

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Tue, 2010-11-09 at 09:23 +0100, Richard Cochran wrote:
> On Mon, Nov 08, 2010 at 03:37:03PM -0800, john stultz wrote:
> > On Thu, 2010-11-04 at 20:28 +0100, Richard Cochran wrote:
> > > #define CPUCLOCK_PID(clock) ((pid_t) ~((clock) >> 3))
> >
> > So I think we may want to add some additional comments here.
> > Specifically around the limits both new and existing that are created
> > around the interactions between clockid_t, pid_t and now the clock_fd.
> >
> > Specifically, pid_t is already limited by futex to 29 bits (I believe,
> > please correct me if I'm wrong). MAKE_PROCESS_CPUCLOCK macro below seems
> > to also utilize this 29 bit pid limit assumption as well (via pid<<3),
> > however it may actually require pid to be below 28 (since CLOCK_DISPATCH
> > assumes cpu clockids are negative).
> >
> > Anyway, this seems like it should be a bit more explicit.
>
> Yes, you are right, of course, but I would like to point out that the
> existing "overloading" of the clockid_t isn't really explained at all.
> It was not clear to me whether or not 29 pid bits is enough for the
> worst case, or not.
>
> This code is older than 2005 (git/linux) and so I don't know who wrote
> it and what they were thinking. I took the first step and tried to
> explain as much I understand.

Looks like the cpu timers bit landed in 2.6.11 from Roland.

Roland: Might be nice to get your thoughts on the proposed changes here.


> > > +#define FD_TO_CLOCKID(fd) ((clockid_t) (fd << 3) | CLOCKFD)
> > > +#define CLOCKID_TO_FD(clk) (((unsigned int) clk) >> 3)
> >
> > So similarly here, we need to be explicit in making sure that the max fd
> > value is small enough to fit without dropping bits if we shift it up by
> > three (trying to quickly review open I don't see any explicit limit,
> > other then NR_OPEN_DEFAULT, but that's BIT_PER_LONG, which won't fit in
> > the int returned by open on 64bit systems).
>
> I also didn't see any limit to the number of open files, except that
> it must be a non-negative (signed) integer.
>
> For userspace, there will have to be a glibc function/macro like
> FD_TO_CLOCKID() that tests the three most significant bits and returns
> CLOCK_INVALID if any are set.
>
> This will result in the limitation that if a userspace program already
> has 2^29 (536 million) open files, then it will not be able to obtain
> a dynamic clock id. I think we can live with that.

This does seem reasonable. Any one have an objection with this?


> > So sort of minor nit here, but is there a reason the clockfd
> > implementation is primary here and the standard posix implementation
> > gets pushed off into its own function rather then doing something like:
> >
> > clk = clockid_to_clock_device(id)
> > if(clk)
> > return clockdev_clock_gettime(clk, user_ts);
> > [existing sys_clock_gettime()]
> >
> > As you implemented it, it seems to expect the clockdevice method to be
> > the most frequent use case, where as its likely to be the inverse. So
> > folks looking into the more common CLOCK_REALTIME calls have to traverse
> > more code then the likely more rare clockfd cases.
>
> Actually, what I would like to do is refactor the exisiting posix
> clock code to use the new framework. The idea is to have a set of
> static global clock_device*, one per fixed clock. The function
> clockid_to_clock_device() will include a lookup table, like this:
>
> static clock_device *realtime_clock, *monotinic_clock;
>
> switch (id) {
> case CLOCK_REALTIME:
> return realtime_clock;
> case CLOCK_MONOTONIC:
> return monotinic_clock;
> /* and so on ... */
> }

This seems a little over-reaching. I'm not sure I see what benefit would
come from having clock_devices for the static clock_ids? The extra mutex
locking and status/null checking for the clock_device would would just
add unnecessary overhead to the performance critical clock_gettime call.

And would each cpuclock need a clock_device too?


> This could be done on top of the existing patch in an incremental way.
> However, I did not want to change everything all at once.
>
> > Also, in my mind, it would seem more in-line with the existing code to
> > integrate the conditional into CLOCK_DISPATCH. Not that CLOCK_DISPATCH
> > is pretty, but it avoids making your changes look tacked on in front of
> > everything.
>
> Sorry to disagree, but I really don't want to be the one to extend
> that macro in any way. IMHO it really should be replaced with
> something more legible.

Yes, I agree it could use some cleanup. And again, this was only a nit
with the early version of the patch, so not a huge issue right now. But
before these go upstream, we'll need to address this in some way (be it
your lookup table above or something else).

thanks
-john

2010-11-09 21:37:22

by john stultz

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Mon, 2010-11-08 at 07:56 +0100, Arnd Bergmann wrote:
> On Thursday 04 November 2010, Richard Cochran wrote:
> > diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
> > index 3e23844..70f40e6 100644
> > --- a/include/linux/posix-timers.h
> > +++ b/include/linux/posix-timers.h
> > @@ -17,10 +17,22 @@ struct cpu_timer_list {
> > int firing;
> > };
> >
> > +/* Bit fields within a clockid:
> > + *
> > + * The most significant 29 bits hold either a pid or a file descriptor.
> > + *
> > + * Bit 2 indicates whether a cpu clock refers to a thread or a process.
> > + *
> > + * Bits 1 and 0 give the type: PROF=0, VIRT=1, SCHED=2, or FD=3.
> > + *
> > + * A clockid is invalid if bits 2, 1, and 0 all set (see also CLOCK_INVALID
> > + * in include/linux/time.h)
> > + */
> > +
> > #define (clock) ((pid_t) ~((clock) >> 3))
> > #define CPUCLOCK_PERTHREAD(clock) \
> > (((clock) & (clockid_t) CPUCLOCK_PERTHREAD_MASK) != 0)
> > -#define CPUCLOCK_PID_MASK 7
> > +
> > #define CPUCLOCK_PERTHREAD_MASK 4
> > #define CPUCLOCK_WHICH(clock) ((clock) & (clockid_t) CPUCLOCK_CLOCK_MASK)
> > #define CPUCLOCK_CLOCK_MASK 3
> > @@ -28,12 +40,17 @@ struct cpu_timer_list {
> > #define CPUCLOCK_VIRT 1
> > #define CPUCLOCK_SCHED 2
> > #define CPUCLOCK_MAX 3
> > +#define CLOCKFD CPUCLOCK_MAX
> > +#define CLOCKFD_MASK (CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)
>
> It looks like you are turning a kernel internal interface into a user ABI,
> which I think is highly questionable. Using the bits like this internally is
> ok, but making it part of the syscall ABI means that we can never change this
> in the future.
>
> Maybe I was misunderstanding your patch though.


As Richard already mentioned, the cpuclocks has already exported this as
ABI and its used in pthread_getcpuclockid().

I wonder thought if it would be worth having a syscall to convert from
fd to clock_id so it could be more flexible in the future. But it may
not be worth it, as we're probably already limited by the cpuclock
implementation.

thanks
-john


2010-11-10 10:15:22

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Tuesday 09 November 2010, john stultz wrote:
> I wonder thought if it would be worth having a syscall to convert from
> fd to clock_id so it could be more flexible in the future. But it may
> not be worth it, as we're probably already limited by the cpuclock
> implementation.

If the file descriptor comes from a character device, we don't even need
a syscall, it could simply be an ioctl on the device.

Arnd

2010-11-15 09:34:51

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 0/8] Dynamic clock devices

On Mon, Nov 08, 2010 at 03:01:04PM -0800, john stultz wrote:
> That said, given how different this is from the last implementation, I'm
> not fully clear I see how to integrate this into my patch set. It might
> be useful to see a trivial example of how you see a clockdevice being
> registered and used.

Okay, I will post again, with the PTP code as user of the new layer.

Thanks,
Richard

2010-11-15 09:42:03

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime

On Tue, Nov 09, 2010 at 01:10:58PM -0800, john stultz wrote:
> On Tue, 2010-11-09 at 09:23 +0100, Richard Cochran wrote:
> > On Mon, Nov 08, 2010 at 03:37:03PM -0800, john stultz wrote:
> > > So sort of minor nit here, but is there a reason the clockfd
> > > implementation is primary here and the standard posix implementation
> > > gets pushed off into its own function rather then doing something like:
> > >
> > > clk = clockid_to_clock_device(id)
> > > if(clk)
> > > return clockdev_clock_gettime(clk, user_ts);
> > > [existing sys_clock_gettime()]
> > >
> > > As you implemented it, it seems to expect the clockdevice method to be
> > > the most frequent use case, where as its likely to be the inverse. So
> > > folks looking into the more common CLOCK_REALTIME calls have to traverse
> > > more code then the likely more rare clockfd cases.
> >
> > Actually, what I would like to do is refactor the exisiting posix
> > clock code to use the new framework. The idea is to have a set of
> > static global clock_device*, one per fixed clock. The function
> > clockid_to_clock_device() will include a lookup table, like this:
> >
> > static clock_device *realtime_clock, *monotinic_clock;
> >
> > switch (id) {
> > case CLOCK_REALTIME:
> > return realtime_clock;
> > case CLOCK_MONOTONIC:
> > return monotinic_clock;
> > /* and so on ... */
> > }
>
> This seems a little over-reaching. I'm not sure I see what benefit would
> come from having clock_devices for the static clock_ids? The extra mutex
> locking and status/null checking for the clock_device would would just
> add unnecessary overhead to the performance critical clock_gettime call.

Okay, next time I'll leave the syscall in posix-timers.c and call out
to the clock_device in the unlikely dynamic case.

Thanks,
Richard

2010-11-15 10:02:24

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH RFC 0/8] Dynamic clock devices

On Thu, Nov 04, 2010 at 08:26:34PM +0100, Richard Cochran wrote:
> Okay, here is a work in progress, not well tested, but I would like to
> get some feedback whether the direction is good or not.
>
> The first patch introduces clock devices which can appear and
> disappear like usb devices (and, I suppose, hot plug PCI but I am not
> too sure that what is offered here would really work in that case).
> The subsequent patches convert the clock_ and timer_ system calls, one
> by one.
>
Sorry to jump in late, but this only just caught my attention.

While I have no strong feelings on this series one way or the other, the
naming is a bit unfortunate. The clock device / clkdev naming is already
in use as an extension to the clock framework and is used by a wide
variety of embedded platforms already, with a pending patch to move it in
to the generic namespace (grepping for clkdev will give you an idea). The
idea behind that interface is similar in that it deals with the dynamic
creation and teardown of clocks, but is decoupled from timekeeping.

It's also reasonable to assume that devices with dynamic clocks tracked
through clkdev will wish to also use this interface in the timekeeping
case, so it would be good to settle on something less ambiguous in
advance.

2010-11-18 16:33:17

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 0/8] Dynamic clock devices

On Mon, Nov 15, 2010 at 07:01:50PM +0900, Paul Mundt wrote:
> While I have no strong feelings on this series one way or the other, the
> naming is a bit unfortunate. The clock device / clkdev naming is already
> in use as an extension to the clock framework and is used by a wide
> variety of embedded platforms already, with a pending patch to move it in
> to the generic namespace (grepping for clkdev will give you an idea). The
> idea behind that interface is similar in that it deals with the dynamic
> creation and teardown of clocks, but is decoupled from timekeeping.
>
> It's also reasonable to assume that devices with dynamic clocks tracked
> through clkdev will wish to also use this interface in the timekeeping
> case, so it would be good to settle on something less ambiguous in
> advance.

Okay, I will take a look at that and try to come up with a better name
for the dynamic clock code.

Thanks,
Richard

2010-12-04 14:55:31

by Richard Cochran

[permalink] [raw]
Subject: Re: [PATCH RFC 1/8] Introduce dynamic clock devices

On Mon, Nov 08, 2010 at 07:38:41AM +0100, Arnd Bergmann wrote:
> On Thursday 04 November 2010, Richard Cochran wrote:
> > +struct clock_device {
> > + struct file_operations fops;
> > + struct file_operations *driver_fops;
> > + struct clock_device_operations *ops;
> > + struct cdev cdev;
> > + struct kref kref;
> > + struct mutex mux;
> > + void *priv;
> > + int index;
> > + bool zombie;
> > +};
>
> You should really not need the file_operations here, neither the
> struct nor the pointer. Just define a static file_operations
> structure containing clock_device_open and clock_device_release,
> and whatever else you might need, then add the driver's operations
> to clock_device_operations, and pass the clock_device pointer
> directly to them, instead of passing the file/inode pointers.

Arnd, I'm working a revision of this series, and I am not sure I
understand your comment.

The intent here was to allow clock drivers to register a character
device through the clock_device API, since some clocks (hpet, rtc)
already offer a chardev interface.

The same FD from the open character device will also be usable as a
clockid for the generic posix clock_get/settime calls. Thus, the
clock_device layer needs to hook into the file open/release functions.

Are you suggesting that I simply offer all of the functions from a
'struct file_operations' (sans file/inode) in the 'struct
clock_device_operations' too?

I wanted to avoid duplicating the file_operations functions, so that
future changes in those functions would automatically trickle down to
the clock drivers.

Thanks,
Richard


2010-12-06 14:15:27

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RFC 1/8] Introduce dynamic clock devices

On Saturday 04 December 2010, Richard Cochran wrote:
> On Mon, Nov 08, 2010 at 07:38:41AM +0100, Arnd Bergmann wrote:
> > On Thursday 04 November 2010, Richard Cochran wrote:
> > > +struct clock_device {
> > > + struct file_operations fops;
> > > + struct file_operations *driver_fops;
> > > + struct clock_device_operations *ops;
> > > + struct cdev cdev;
> > > + struct kref kref;
> > > + struct mutex mux;
> > > + void *priv;
> > > + int index;
> > > + bool zombie;
> > > +};
> >
> > You should really not need the file_operations here, neither the
> > struct nor the pointer. Just define a static file_operations
> > structure containing clock_device_open and clock_device_release,
> > and whatever else you might need, then add the driver's operations
> > to clock_device_operations, and pass the clock_device pointer
> > directly to them, instead of passing the file/inode pointers.
>
> Arnd, I'm working a revision of this series, and I am not sure I
> understand your comment.
>
> The intent here was to allow clock drivers to register a character
> device through the clock_device API, since some clocks (hpet, rtc)
> already offer a chardev interface.
>
> The same FD from the open character device will also be usable as a
> clockid for the generic posix clock_get/settime calls. Thus, the
> clock_device layer needs to hook into the file open/release functions.

Ok, it wasn't clear that you use this to hook the existing file
operations, I now understand why you did it this way, but I think
my point is still valid.

> Are you suggesting that I simply offer all of the functions from a
> 'struct file_operations' (sans file/inode) in the 'struct
> clock_device_operations' too?

Yes, exactly.

> I wanted to avoid duplicating the file_operations functions, so that
> future changes in those functions would automatically trickle down to
> the clock drivers.

No need, these rarely change. More importantly, if you want to offer
a consistent interface across all these, I would make the interface
as restrictive as possible rather than offering all of the file
operations. Have a look which operations are actually used by the
character devices you want to support, and then pass through the
superset of those, but not more.

Arnd