Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754613Ab0KIVLe (ORCPT ); Tue, 9 Nov 2010 16:11:34 -0500 Received: from e37.co.us.ibm.com ([32.97.110.158]:59579 "EHLO e37.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752720Ab0KIVLb (ORCPT ); Tue, 9 Nov 2010 16:11:31 -0500 Subject: Re: [PATCH RFC 2/8] clock device: convert clock_gettime From: john stultz To: Richard Cochran Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Alan Cox , Arnd Bergmann , Christoph Lameter , Peter Zijlstra , Thomas Gleixner , Roland McGrath In-Reply-To: <20101109082353.GA2690@riccoc20.at.omicron.at> References: <81ccd2674ebf26332898761ba6b7b54f395a15bd.1288897199.git.richard.cochran@omicron.at> <1289259423.21487.7.camel@localhost.localdomain> <20101109082353.GA2690@riccoc20.at.omicron.at> Content-Type: text/plain; charset="UTF-8" Date: Tue, 09 Nov 2010 13:10:58 -0800 Message-ID: <1289337058.9434.39.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5005 Lines: 116 On Tue, 2010-11-09 at 09:23 +0100, Richard Cochran wrote: > On Mon, Nov 08, 2010 at 03:37:03PM -0800, john stultz wrote: > > On Thu, 2010-11-04 at 20:28 +0100, Richard Cochran wrote: > > > #define CPUCLOCK_PID(clock) ((pid_t) ~((clock) >> 3)) > > > > So I think we may want to add some additional comments here. > > Specifically around the limits both new and existing that are created > > around the interactions between clockid_t, pid_t and now the clock_fd. > > > > Specifically, pid_t is already limited by futex to 29 bits (I believe, > > please correct me if I'm wrong). MAKE_PROCESS_CPUCLOCK macro below seems > > to also utilize this 29 bit pid limit assumption as well (via pid<<3), > > however it may actually require pid to be below 28 (since CLOCK_DISPATCH > > assumes cpu clockids are negative). > > > > Anyway, this seems like it should be a bit more explicit. > > Yes, you are right, of course, but I would like to point out that the > existing "overloading" of the clockid_t isn't really explained at all. > It was not clear to me whether or not 29 pid bits is enough for the > worst case, or not. > > This code is older than 2005 (git/linux) and so I don't know who wrote > it and what they were thinking. I took the first step and tried to > explain as much I understand. Looks like the cpu timers bit landed in 2.6.11 from Roland. Roland: Might be nice to get your thoughts on the proposed changes here. > > > +#define FD_TO_CLOCKID(fd) ((clockid_t) (fd << 3) | CLOCKFD) > > > +#define CLOCKID_TO_FD(clk) (((unsigned int) clk) >> 3) > > > > So similarly here, we need to be explicit in making sure that the max fd > > value is small enough to fit without dropping bits if we shift it up by > > three (trying to quickly review open I don't see any explicit limit, > > other then NR_OPEN_DEFAULT, but that's BIT_PER_LONG, which won't fit in > > the int returned by open on 64bit systems). > > I also didn't see any limit to the number of open files, except that > it must be a non-negative (signed) integer. > > For userspace, there will have to be a glibc function/macro like > FD_TO_CLOCKID() that tests the three most significant bits and returns > CLOCK_INVALID if any are set. > > This will result in the limitation that if a userspace program already > has 2^29 (536 million) open files, then it will not be able to obtain > a dynamic clock id. I think we can live with that. This does seem reasonable. Any one have an objection with this? > > So sort of minor nit here, but is there a reason the clockfd > > implementation is primary here and the standard posix implementation > > gets pushed off into its own function rather then doing something like: > > > > clk = clockid_to_clock_device(id) > > if(clk) > > return clockdev_clock_gettime(clk, user_ts); > > [existing sys_clock_gettime()] > > > > As you implemented it, it seems to expect the clockdevice method to be > > the most frequent use case, where as its likely to be the inverse. So > > folks looking into the more common CLOCK_REALTIME calls have to traverse > > more code then the likely more rare clockfd cases. > > Actually, what I would like to do is refactor the exisiting posix > clock code to use the new framework. The idea is to have a set of > static global clock_device*, one per fixed clock. The function > clockid_to_clock_device() will include a lookup table, like this: > > static clock_device *realtime_clock, *monotinic_clock; > > switch (id) { > case CLOCK_REALTIME: > return realtime_clock; > case CLOCK_MONOTONIC: > return monotinic_clock; > /* and so on ... */ > } This seems a little over-reaching. I'm not sure I see what benefit would come from having clock_devices for the static clock_ids? The extra mutex locking and status/null checking for the clock_device would would just add unnecessary overhead to the performance critical clock_gettime call. And would each cpuclock need a clock_device too? > This could be done on top of the existing patch in an incremental way. > However, I did not want to change everything all at once. > > > Also, in my mind, it would seem more in-line with the existing code to > > integrate the conditional into CLOCK_DISPATCH. Not that CLOCK_DISPATCH > > is pretty, but it avoids making your changes look tacked on in front of > > everything. > > Sorry to disagree, but I really don't want to be the one to extend > that macro in any way. IMHO it really should be replaced with > something more legible. Yes, I agree it could use some cleanup. And again, this was only a nit with the early version of the patch, so not a huge issue right now. But before these go upstream, we'll need to address this in some way (be it your lookup table above or something else). thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/