Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp3860525imd; Mon, 29 Oct 2018 13:34:02 -0700 (PDT) X-Google-Smtp-Source: AJdET5cshVdWW9uRbmBNc2x/GoGnwyflAoYinAqvFX2oCVzr27KylFxglQUqIHiEOXV56eyHtpvK X-Received: by 2002:a17:902:6bc1:: with SMTP id m1-v6mr16039840plt.34.1540845242914; Mon, 29 Oct 2018 13:34:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540845242; cv=none; d=google.com; s=arc-20160816; b=vb/1B1nuQ8OObb/BeEHKstfFT2TPYZ8d9WinVr/SW69sdiyXan3w96a+d2izMCblMH HeRTZ0EAjq5G8X+oar6zh03bGzPUvpdWSMmCMwielvF0ZazdryzZbUzsN312j8vwCc/g fcscvfYv5o8+szYWouFI+yEyDzRE6zcvNZp9l7l3kZ3pU2uj2vtEZPxNlfBEm0JSMjCl CTe7ndtW0kMCtZcpKgbyshQzII0hxs3mxTc7ad9fctqasB68zYzyJKClMp4paur7WP4u 9dP62HPaICljAUCZzti5/NrNK4MeofuLFxswD434Ue4yCWMVeQzjb2IrpFIeCUxf3cxs rINw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=1+uLp5wFnBqSprIxO7cNPpo5WgyF5kOWarSSD9C19TA=; b=ajRT0KX+28WGtU31YnrMwC9Sx7Jp8ftLpSo0L5Zc9mph2rloRY7LZrIoHc7mxKkWY+ O0Z7o1pOGw4kSE7jMazBLpiJ3yAEiCT6TaEHL46aAXwf5YjAJBjLdjPRaDh+Hl+d/jr1 g0JmZq5IA0WbIA2zC3Ar8/ouez6IAxi+111Vz1bPlF11xn8oEkM2xqImMi7uT9uf3Vrc fANBXRHPY67SQKyIYmuIJN6plCAft455W4qQ9FElQagmsgSzWuBeWSsSiIDVcNwaddXf pNNx4gAhPl+dC5+SVUwPmg0IvJeQHeXtOWnHKpqpAXK0ySAA0vVm4n3kIizleK2HlYo8 VnVQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f14-v6si21197821pgr.259.2018.10.29.13.33.46; Mon, 29 Oct 2018 13:34:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729661AbeJ3FXn (ORCPT + 99 others); Tue, 30 Oct 2018 01:23:43 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:52721 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729535AbeJ3FXn (ORCPT ); Tue, 30 Oct 2018 01:23:43 -0400 Received: from tmo-115-37.customers.d1-online.com ([80.187.115.37] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gHEDr-0002gD-SM; Mon, 29 Oct 2018 21:33:16 +0100 Date: Mon, 29 Oct 2018 21:33:14 +0100 (CET) From: Thomas Gleixner To: Andrei Vagin cc: "Eric W. Biederman" , "linux-kselftest@vger.kernel.org" , Dmitry Safonov , "linux-api@vger.kernel.org" , Jeff Dike , "x86@kernel.org" , Dmitry Safonov <0x7f454c46@gmail.com>, "linux-kernel@vger.kernel.org" , Oleg Nesterov , "criu@openvz.org" , Ingo Molnar , Alexey Dobriyan , Andy Lutomirski , "H. Peter Anvin" , Cyrill Gorcunov , Christian Brauner , Pavel Emelianov , Shuah Khan , "containers@lists.linux-foundation.org" , Adrian Reber , Peter Zijlstra Subject: Re: [RFC 00/20] ns: Introduce Time Namespace In-Reply-To: <20181021014121.GA23474@gmail.com> Message-ID: References: <20180919205037.9574-1-dima@arista.com> <874lej6nny.fsf@xmission.com> <20180924205119.GA14833@outlook.office365.com> <874leezh8n.fsf@xmission.com> <20180925014150.GA6302@outlook.office365.com> <87zhw4rwiq.fsf@xmission.com> <87mus1ftb9.fsf@xmission.com> <20181021014121.GA23474@gmail.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andrei, On Sat, 20 Oct 2018, Andrei Vagin wrote: > When a container is migrated to another host, we have to restore its > monotonic and boottime clocks, but we still expect that the container > will continue using the host real-time clock. > > Before stating this series, I was thinking about this, I decided that > these cases can be solved independently. Probably, the full isolation of > the time sub-system will have much higher overhead than just offsets for > a few clocks. And the idea that isolation of the real-time clock should > be optional gives us another hint that offsets for monotonic and > boot-time clocks can be implemented independently. > > Eric and Tomas, what do you think about this? If you agree that these > two cases can be implemented separately, what should we do with this > series to make it ready to be merged? > > I know that we need to: > > * look at device drivers that report timestamps in CLOCK_MONOTONIC base. and CLOCK_BOOTTIME and that's quite a few. > * forbid changing offsets after creating timers There are more things to think about. What about interfaces which expose boot time or monotonic time in /proc? Aside of that (I finally came around to look at the series in more detail) I'm really unhappy about the unconditional overhead once the Time namespace config switch is enabled. This applies especially to the VDSO. We spent quite some time recently to squeeze a few cycles out of those functions and it would be a pity to pointlessly waste cycles for the !namespace case. I can see the urge for this, but please let us think it through properly before rushing anything in which we are going to regret once we want to do more sophisticated time domain management, e.g. support for isolated clock real time. I'm worried, that without a clear plan about the overall picture, we end up with duct tape which is hard to distangle after the fact. There have been a few other things brought up versus time management in general, like the TSN folks utilizing grand clock masters which expose random time instead of proper TAI. Plus some requirements for exposing some sort of 'monotonic' clocks which are derived from external synchronization mechanisms, but should not affect the regular time keeping clocks. While different issues, these all fall into the category of separate time domains, so taking a step back to the drawing board is probably the best thing what we can do now. There are certainly a few things which can be looked at independently, e.g. the VDSO mechanics or general mechanisms to avoid plastering the whole kernel with these name space functions applying offsets left and right. I rather have dedicated core functionality which replaces/amends existing timer functions to become time namespace aware. I'll try to find some time in the next weeks to look deeper into that, but I can't promise anything before returning from LPC. Btw, LPC would be a great opportunity to discuss that. Are you and the other name space wizards there by any chance? Thanks, tglx