Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2527438imm; Thu, 27 Sep 2018 14:31:54 -0700 (PDT) X-Google-Smtp-Source: ACcGV62qg9sFbJkrqFmQsoeBk9zq3FkAX025bY4mf/SR3U/GW9rtXs99nLdINYtkhTzf3E9cLxKt X-Received: by 2002:a17:902:a58b:: with SMTP id az11-v6mr12876103plb.93.1538083914565; Thu, 27 Sep 2018 14:31:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538083914; cv=none; d=google.com; s=arc-20160816; b=nKDXh9axcNU6/B+fjJ6W4Du+v5Y0qUYQEy9MkqhRbAUpCszsCIziDIK60XOhuUrkLN ONwEj9MvvDqFLbKOfBMb1/1brdcdPY4GKTPy/MOTtZgw4aUinAPdCuHBI8311Nht3ngy NlnhnDntKDEXPhUsP7e2efrsRrGp/XLTCnulVnf7mCwH/PN5prLcpHIkgTAVWJzk2/vi 4kPR3fvPWomTjOH9eUa6EZzl0b5b535pqQdKew+Oj5MabsFbqdM2prgRtoF2HWeVIU+l cO8+rSW6p4z06uhcpqsT1LVu2LtNpEOZ8o8cwjM3fZJZlCSVtI64FR0dnnRqp/FMDQs2 jpiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=ltnltG5AaemOsjgIfYjdTkiWwnnqBV7jASrCVLx0sfc=; b=Ghc8Z3dtxlP1ndYNBo1RkYTifFo+mTQAShzzZHMZF7Hlu6gDs9xc6Oy/VX4lV7o2Ci Ir3ETXwFUJe6SRtF8JRWDLtiK7YA9/TnerbJxiR1KdPIRZMzLEx8/3O82XiwVxKyVSaI Z55pnKZ7XsbHGSIQCckrHhMskLo8LAskoEb6jXxEBesCQPfcOPb9sou8zhVF3VVE6Zi7 YSrmuXF+y/W7Y78FF88vZgNFiMfLNRMIfVoTOnlibNj/WPe7r9zjrJq7gKUHUWPnfP7p p/dNIz4sarBA01Cy1s51JTzEjtfld+200tk/iT+H7NHgkA7zx1ewcNJ5Du63YaRvp/mJ v2BQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y86-v6si3129384pfi.195.2018.09.27.14.31.35; Thu, 27 Sep 2018 14:31:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728186AbeI1DvC (ORCPT + 99 others); Thu, 27 Sep 2018 23:51:02 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:53089 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727340AbeI1DvC (ORCPT ); Thu, 27 Sep 2018 23:51:02 -0400 Received: from p5492e4c1.dip0.t-ipconnect.de ([84.146.228.193] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1g5drO-0003Wc-IO; Thu, 27 Sep 2018 23:30:10 +0200 Date: Thu, 27 Sep 2018 23:30:09 +0200 (CEST) From: Thomas Gleixner To: "Eric W. Biederman" cc: Andrey Vagin , Dmitry Safonov , "linux-kernel@vger.kernel.org" , Dmitry Safonov <0x7f454c46@gmail.com>, Adrian Reber , Andy Lutomirski , Christian Brauner , Cyrill Gorcunov , "H. Peter Anvin" , Ingo Molnar , Jeff Dike , Oleg Nesterov , Pavel Emelianov , Shuah Khan , "containers@lists.linux-foundation.org" , "criu@openvz.org" , "linux-api@vger.kernel.org" , "x86@kernel.org" , Alexey Dobriyan , "linux-kselftest@vger.kernel.org" Subject: Re: [RFC 00/20] ns: Introduce Time Namespace In-Reply-To: <87zhw4rwiq.fsf@xmission.com> Message-ID: References: <20180919205037.9574-1-dima@arista.com> <874lej6nny.fsf@xmission.com> <20180924205119.GA14833@outlook.office365.com> <874leezh8n.fsf@xmission.com> <20180925014150.GA6302@outlook.office365.com> <87zhw4rwiq.fsf@xmission.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 26 Sep 2018, Eric W. Biederman wrote: > Reading the code the calling sequence there is: > tick_sched_do_timer > tick_do_update_jiffies64 > update_wall_time > timekeeping_advance > timekeepging_update > > If I read that properly under the right nohz circumstances that update > can be delayed indefinitely. > > So I think we could prototype a time namespace that was per > timekeeping_update and just had update_wall_time iterate through > all of the time namespaces. Please don't go there. timekeeping_update() is already heavy and walking through a gazillion of namespaces will just make it horrible, > I don't think the naive version would scale to very many time > namespaces. :) > At the same time using the techniques from the nohz work and a little > smarts I expect we could get the code to scale. You'd need to invoke the update when the namespace is switched in and hasn't been updated since the last tick happened. That might be doable, but you also need to take the wraparound constraints of the underlying clocksources into account, which again can cause walking all name spaces when they are all idle long enough. From there it becomes hairy, because it's not only timekeeping, i.e. reading time, this is also affecting all timers which are armed from a namespace. That gets really ugly because when you do settimeofday() or adjtimex() for a particular namespace, then you have to search for all armed timers of that namespace and adjust them. The original posix timer code had the same issue because it mapped the clock realtime timers to the timer wheel so any setting of the clock caused a full walk of all armed timers, disarming, adjusting and requeing them. That's horrible not only performance wise, it's also a locking nightmare of all sorts. Add time skew via NTP/PTP into the picture and you might have to adjust timers as well, because you need to guarantee that they are not expiring early. I haven't looked through Dimitry's patches yet, but I don't see how this can work at all without introducing subtle issues all over the place. Thanks, tglx