Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp911212ybx; Thu, 7 Nov 2019 04:33:44 -0800 (PST) X-Google-Smtp-Source: APXvYqxxQgdXZBIcIAI825fpcOi7WVZQv+3MdskblDod5mbWE/vSMVK6XZjJrUfkz9LIsHKxXOY7 X-Received: by 2002:aa7:cb53:: with SMTP id w19mr3271520edt.50.1573130024600; Thu, 07 Nov 2019 04:33:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573130024; cv=none; d=google.com; s=arc-20160816; b=sflWGo2X/eOFAzp8f0VqwLt2xwtJydbWSVi3bKMAyECZ5i/8MREkRmDePzSTcHurk2 8TdZWb6RYqLiGP/lmE2iNUnZwaduleozo2AzeDbSm5ePCGqsfIFrAAofmYtuCnbnOtUA IuEjKOdjA4N5b7xKb/mM6Fbtx9u2QDFuN1RecYtqnbNIUvSd4INKcocEgrgDd77tdMO9 Xl+uzsysZciC/W/rvlLasJjQTC0d18F2ypnDCzAmACo2splT2Ng2NnbdgbkEDpuPkt8r RCql36LDHqIwvK7eMihIGVoUkdDyr8G0gXQioYLRfhPx4XTlDqWOVLO6ITVqBxpjDWbZ Ejtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=5b6dL4GU7b0E4m+87rHIiqniNskuz8XJPlTDMY3zCsI=; b=dysbYXZ4pY3TFDsjzRo3kPRZvVD23JfmGwL2VI9gwwg68ccfgHJw39cR7p2FLb8mGg FMKZ9K/v6PS+ai9VcOQMHOCOEXmIehzqFRlHoDnYG5kD4OzaYN04X49gqi/0GvEY134s N4cVNxWOSOV/Y/90gwfk8RZud+aXueWot54Z3adIOG88Xr1gC18bT0va8JxuKIeIJooJ CQgsyXJGn6KqouzQK2undwM/vdMTKCS36PfeDjOdAI27GE6yi/cAnIHlUqiTPIivbC02 WhHKD2hg+O3GkcidO4nXW/JyldlwsC3lQrp66Rk/Gf9oBmE4eN82ix9bLmBoyjg4qyxF ndZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=QORicF9P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a52si1240912eda.450.2019.11.07.04.33.20; Thu, 07 Nov 2019 04:33:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=QORicF9P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388427AbfKGMcd (ORCPT + 99 others); Thu, 7 Nov 2019 07:32:33 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43716 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727619AbfKGMcd (ORCPT ); Thu, 7 Nov 2019 07:32:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=5b6dL4GU7b0E4m+87rHIiqniNskuz8XJPlTDMY3zCsI=; b=QORicF9PleMxv8poQOXsiir5J 1sBpNIs3/HKgGcSlGkkjPCArd7lTJ3NQI+txxI6GwMpcVaMyjWtACK2KlyPqXfuP/39LD8+ffKN9p fB/x73hq9tYzxCBYGYNufCDbWOLmqxtHWxvWursQSwMSKVgLjqldH7LM94UhsTH4Vl2kZQ+eXOrEO XQAIp1FFk/ZgutTuRWHIqXgHmF35kWbPvJdayt7DueI6Zh3uDG6rgnN5s0pwSlIaBnauPSftVYFCn dcObLkppz20g10ag6fBJUCr6a30vuKGSYDXedxFEXyKsQxu/w0wluU0c19jq7v5JCCttgU65+aDPv O5Npvlt8A==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1iSgxf-00080d-5M; Thu, 07 Nov 2019 12:32:27 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id DFAC0300692; Thu, 7 Nov 2019 13:31:19 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 57BED2025EDB3; Thu, 7 Nov 2019 13:32:24 +0100 (CET) Date: Thu, 7 Nov 2019 13:32:24 +0100 From: Peter Zijlstra To: Jan Stancek Cc: linux-kernel@vger.kernel.org, ltp@lists.linux.it, viro@zeniv.linux.org.uk, kstewart@linuxfoundation.org, gregkh@linuxfoundation.org, tglx@linutronix.de, rfontana@redhat.com Subject: Re: [PATCH] kernel: use ktime_get_real_ts64() to calculate acct.ac_btime Message-ID: <20191107123224.GA6315@hirez.programming.kicks-ass.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 02, 2019 at 12:39:24AM +0100, Jan Stancek wrote: > fill_ac() calculates process creation time from current time as: > ac->ac_btime = get_seconds() - elapsed > > get_seconds() doesn't accumulate nanoseconds as regular time getters. > This creates race for user-space (e.g. LTP acct02), which typically > uses gettimeofday(), because process creation time sometimes appear > to be dated 'in past': > > acct("myfile"); > time_t start_time = time(NULL); > if (fork()==0) { > sleep(1); > exit(0); > } > waitpid(NULL); > acct(NULL); > > // acct.ac_btime == 1572616777 > // start_time == 1572616778 > Lets start by saying this accounting stuff is terrible crap and it deserves to fail and burn. The only spec I found for what it actually wants in those fields is Documentation/accounting/taskstats-struct.rst, that states: /* The time when a task begins, in [secs] since 1970. */ __u32 ac_btime; /* Begin time [sec since 1970] */ /* The elapsed time of a task, in [usec]. */ __u64 ac_etime; /* Elapsed time [usec] */ But that is not really well defined. As implemented etime does not include suspend time (maybe on purpose, maybe not). And what does btime want? As implemented it jumps around if you ask the question twice with an adjtime() call or suspend in between. Of course, if we take an actual CLOCK_REALTIME timestamp at fork() the value doesn't change, but then it can be in the future (DST,adjtime()), which is exactly the reason why CLOCK_REALTIME is absolute shit for timestamps (logging, accounting, etc.). And your 'fix' is pretty terible too. Arguably ktime_get_seconds() wants fixing for not having the ns accumulation and actually differing from tv_sec, but now you accrue one source of ns while still disregarding another (also, I friggin hate timespec, it's a terrible interface for time). All in all, I'm tempted to just declare this stuff broken and -EWONTFIX, but if we have to do something, something like the below is at least internally consistent. --- diff --git a/kernel/tsacct.c b/kernel/tsacct.c index 7be3e7530841..76d6325c2724 100644 --- a/kernel/tsacct.c +++ b/kernel/tsacct.c @@ -23,18 +23,31 @@ void bacct_add_tsk(struct user_namespace *user_ns, { const struct cred *tcred; u64 utime, stime, utimescaled, stimescaled; - u64 delta; + u64 mono, real, btime; BUILD_BUG_ON(TS_COMM_LEN < TASK_COMM_LEN); + mono = ktime_get_ns(); + real = ktime_get_real_ns(); + /* calculate task elapsed time in nsec */ - delta = ktime_get_ns() - tsk->start_time; + delta = mono - tsk->start_time; /* Convert to micro seconds */ do_div(delta, NSEC_PER_USEC); stats->ac_etime = delta; - /* Convert to seconds for btime */ - do_div(delta, USEC_PER_SEC); - stats->ac_btime = get_seconds() - delta; + + /* + * Compute btime by subtracting the elapsed time from the current + * CLOCK_REALTIME. + * + * XXX totally buggered, because it changes results across + * adjtime() calls and suspend/resume. + */ + delta = mono - tsk->start_time; // elapsed in ns + btime = real - delta; // real ns - elapsed ns + do_div(btime, NSEC_PER_SEC); // truncated to seconds + stats->ac_btime = btime; + if (thread_group_leader(tsk)) { stats->ac_exitcode = tsk->exit_code; if (tsk->flags & PF_FORKNOEXEC)