Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp4138865ybi; Mon, 29 Jul 2019 20:06:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqwRlw9PjrU80mjaLQclazmXQw7VWAPOs3ZzjYOvTTuKBK8j7WiClLGMtiEGFpetz7qLECGj X-Received: by 2002:a63:c23:: with SMTP id b35mr73230135pgl.265.1564455995625; Mon, 29 Jul 2019 20:06:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564455995; cv=none; d=google.com; s=arc-20160816; b=NG09bq5JdCiyrzEQCGYvwJ8KbByCZoReBvd3/9sYlHKiV5f4DhjqRjUUBpMQISqhFl FxyxZMUVC7/RAWIczO/yHkCaKZ5YGkSApa7F9uV0pdroW/LrzJ/wkSe5HJrrnSCbth4J C7KpQ17YUAyuDZ0JVauOWrR01XVAJltMUPxLrUor1DrMfm6d5aB8aNwiiwxvqux2821R Sin72kefc/lepo6VzdmZrSzjae4c41JYAZBcFjHjC+7/U9wKI5eDNaYDLTfKE1EYncfJ 7lOWiFXH2NtPFGcKlTS3ZNOqZPcc2NaCGhoXDImcx/ix62Ziz2ymGYHoQMiYOXPP+dBl MFTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NCjCxHizGTUiltqJkbCb5KXNF+5A90VayugEUOfu2Eo=; b=ftVqOdmzzNUg0epMzEsWIPHWeMjHRRNQTK4a48R/3Cwpde5VZlMAdiMQ79wfBMUo83 FraHVjgF0SPMwfi8FmE57LRQLZ4ul15cpOkMOQscmcPgpVjXyr8qIVxuhYlX7XeEEEWD pgMZQZOdKgzvuGUVshgxWB62KR+qinfWujJ1ww9eOX7/aJJ9d1+d31/pSMkg7fnn3MYa 3CEQ7OP5SB8CiPuwYPxCof05qOy0Sx5jxvcIs37NRujDPOjpnKQjhtWgU6MnWQpEX5I9 qWoi6nQWyv2dMJsKtA6BpLmy9uw5b6lU+l4C9j6cRrlbPXDgUPnT1k5cqnrGPcKcklCE HbaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=TL5eR7Hf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a17si29693487pfa.45.2019.07.29.20.06.20; Mon, 29 Jul 2019 20:06:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=TL5eR7Hf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388462AbfG2V6r (ORCPT + 99 others); Mon, 29 Jul 2019 17:58:47 -0400 Received: from mail-wr1-f67.google.com ([209.85.221.67]:37854 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730509AbfG2V6o (ORCPT ); Mon, 29 Jul 2019 17:58:44 -0400 Received: by mail-wr1-f67.google.com with SMTP id n9so38383566wrr.4 for ; Mon, 29 Jul 2019 14:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NCjCxHizGTUiltqJkbCb5KXNF+5A90VayugEUOfu2Eo=; b=TL5eR7HfRD/QnfEI53k6ExtTJXh+jmHhZLiEhpvrNhgyAJCXIukcwiZjJvJ8mQCAJ9 NVV4VzUIflROWuOwTs9X72I25u+eAJn4JnMF339bU+j7F3tedXh6t3qWMXJO+wq4wqS3 SN9rKfdb4ipHQHSkRsM456Uray1Q89NAIz67xBFKuDBctmWsI696vMN3iVuMQ8Mppq/b wqnLonMMjwkBaA4FnHWTtsincRCDo8LOXoQsmSFRc/tYXCVa2aJXzkf1sNvfh7pu6mP1 bfSUpIB8efi3+5Y4q154s0S3TiTZ7H+g8oBl/kdOaoCtdjsWfj+7pq+w5mZr/gEBWcV7 JDDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NCjCxHizGTUiltqJkbCb5KXNF+5A90VayugEUOfu2Eo=; b=DGm3tXhv5irDF85jyJWvhhKWLjv0rb04lSFpvr2I1rdvsBucsnSRpIQq94bXbNyT5f m871zk5frjHF6vYuzC/8ZaXFnMT3aINLsb4j9CFKTr6toU06Nm0HeFelNY/fPSje4aSG tYRLJP4bZhpnqc87IkIImngXmyu97dQHWsIFchhQ/W6sGdezYb0OWPG0E21+DU+AkgXj DmV8LB+LZJ2OlXAjZYSmuuX6npOSaOlyl9CaqYpPVCPxKGTDd4rybpE2Vv29ndZ8M6bi Lznv2GCd82g1WShgqGAvaQHdJJEcoq5FmaV72qRpMv3KkPnGNrTm/7zjI+wgr9+C5Tyx 6Dag== X-Gm-Message-State: APjAAAWHU59Grd2lvPiWekBZymvoZFXw33CDIEBZYKnM3DNnrEwp707F BcKeVMtf7Ec6LJnY15TRhfK4iTvRfnfFr3pw2JGw854Qy0QORvvzPeuj1DRl9DC0sFA+T+nHqRc CUMWJmqBlXVGwdZIRwzBuQQ+WnT1+bHj1EXbgSc4xz4kEUfs7Q8Ppr7kV9d+LrwVOWsA8dST9VW 6ISsOucW5xfwMiiUjaxKTCUq/k386rECKkcsbq27E= X-Received: by 2002:a5d:494d:: with SMTP id r13mr48533053wrs.152.1564437521752; Mon, 29 Jul 2019 14:58:41 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id x20sm49230728wmc.1.2019.07.29.14.58.40 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 29 Jul 2019 14:58:41 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Andrei Vagin , Dmitry Safonov , Adrian Reber , Andrei Vagin , Andy Lutomirski , Arnd Bergmann , Christian Brauner , Cyrill Gorcunov , "Eric W. Biederman" , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jeff Dike , Oleg Nesterov , Pavel Emelyanov , Shuah Khan , Thomas Gleixner , Vincenzo Frascino , containers@lists.linux-foundation.org, criu@openvz.org, linux-api@vger.kernel.org, x86@kernel.org Subject: [PATCHv5 30/37] fs/proc: Introduce /proc/pid/timens_offsets Date: Mon, 29 Jul 2019 22:57:12 +0100 Message-Id: <20190729215758.28405-31-dima@arista.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190729215758.28405-1-dima@arista.com> References: <20190729215758.28405-1-dima@arista.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CLOUD-SEC-AV-Info: arista,google_mail,monitor X-CLOUD-SEC-AV-Sent: true X-Gm-Spam: 0 X-Gm-Phishy: 0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrei Vagin API to set time namespace offsets for children processes, i.e.: echo "clockid off_ses off_nsec" > /proc/self/timens_offsets Signed-off-by: Andrei Vagin Co-developed-by: Dmitry Safonov Signed-off-by: Dmitry Safonov --- fs/proc/base.c | 95 ++++++++++++++++++++++++++++++ include/linux/time_namespace.h | 10 ++++ kernel/time_namespace.c | 104 +++++++++++++++++++++++++++++++++ 3 files changed, 209 insertions(+) diff --git a/fs/proc/base.c b/fs/proc/base.c index ebea9501afb8..1d2007365e87 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -94,6 +94,7 @@ #include #include #include +#include #include #include "internal.h" #include "fd.h" @@ -1533,6 +1534,97 @@ static const struct file_operations proc_pid_sched_autogroup_operations = { #endif /* CONFIG_SCHED_AUTOGROUP */ +#ifdef CONFIG_TIME_NS +static int timens_offsets_show(struct seq_file *m, void *v) +{ + struct task_struct *p; + + p = get_proc_task(file_inode(m->file)); + if (!p) + return -ESRCH; + proc_timens_show_offsets(p, m); + + put_task_struct(p); + + return 0; +} + +static ssize_t +timens_offsets_write(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + struct inode *inode = file_inode(file); + struct proc_timens_offset offsets[2]; + char *kbuf = NULL, *pos, *next_line; + struct task_struct *p; + int ret, noffsets; + + /* Only allow < page size writes at the beginning of the file */ + if ((*ppos != 0) || (count >= PAGE_SIZE)) + return -EINVAL; + + /* Slurp in the user data */ + kbuf = memdup_user_nul(buf, count); + if (IS_ERR(kbuf)) + return PTR_ERR(kbuf); + + /* Parse the user data */ + ret = -EINVAL; + noffsets = 0; + for (pos = kbuf; pos; pos = next_line) { + struct proc_timens_offset *off = &offsets[noffsets]; + int err; + + /* Find the end of line and ensure we don't look past it */ + next_line = strchr(pos, '\n'); + if (next_line) { + *next_line = '\0'; + next_line++; + if (*next_line == '\0') + next_line = NULL; + } + + err = sscanf(pos, "%u %lld %lu", &off->clockid, + &off->val.tv_sec, &off->val.tv_nsec); + if (err != 3 || off->val.tv_nsec >= NSEC_PER_SEC) + goto out; + noffsets++; + if (noffsets == ARRAY_SIZE(offsets)) { + if (next_line) + count = next_line - kbuf; + break; + } + } + + ret = -ESRCH; + p = get_proc_task(inode); + if (!p) + goto out; + ret = proc_timens_set_offset(file, p, offsets, noffsets); + put_task_struct(p); + if (ret) + goto out; + + ret = count; +out: + kfree(kbuf); + return ret; +} + +static int timens_offsets_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, timens_offsets_show, inode); +} + +static const struct file_operations proc_timens_offsets_operations = { + .open = timens_offsets_open, + .read = seq_read, + .write = timens_offsets_write, + .llseek = seq_lseek, + .release = single_release, +}; +#endif /* CONFIG_TIME_NS */ + static ssize_t comm_write(struct file *file, const char __user *buf, size_t count, loff_t *offset) { @@ -3015,6 +3107,9 @@ static const struct pid_entry tgid_base_stuff[] = { #endif #ifdef CONFIG_SCHED_AUTOGROUP REG("autogroup", S_IRUGO|S_IWUSR, proc_pid_sched_autogroup_operations), +#endif +#ifdef CONFIG_TIME_NS + REG("timens_offsets", S_IRUGO|S_IWUSR, proc_timens_offsets_operations), #endif REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations), #ifdef CONFIG_HAVE_ARCH_TRACEHOOK diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h index 9ba9664ff0ab..3f4d457ff0dc 100644 --- a/include/linux/time_namespace.h +++ b/include/linux/time_namespace.h @@ -40,6 +40,16 @@ static inline void put_time_ns(struct time_namespace *ns) kref_put(&ns->kref, free_time_ns); } +extern void proc_timens_show_offsets(struct task_struct *p, struct seq_file *m); + +struct proc_timens_offset { + int clockid; + struct timespec64 val; +}; + +extern int proc_timens_set_offset(struct file *file, struct task_struct *p, + struct proc_timens_offset *offsets, int n); + static inline void timens_add_monotonic(struct timespec64 *ts) { struct timens_offsets *ns_offsets = current->nsproxy->time_ns->offsets; diff --git a/kernel/time_namespace.c b/kernel/time_namespace.c index 4b2eb92ad595..e17b8569bead 100644 --- a/kernel/time_namespace.c +++ b/kernel/time_namespace.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -249,6 +250,109 @@ static struct user_namespace *timens_owner(struct ns_common *ns) return to_time_ns(ns)->user_ns; } +static void show_offset(struct seq_file *m, int clockid, struct timespec64 *ts) +{ + seq_printf(m, "%d %lld %ld\n", clockid, ts->tv_sec, ts->tv_nsec); +} + +void proc_timens_show_offsets(struct task_struct *p, struct seq_file *m) +{ + struct ns_common *ns; + struct time_namespace *time_ns; + struct timens_offsets *ns_offsets; + + ns = timens_for_children_get(p); + if (!ns) + return; + time_ns = to_time_ns(ns); + + if (!time_ns->offsets) { + put_time_ns(time_ns); + return; + } + ns_offsets = time_ns->offsets; + + show_offset(m, CLOCK_MONOTONIC, &ns_offsets->monotonic); + show_offset(m, CLOCK_BOOTTIME, &ns_offsets->boottime); + put_time_ns(time_ns); +} + +int proc_timens_set_offset(struct file *file, struct task_struct *p, + struct proc_timens_offset *offsets, int noffsets) +{ + struct ns_common *ns; + struct time_namespace *time_ns; + struct timens_offsets *ns_offsets; + struct timespec64 *offset; + struct timespec64 tp; + int i, err; + + ns = timens_for_children_get(p); + if (!ns) + return -ESRCH; + time_ns = to_time_ns(ns); + + if (!time_ns->offsets || time_ns->initialized || + !file_ns_capable(file, time_ns->user_ns, CAP_SYS_TIME)) { + put_time_ns(time_ns); + return -EPERM; + } + ns_offsets = time_ns->offsets; + + for (i = 0; i < noffsets; i++) { + struct proc_timens_offset *off = &offsets[i]; + + switch (off->clockid) { + case CLOCK_MONOTONIC: + ktime_get_ts64(&tp); + break; + case CLOCK_BOOTTIME: + ktime_get_boottime_ts64(&tp); + break; + default: + err = -EINVAL; + goto out; + } + + err = -ERANGE; + + if (off->val.tv_sec > KTIME_SEC_MAX || off->val.tv_sec < -KTIME_SEC_MAX) + goto out; + + tp = timespec64_add(tp, off->val); + /* + * KTIME_SEC_MAX is divided by 2 to be sure that KTIME_MAX is + * still unreachable. + */ + if (tp.tv_sec < 0 || tp.tv_sec > KTIME_SEC_MAX / 2) + goto out; + } + + err = 0; + /* don't report errors after this line */ + for (i = 0; i < noffsets; i++) { + struct proc_timens_offset *off = &offsets[i]; + + switch (off->clockid) { + case CLOCK_MONOTONIC: + offset = &ns_offsets->monotonic; + break; + case CLOCK_BOOTTIME: + offset = &ns_offsets->boottime; + break; + default: + goto out; + } + + *offset = off->val; + } + +out: + put_time_ns(time_ns); + + return err; +} + const struct proc_ns_operations timens_operations = { .name = "time", .type = CLONE_NEWTIME, -- 2.22.0