Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp298387ybp; Thu, 10 Oct 2019 18:26:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqyTzseJM8P0GlQwCAKYn6hmpHyCsOFr9UfpGjSzRwYoyoUc9q0tNlJODa9NST5H0fP41DVH X-Received: by 2002:a17:906:90d8:: with SMTP id v24mr11368781ejw.60.1570757176596; Thu, 10 Oct 2019 18:26:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570757176; cv=none; d=google.com; s=arc-20160816; b=D8MyeKFy5o/7Vx1Jg/Lv5TgaI1A7PQJA7lkcfjBIHvA5OOxc4j3GfcQnuMsoo2HEFW YxkF54yDZLJDl/8ngNlAkWfiWN/QnLZW1JQGiml4Yi6Jiv1386LW0/+gNXheOH5tXDgW CjyA2i1lOEqrN8UmmQrSCQD/Ahr05X/g4Ya4Wg5npFGGwqBguJHEd0yLBusgN4+qujy3 QT+b7sFf6W7C+7FvGc55DanYDLkvDPPSmDm+7R6BSnwvnPjGNpgMYlp4FiBW10KPcBm0 JzImJJZAI6w8xO4kUR+gye+0OFnkH6Fn062yoW37bFBVI8jhiSx0GXWQE1l3BNHjf8V8 VNNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gWD0ajueU3FDZugFOeZjcuv8DzQScIgD3lk4LddX0o0=; b=bkYAJkjntV+u8I6yNGajVvEYEfjpvX/HV296Kfx75yRuSqiUbpV71XeUDP2C/khr4S 6M3TtdArmbPbyXct5XbGcc2TtlV34iiZ9d0CaX4Mxm1GwHuf61LKkTI3g67UHJYwZbXB lGIM1CfaxuayAOfKeZlqEH/I0tA3zyr1nhRG2jC/nGqP/FT93VbosUh9aOxaVr0wK+gU OcKW4TfoMkHKLgWlGqwk+bFggkO2wnXN7nPVaQZ1OjByOdzte1ShEktMfoJFdATvmr4S 4syUu3F0d2QicDKfPOXQH9uDpQ5fO5OX7ypO5xVwF0Ro0BakKO3AncQrgjxAcz3ZTYgq t3hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=EhcGaFKY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j10si4836640ede.70.2019.10.10.18.25.53; Thu, 10 Oct 2019 18:26:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=googlenew header.b=EhcGaFKY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728382AbfJKBZE (ORCPT + 99 others); Thu, 10 Oct 2019 21:25:04 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:53122 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728240AbfJKBYZ (ORCPT ); Thu, 10 Oct 2019 21:24:25 -0400 Received: by mail-wm1-f65.google.com with SMTP id r19so8704310wmh.2 for ; Thu, 10 Oct 2019 18:24:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gWD0ajueU3FDZugFOeZjcuv8DzQScIgD3lk4LddX0o0=; b=EhcGaFKYu218GQzSN/qZhWvFpix+VaufF9g+EZNmmiCDUnuO0vUw4GvIPdOPE8FOCB el7ZfzJwXT0tUV0pELO7ZUfUuXCer6YUVxCGH16my851zEWcBJcWE62P6IMus4ddnggb DEJqUNgY7nV7hkPNdj83DNUcQ7zBrl8TC29sjPYGGw53BFsdZDi/1S9tnpORZ4bEqv6w 7Bc8rDvZRgVBgd5tbBovpUtXew9lMC6GkZF6x5TSceo9LhKwbfo7iEPzv9sK9zwTvDC7 q6nBlxZul6lqD0abv9A0axw/EfMou/IDI91CjVQL1o/SGw4Ko/kYQCHdICmoIlGqwLOQ 1AmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gWD0ajueU3FDZugFOeZjcuv8DzQScIgD3lk4LddX0o0=; b=aaPwsewzBzGQRVPEH/QzSZs4NsvAYYQW7Gy0T/BBKGbe/VokCwwGZmJSVOy6ISUU6A iQx9maBtA3oXR8PjrEajPQJ679Yo2/WBn8YIi95jkmoNOi964Pe1cF4QycWoSi7c0qhJ crytrt75DAP5xIJFWujAeMZKA8boM2I/FuVQWZ29HVJ+T2CycfsUVp4/QOqvToIqBCzo IWD1sluw95m3cZX0hFbzv8jThvVnJBArx5FHIon+teh8Trt0Yeb9K9d5PNdnzuMqYriu QolE1i04KhjaOWSsRdTOk0A+McpO2fD6Yqkl6dAl1PVHek4AqKRY/X4jOxL07M8pDZJs bdZQ== X-Gm-Message-State: APjAAAVgAMJ/Z9Akr0J+EOjS1hASGe4X7+rxjY5lIyG7ytz2I3nsav1F 6Vp3oHzPTqEbO6LAz0T86CGToUiSP0s= X-Received: by 2002:a1c:bc07:: with SMTP id m7mr958262wmf.103.1570757063384; Thu, 10 Oct 2019 18:24:23 -0700 (PDT) Received: from localhost.localdomain ([2a02:8084:ea2:c100:228:f8ff:fe6f:83a8]) by smtp.gmail.com with ESMTPSA id l13sm7699795wmj.25.2019.10.10.18.24.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Oct 2019 18:24:22 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Andrei Vagin , Andy Lutomirski , Arnd Bergmann , Christian Brauner , Cyrill Gorcunov , "Eric W. Biederman" , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jeff Dike , Oleg Nesterov , Pavel Emelyanov , Shuah Khan , Thomas Gleixner , Vincenzo Frascino , containers@lists.linux-foundation.org, criu@openvz.org, linux-api@vger.kernel.org, x86@kernel.org, Andrei Vagin Subject: [PATCHv7 25/33] x86/vdso: Zap vvar pages on switch a time namspace Date: Fri, 11 Oct 2019 02:23:33 +0100 Message-Id: <20191011012341.846266-26-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191011012341.846266-1-dima@arista.com> References: <20191011012341.846266-1-dima@arista.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The VVAR page layout depends on whether a task belongs to the root or non-root time namespace. Whenever a task changes its namespace, the VVAR page tables are cleared and then they will re-faulted with a corresponding layout. Co-developed-by: Andrei Vagin Signed-off-by: Andrei Vagin Signed-off-by: Dmitry Safonov --- arch/x86/entry/vdso/vma.c | 27 +++++++++++++++++++++++++++ include/linux/time_namespace.h | 3 +++ kernel/time/namespace.c | 10 ++++++++++ 3 files changed, 40 insertions(+) diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index d6cb8a16f368..57ada3e95f8d 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -50,6 +50,7 @@ void __init init_vdso_image(const struct vdso_image *image) image->alt_len)); } +static const struct vm_special_mapping vvar_mapping; struct linux_binprm; static vm_fault_t vdso_fault(const struct vm_special_mapping *sm, @@ -127,6 +128,32 @@ static struct page *find_timens_vvar_page(struct vm_area_struct *vma) return NULL; } + +/* + * The vvar page layout depends on whether a task belongs to the root or + * non-root time namespace. Whenever a task changes its namespace, the VVAR + * page tables are cleared and then they will re-faulted with a + * corresponding layout. + * See also the comment near timens_setup_vdso_data() for details. + */ +int vdso_join_timens(struct task_struct *task, struct time_namespace *ns) +{ + struct mm_struct *mm = task->mm; + struct vm_area_struct *vma; + + if (down_write_killable(&mm->mmap_sem)) + return -EINTR; + + for (vma = mm->mmap; vma; vma = vma->vm_next) { + unsigned long size = vma->vm_end - vma->vm_start; + + if (vma_is_special_mapping(vma, &vvar_mapping)) + zap_page_range(vma, vma->vm_start, size); + } + + up_write(&mm->mmap_sem); + return 0; +} #else static inline struct page *find_timens_vvar_page(struct vm_area_struct *vma) { diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h index c479cfda2c3e..dcf3dbf2836b 100644 --- a/include/linux/time_namespace.h +++ b/include/linux/time_namespace.h @@ -30,6 +30,9 @@ struct time_namespace { extern struct time_namespace init_time_ns; #ifdef CONFIG_TIME_NS +extern int vdso_join_timens(struct task_struct *task, + struct time_namespace *ns); + static inline struct time_namespace *get_time_ns(struct time_namespace *ns) { kref_get(&ns->kref); diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c index e14cd1ca387d..0dc0742ed1ee 100644 --- a/kernel/time/namespace.c +++ b/kernel/time/namespace.c @@ -280,6 +280,7 @@ static void timens_put(struct ns_common *ns) static int timens_install(struct nsproxy *nsproxy, struct ns_common *new) { struct time_namespace *ns = to_time_ns(new); + int err; if (!current_is_single_threaded()) return -EUSERS; @@ -290,6 +291,10 @@ static int timens_install(struct nsproxy *nsproxy, struct ns_common *new) timens_set_vvar_page(current, ns); + err = vdso_join_timens(current, ns); + if (err) + return err; + get_time_ns(ns); put_time_ns(nsproxy->time_ns); nsproxy->time_ns = ns; @@ -304,6 +309,7 @@ int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk) { struct ns_common *nsc = &nsproxy->time_ns_for_children->ns; struct time_namespace *ns = to_time_ns(nsc); + int err; /* create_new_namespaces() already incremented the ref counter */ if (nsproxy->time_ns == nsproxy->time_ns_for_children) @@ -311,6 +317,10 @@ int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk) timens_set_vvar_page(tsk, ns); + err = vdso_join_timens(tsk, ns); + if (err) + return err; + get_time_ns(ns); put_time_ns(nsproxy->time_ns); nsproxy->time_ns = ns; -- 2.23.0