Received: by 10.223.164.221 with SMTP id h29csp1977565wrb; Thu, 2 Nov 2017 04:04:52 -0700 (PDT) X-Google-Smtp-Source: ABhQp+QG9XCOlMhs+0EWsDwQn5Zc804mOaIcTIYXh6JJFWrFNZr/rqd3vL07RhOgQtl0Mrdsj/8m X-Received: by 10.99.124.75 with SMTP id l11mr3205405pgn.453.1509620692852; Thu, 02 Nov 2017 04:04:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1509620692; cv=none; d=google.com; s=arc-20160816; b=FV+rClcQElVYJD4wkLvWZrvyzEkQtNeQUWzateh4o9g41cy4p6r1+dMMwD5TS25WY0 Fj7vAqBHcuWnQA0vieEow0Vu2dHoQ9lkNDoYywFMBNb9d/JFNEO2+nDvf+N4MHiWoOl7 kQJL3lviLDCZbMwKdS0LaSoRgLaIpkweoplwb2J0OxOrJMJC/cRhALHlN9KHfTuxm5WW vEUvYtnsXzSGwjXZ6bc7c3YPahhaAf+m/8W2y3cczeI25QDOK2bAjx1hxsHvg6LRo6hY sbddQDZF9KUMk1yIUxddgtQHT4cdwMabn5XbJ/NRJsv/hZFS8q5ElLNE/ZuhauOvJAJd BqQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=9BP1k0yfHOciv5b3rztPEPv7nvIj8eOPRYYG/AvoS/E=; b=TChBxkFWYGQzWbJ1fHdUi+s2ixKvPLxfpEkIcWSBs80AP/cB0AHv0c2etH4+5JqYfd hQM5GcOgtPUqSsTBPVuH1rneVmOLHpZqxZMw3V+GaOBxWhOa8BREDdxvoCyrQoOp0ARZ JJlNalbLLVXxRh1q9Zbv7r0aXxT2azIOFILI+pTQVFnF8dJqBNcG2fXFGUqfUOs7Mfsq +aFOFgWhWY5oeePvbNpO5djfL7QZm8FFTURF0UeQVfvZ5PHn+I3fCRmsZAlYKyUKNt75 jeJe5Fb4z9Z2f7dy1dNpsShWo4DmmIP1avctYmCiLNOngnk2tlw8SzwG00rV7Oarx8Ab If2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j17si3407718pgv.452.2017.11.02.04.04.37; Thu, 02 Nov 2017 04:04:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755490AbdKBLEB (ORCPT + 99 others); Thu, 2 Nov 2017 07:04:01 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:58958 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752780AbdKBLD6 (ORCPT ); Thu, 2 Nov 2017 07:03:58 -0400 Received: from mail-wr0-f197.google.com ([209.85.128.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1eADHx-0001B9-U5 for linux-kernel@vger.kernel.org; Thu, 02 Nov 2017 11:03:57 +0000 Received: by mail-wr0-f197.google.com with SMTP id u98so2823443wrb.4 for ; Thu, 02 Nov 2017 04:03:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9BP1k0yfHOciv5b3rztPEPv7nvIj8eOPRYYG/AvoS/E=; b=SupKDPYMLAaUPxv1b+yUtRzuGmfg5XB5AbcLiKJ5JvS7eEoxg+G1WT8caJJ7DHdsOP /RYBbWdLuh5PtUVAlKboFIedgL1hbuGTmd3Wc3ddDT2XUKEXaTQ/2D0Xm2frqxgeokl4 grOnR/BJo0s1sCO2mQVXIQwZZ/eJENKm3E1oSkB64n2xwOHGOdbJCTAfJAQsvlo1ZRs4 Pkywx1Lek5j+mksfM31c6Fi6Zpo9heII7QLceyi0EMXwPwi8j2isYsLcodIaPXxTT/X7 skG0D527Q1hETPDtjbjmuHSeBvMXuXwHDn7RH8N3ZEyhQibukd+ToeNbzCdPMMhljcZ1 t88A== X-Gm-Message-State: AMCzsaXWtiS22URHTPalMJc0ipy2OPIYGKZSzSx4G0r+3Mg+74YM68JA BE1vavLKgFTUqA4xXvKjH0Ku9KJDLMuvOvxEGxgjcDBPo6bn0G6fU/NgYdHj0SWGghpeGt4ZfKf 0Ps6VORqTtftvyzG81lALeOlDBHucoNUc54DLn6EjeA== X-Received: by 10.80.158.207 with SMTP id a73mr3954251edf.90.1509620637580; Thu, 02 Nov 2017 04:03:57 -0700 (PDT) X-Received: by 10.80.158.207 with SMTP id a73mr3954233edf.90.1509620637331; Thu, 02 Nov 2017 04:03:57 -0700 (PDT) Received: from localhost.localdomain (x59cc8a21.dyn.telefonica.de. [89.204.138.33]) by smtp.gmail.com with ESMTPSA id p45sm3235138edc.30.2017.11.02.04.03.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Nov 2017 04:03:56 -0700 (PDT) From: Christian Brauner X-Google-Original-From: Christian Brauner To: ebiederm@xmission.com, linux-kernel@vger.kernel.org, peterz@infradead.org Cc: containers@lists.linux-foundation.org, nborisov@suse.com, serge@hallyn.com, tycho@tycho.ws, Christian Brauner Subject: [PATCH 1/1] userns: Fix/clarify memory ordering Date: Thu, 2 Nov 2017 12:03:44 +0100 Message-Id: <20171102110344.31647-2-christian.brauner@ubuntu.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20171102110344.31647-1-christian.brauner@ubuntu.com> References: <20171102110344.31647-1-christian.brauner@ubuntu.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Nikolay noticed a number of undocumented memory barriers in this code; the ordering is fairly simple but not explicitly described. Cure that. Switch over to smp_store_release() / smp_load_acquire() as that is the natural fit for the pattern and includes the missing but required WRITE_ONCE()/READ_ONCE()s. CC: Eric Biederman Cc: Linux Containers Reported-by: Nikolay Borisov Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Christian Brauner --- kernel/user_namespace.c | 74 +++++++++++++++++++++++++++++++------------------ 1 file changed, 47 insertions(+), 27 deletions(-) diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 899c31060ff3..2129762a930e 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -27,8 +27,47 @@ #include static struct kmem_cache *user_ns_cachep __read_mostly; + +/** + * The userns_state_mutex serializes all writes to any given map. + * + * Any map is only ever written once. + * + * An id map fits within 1 cache line on most architectures. + */ static DEFINE_MUTEX(userns_state_mutex); +/** + * There is a data dependency between reading the count of the extents and the + * values of the extents. The desired behavior is to see the values of the + * extents that were written before the count of the extents. + * + * To achieve this smp_store_release() is used to guarantee the write order and + * smp_load_acquire() is guaranteed that we observe the written data. + */ +static inline void map_store_extents(struct uid_gid_map *map, + unsigned int extents) +{ + /* + * Ensure the map->extent[] stores happen-before we grow map->nr_extents + * to cover it. + * + * Matches the load_acquire in map_load_extents(). + */ + smp_store_release(&map->nr_extents, extents); +} + +static inline unsigned int map_load_extents(struct uid_gid_map *map) +{ + /* + * Ensure the map->nr_extents load happens-before we try and access + * map->extent[], such that we guarantee the data is in fact there. + * + * Matches the store-release in map_store_extents(). + */ + return smp_load_acquire(&map->nr_extents); +} + static bool new_idmap_permitted(const struct file *file, struct user_namespace *ns, int cap_setid, struct uid_gid_map *map); @@ -296,9 +335,9 @@ map_id_range_down_base(unsigned extents, struct uid_gid_map *map, u32 id, u32 co static u32 map_id_range_down(struct uid_gid_map *map, u32 id, u32 count) { struct uid_gid_extent *extent; - unsigned extents = map->nr_extents; - smp_rmb(); + unsigned extents; + extents = map_load_extents(map); if (extents <= UID_GID_MAP_MAX_BASE_EXTENTS) extent = map_id_range_down_base(extents, map, id, count); else @@ -359,9 +398,9 @@ map_id_up_max(unsigned extents, struct uid_gid_map *map, u32 id) static u32 map_id_up(struct uid_gid_map *map, u32 id) { struct uid_gid_extent *extent; - unsigned extents = map->nr_extents; - smp_rmb(); + unsigned extents; + extents = map_load_extents(map); if (extents <= UID_GID_MAP_MAX_BASE_EXTENTS) extent = map_id_up_base(extents, map, id); else @@ -647,9 +686,9 @@ static void *m_start(struct seq_file *seq, loff_t *ppos, struct uid_gid_map *map) { loff_t pos = *ppos; - unsigned extents = map->nr_extents; - smp_rmb(); + unsigned extents; + extents = map_load_extents(map); if (pos >= extents) return NULL; @@ -860,25 +899,6 @@ static ssize_t map_write(struct file *file, const char __user *buf, char *kbuf = NULL, *pos, *next_line; ssize_t ret = -EINVAL; - /* - * The userns_state_mutex serializes all writes to any given map. - * - * Any map is only ever written once. - * - * An id map fits within 1 cache line on most architectures. - * - * On read nothing needs to be done unless you are on an - * architecture with a crazy cache coherency model like alpha. - * - * There is a one time data dependency between reading the - * count of the extents and the values of the extents. The - * desired behavior is to see the values of the extents that - * were written before the count of the extents. - * - * To achieve this smp_wmb() is used on guarantee the write - * order and smp_rmb() is guaranteed that we don't have crazy - * architectures returning stale data. - */ mutex_lock(&userns_state_mutex); memset(&new_map, 0, sizeof(struct uid_gid_map)); @@ -1015,8 +1035,8 @@ static ssize_t map_write(struct file *file, const char __user *buf, map->forward = new_map.forward; map->reverse = new_map.reverse; } - smp_wmb(); - map->nr_extents = new_map.nr_extents; + + map_store_extents(map, new_map.nr_extents); *ppos = count; ret = count; -- 2.14.1 From 1582973924464363570@xxx Thu Nov 02 16:52:55 +0000 2017 X-GM-THRID: 1581796008649859051 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread