Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4813398imm; Tue, 9 Oct 2018 05:46:26 -0700 (PDT) X-Google-Smtp-Source: ACcGV62QBUiTwb/3QkIr1pnnZyKIGZMelBr78Grnlp1C3Rh6ac3uG9u3mqgOeRJsHd5R71kIELKc X-Received: by 2002:a17:902:bf0a:: with SMTP id bi10-v6mr28635881plb.163.1539089186641; Tue, 09 Oct 2018 05:46:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539089186; cv=none; d=google.com; s=arc-20160816; b=V6zys54xLygnfY9wLUnDGbeMoTMLaN9QrJSVyqxcRGu/h76QbLT1590gmjBBp+p9qx wkNUFeBuOpc2PqazSMFSNJA9+YdS++8j5ZmU90WnuENFv5E39ve9Wr1+n3yb083TQuUL vsoaw2YI5Em4FV03zAt5m3vuX+qPXqPLnQsdkSs3uAmtOWWhDlzd5IZ1zAFVwPaVTS7c zOO5RURo9IlU4DrCSVS+Ryhb4L5zEkZehKeC8sN4TNKF/T7vj+WKojnMuktx6ib633cE 1CwNkTpunTKLYx6bm5/n9K9TTc4DgQvRNwA8wo5z7TJQvVHDmjqhCOIvotdjpILIvGk1 JVWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=7Mgv8nZVA4iK9x8cx21nzvxLWBG8pe5Xw26erHZzU5Y=; b=iBh2rlUzUuNAllvkAOTRb9L54R7DPiXDMSUyigkm/vBBZDyX0uK6gW+GGisjPwNEg9 3AYabuckWi4ReMRaeE5iiSsdAYl/KASe/0I9mNn/FYaqx75BVDrkFTT/EfcAlrVYaDaZ f0N7sGtPEepcrOfDrg4qDSx2IkCAbK9AiMsuiL9+R+G+153N0CeuFuntM+9S6qpXD/0h aQCnhssDMI0oHepIhHFNkdPGg7dgiuFjF8GHsdLYWNGkQGTDjFfJv62eIYtGuDBZ7NuA dkcfZPqqooiUk3X//NPOkahce6YbKIr5NAyUnAQtlvZWeGhCcz66hSRmHtj/hVeRhoyh /Qmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RA+jJwLn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r134-v6si20148155pgr.252.2018.10.09.05.46.12; Tue, 09 Oct 2018 05:46:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RA+jJwLn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726562AbeJIUBK (ORCPT + 99 others); Tue, 9 Oct 2018 16:01:10 -0400 Received: from mail-oi1-f195.google.com ([209.85.167.195]:34000 "EHLO mail-oi1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726393AbeJIUBJ (ORCPT ); Tue, 9 Oct 2018 16:01:09 -0400 Received: by mail-oi1-f195.google.com with SMTP id v69-v6so1095723oif.1 for ; Tue, 09 Oct 2018 05:44:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7Mgv8nZVA4iK9x8cx21nzvxLWBG8pe5Xw26erHZzU5Y=; b=RA+jJwLnTAQ9n6isy6HNiqDqwGXa2MF3ZGCTbQcGiz7RQi4UXqyGr7e2oi0DtbnuSy 4rI8YrV91y95NeOSyL6+vQ4ziiidNOvzCEm+T1CHHCtSBdPesCTtDnufEmrRKf2dvWI1 U7Mr86q0COczug6s7bMatjsz8K3jCDOyem8tz9DWKW3DAADN8WxkVbr9C2h0KrPiBmu2 kti10IJfirH/N/PdQYfSe7jE/Ow260m4EKqZjPK6fT+Nsemo2WnCwrWL3QnCihuwRTHp w9feCR378rJynVcifYii8EbOwaXuhNHiSGhyPnvaWtWZQMEALNxvdvj9Cue5+6ZhGUD/ 51tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7Mgv8nZVA4iK9x8cx21nzvxLWBG8pe5Xw26erHZzU5Y=; b=RckDrPyA0O24Tnve3/3vSne4DVtA+XYHmsNX1IiQ+ZVObwPZIL253ye3VHO4AQier2 FM5mN7+il3mNUWVbPQNutRGy5LMhv3loH9UjDVwycLyrwv9DL5I8shA5xciiPxuavaLH qyw3xgkMVfnBKag5BokgTrmlfIMfkLr+f9HgD85NLUanN+Ei6+qEwVeNzlpBZJ2Afy4Y sswOjQjDiPhSz8r673giF8S2s5wtoWrzdCw2MIhLxxz8V7VwYjEoOucUhF/DFOvWN9Ao FIllJf/9JRLoknL/eVx4zIN1Q1qQCBFi88GkqlZ2cBsF3i7hcP2NulOXewKwm7zW9DEZ 48kw== X-Gm-Message-State: ABuFfohE1tC2oUe7WRp8nkBfRDuAVilsufMIgU/WLILRzhYemiKQ1U1E fZPpLTm+tipPci6ZfqGz1EWvhoqoashdHAyybn/tcg== X-Received: by 2002:aca:c045:: with SMTP id q66-v6mr6110923oif.22.1539089062316; Tue, 09 Oct 2018 05:44:22 -0700 (PDT) MIME-Version: 1.0 References: <20181009103752.21482-1-laurent@vivier.eu> <20181009103752.21482-2-laurent@vivier.eu> In-Reply-To: <20181009103752.21482-2-laurent@vivier.eu> From: Jann Horn Date: Tue, 9 Oct 2018 14:43:56 +0200 Message-ID: Subject: Re: [RFC v5 1/1] ns: add binfmt_misc to the user namespace To: Laurent Vivier Cc: kernel list , "Eric W. Biederman" , dima@arista.com, Linux API , James Bottomley , Al Viro , linux-fsdevel@vger.kernel.org, avagin@gmail.com, containers@lists.linux-foundation.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 9, 2018 at 12:38 PM Laurent Vivier wrote: > This patch allows to have a different binfmt_misc configuration > for each new user namespace. By default, the binfmt_misc configuration > is the one of the previous level, but if the binfmt_misc filesystem is > mounted in the new namespace a new empty binfmt instance is created and > used in this namespace. > > For instance, using "unshare" we can start a chroot of an another > architecture and configure the binfmt_misc interpreter without being root > to run the binaries in this chroot. [...] > @@ -823,12 +847,34 @@ static const struct super_operations s_ops = { > static int bm_fill_super(struct super_block *sb, void *data, int silent) > { > int err; > + struct user_namespace *ns = sb->s_user_ns; > static const struct tree_descr bm_files[] = { > [2] = {"status", &bm_status_operations, S_IWUSR|S_IRUGO}, > [3] = {"register", &bm_register_operations, S_IWUSR}, > /* last one */ {""} > }; > > + /* create a new binfmt namespace > + * if we are not in the first user namespace > + * but the binfmt namespace is the first one > + */ > + if (READ_ONCE(ns->binfmt_ns) == NULL) { > + struct binfmt_namespace *new_ns; > + > + new_ns = kmalloc(sizeof(struct binfmt_namespace), > + GFP_KERNEL); > + if (new_ns == NULL) > + return -ENOMEM; > + INIT_LIST_HEAD(&new_ns->entries); > + new_ns->enabled = 1; > + rwlock_init(&new_ns->entries_lock); > + new_ns->bm_mnt = NULL; > + new_ns->entry_count = 0; > + /* ensure new_ns is completely initialized before sharing it */ > + smp_wmb(); > + WRITE_ONCE(ns->binfmt_ns, new_ns); > + } You're still not preventing a concurrent race of two mount() calls, right? What prevents two instances of this code block from running concurrently in two different namespaces? I think you want to take some sort of global lock around this.