Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp820211ybn; Tue, 24 Sep 2019 10:00:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqyabVy217bbMrK+P5EUjWAjEvYnlGQ7sadM73kjW5NK9qIOwwTK4T04OenFLx+tY8sw0MfR X-Received: by 2002:adf:fe07:: with SMTP id n7mr3192641wrr.90.1569344456620; Tue, 24 Sep 2019 10:00:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569344456; cv=none; d=google.com; s=arc-20160816; b=Y7BCKWgxSSXi8MaiajhiwZGmUK2qec7Z5HAMKgaFBRSeOl1Z4oHqN5eTXm/GcxJwEd BUXY7Uv+UmtxLZjb2yIW0XIgIPX7S1J4P9IiOoO+oGlovslAp2fwAu7lmg5ADQ1lJ0qS nK09wPiw5FyF0jJKbeYhBUvfHrmCK7IiLvsHVcOokHkoQkl+QLxyRTuPiZTEh/Wi+JSn xuIQScPd5FJZ/yNs6QmZwC0TOo7uCwGrjrYwmiyB1hn2CMEulj5urzNhK1OCy8nTzwiv Rd1Xdb0qyz8SLFacd9sFTFWqqDu591+2y+fl1Ov4AiuwBMxW16oLIA7Fzu9GIGP6AnfT 4+BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:cc:dkim-signature; bh=qtQ45fwloWjBlifREEUJV5bQdHg/Ixd1tAtGl3/qHQY=; b=uiXjDdy8+/fDW0wUWv226H5TmrmcFQT42P3VthlBJSeAn23VpMo/9tDnm5tCShZg08 daFbn37g8XId/PqdcANosC19hwJ54hE2coOEkxCKjlhf49Cd6Pi+m2y/j6bM1QHnhGYX jvlWstHVGIrRALM1LoS0YmcKUsrEQxKzVvTkI3+//ypZXYS99eHuwklmIJNfBAgWtHFh LHeKiLJn8CqrslB4kGvCub8UkbJx21kBtuICtz8d9+qFNckmrc7Wy+9iJqahDAFUwLiC Okn4hDfsiaJYSV+xbIxCEOfhrun2YWdW7MAZcJV996Cin2lhlKiEcGUKqDxhZLgmMWdd A6aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="o8SDD/2I"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f47si1513683ede.263.2019.09.24.10.00.32; Tue, 24 Sep 2019 10:00:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="o8SDD/2I"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728385AbfIWLLB (ORCPT + 99 others); Mon, 23 Sep 2019 07:11:01 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:37202 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727145AbfIWLLB (ORCPT ); Mon, 23 Sep 2019 07:11:01 -0400 Received: by mail-wr1-f65.google.com with SMTP id i1so13443826wro.4; Mon, 23 Sep 2019 04:10:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=cc:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=qtQ45fwloWjBlifREEUJV5bQdHg/Ixd1tAtGl3/qHQY=; b=o8SDD/2Icte4oZCfSphkKpTh1ZO8m6SvLc4ALLi+3AK9muyEWmsAjKuureFZ0oGs7D WHQcG9wcTUqcaLrGUOtN9Ihb49re8ZbwjVJC2htaCP7Ps8V/HJr1ztuGfCvE3WNWx10q gSU/0+LaaEazotMFOy0lCbO7hy0C6U0TyBOl4QPt7Ha+oXXDlRBD6wfXzjGXr6ty0g8N y6yNmzJUFV1+jaWcwWSvwmkhVC2VrFhdxH8DrauXJRBEH38WPL4zZgiVEELxYeFT4i7Y WgMVBKW7thyvW3jIRsWrWrJBf/HmJQ6xX/zzObiSqd995zwd38jC6r6Np+fr/p2yGR1h q6Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:cc:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qtQ45fwloWjBlifREEUJV5bQdHg/Ixd1tAtGl3/qHQY=; b=S0CFrgMx836FKFrB+RDV8+Tl1ibFc+kg6d3YGmQJUEdKbhSbfkx/8OlCKBtyQJO0EK ZQzeZQul/7cMaStBcxm0lSZtaLNHhO2FwwPx2ORKKpnJ1iAI/sJOYKy10c+u5DtgX5TP qfqadgk8SFj/3Q6KU5YIw2395wH8n1YVsUB9T61iqToDacga43VsbAi5/nsXppycPKI6 uA1guaYvUbX507N0CaTZ5Au3i23MNuEVywhP/FWbPhz6mIDkr+5lulV9T4RO7jo4iaKj 3p0w/OyQwb07u0mb73P2JNx5PgVT7K0qLxQHTm7uzx11fMJmN4LT02gePg72yaUW32c7 u03g== X-Gm-Message-State: APjAAAUdEyikRiF4e5jMMZVweDRWTQsJJIVk2gxO+BPi3G8jXXwXYge4 S3i1f1pe/O7bwgXjQX/ltEo= X-Received: by 2002:a5d:6785:: with SMTP id v5mr22057575wru.9.1569237058047; Mon, 23 Sep 2019 04:10:58 -0700 (PDT) Received: from [10.0.20.253] ([95.157.63.22]) by smtp.gmail.com with ESMTPSA id 207sm20220357wme.17.2019.09.23.04.10.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 23 Sep 2019 04:10:57 -0700 (PDT) Cc: mtk.manpages@gmail.com, Christian Brauner , linux-man , Containers , lkml , Andy Lutomirski , Jordan Ogas , werner@almesberger.net, Al Viro Subject: Re: pivot_root(".", ".") and the fchdir() dance To: "Eric W. Biederman" References: <20190805103630.tu4kytsbi5evfrhi@mikami> <3a96c631-6595-b75e-f6a7-db703bf89bcf@gmail.com> <87r24piwhm.fsf@x220.int.ebiederm.org> <87ftl5donm.fsf@x220.int.ebiederm.org> <20190910111551.scam5payogqqvlri@wittgenstein> <30545c5c-ff4c-8b87-e591-40cc0a631304@gmail.com> <871rwnda47.fsf@x220.int.ebiederm.org> <448138b8-0d0c-5eb3-d5e5-04a26912d3a8@gmail.com> <87ef0hbezt.fsf@x220.int.ebiederm.org> From: "Michael Kerrisk (man-pages)" Message-ID: Date: Mon, 23 Sep 2019 13:10:56 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <87ef0hbezt.fsf@x220.int.ebiederm.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Eric, On 9/15/19 8:17 PM, Eric W. Biederman wrote: > "Michael Kerrisk (man-pages)" writes: > >> Hello Eric, >> >> On 9/11/19 1:06 AM, Eric W. Biederman wrote: >>> "Michael Kerrisk (man-pages)" writes: >>> >>>> Hello Christian, >>>> >>>>>> All: I plan to add the following text to the manual page: >>>>>> >>>>>> new_root and put_old may be the same directory. In particular, >>>>>> the following sequence allows a pivot-root operation without need‐ >>>>>> ing to create and remove a temporary directory: >>>>>> >>>>>> chdir(new_root); >>>>>> pivot_root(".", "."); >>>>>> umount2(".", MNT_DETACH); >>>>> >>>>> Hm, should we mention that MS_PRIVATE or MS_SLAVE is usually needed >>>>> before the umount2()? Especially for the container case... I think we >>>>> discussed this briefly yesterday in person. >>>> Thanks for noticing. That detail (more precisely: not MS_SHARED) is >>>> already covered in the numerous other changes that I have pending >>>> for this page: >>>> >>>> The following restrictions apply: >>>> ... >>>> - The propagation type of new_root and its parent mount must not >>>> be MS_SHARED; similarly, if put_old is an existing mount point, >>>> its propagation type must not be MS_SHARED. >>> >>> Ugh. That is close but not quite correct. >>> >>> A better explanation: >>> >>> The pivot_root system call will never propagate any changes it makes. >>> The pivot_root system call ensures this is safe by verifying that >>> none of put_old, the parent of new_root, and parent of the root directory >>> have a propagation type of MS_SHARED. >> >> Thanks for that. However, another question. You text has two changes. >> First, I understand why you reword the discussion to indicate the >> _purpose_ of the rules. However, you also, AFAICS, list a different set of >> of directories that can't be MS_SHARED: >> >> I said: new_root, the parent of new_root, and put_old >> You said: the parent of new_root, and put_old, and parent of the >> root directory. > > >> Was I wrong on this detail also? > > That is how I read the code. The code says: > > if (IS_MNT_SHARED(old_mnt) || > IS_MNT_SHARED(new_mnt->mnt_parent) || > IS_MNT_SHARED(root_mnt->mnt_parent)) > goto out4; > > We both agree on put_old and the parent of new_mnt. > > When I look at the code root_mnt comes from the root directory, not new_mnt. Hmm -- I had checked the code when I wrote my text, but somehow I misread things. Going back to recheck the code, you are obviously correct. Thanks for catching that. > Furthermore those checks fundamentally makes sense as the root directory > and new_root that are moving. The directory put_old simply has > something moving onto it. > >>> The concern from our conversation at the container mini-summit was that >>> there is a pathology if in your initial mount namespace all of the >>> mounts are marked MS_SHARED like systemd does (and is almost necessary >>> if you are going to use mount propagation), that if new_root itself >>> is MS_SHARED then unmounting the old_root could propagate. >>> >>> So I believe the desired sequence is: >>> >>>>>> chdir(new_root); >>> +++ mount("", ".", MS_SLAVE | MS_REC, NULL); >>>>>> pivot_root(".", "."); >>>>>> umount2(".", MNT_DETACH); >>> >>> The change to new new_root could be either MS_SLAVE or MS_PRIVATE. So >>> long as it is not MS_SHARED the mount won't propagate back to the >>> parent mount namespace. >> >> Thanks. I made that change. > > For what it is worth. The sequence above without the change in mount > attributes will fail if it is necessary to change the mount attributes > as "." is both put_old as well as new_root. > > When I initially suggested the change I saw "." was new_root and forgot > "." was also put_old. So I thought there was a silent danger without > that sequence. So, now I am a little confused by the comments you added here. Do you now mean that the mount("", ".", MS_SLAVE | MS_REC, NULL); call is not actually necessary? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/