Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07A1DC05027 for ; Thu, 2 Feb 2023 20:10:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232573AbjBBUKo (ORCPT ); Thu, 2 Feb 2023 15:10:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232512AbjBBUKm (ORCPT ); Thu, 2 Feb 2023 15:10:42 -0500 Received: from mail-yw1-x1134.google.com (mail-yw1-x1134.google.com [IPv6:2607:f8b0:4864:20::1134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 481347EFCA for ; Thu, 2 Feb 2023 12:10:36 -0800 (PST) Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-5249a65045aso9450797b3.13 for ; Thu, 02 Feb 2023 12:10:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=aMAjimDlKMwUcRFDQ6b+VBfedyOvbLyBE+vOI+tSfHA=; b=kyF+/plr5E/Ea/CcC8kG3DZtLYwmW3pbaj7ub7BWG0oOmUWEs0ikGKms709xZ3r46a fS7/eLvB6OiRfJsQp73/xjkOdf4G6I4FnffK8HFGbsX3nuet1hVVnM8RCxxsPhZqXauh yTbOzle222FWt7Wu3vu0dxPxYG5oHgB02m+paWO1aDZtPtA4vBqOc4bS2gapXOCToqGM UUzG8sVKIM+PeBZts0WxdKwxlr0k3szWFpNBVDkGgXGsfaK/35+ADzW+Tjti47JUCMCE zSkdeSfuPAXbUqU43dOSHwnYFZn5HT1g36N0MkXCDgfHIHxC3hbU6ACLIr2UwUTMAM5u ibWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=aMAjimDlKMwUcRFDQ6b+VBfedyOvbLyBE+vOI+tSfHA=; b=W9k53A4aowblPsCdZyoEvOsL7cH2BhuBN1ywoI33m2v91v8UOf+0jdnxyCrKbD9RQ9 k98dNL4QKCy4j2lWK9cwRYBO0bQHRdoLdC0j7y0bEq3NOXHDa+paLSPYm2S48n5yopuL 2U73T2IffAb3QhtHUCBAIGIiGrTJgfkBnLDurAb7lEnSS6ntBbB0DPjszeZNb6z2Q0lu AqP4vF0XGoxymIs9mWpmM8+EdBEcML5LBu1gZCSqKK9d2sfQbTheHVpiAAU70wWAkb0e RjctKbDiNPME1UTM/Q0a5ktQxOva7KAB3kGmSQnfNv2/PYQNI9SXTbcoViE+E0czjSiT XYYQ== X-Gm-Message-State: AO0yUKUg/nNTfVv6hJTpdN/+N5WsbDfbUSL38Eiunxa4m1NHG4n9Dtua WUlaKrDDk0J1MQkyN0kSGBKBq9NSlKvLhKFgKUU71Q== X-Google-Smtp-Source: AK7set9AqY68emvr60gryS7o515sri+M7IT/v1XYZ0VEeOpgfCcezGt7jhwHyJF5LmQMbsFZMGLvgpHWm+XynGoijC0= X-Received: by 2002:a05:690c:39f:b0:50f:9101:875f with SMTP id bh31-20020a05690c039f00b0050f9101875fmr855576ywb.392.1675368635261; Thu, 02 Feb 2023 12:10:35 -0800 (PST) MIME-Version: 1.0 References: <20230202014810.744-1-hdanton@sina.com> In-Reply-To: From: Eric Dumazet Date: Thu, 2 Feb 2023 21:10:23 +0100 Message-ID: Subject: Re: [RFC] net: add new socket option SO_SETNETNS To: Alok Tiagi Cc: Hillf Danton , ebiederm@xmission.com, netdev@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 2, 2023 at 8:55 PM Alok Tiagi wrote: > > On Thu, Feb 02, 2023 at 09:48:10AM +0800, Hillf Danton wrote: > > On Wed, 1 Feb 2023 19:22:57 +0000 aloktiagi > > > @@ -1535,6 +1535,52 @@ int sk_setsockopt(struct sock *sk, int level, int optname, > > > WRITE_ONCE(sk->sk_txrehash, (u8)val); > > > break; > > > > > > + case SO_SETNETNS: > > > + { > > > + struct net *other_ns, *my_ns; > > > + > > > + if (sk->sk_family != AF_INET && sk->sk_family != AF_INET6) { > > > + ret = -EOPNOTSUPP; > > > + break; > > > + } > > > + > > > + if (sk->sk_type != SOCK_STREAM && sk->sk_type != SOCK_DGRAM) { > > > + ret = -EOPNOTSUPP; > > > + break; > > > + } > > > + > > > + other_ns = get_net_ns_by_fd(val); > > > + if (IS_ERR(other_ns)) { > > > + ret = PTR_ERR(other_ns); > > > + break; > > > + } > > > + > > > + if (!ns_capable(other_ns->user_ns, CAP_NET_ADMIN)) { > > > + ret = -EPERM; > > > + goto out_err; > > > + } > > > + > > > + /* check that the socket has never been connected or recently disconnected */ > > > + if (sk->sk_state != TCP_CLOSE || sk->sk_shutdown & SHUTDOWN_MASK) { > > > + ret = -EOPNOTSUPP; > > > + goto out_err; > > > + } > > > + > > > + /* check that the socket is not bound to an interface*/ > > > + if (sk->sk_bound_dev_if != 0) { > > > + ret = -EOPNOTSUPP; > > > + goto out_err; > > > + } > > > + > > > + my_ns = sock_net(sk); > > > + sock_net_set(sk, other_ns); > > > + put_net(my_ns); > > > + break; > > > > cpu 0 cpu 2 > > --- --- > > ns = sock_net(sk); > > my_ns = sock_net(sk); > > sock_net_set(sk, other_ns); > > put_net(my_ns); > > ns is invalid ? > > That is the reason we want the socket to be in an un-connected state. That > should help us avoid this situation. This is not enough.... Another thread might look at sock_net(sk), for example from inet_diag or tcp timers (which can be fired even in un-connected state) Even UDP sockets can receive packets while being un-connected, and they need to deref the net pointer. Currently there is no protection about sock_net(sk) being changed on the fly, and the struct net could disappear and be freed. There are ~1500 uses of sock_net(sk) in the kernel, I do not think you/we want to audit all of them to check what could go wrong...