Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp25924770rwd; Mon, 3 Jul 2023 03:01:44 -0700 (PDT) X-Google-Smtp-Source: APBJJlEJeOUwsEg7j1ActMYRO92g0MlKqLz4sub7dSgQe1GvS0u2Ceyh2QG7GkzBvzLgwZqv9VfR X-Received: by 2002:a17:90b:268b:b0:263:eef:1465 with SMTP id pl11-20020a17090b268b00b002630eef1465mr8102161pjb.44.1688378503757; Mon, 03 Jul 2023 03:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688378503; cv=none; d=google.com; s=arc-20160816; b=niqwp0XT7F5QXQLjoZNbqQ6tGplF2FkPvIwiKn9+TYHzX/JQ2oS9HAmFXDdopWjq3a 16q7Qn44Q0cgue9dpsHPrbast4pNCZk5ba8/4PV0DgGmO8v09BpzZKP16EBzvVUXgYUD Sr3KEZao6INZGxbO8q4vqZ//ujxWz+u7B6d9a6CXhfKYc757bJdvDUuV8Zl6IKps1Mrd GmbMeDZWvdxN7fTBg/JWT/uEUZkylAokpDI4JmTf6DSKNDdn7xxUD62krJL4TdY+9j// zoOwIWluezMFomtyYhOFieNl0U7M9FY/rwvJJcXqbXN3WdsrTU2+t64JLv9pMnKaK9xg 0z9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Or8dFWcZaZQ+0fBVUWIhSbo2VVRVHtdq1po7KsbeTx0=; fh=jwlygcFILYvyxvhYBfDh8RF1KE70m5C8pe92KZFXpgM=; b=qt0K8Lqrq9W+PmfNqeQy77xBdebhO1LBqZFPIimfF1CxHXYgoStj+v+Xy2KqObHOwa Di9mYX2DiijNKJoaCas1y3sYMVUHPe99BgkCwKwGIfWQiFVI8na37WyBMiDT6EpniyIV AivRzoMWkZIgPUT77/OzfUKT7KCxJyS1kmtxrQdTmdGDCYkeT3v9rI/l8nRqu/7VBo13 XANe/T0VO1PIIt/EXdBpBM2vOjDZghIV9hlLNqd7wmPQL7n6wmlpox1ceftwhD3d112C /5NvJieDPNBGspYfInlqjbeuMCiaBSJZonSE8qSZstPPu82CcaLXUB09h8wYnEGdglFe 2eDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=j89bmR0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kb15-20020a17090ae7cf00b00262fa96d9e6si16235728pjb.142.2023.07.03.03.01.28; Mon, 03 Jul 2023 03:01:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=j89bmR0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230179AbjGCJtA (ORCPT + 99 others); Mon, 3 Jul 2023 05:49:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231200AbjGCJsw (ORCPT ); Mon, 3 Jul 2023 05:48:52 -0400 Received: from mail-qv1-xf29.google.com (mail-qv1-xf29.google.com [IPv6:2607:f8b0:4864:20::f29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CD5912C; Mon, 3 Jul 2023 02:48:50 -0700 (PDT) Received: by mail-qv1-xf29.google.com with SMTP id 6a1803df08f44-636220ca533so5573086d6.0; Mon, 03 Jul 2023 02:48:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688377729; x=1690969729; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Or8dFWcZaZQ+0fBVUWIhSbo2VVRVHtdq1po7KsbeTx0=; b=j89bmR0pEZdAh1wW0mtwaoyM9ZTL4bdljqEP1mBSHDe2uHocqrRTilD8XW5ziOwKWz hrPZP28u0sBWfIS9859Pvk9/JKqxoQe1taeAF27ItcbIK8njFTg4dL1b8+tj4yVnnVFL RMDgA0dgWjUu9gBAdFjAsdQoWGsnrAyzyUyLkwysrnp4AUMqNLSgzCb6LxHFb9Gd6huN F9FT8Wi3HUJcnguNRJdPA0LHGCAYhm3mQ+Y4C30Q4nyG8e4+nXGlxkGVKaLGb1tWkhLV shvBlXlVd4B1Se5f2fCmzjIIjVG00COmjnaQGfFwibQxKXQlg4rY5clf0bIdlHLA9mLO n7lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688377729; x=1690969729; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Or8dFWcZaZQ+0fBVUWIhSbo2VVRVHtdq1po7KsbeTx0=; b=Q6qMT0qcZUztVqEutltOmvCAbrf8Rsz1yK3Mum9qktoBiJW9jd5KRWQfk9xZL1RIAV 8yDmvm+qyb9FRR/J3sSdkN9Ab8EBtCEtN9Btc5N2hc/weMRn7zFiUskVNhHj13vF3m8f JMv2iOsfVTeqx4Sdwakd+bkLzRpUcU8FPXdLgNrs5B+g5Chb3FUTrAXp0G1DIuUWHZwA Dij+2iCEkIJjJ/z19bBZlcEkdCDFkwknyzo61bKaaXGmYwc1Nr2jv26HdaqsqkSMWdB0 fnoAMTGzdwW4pti5jt1Em2tCppqEPPV4kcfp8xKMuq2KRiukxMlfoZ0IsuNdxrUt4gXh UEGw== X-Gm-Message-State: ABy/qLZRW9TrfKC2CrZnPZdIunbdKugAGnOdq6xvkrXzY2oQy7QrxwOU JiqUvJr8qVI2slDQoW7pUwnVMzkE6Pt0l2u0Is9mnrbkMLc= X-Received: by 2002:a05:6214:2685:b0:635:d9d0:cccf with SMTP id gm5-20020a056214268500b00635d9d0cccfmr10789245qvb.4.1688377728977; Mon, 03 Jul 2023 02:48:48 -0700 (PDT) MIME-Version: 1.0 References: <20230630145831.2988845-1-i.maximets@ovn.org> In-Reply-To: <20230630145831.2988845-1-i.maximets@ovn.org> From: Magnus Karlsson Date: Mon, 3 Jul 2023 11:48:37 +0200 Message-ID: Subject: Re: [RFC bpf-next] xsk: honor SO_BINDTODEVICE on bind To: Ilya Maximets Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Jason Wang , Stefan Hajnoczi Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 30 Jun 2023 at 16:58, Ilya Maximets wrote: > > Initial creation of an AF_XDP socket requires CAP_NET_RAW capability. > A privileged process might create the socket and pass it to a > non-privileged process for later use. However, that process will be > able to bind the socket to any network interface. Even though it will > not be able to receive any traffic without modification of the BPF map, > the situation is not ideal. > > Sockets already have a mechanism that can be used to restrict what > interface they can be attached to. That is SO_BINDTODEVICE. > > To change the binding the process will need CAP_NET_RAW. > > Make xsk_bind() honor the SO_BINDTODEVICE in order to allow safer > workflow when non-privileged process is using AF_XDP. Rebinding an AF_XDP socket is not allowed today. Any such attempt will return an error from bind. So if I understand the purpose of SO_BINDTODEVICE correctly, you could say that this option is always set for an AF_XDP socket and it is not possible to toggle it. The only way to "rebind" an AF_XDP socket is to close it and open a new one. This was a conscious design decision from day one as it would be very hard to support this, especially in zero-copy mode. > Signed-off-by: Ilya Maximets > --- > > Posting as an RFC for now to probably get some feedback. > Will re-post once the tree is open. > > Documentation/networking/af_xdp.rst | 9 +++++++++ > net/xdp/xsk.c | 6 ++++++ > 2 files changed, 15 insertions(+) > > diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst > index 247c6c4127e9..1cc35de336a4 100644 > --- a/Documentation/networking/af_xdp.rst > +++ b/Documentation/networking/af_xdp.rst > @@ -433,6 +433,15 @@ start N bytes into the buffer leaving the first N bytes for the > application to use. The final option is the flags field, but it will > be dealt with in separate sections for each UMEM flag. > > +SO_BINDTODEVICE setsockopt > +-------------------------- > + > +This is a generic SOL_SOCKET option that can be used to tie AF_XDP > +socket to a particular network interface. It is useful when a socket > +is created by a privileged process and passed to a non-privileged one. > +Once the option is set, kernel will refuse attempts to bind that socket > +to a different interface. Updating the value requires CAP_NET_RAW. > + > XDP_STATISTICS getsockopt > ------------------------- > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index 5a8c0dd250af..386ff641db0f 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c > @@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) > struct sock *sk = sock->sk; > struct xdp_sock *xs = xdp_sk(sk); > struct net_device *dev; > + int bound_dev_if; > u32 flags, qid; > int err = 0; > > @@ -899,6 +900,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) > XDP_USE_NEED_WAKEUP)) > return -EINVAL; > > + bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); > + > + if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex) > + return -EINVAL; > + > rtnl_lock(); > mutex_lock(&xs->mutex); > if (xs->state != XSK_READY) { > -- > 2.40.1 > >