Received: by 2002:a05:6500:1b45:b0:1f5:f2ab:c469 with SMTP id cz5csp1223360lqb; Thu, 18 Apr 2024 04:02:35 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUt884PMRfQ6OzBuUYXUv4RAbdD56bDTqaVw+Mnfs9dljH0U8SVtAKNk1oOS+IUBuK9FdEQFFuR5XLxKEEcNyu/ISSSGhSM2xGb+SLm4g== X-Google-Smtp-Source: AGHT+IHMegN3U5bCO2r55ih9srEUH4BEeOECiyhFudqGrcoxrOXEdJ6Of1NeEe66WuPwmm1w1VyN X-Received: by 2002:a17:907:3601:b0:a55:577e:4e2e with SMTP id bk1-20020a170907360100b00a55577e4e2emr1620050ejc.20.1713438155535; Thu, 18 Apr 2024 04:02:35 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713438155; cv=pass; d=google.com; s=arc-20160816; b=CmGYZqI80MO7i7oLYKTFkDw8PjH4ALOHRDKHCVaIOjxvfCt8NFEUHGFZ7A4TOWdJzv LP92Bhq6sZOno+Eij63L6sqznbcthX0vo+shAg38R2y7N6it3Z6gylhT1hOgm3vUDnr3 AdGI67sO2xwW8COVohbTPM54ZkhsxCq5YZ/khy5Wr4KO6lp5oCOdrWBOFL6Chiu6/ExL fEKxkqCkAqi5GseWnmzaNcjJn7Ov3UNRvaMZGEA6SMJkpits3+MmZjpqPdHreU08I2Xw hn95V3pmPqg4Glvedhw1OocppE+cMNA2GDeKHK922XlyqtytefzM2d3uPHrWJ7t/7pxX BeXA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=ktW00F9DIAhL7DS4TlvR8U57g+A/C2nEvJUjlW8cwMw=; fh=sdasWbm6ZQ0EZJO/WS3MBJ9Ml5uajBD4Yjo0oeUm5ZI=; b=alekGOCrOrqWzwisH+G0RVz3ywEAZeVgWEKxiV6Td6Ta/HAS10jJrr7okj4NLZjol7 yUQmPESRGGXBm9odEPnITQ1G5KbThnpkmJq5qCqB05Vw5Wxf8e6EnShDkf/gyybT5/18 JreNo1f+iaGAzz6G8hoFSpQtXGcqZ5fB1DalL0Xlx9UpdrT3bk+nylBuuLwM0LiarfcN 5lzUwOt7pnVqNu2qP4aBZRpKz6yQtcEJZz4Gc4DvQ+2tSjMjH7Rjv/3P4oL0aXmflMVc VPC3MYSDBj1nB6nUkkPS/ZznTYGXm9UAZpDh2j0aTNetSK0tN4IDcJFCUdbrbcMlI0Gk /8qA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b="lpYe4hx/"; arc=pass (i=1 spf=pass spfdomain=canonical.com dkim=pass dkdomain=canonical.com dmarc=pass fromdomain=canonical.com); spf=pass (google.com: domain of linux-kernel+bounces-149941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-149941-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id l4-20020a170906414400b00a51bed38908si715520ejk.1051.2024.04.18.04.02.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 04:02:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-149941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b="lpYe4hx/"; arc=pass (i=1 spf=pass spfdomain=canonical.com dkim=pass dkdomain=canonical.com dmarc=pass fromdomain=canonical.com); spf=pass (google.com: domain of linux-kernel+bounces-149941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-149941-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 1F81A1F21F97 for ; Thu, 18 Apr 2024 11:02:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4730C15E5AE; Thu, 18 Apr 2024 11:02:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b="lpYe4hx/" Received: from smtp-relay-internal-0.canonical.com (smtp-relay-internal-0.canonical.com [185.125.188.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AABC15B129 for ; Thu, 18 Apr 2024 11:02:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.125.188.122 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713438137; cv=none; b=Ke/+dIaa2FK8PS//EnYKvc1H9itYF0KoaVeGFv3lGNzoKC6R/A+nvFQeKIdBDCGWMHXL+UKVDHsv1IZKRGaEvAjENyGNR/5gfkTe9lbIuNQ7hvQbXDXQsT0sdhNEv113Nu0QCs/WrlPa9iPCTit2N4CeqRRN9qE19E2mn+S5U9M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713438137; c=relaxed/simple; bh=aWz5Gig2xv2dqOzuJfPq1HgDbmDRmwb4+kS753Y/Y7o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=rxjqSAsaLiT2j4OTiB18d1dV/WfSFJmgkq3kBKx3QjJV6w2VRDl04DXkFB6iW8WszklJYCR318SIEopiOPBolAH2VMVR0GJ41U4XyqByBvfDiCmjI0l3jd7UuRKZm5bYL+/ob++61J9efAKHKqSbMw0H9yHPHEnVoPNGXi3wYKM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com; spf=pass smtp.mailfrom=canonical.com; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b=lpYe4hx/; arc=none smtp.client-ip=185.125.188.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=canonical.com Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id C17873F8EB for ; Thu, 18 Apr 2024 11:02:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1713438129; bh=ktW00F9DIAhL7DS4TlvR8U57g+A/C2nEvJUjlW8cwMw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=lpYe4hx/C8c3+Uq4BzQVzrFs2MoXisxVRCW5jlg4WZ8S24xBMwOKPuhdbSsQ67jOg dCXa7bcAPqBoloz/6lxHT2wGEB1COfjPKOLfhH7gqGgmbLzZtNVsX9vvz/MG0nBm0W +PKSN047yU1bf+OhZ3A8ONB87b/46XSV8o6AaJvlmk2CFA8ZhY4T/dx6tiXYo8CbiO nH+v/RAZL8pW20hXXFVpLs6vPvO+qlhggfGyg8B1p/cPiHKe77W3ctDONVasRBUhHD QnKNVQPjIefaO+xtAvLRAILBKoquGMsNT2RVPfD0iqYR3GXHo9JhBcjTbE4ddQopn4 CyCIRm0HnxpWg== Received: by mail-ed1-f72.google.com with SMTP id 4fb4d7f45d1cf-56e678f6d81so1842818a12.0 for ; Thu, 18 Apr 2024 04:02:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713438128; x=1714042928; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ktW00F9DIAhL7DS4TlvR8U57g+A/C2nEvJUjlW8cwMw=; b=cp17hGnOmBdY6q3M33/wXELOCZFHNxIQpMKV61HDRWquFpt3ixw7abaI9c8l/z6eYF KFDwr1gKxMSwwnzcdUSZ63YZTmcNKbLR7RvJBXQsA4Lv6l/SWlBO1x0mPe450+yL0KUa 4DRR+H8xuH+HJfpkWR3GcTOn4EQN7CFr70D7HNKnVopI3JlOkKtz8vz8UEaFBzLu4pUK UQsl8bpiAomii0GbFrNcHucRb2x6aRGvgZREkdS8LJY13lBBbUhNIgrUHy3zgYvUZKxV 2Ihz1zk0JydHji3pKG0F3Wu/z1TIHkjwVKtC45Y5T+mVEIQO4RYhyd73FzrdoIE1o8Pk EIAQ== X-Forwarded-Encrypted: i=1; AJvYcCWdXL4oycc0bh4qVemTHNP2YF4lDgQecUtJCSvA9YudoLVFbLdL+wHAfW4bqHotXJdlAiG/rooaustS5a4lHZfaMMvdh9xstGDLEnEM X-Gm-Message-State: AOJu0YwJzhFORhzpb5bnVaFIGgFibX8oh3Y8SssZOuBQctTr/IyBFZJJ KJTD0BckuKZbnambtTC/cv6gXTsddA/C8YgEUm53mUkBZR1TXd/jXmSM1mSX81Q2ChD4sJIPDID sQ+ghi+eD1/rlHscWM+BRMo50wzCwebheE8y6g4jHjbDPt4qiGkqKVe788h1Eh7Bn2mH7ZiwpU/ KMpg== X-Received: by 2002:a17:907:9447:b0:a52:30d4:20e6 with SMTP id dl7-20020a170907944700b00a5230d420e6mr1974020ejc.10.1713438128366; Thu, 18 Apr 2024 04:02:08 -0700 (PDT) X-Received: by 2002:a17:907:9447:b0:a52:30d4:20e6 with SMTP id dl7-20020a170907944700b00a5230d420e6mr1974004ejc.10.1713438128069; Thu, 18 Apr 2024 04:02:08 -0700 (PDT) Received: from amikhalitsyn.lan ([2001:470:6d:781:320c:9c91:fb97:fbfc]) by smtp.gmail.com with ESMTPSA id yk18-20020a17090770d200b00a51983e6190sm728594ejb.205.2024.04.18.04.02.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 04:02:07 -0700 (PDT) From: Alexander Mikhalitsyn To: horms@verge.net.au Cc: netdev@vger.kernel.org, lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Mikhalitsyn , =?UTF-8?q?St=C3=A9phane=20Graber?= , Christian Brauner , Julian Anastasov , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal Subject: [PATCH net-next v2 2/2] ipvs: allow some sysctls in non-init user namespaces Date: Thu, 18 Apr 2024 13:01:53 +0200 Message-Id: <20240418110153.102781-2-aleksandr.mikhalitsyn@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240418110153.102781-1-aleksandr.mikhalitsyn@canonical.com> References: <20240418110153.102781-1-aleksandr.mikhalitsyn@canonical.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Let's make all IPVS sysctls writtable even when network namespace is owned by non-initial user namespace. Let's make a few sysctls to be read-only for non-privileged users: - sync_qlen_max - sync_sock_size - run_estimation - est_cpulist - est_nice I'm trying to be conservative with this to prevent introducing any security issues in there. Maybe, we can allow more sysctls to be writable, but let's do this on-demand and when we see real use-case. This patch is motivated by user request in the LXC project [1]. Having this can help with running some Kubernetes [2] or Docker Swarm [3] workloads inside the system containers. [1] https://github.com/lxc/lxc/issues/4278 [2] https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103 [3] https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682 Cc: Stéphane Graber Cc: Christian Brauner Cc: Julian Anastasov Cc: Simon Horman Cc: Pablo Neira Ayuso Cc: Jozsef Kadlecsik Cc: Florian Westphal Signed-off-by: Alexander Mikhalitsyn --- net/netfilter/ipvs/ip_vs_ctl.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index daa62b8b2dd1..f84f091626ef 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -4272,6 +4272,7 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) struct ctl_table *tbl; int idx, ret; size_t ctl_table_size = ARRAY_SIZE(vs_vars); + bool unpriv = net->user_ns != &init_user_ns; atomic_set(&ipvs->dropentry, 0); spin_lock_init(&ipvs->dropentry_lock); @@ -4286,12 +4287,6 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) tbl = kmemdup(vs_vars, sizeof(vs_vars), GFP_KERNEL); if (tbl == NULL) return -ENOMEM; - - /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - tbl[0].procname = NULL; - ctl_table_size = 0; - } } else tbl = vs_vars; /* Initialize sysctl defaults */ @@ -4317,10 +4312,17 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) ipvs->sysctl_sync_ports = 1; tbl[idx++].data = &ipvs->sysctl_sync_ports; tbl[idx++].data = &ipvs->sysctl_sync_persist_mode; + ipvs->sysctl_sync_qlen_max = nr_free_buffer_pages() / 32; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx++].data = &ipvs->sysctl_sync_qlen_max; + ipvs->sysctl_sync_sock_size = 0; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx++].data = &ipvs->sysctl_sync_sock_size; + tbl[idx++].data = &ipvs->sysctl_cache_bypass; tbl[idx++].data = &ipvs->sysctl_expire_nodest_conn; tbl[idx++].data = &ipvs->sysctl_sloppy_tcp; @@ -4343,15 +4345,22 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) tbl[idx++].data = &ipvs->sysctl_conn_reuse_mode; tbl[idx++].data = &ipvs->sysctl_schedule_icmp; tbl[idx++].data = &ipvs->sysctl_ignore_tunneled; + ipvs->sysctl_run_estimation = 1; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_run_estimation; ipvs->est_cpulist_valid = 0; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_est_cpulist; ipvs->sysctl_est_nice = IPVS_EST_NICE; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_est_nice; -- 2.34.1