Received: by 2002:a9a:4c47:0:b029:116:c383:538 with SMTP id u7csp7551372lko; Fri, 30 Jul 2021 06:17:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy2dmzBacHcKFvjfNyCVUoBFLlF22snOy1lFExvLYk9kBztaBC13ImGYEiaH5pUKpcQywfQ X-Received: by 2002:a17:907:988d:: with SMTP id ja13mr2603139ejc.548.1627651046371; Fri, 30 Jul 2021 06:17:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627651046; cv=none; d=google.com; s=arc-20160816; b=A5QRBWxbvV29dlLzoJdV36odeMhOlIHm0RmwqNwHmq4y+TPRLaufvXVX3rnsEieXCU Pe47NkpzXiktCebpw1uD+zPU+gHF4Jb32MkTUE9Pb2Nt8ywB/h/OtTKIRFc2s0hNnSr8 9iOMQSgaBH91x05crGgHsXS5TuOmy2LK0p6sXEclYTdAFTH8WXkbvDm1F8xsJKfTtEDe i2R1x8b95GTV4eNXooj8o4rHzm7jFdjjRvm1eFHN589g9Z9xahuZDLQWjV5WMjCTr4Ka 6OlxZUheEBsIhE1ymIVtHZeLbKxWOuHPHguy3rXv8Zw3Yn9OZw8/OF1OF/XWZHkA6P9c KVBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=DvyXapxVzn+zrn+bjqa/tYPCaA3lgxakfkU7dEkt2Gk=; b=iA9ya0AGc4P751tf4t9o1qC2RZWECErO1hOcgfie89J5otvurYVbGuY0RSEEpFfey2 9F7LBRIRMyVi43E8egBWTbRWyyrowEE0LSFIz1rJYEzKp6RorwwzlTDOsfnyIsVk7E56 4y+05AdfQHs+Vak+zm9xWW0OOZm0P9j8l69moXTMwwcTksdb8XAbWk2fW8Plm6L8WRQo 1LyRpjoEnTlGV53ThWMF9k+/Y39uRN+pcQjvNNOGs9016rRLRf67GJE7gOz8XtNnLC47 +b0vQA46zkCTQWaU3nCFn8eilGgThk+XIx5vQT24T6XCwirC9Nqf6wGrJSGGuOk/sYQu Obeg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="N/glq5dW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w11si2005521ejc.737.2021.07.30.06.17.02; Fri, 30 Jul 2021 06:17:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="N/glq5dW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238937AbhG3NNn (ORCPT + 99 others); Fri, 30 Jul 2021 09:13:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:39758 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238904AbhG3NNm (ORCPT ); Fri, 30 Jul 2021 09:13:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627650817; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DvyXapxVzn+zrn+bjqa/tYPCaA3lgxakfkU7dEkt2Gk=; b=N/glq5dWiufBFcqIXf41uSBX8hakjkXggDZ/AwZM/eiro+hyM2zMQTWlXVJzOhyo8gYU+G JTEE3gV90ZnP7qttyp1i0+aPAuPIqtMyYSD4qSOo447UShLjq45fmI8WcE22/iEV5meHcs 1/JKFcv72ZwITNlKH93b6YyJOAIG0Hw= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-520-nXL6IcJvOO2QZGOWolXgYw-1; Fri, 30 Jul 2021 09:13:36 -0400 X-MC-Unique: nXL6IcJvOO2QZGOWolXgYw-1 Received: by mail-wm1-f71.google.com with SMTP id d72-20020a1c1d4b0000b029025164ff3ebfso3184071wmd.7 for ; Fri, 30 Jul 2021 06:13:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=DvyXapxVzn+zrn+bjqa/tYPCaA3lgxakfkU7dEkt2Gk=; b=jQDOXduxFvVIlQDx9uPfwHhv0VF0jnvq6KSXfjSOEQqC/SvOdLuRqa4nzzhvfFT+aS lAH7C5Iyy8RmgQbVePEemC5hJ3dlBuggOCWNTrscwMT/7FOViEXmFplPxtpk0nwdoBNj hphBcUk+wEC89J7N6LTUrK7pehQWRjRvxVgty01UGsElt3xWsgiwUzhddhdi1hjeJD83 QGa52TJiohiL27Wxu1JFdF8dE21nDPLe76Tyt9sRJTnemYTQ27kiDbjI4kzp+JxrhyHN G2mr+cOSsy+3ah2eXHevGQAMwgyH4N/oiKDcXPZghMqtN9EzUOouAkdy2sxilZM8oqxF iGlg== X-Gm-Message-State: AOAM533wsi0Oy7gEAwD4BxxH/3khBFtlLtUNADcQLykmGAkGiLAPs/6R SkgL64fcvJYnNFbSElZuYSq7GjfLKCKI057VvlNd8WgfCBQQN+muf893BpaoZbLKNVqRMu1C8Tw fviJO0wTvCFQdRxmkA9/5K20Z X-Received: by 2002:a5d:504d:: with SMTP id h13mr1681131wrt.132.1627650815261; Fri, 30 Jul 2021 06:13:35 -0700 (PDT) X-Received: by 2002:a5d:504d:: with SMTP id h13mr1681103wrt.132.1627650815058; Fri, 30 Jul 2021 06:13:35 -0700 (PDT) Received: from gerbillo.redhat.com (146-241-97-57.dyn.eolo.it. [146.241.97.57]) by smtp.gmail.com with ESMTPSA id x12sm1775128wrt.35.2021.07.30.06.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Jul 2021 06:13:34 -0700 (PDT) Message-ID: Subject: Re: [PATCH] sock: allow reading and changing sk_userlocks with setsockopt From: Paolo Abeni To: Pavel Tikhomirov , netdev@vger.kernel.org Cc: "David S. Miller" , Jakub Kicinski , Arnd Bergmann , Eric Dumazet , Florian Westphal , linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-arch@vger.kernel.org, Andrei Vagin Date: Fri, 30 Jul 2021 15:13:31 +0200 In-Reply-To: <20210730105406.318726-1-ptikhomirov@virtuozzo.com> References: <20210730105406.318726-1-ptikhomirov@virtuozzo.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2021-07-30 at 13:54 +0300, Pavel Tikhomirov wrote: > SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket > buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and > tcp_sndbuf_expand()). If we've just created a new socket this adjustment > is enabled on it, but if one changes the socket buffer size by > setsockopt(SO_{SND,RCV}BUF*) it becomes disabled. > > CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on > restore as it first needs to increase buffer sizes for packet queues > restore and second it needs to restore back original buffer sizes. So > after CRIU restore all sockets become non-auto-adjustable, which can > decrease network performance of restored applications significantly. I'm wondering if you could just tune tcp_rmem instead? > CRIU need to be able to restore sockets with enabled/disabled adjustment > to the same state it was before dump, so let's add special setsockopt > for it. > > Signed-off-by: Pavel Tikhomirov > --- > Here is a corresponding CRIU commits using these new feature to fix slow > download speed problem after migration: > https://github.com/checkpoint-restore/criu/pull/1568 > > Origin of the problem: > > We have a customer in Virtuozzo who mentioned that nginx server becomes > slower after container migration. Especially it is easy to mention when > you wget some big file via localhost from the same container which was > just migrated. > > By strace-ing all nginx processes I see that nginx worker process before > c/r sends data to local wget with big chunks ~1.5Mb, but after c/r it > only succeeds to send by small chunks ~64Kb. > > Before: > sendfile(12, 13, [7984974] => [9425600], 11479629) = 1440626 <0.000180> > > After: > sendfile(8, 13, [1507275] => [1568768], 17957328) = 61493 <0.000675> > > Smaller buffer can explain the decrease in download speed. So as a POC I > just commented out all buffer setting manipulations and that helped. > > --- > arch/alpha/include/uapi/asm/socket.h | 2 ++ > arch/mips/include/uapi/asm/socket.h | 2 ++ > arch/parisc/include/uapi/asm/socket.h | 2 ++ > arch/sparc/include/uapi/asm/socket.h | 2 ++ > include/uapi/asm-generic/socket.h | 2 ++ > net/core/sock.c | 12 ++++++++++++ > 6 files changed, 22 insertions(+) > > diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h > index 6b3daba60987..1dd9baf4a6c2 100644 > --- a/arch/alpha/include/uapi/asm/socket.h > +++ b/arch/alpha/include/uapi/asm/socket.h > @@ -129,6 +129,8 @@ > > #define SO_NETNS_COOKIE 71 > > +#define SO_BUF_LOCK 72 > + > #if !defined(__KERNEL__) > > #if __BITS_PER_LONG == 64 > diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h > index cdf404a831b2..1eaf6a1ca561 100644 > --- a/arch/mips/include/uapi/asm/socket.h > +++ b/arch/mips/include/uapi/asm/socket.h > @@ -140,6 +140,8 @@ > > #define SO_NETNS_COOKIE 71 > > +#define SO_BUF_LOCK 72 > + > #if !defined(__KERNEL__) > > #if __BITS_PER_LONG == 64 > diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h > index 5b5351cdcb33..8baaad52d799 100644 > --- a/arch/parisc/include/uapi/asm/socket.h > +++ b/arch/parisc/include/uapi/asm/socket.h > @@ -121,6 +121,8 @@ > > #define SO_NETNS_COOKIE 0x4045 > > +#define SO_BUF_LOCK 0x4046 > + > #if !defined(__KERNEL__) > > #if __BITS_PER_LONG == 64 > diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h > index 92675dc380fa..e80ee8641ac3 100644 > --- a/arch/sparc/include/uapi/asm/socket.h > +++ b/arch/sparc/include/uapi/asm/socket.h > @@ -122,6 +122,8 @@ > > #define SO_NETNS_COOKIE 0x0050 > > +#define SO_BUF_LOCK 0x0051 > + > #if !defined(__KERNEL__) > > > diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h > index d588c244ec2f..1f0a2b4864e4 100644 > --- a/include/uapi/asm-generic/socket.h > +++ b/include/uapi/asm-generic/socket.h > @@ -124,6 +124,8 @@ > > #define SO_NETNS_COOKIE 71 > > +#define SO_BUF_LOCK 72 > + > #if !defined(__KERNEL__) > > #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__)) > diff --git a/net/core/sock.c b/net/core/sock.c > index a3eea6e0b30a..843094f069f3 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -1357,6 +1357,14 @@ int sock_setsockopt(struct socket *sock, int level, int optname, > ret = sock_bindtoindex_locked(sk, val); > break; > > + case SO_BUF_LOCK: > + { > + int mask = SOCK_SNDBUF_LOCK | SOCK_RCVBUF_LOCK; What about define a marco with the above mask, and avoid the local variable declaration and brackets??! Thanks! Paolo