Received: by 10.223.164.202 with SMTP id h10csp812750wrb; Tue, 14 Nov 2017 10:02:57 -0800 (PST) X-Google-Smtp-Source: AGs4zMaHkeHTBqTazoDLwqAgueSM2O4T4vuvN30hFS0RaOLgOWs5W1JtD571y/he1i6/Qd1fsTJp X-Received: by 10.99.117.90 with SMTP id f26mr12890992pgn.201.1510682576999; Tue, 14 Nov 2017 10:02:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510682576; cv=none; d=google.com; s=arc-20160816; b=009Jof1O2ixJf4x+6WgbNDWQvKr1iZztJsT7RDHmcpseIbIGe4Fsp6Jdec9SIMu0ZI BRTB9E6ufGbZ7Ab84ifJLNjIAJ1iDayQVLCBnlZNHStSJGh+1olURAolA8cnYikdPbyd D5Spqhvc0PUuZ3VSFrLfHVCzD1dQQPTGoJbIQwyeFlS1zBfhl4lTva5RvilVGHi+6jvm bJhhgRxd62T6Gg+xl4aqyCn8Govil7G6MSB3CnzlbGz1UPrCFMlKCgfI0JRygXMIep6D gIm/cv7isT7bVJOryEfeCcPq+x8mB4Dr7JA2qZqzQiamUWE/DExzamEleeSYHcTPySyM jkVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:arc-authentication-results; bh=e3Pe2xXV7J/xRiVmeolcsVavvc3AfnMJIiVJUfekKuc=; b=WZjrUNwxXGKrFczhX010Nqus43enAf28VivkX2Ge8VX8ldZpHaLpYHxDUCN1buSGDu KQN6HbMkipIygrRMEvZnJqijmXV7D1IDxImZLa8Wbg5QxIiEAPCyKu5PRhK63rr30i5x nv4PWN/Ty+bb6FOTVXuFZS3n5F8vuHTNt56aEkCcoKVu4nUxHsbmCxlIXJzrHhHrD0U1 2cilhgiPCE+kZL4gripkPxIKF5ZhmqJIu4CmNyG6XE7MZR8XC6tRv8A/wnHEbPEUjYN0 UPJvZDoGvqM9c4BDFdmQ/7TQeZ7SOqNp6ZejuRWrZR+qA//6WaLPDa/Aq0M43/GenR6R At/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oLEhgz8w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z18si5795605pge.375.2017.11.14.10.02.12; Tue, 14 Nov 2017 10:02:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oLEhgz8w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756224AbdKNSBL (ORCPT + 88 others); Tue, 14 Nov 2017 13:01:11 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:44132 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752362AbdKNSBE (ORCPT ); Tue, 14 Nov 2017 13:01:04 -0500 Received: by mail-io0-f195.google.com with SMTP id q101so7923086ioi.1; Tue, 14 Nov 2017 10:01:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=e3Pe2xXV7J/xRiVmeolcsVavvc3AfnMJIiVJUfekKuc=; b=oLEhgz8w6oH03kUJ3nIoVqb/zKLqKhBSw3f/R1cDXxA87B1XUMOwBGMBUzam0hpzNG 4ZvWPf1tKY+5Gc2cVunNRq93y/B+S78LpKc2Bb3UJwChQ2/h6TXgFxCKgiZc4qbRR+gA OqpLPnDPqa9Yo9APdc3I1BFdC2aDJmM4JIb15GxxCzq+9vKHj9IosuqyJy5RpwsUXLE4 yfDHvUCeXah83RZN2+bgP6XWAmSLR14mzGOfjwA3RaSoiYJWGC9pe6OmYB9MBUONg5/i gXZGgRJVrontDbOCdXkNtTP6Liem+S6GVoGoAJ+13/eREOg8NOJcoLMwp2EcETQ7ROdz 5ZaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=e3Pe2xXV7J/xRiVmeolcsVavvc3AfnMJIiVJUfekKuc=; b=MQtOH+HNmLDBn3FPBG6w0cq9yvm1GhkZ1d3dtPmMrXM6XFVZKj5Pd9Lg3145XkrxZ5 xAFpV6CxTfhTJs25RU4QPP5rqybmULOMHQ6h6j+gE4HqiV0wL5biva6t0NWW0tLQtYYK OBeA+kjmmOXo1dOEOHFNM8VipDqT1rmPo/VnsVMBrJ/DZ6yAldMz3QTPRk00mZJgtZoB 7r9sSspkHMB8oTjAKesQFXClCkehWLsV97WOzgjALFDtjxr9OC3EVhrtDB/rsP4pUA81 C9UQxI3oZlqKLcbyoFPt56o+IfU4MevXlYajTMpxZlH9WL/sDXfAbPKbixcAN56AVVDH hC0g== X-Gm-Message-State: AJaThX6l8bKAfTsPcAcSn5bg3P/B0ZvVKINZ395as5geulFW9vCjBatC 8uX2GF82t95lOXuGYW2jHHw= X-Received: by 10.107.142.137 with SMTP id q131mr2239393iod.109.1510682462419; Tue, 14 Nov 2017 10:01:02 -0800 (PST) Received: from ?IPv6:2620:15c:2c1:100:444d:62f5:f311:8b8e? ([2620:15c:2c1:100:444d:62f5:f311:8b8e]) by smtp.googlemail.com with ESMTPSA id q202sm8710172iod.85.2017.11.14.10.01.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 10:01:01 -0800 (PST) Message-ID: <1510682459.2849.174.camel@edumazet-glaptop3.roam.corp.google.com> Subject: Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit From: Eric Dumazet To: Andrei Vagin Cc: Kirill Tkhai , davem@davemloft.net, vyasevic@redhat.com, kstewart@linuxfoundation.org, pombredanne@nexb.com, vyasevich@gmail.com, mark.rutland@arm.com, gregkh@linuxfoundation.org, adobriyan@gmail.com, fw@strlen.de, nicolas.dichtel@6wind.com, xiyou.wangcong@gmail.com, roman.kapl@sysgo.com, paul@paul-moore.com, dsahern@gmail.com, daniel@iogearbox.net, lucien.xin@gmail.com, mschiffer@universe-factory.net, rshearma@brocade.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, ebiederm@xmission.com, gorcunov@virtuozzo.com Date: Tue, 14 Nov 2017 10:00:59 -0800 In-Reply-To: <20171114174454.GA11452@outlook.office365.com> References: <151066759055.14465.9783879083192000862.stgit@localhost.localdomain> <20171114174454.GA11452@outlook.office365.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2017-11-14 at 09:44 -0800, Andrei Vagin wrote: > On Tue, Nov 14, 2017 at 04:53:33PM +0300, Kirill Tkhai wrote: > > Curently mutex is used to protect pernet operations list. It makes > > cleanup_net() to execute ->exit methods of the same operations set, > > which was used on the time of ->init, even after net namespace is > > unlinked from net_namespace_list. > > > > But the problem is it's need to synchronize_rcu() after net is removed > > from net_namespace_list(): > > > > Destroy net_ns: > > cleanup_net() > > mutex_lock(&net_mutex) > > list_del_rcu(&net->list) > > synchronize_rcu() <--- Sleep there for ages > > list_for_each_entry_reverse(ops, &pernet_list, list) > > ops_exit_list(ops, &net_exit_list) > > list_for_each_entry_reverse(ops, &pernet_list, list) > > ops_free_list(ops, &net_exit_list) > > mutex_unlock(&net_mutex) > > > > This primitive is not fast, especially on the systems with many processors > > and/or when preemptible RCU is enabled in config. So, all the time, while > > cleanup_net() is waiting for RCU grace period, creation of new net namespaces > > is not possible, the tasks, who makes it, are sleeping on the same mutex: > > > > Create net_ns: > > copy_net_ns() > > mutex_lock_killable(&net_mutex) <--- Sleep there for ages > > > > The solution is to convert net_mutex to the rw_semaphore. Then, > > pernet_operations::init/::exit methods, modifying the net-related data, > > will require down_read() locking only, while down_write() will be used > > for changing pernet_list. > > > > This gives signify performance increase, like you may see below. There > > is measured sequential net namespace creation in a cycle, in single > > thread, without other tasks (single user mode): > > > > 1)int main(int argc, char *argv[]) > > { > > unsigned nr; > > if (argc < 2) { > > fprintf(stderr, "Provide nr iterations arg\n"); > > return 1; > > } > > nr = atoi(argv[1]); > > while (nr-- > 0) { > > if (unshare(CLONE_NEWNET)) { > > perror("Can't unshare"); > > return 1; > > } > > } > > return 0; > > } > > > > Origin, 100000 unshare(): > > 0.03user 23.14system 1:39.85elapsed 23%CPU > > > > Patched, 100000 unshare(): > > 0.03user 67.49system 1:08.34elapsed 98%CPU > > > > 2)for i in {1..10000}; do unshare -n bash -c exit; done > > Hi Kirill, > > This mutex has another role. You know that net namespaces are destroyed > asynchronously, and the net mutex gurantees that a backlog will be not > big. If we have something in backlog, we know that it will be handled > before creating a new net ns. > > As far as I remember net namespaces are created much faster than > they are destroyed, so with this changes we can create a really big > backlog, can't we? Please take a look at the recent patches I did : 8ca712c373a462cfa1b62272870b6c2c74aa83f9 Merge branch 'net-speedup-netns-create-delete-time' 64bc17811b72758753e2b64cd8f2a63812c61fe1 ipv4: speedup ipv6 tunnels dismantle bb401caefe9d2c65e0c0fa23b21deecfbfa473fe ipv6: speedup ipv6 tunnels dismantle 789e6ddb0b2fb5d5024b760b178a47876e4de7a6 tcp: batch tcp_net_metrics_exit a90c9347e90ed1e9323d71402ed18023bc910cd8 ipv6: addrlabel: per netns list d464e84eed02993d40ad55fdc19f4523e4deee5b kobject: factorize skb setup in kobject_uevent_net_broadcast() 4a336a23d619e96aef37d4d054cfadcdd1b581ba kobject: copy env blob in one go 16dff336b33d87c15d9cbe933cfd275aae2a8251 kobject: add kobject_uevent_net_broadcast() From 1584064180049940864@xxx Tue Nov 14 17:42:04 +0000 2017 X-GM-THRID: 1584049999819820517 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread