Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1145568imm; Wed, 25 Jul 2018 12:18:34 -0700 (PDT) X-Google-Smtp-Source: AAOMgpevvaoOFYl4bwofi7asFEC2c+SMsaT7VYa44reBSYUbKmVoxaTD4jOmTkDdwrmYA0c1Niuc X-Received: by 2002:a63:bf43:: with SMTP id i3-v6mr21822258pgo.342.1532546314760; Wed, 25 Jul 2018 12:18:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532546314; cv=none; d=google.com; s=arc-20160816; b=wWoT69Y51+CAkWrypssRsen25dvTqON7Nsc3RPtdMBDRJdSUZtUYtInLg/JC9gk9iQ gnCwP0zzZbx9tmhI28qiM1B0G6s4JpX2KL+8+e8sNcjDTMRPcvDdA3sn0Ab2v2l1PpFY ZRw7RZgoWV0tYMI9xz6dBavsN9c1h8OJcEnxp+GWTSc4Fd2OVynR2hhU/KZADL3J/Y6w ab7Vk08K+AVtKkSXlqaNnlV7so5bvEcW5MBJUf+TOVsmPL0XCT6KekmVxOlh9dSLORun qxafMzyELBr3rCuaGU3i/s69wui0Ds7juXcZ2FynYOUsAH4uevbRltyY+JdYCMnPaWb/ LOKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from :arc-authentication-results; bh=94ya1aj8ydxbAdSNqgRycgDZpq7Uvr3e2mGLdMXtv7c=; b=Wq0zmD5OHakGN9gLe5AGOXwOHnH+3SGTYR83s/u0JKcHC0pZ/kj3oo6pN5pCGzd/Px 2kCPWGXLtbd626fx9Y3Q+yben/VTqW/GBmam5hyW3uJB/PdhOt3F/2ivU2VW89hcZHMz +dCbudK+1XmDhTeSdjxG3BbS4EXIbNnPdEQ4lC0okg9c6zGBCOPiUl80a+9u3FkRJquT sEBfrYI7OSxge+y/7Y+dAssqMMnT25LezU21+yA3azdNwrOuoPCwmCabEmyU1v8PRAIG fdt9/c1VQgrsvjaNl4g88ptie+yUeyaVnfJMVmKPosa7FSQrEE6PDOCR8u5uJ1TmkFXs 3BKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a4-v6si15182138pgl.9.2018.07.25.12.18.19; Wed, 25 Jul 2018 12:18:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730135AbeGYUaa (ORCPT + 99 others); Wed, 25 Jul 2018 16:30:30 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:38950 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729417AbeGYUaa (ORCPT ); Wed, 25 Jul 2018 16:30:30 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fiPHq-0002S6-LI; Wed, 25 Jul 2018 13:17:26 -0600 Received: from [97.119.167.31] (helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fiPHp-000098-H1; Wed, 25 Jul 2018 13:17:26 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: David Ahern Cc: Cong Wang , David Miller , Linux Kernel Network Developers , nikita.leshchenko@oracle.com, Roopa Prabhu , Stephen Hemminger , Ido Schimmel , Jiri Pirko , Saeed Mahameed , Alexander Aring , linux-wpan@vger.kernel.org, NetFilter , LKML References: <1a3f59a9-0ba5-c83f-16a6-f9550a84f693@gmail.com> <1a27e301-3275-b349-a2f8-afdfdc02f04f@gmail.com> <20180718.125938.2271502580775162784.davem@davemloft.net> <28c30574-391c-b4bd-c337-51d3040d901a@gmail.com> <5021d874-8e99-6eba-f24b-4257c62d4457@gmail.com> <87muufze8w.fsf@xmission.com> <4b03b5f6-87ce-9ff2-7c14-598beebd8fb8@gmail.com> <87zhyfw70m.fsf@xmission.com> Date: Wed, 25 Jul 2018 14:17:21 -0500 In-Reply-To: (David Ahern's message of "Wed, 25 Jul 2018 12:13:30 -0600") Message-ID: <87o9evt9a6.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1fiPHp-000098-H1;;;mid=<87o9evt9a6.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=97.119.167.31;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/R2zkNmVwQ6kjKZbQSiFAEYhKimhriGiQ= X-SA-Exim-Connect-IP: 97.119.167.31 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on sa01.xmission.com X-Spam-Level: * X-Spam-Status: No, score=1.5 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TM2_M_HEADER_IN_MSG,T_TooManySym_01,T_XMDrugObfuBody_08, XMSubLong autolearn=disabled version=3.4.0 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4992] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa01 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;David Ahern X-Spam-Relay-Country: X-Spam-Timing: total 761 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 2.9 (0.4%), b_tie_ro: 1.93 (0.3%), parse: 1.36 (0.2%), extract_message_metadata: 6 (0.8%), get_uri_detail_list: 3.3 (0.4%), tests_pri_-1000: 7 (0.9%), tests_pri_-950: 2.0 (0.3%), tests_pri_-900: 1.74 (0.2%), tests_pri_-400: 50 (6.6%), check_bayes: 48 (6.3%), b_tokenize: 15 (1.9%), b_tok_get_all: 17 (2.2%), b_comp_prob: 5 (0.7%), b_tok_touch_all: 5.0 (0.7%), b_finish: 3.0 (0.4%), tests_pri_0: 659 (86.6%), check_dkim_signature: 0.91 (0.1%), check_dkim_adsp: 13 (1.8%), tests_pri_500: 7 (1.0%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH RFC/RFT net-next 00/17] net: Convert neighbor tables to per-namespace X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org David Ahern writes: > On 7/25/18 11:38 AM, Eric W. Biederman wrote: >> >> Absolutely NOT. Global thresholds are exactly correct given the fact >> you are running on a single kernel. >> >> Memory is not free (Even though we are swimming in enough of it memory >> rarely matters). One of the few remaining challenges is for containers >> is finding was to limit resources in such a way that one application >> does not mess things up for another container during ordinary usage. >> >> It looks like the neighbour tables absolutely are that kind of problem, >> because the artificial limits are too strict. Completely giving up on >> limits does not seem right approach either. We need to fix the limits >> we have (perhaps making them go away entirely), not just apply a >> band-aid. Let's get to the bottom of this and make the system better. > > Eric: yes, they all share the global resource of memory and there should > be limits on how many entries a remote entity can create. > > Network namespaces can provide a separation such that one namespace does > not disrupt networking in another. It is absolutely appropriate to do > so. Your rigid stance is inconsistent given the basic meaning of a > network namespace and the parallels to this same problem -- bridges, > vxlans, and ip fragments. Only neighbor tables are not per-device or per > namespace; your insistence on global limits is missing the mark and wrong. That is not what I said. Let me rephrase and see if you understand. The problem appears to be of lots of devices. Fundamentally if you use lots of network devices today unless you adjust gc_thresh3 you will run out of neighbour table entries. The problem has a bigger scope than what you are looking at. If you fix the core problem you won't see the problem in the context of network namespaces either. Default limits should be something that will never be hit unless something goes crazy. We are hitting them. Therefore by definition there is a bug in these limits. And yes there is absolutely a place for global limits on things like inodes, file descriptors etc, that does not care about which part of the kernel you are in. However hitting those limits in normal operation is a bug. We have ourselves a bug. Eric p.s. I wrote the definition of network namespaces and it absolutely does have room for global limits. One of the things Linus has periodically yelled at me about is that there are not enough of them.