Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp349451ybl; Thu, 30 Jan 2020 23:42:55 -0800 (PST) X-Google-Smtp-Source: APXvYqzX0oRzWjb7fPubzL7xfAtz6JCeNPy6wckvWzds44Twh5GWayZEjpF1cjSyllXVsHIYpm2h X-Received: by 2002:aca:3857:: with SMTP id f84mr5250478oia.150.1580456575248; Thu, 30 Jan 2020 23:42:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580456575; cv=none; d=google.com; s=arc-20160816; b=icW2b9c90p235opkWMXqa/t/RKiaEsKUvYhWF1FvJKpjJbFF/hDIPfA7A+ZtSfUmft avR3B1Q7NRsJjiBgqv50aXpZnuC26fmflCVhm2iRCgnYevx3IqT+XwKUuXidJEqZe+h3 yAT+T0cLs7amgXfWM0+ihcOfrG58oyAPxrBbJLkKwOBPoeDJyVA9O1HSz1fDLDv0y12p 9Bu4dKUc36szuQunYEjy7lAkduUrWuHnXF0Ukj2zKnY9f0EsvTAmZuemZ6y+ofggV2W2 WJIrqwjAD9uHurzW3x0Je7pXDjR5x+VbvzrXoQSzdKgPHrG4LHQA2M9/Vz5vS0bo3n5M PLWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:in-reply-to:message-id:date :subject:cc:to:from:ironport-sdr:dkim-signature; bh=hXqcxoeUTXv6Sl/D/o2xVhucn86dLhacZQT1P/cIuzM=; b=fVoZr72DmwnflRpl2iU0m4ainZji8EsByEJw7W7W/3D0H6pvnzirhH8OqAiod/VaBL BSt3/b6eqjvo9SWSajprFr/zHn0PQFP2lw++miW/Q6mUbWrd+lHXJFMXwIJc/m2cr9EO 8jq05Ma16cAqzTMIrRwey8al1Y9v0NycemdOZKH2onQPY7amAwxg3oHLb6gk6EdkVw83 Qt6mrSUs38v0TzxJx860S5/m7dsZn85XBNPmfboLhO3O/g064On65lD75d398SBsiCxk Lheen1KeRfSdhE0GIQWljcraOcvmwf+Qb8Lr/k0tw3Q559VhiVqGiyvVoQlSUQjODcnX S5TA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=VsXqKMr7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p22si4679598ota.43.2020.01.30.23.42.42; Thu, 30 Jan 2020 23:42:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=VsXqKMr7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728158AbgAaHlo (ORCPT + 99 others); Fri, 31 Jan 2020 02:41:44 -0500 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:15475 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728089AbgAaHln (ORCPT ); Fri, 31 Jan 2020 02:41:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1580456503; x=1611992503; h=from:to:cc:subject:date:message-id:in-reply-to: mime-version; bh=hXqcxoeUTXv6Sl/D/o2xVhucn86dLhacZQT1P/cIuzM=; b=VsXqKMr7mO312v5ohuH8XjkUnzIt36KoDYQhKwgEHa6yIdpnjWHqB85X TNTn0Kk7o/x7ABgvPCoRDa9K3C+gJGtGJRRpyFACsWdoiH4PwE0GSwKgC KDaglHlrsojI/XxhHFBFRSLIKT8Xl/aj7S1yX+sUtX9QXKVFiso9fbq9/ 0=; IronPort-SDR: TDVLF5psLVoz4B8kmnPSKaCBxvX8j+VejGNDcH7bgeJuy0/p1gPPyG1+1SwQ0HFCFa+vcpb7p8 3JUDRyKLDbEg== X-IronPort-AV: E=Sophos;i="5.70,385,1574121600"; d="scan'208";a="14975291" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-74cf8b49.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP; 31 Jan 2020 07:41:41 +0000 Received: from EX13MTAUEA002.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-74cf8b49.us-east-1.amazon.com (Postfix) with ESMTPS id CC8DEC08EB; Fri, 31 Jan 2020 07:41:37 +0000 (UTC) Received: from EX13D31EUA001.ant.amazon.com (10.43.165.15) by EX13MTAUEA002.ant.amazon.com (10.43.61.77) with Microsoft SMTP Server (TLS) id 15.0.1236.3; Fri, 31 Jan 2020 07:41:37 +0000 Received: from u886c93fd17d25d.ant.amazon.com (10.43.160.29) by EX13D31EUA001.ant.amazon.com (10.43.165.15) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 31 Jan 2020 07:41:30 +0000 From: To: Eric Dumazet CC: , David Miller , "Alexei Starovoitov" , Daniel Borkmann , "Martin KaFai Lau" , Song Liu , Yonghong Song , , netdev , LKML , bpf , , Benjamin Herrenschmidt , Subject: Re: Re: Re: Latency spikes occurs from frequent socket connections Date: Fri, 31 Jan 2020 08:41:16 +0100 Message-ID: <20200131074116.8684-1-sjpark@amazon.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: (raw) MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.43.160.29] X-ClientProxiedBy: EX13D13UWA002.ant.amazon.com (10.43.160.172) To EX13D31EUA001.ant.amazon.com (10.43.165.15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 30 Jan 2020 09:02:08 -0800 Eric Dumazet wrote: > On Thu, Jan 30, 2020 at 4:41 AM wrote: > > > > On Wed, 29 Jan 2020 09:52:43 -0800 Eric Dumazet wrote: > > > > > On Wed, Jan 29, 2020 at 9:14 AM wrote: > > > > > > > > Hello, > > > > > > > > > > > > We found races in the kernel code that incur latency spikes. We thus would > > > > like to share our investigations and hear your opinions. > > > > [...] > > > > > > I would rather try to fix the issue more generically, without adding > > > extra lookups as you did, since they might appear > > > to reduce the race, but not completely fix it. > > > > > > For example, the fact that the client side ignores the RST and > > > retransmits a SYN after one second might be something that should be > > > fixed. > > > > I also agree with this direction. It seems detecting this situation and > > adjusting the return value of tcp_timeout_init() to a value much lower than the > > one second would be a straightforward solution. For a test, I modified the > > function to return 1 (4ms for CONFIG_HZ=250) and confirmed the reproducer be > > silent. My following question is, how we can detect this situation in kernel? > > However, I'm unsure how we can distinguish this specific case from other cases, > > as everything is working as normal according to the TCP protocol. > > > > Also, it seems the value is made to be adjustable from the user space using the > > bpf callback, BPF_SOCK_OPS_TIMEOUT_INIT: > > > > BPF_SOCK_OPS_TIMEOUT_INIT, /* Should return SYN-RTO value to use or > > * -1 if default value should be used > > */ > > > > Thus, it sounds like you are suggesting to do the detection and adjustment from > > user space. Am I understanding your point? If not, please let me know. > > > > No, I was suggesting to implement a mitigation in the kernel : > > When in SYN_SENT state, receiving an suspicious ACK should not > simply trigger a RST. > > There are multiple ways maybe to address the issue. > > 1) Abort the SYN_SENT state and let user space receive an error to its > connect() immediately. > > 2) Instead of a RST, allow the first SYN retransmit to happen immediately > (This is kind of a challenge SYN. Kernel already implements challenge acks) > > 3) After RST is sent (to hopefully clear the state of the remote), > schedule a SYN rtx in a few ms, > instead of ~ one second. Thank you for this kind comment, Eric! I would prefer the second and third idea rather than first one. Anyway, I will send a patch soon. Will add a kselftest for this case, too. Thanks, SeongJae Park [...]