Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp2644251rdb; Wed, 15 Nov 2023 06:49:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IEt4Q3afGk9Ckjr4c56bxhx6qwTnNoDXs6R8V/Qn3tbPrjj5qqy6UYEdPJH2LohGSqY8Lnf X-Received: by 2002:a05:6a20:9430:b0:187:4803:8666 with SMTP id hl48-20020a056a20943000b0018748038666mr2656810pzb.14.1700059775780; Wed, 15 Nov 2023 06:49:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700059775; cv=none; d=google.com; s=arc-20160816; b=w2dFPUOGl1tiZP5ufi6334N0saHJetBbzPfG1sPCJzQK8IDi0Lj2QiUaYMn5a4LVG4 SGVe3QWbJLDirb70BjXw6L+J3KFU+mi/kPHaexKxF1lZfocJeVksqN/AA6GDsR6vPWdB OUiVWjZK4agpLSfGb0d9FAOoRX02xAcm+LkRXn/sB4plsEppJsxFUUwJ/1ExlmgNCY7N C9MmVRFFpTRiAKEZfUBGAyy04fKMcBkHOz+o/Ew/LPYFoXFQeq4dXFOUUXxGCGqHqg1X uWgImrR94jeNC85b+Y92qhKPSvcw9bKm8+8pb9NSfSec/pCXGSZRmu80Kbu6uktfXdyI b2RQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=zHkzs+ps70Z5GXuD30h9mGNv1mAJncdsZa6jleaf8rI=; fh=aB4hUxGC2E8/kvFvyGRGVyNnIgWhbVz6Evvrs9mGU54=; b=byEjxFZZLOm0qAGG/2y1PMlWB91cCqrpYH/Mcev7Ardpw+5UdfZF2gFyXoUYZI7I4p LuB+SrelZ/VkKBozqqjUySfc2J5zf5ZecX9zLIemfKzQooVoLwA6SK8Bc+mg0JJOR5Ky qpaG/j4dha764VIy7GSRTJKDrgkA46u8ZpkSHYEmJ0+15ZW0t6BFcVLalXOgb0sutAOo B1OfxMNxsNwReiug2Kjep25JVxJgi0qlZpS6ZFrEmm2ZprjNJlbTPJWGmT0wwi79te2K Voz8PSr5+jW/PFwPICjW9+6c4Nxtk653rkwEpnPeCiAXrcS9mtGcPvpr6OVDtGl22krQ OEGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DqcJbqlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id y13-20020a056a00190d00b006be1a0457d9si10537310pfi.10.2023.11.15.06.49.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 06:49:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DqcJbqlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id EDF3180842DB; Wed, 15 Nov 2023 06:49:32 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344226AbjKOOt1 (ORCPT + 99 others); Wed, 15 Nov 2023 09:49:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234978AbjKOOtZ (ORCPT ); Wed, 15 Nov 2023 09:49:25 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C837187 for ; Wed, 15 Nov 2023 06:49:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700059761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zHkzs+ps70Z5GXuD30h9mGNv1mAJncdsZa6jleaf8rI=; b=DqcJbqlHOv7fQMPXSqR0DD/V2YMU1KSW2R4zrrJbTI+k1IWjaSvx0BmPo1W0gmco7eLgHg xGnlBZSiloq0nJwPlY8bZfkOSrA/figl9lYZasvVtEKU4uVTWgXk7K94uCS6Z7rCDF04Eh KdxF+h+eChQF4GOSwxx5YbIyvx6l2kI= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-464-AeBtmgJZPqup2gnubeQMhg-1; Wed, 15 Nov 2023 09:49:19 -0500 X-MC-Unique: AeBtmgJZPqup2gnubeQMhg-1 Received: by mail-ej1-f71.google.com with SMTP id a640c23a62f3a-9c39f53775fso52345866b.1 for ; Wed, 15 Nov 2023 06:49:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700059758; x=1700664558; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=zHkzs+ps70Z5GXuD30h9mGNv1mAJncdsZa6jleaf8rI=; b=uGC0WH8FtmLnMVF7NpClUPQ5jBAgWB8lDwE/av378yMBw1ghRl2YGgrSdvLJa2bXL0 1OQDRLt/nmSoNwHy0SMTIlwM/8g6wHaHUWyWzc5FTa5APHCtewoOKwjUOVYNUpbioOpo M/APVNsixhR0I8jLTjyuDAatkkZMSSPMeZbJnPAzltybcnajAD0xCC44bDj8Pntb8upL VI2ycwJTads5jGBQkQxvQv/7MsT5JZ64ifma/NSxX6VO0VbuO6JJh3qmTX/fBrU8vA+1 hmlYFfLTzAxP3/Fef7EDFYmindlMOaTyFOqaFE3pJy3q8QbS5w8q8dQsY07FuMfSzAIx 5vLg== X-Gm-Message-State: AOJu0YxLAhtT2hcsuALNuSyk0AjpzgzjUUlqbaWQWZyL1PJiQ/nnwJPg IL9wLtlYW1yI+Qux/3qe+jxGTCDAe6uGiMOfcVnwbLrVOQb5KJ5OnNenq+h3aM7g9zUBNN5lq7G wZoiBFeNd8y+oyfvnOSLj2l87 X-Received: by 2002:a17:906:74c7:b0:9c4:4b20:44a5 with SMTP id z7-20020a17090674c700b009c44b2044a5mr3818322ejl.4.1700059758554; Wed, 15 Nov 2023 06:49:18 -0800 (PST) X-Received: by 2002:a17:906:74c7:b0:9c4:4b20:44a5 with SMTP id z7-20020a17090674c700b009c44b2044a5mr3818307ejl.4.1700059758184; Wed, 15 Nov 2023 06:49:18 -0800 (PST) Received: from gerbillo.redhat.com (146-241-232-35.dyn.eolo.it. [146.241.232.35]) by smtp.gmail.com with ESMTPSA id y10-20020a1709064b0a00b009dd7bc622fbsm7149455eju.113.2023.11.15.06.49.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 06:49:17 -0800 (PST) Message-ID: Subject: Re: [PATCH net-next v2] netlink: introduce netlink poll to resolve fast return issue From: Paolo Abeni To: Jong eon Park , "David S. Miller" , Eric Dumazet , Jakub Kicinski Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dongha7.kang@samsung.com Date: Wed, 15 Nov 2023 15:49:16 +0100 In-Reply-To: <20231114090748.694646-1-jongeon.park@samsung.com> References: <20231114090748.694646-1-jongeon.park@samsung.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 15 Nov 2023 06:49:33 -0800 (PST) Hi, I'm sorry for the delayed feedback. On Tue, 2023-11-14 at 18:07 +0900, Jong eon Park wrote: > In very rare cases, there was an issue where a user's 'poll' function > waiting for a uevent would continuously return very quickly, causing > excessive CPU usage due to the following scenario. >=20 > When sk_rmem_alloc exceeds sk_rcvbuf, netlink_broadcast_deliver returns a= n > error and netlink_overrun is called. However, if netlink_overrun was > called in a context just before a another context returns from the 'poll' > and 'recv' is invoked, emptying the rcv queue, sk->sk_err =3D ENOBUF is > written to the netlink socket belatedly and it enters the > NETLINK_S_CONGESTED state. If the user does not check for POLLERR, they > cannot consume and clean sk_err and repeatedly enter the situation where > they call 'poll' again but return immediately. Moreover, in this > situation, rcv queue is already empty and NETLINK_S_CONGESTED flag > prevents any more incoming packets. This makes it impossible for the user > to call 'recv'. >=20 > This "congested" situation is a bit ambiguous. The queue is empty, yet > 'congested' remains. This means kernel can no longer deliver uevents > despite the empty queue, and it lead to the persistent 'congested' status= . >=20 > ------------CPU1 (kernel)---------- --------------CPU2 (app)------------= -- > ... > a driver delivers uevent. poll was waiting for schedule. > a driver delivers uevent. > a driver delivers uevent. > ... > 1) netlink_broadcast_deliver fails. > (sk_rmem_alloc > sk_rcvbuf) > getting schedule and poll returns, > and the app calls recv. > (rcv queue is empied) > 2) >=20 > netlink_overrun is called. > (NETLINK_S_CONGESTED flag is set, > ENOBUF is written in sk_err and, > wake up poll.) > finishing its job and call poll. > poll returns POLLERR. >=20 > (the app doesn't have POLLERR handl= er) > it calls poll, but getting POLLERR. > it calls poll, but getting POLLERR. > it calls poll, but getting POLLERR. > ... >=20 > To address this issue, I would like to introduce the following netlink > poll. IMHO the above is an application bug, and should not be addressed in the kernel. If you want to limit the amount of CPU time your application could use, you have to resort to process scheduler setting and/or container limits: nothing could prevent a [buggy?] application from doing: # in shell script while true; do :; done The above condition is IMHO not very different from the above: the application is requesting POLLERR event and not processing them. To more accurate is like looping on poll() getting read event without reading any data. Nothing we should address in the kernel. Cheers, Paolo