Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp375654rdg; Thu, 12 Oct 2023 08:08:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEAbosPTRbj8LQDBFuL30hrqLJj1btca/AUB9b4OrJa6RdEI3SNDGP59lBbQSj0wx1Ns5G8 X-Received: by 2002:a05:6a00:1a43:b0:68a:582b:6b62 with SMTP id h3-20020a056a001a4300b0068a582b6b62mr29578365pfv.7.1697123288507; Thu, 12 Oct 2023 08:08:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697123288; cv=none; d=google.com; s=arc-20160816; b=VI4ZFaY40Bub0Fci7Wisutyiak7JwaW3Y1zJq4XTVCgiKIcxnFC59cgi1BavXgw8y1 2k8JMkwa2drQyeEjXgBIkXgoH9LnslOa9OA4yjONG7+YiXVw7HV5+4QHAB43u+fsidx2 IhTgwVyNmz6kQBr9PmOFtEckgxPF66rnfmBDAv6nSbS23L5NLkpQ5dVgKRmAWO8bx5LT 1E+VF5Py3MyTW8h2fFnaSsNu6EdJZjlWAQkD0vZLgQGCiUjB+18h3GAU/4whVH7xJNwz VXW8lFqPUaVkgRayP/RetQSb9vkJgmmIItpImnIY1UD/i7SXZ6Wbogcusr5NE8HsG61a TFpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=AdocvBnwENUXMzlvOyYLpJBblaZ0RtqfbE/1hQKJm4Q=; fh=o0L/goa8Lyhb9f+qD0csVFszhVIsoRrYBtHhpSM6Gu4=; b=wvDttMaiq/QwQ44//aR/uT5h09BuTbOh4+KoL6YOmJZ4EHu7/bVd+yqMlY3wZmQORf oI5PGmIcqYF64gZEFCyWZ4V03sKZeTnODee2uPtwmvBxLBJAjda1yaCEdLQ1zbY3c2GJ Wy/uLDf3sH1MxE1VhPyi1pfj+s6Ip5+mQMWMRcD5fo89n4OZsBHpvWZ/PBZBXFCgcaXv miQoOsnhqmnKJp5hZe4f3PU4a41KRgwBKmSC5/oiN7TMOFQ/9oFJQMJu4H8YipEZfYDx U3oxFquXGUcXos6WDsbusmkQAmXyyJH/CrLJqdcmxp+bU5fhZ6A+RY3ZwpjbJ04Rswbk lIHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LdqRdzkr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id ea25-20020a056a004c1900b00690ffbb46e6si7286250pfb.261.2023.10.12.08.08.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 08:08:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LdqRdzkr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 4EC1283B1EA6; Thu, 12 Oct 2023 08:08:06 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343993AbjJLPH6 (ORCPT + 99 others); Thu, 12 Oct 2023 11:07:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233980AbjJLPH5 (ORCPT ); Thu, 12 Oct 2023 11:07:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D3B0B8 for ; Thu, 12 Oct 2023 08:07:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1697123230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=AdocvBnwENUXMzlvOyYLpJBblaZ0RtqfbE/1hQKJm4Q=; b=LdqRdzkredwEBu7kcrhD53Lea5oLoNMrQqvSs93S1ZTjrQMKwQOaHqVj43G1nA3ejMXVQE lnljjM7NNyPGiQB0UrSl1sY6Q3qXwnYH1w/ijMrLTINZt+EewW8HNYK9O9VpXJK7rFsRON b9EiT1rJP2zpGdn4qEaDgl7o0n7MUHw= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-576-hGQptFwDPV-zNNkfRttMNQ-1; Thu, 12 Oct 2023 11:07:07 -0400 X-MC-Unique: hGQptFwDPV-zNNkfRttMNQ-1 Received: by mail-qt1-f199.google.com with SMTP id d75a77b69052e-418134c43d7so15580571cf.1 for ; Thu, 12 Oct 2023 08:07:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697123226; x=1697728026; h=mime-version:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=AdocvBnwENUXMzlvOyYLpJBblaZ0RtqfbE/1hQKJm4Q=; b=u3JTv487qPxOhUzJ9wErzQ//LT7ZgpvWvYh6cEVzC3Gud0vubBRxl6e2OiGVZUSbNB uiFFsE82fLrIXZQd6pd+BxbivLxDabtxkzVL++8IzxKB/M1JkiABTJ39hCc6hC/7QBSr tapI3YKVZeMP2Tvzcxyq2KNytpkNqce9dEp+YSk9xgIlHQXqbAk6D0KyVE08Q5pq93Le BXsXzqLS7bNpfwF9rozwEVERpC8GwMBlRzWjRdiO6H0T+hfBfUG1/C1kvBlmayCRWYiB Ta7sCqwHj+DX8kFGVhm+BvHT06tFKQT4lzQk/exDyVc92kS4jsS3+6rrwwA15S6ZyeSZ yvSw== X-Gm-Message-State: AOJu0Yzaj6w+Adfss0BsHsTVF7L460j4Kq+fv7GvkRnWTpbJg4FftqoG jSf5i575VnJcpht4rDSpiZYL0N92TXpYLOfTg1vnAU2eR1eaET94CO02EwGS9QAJo7j9tZYVspQ 82HOG6/1xWy7tP1p1v9C/FUV0 X-Received: by 2002:a05:622a:1314:b0:417:f666:b780 with SMTP id v20-20020a05622a131400b00417f666b780mr29182454qtk.19.1697123226645; Thu, 12 Oct 2023 08:07:06 -0700 (PDT) X-Received: by 2002:a05:622a:1314:b0:417:f666:b780 with SMTP id v20-20020a05622a131400b00417f666b780mr29182415qtk.19.1697123226226; Thu, 12 Oct 2023 08:07:06 -0700 (PDT) Received: from vschneid.remote.csb (213-44-141-166.abo.bbox.fr. [213.44.141.166]) by smtp.gmail.com with ESMTPSA id fy11-20020a05622a5a0b00b004198d026be6sm6279054qtb.35.2023.10.12.08.07.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 08:07:05 -0700 (PDT) From: Valentin Schneider To: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sebastian Andrzej Siewior , Thomas Gleixner , Juri Lelli , Clark Williams , "Luis Claudio R. Goncalves" Subject: [RT BUG] Stall caused by eventpoll, rwlocks and CFS bandwidth controller Date: Thu, 12 Oct 2023 17:07:02 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Thu, 12 Oct 2023 08:08:06 -0700 (PDT) Hi folks, We've had reports of stalls happening on our v6.0-ish frankenkernels, and while we haven't been able to come out with a reproducer (yet), I don't see anything upstream that would prevent them from happening. The setup involves eventpoll, CFS bandwidth controller and timer expiry, and the sequence looks as follows (time-ordered): p_read (on CPUn, CFS with bandwidth controller active) ====== ep_poll_callback() read_lock_irqsave() ... try_to_wake_up() <- enqueue causes an update_curr() + sets need_resched due to having no more runtime preempt_enable() preempt_schedule() <- switch out due to p_read being now throttled p_write ======= ep_poll() write_lock_irq() <- blocks due to having active readers (p_read) ktimers/n ========= timerfd_tmrproc() `\ ep_poll_callback() `\ read_lock_irqsave() <- blocks due to having active writer (p_write) From this point we have a circular dependency: p_read -> ktimers/n (to replenish runtime of p_read) ktimers/n -> p_write (to let ktimers/n acquire the readlock) p_write -> p_read (to let p_write acquire the writelock) IIUC reverting 286deb7ec03d ("locking/rwbase: Mitigate indefinite writer starvation") should unblock this as the ktimers/n thread wouldn't block, but then we're back to having the indefinite starvation so I wouldn't necessarily call this a win. Two options I'm seeing: - Prevent p_read from being preempted when it's doing the wakeups under the readlock (icky) - Prevent ktimers / ksoftirqd (*) from running the wakeups that have ep_poll_callback() as a wait_queue_entry callback. Punting that to e.g. a kworker /should/ do. (*) It's not just timerfd, I've also seen it via net::sock_def_readable - it should be anything that's pollable. I'm still scratching my head on this, so any suggestions/comments welcome! Cheers, Valentin