Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52EADC64EC7 for ; Tue, 28 Feb 2023 11:25:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231665AbjB1LZT (ORCPT ); Tue, 28 Feb 2023 06:25:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231500AbjB1LYY (ORCPT ); Tue, 28 Feb 2023 06:24:24 -0500 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4365E2FCC5 for ; Tue, 28 Feb 2023 03:24:01 -0800 (PST) Received: by mail-io1-xd2f.google.com with SMTP id t129so3862950iof.12 for ; Tue, 28 Feb 2023 03:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0u2iKp9ldtcgMOD+hKR8kz3k0BOCjo1o/U4cCu8boAs=; b=L0u+m/Xvcmz+gegyTZ64Bq2mJd6k/r1EXydm4FVgaQ6f4cz0HqDzFGrIqcKOKeSX4m TQ2suA/qG7DQtPqbClxjbKGwj2GKtt6FFwCQFWrQVesEBm4qkMyt9e09ErWg4q4S1J75 n+o213OvKaZoFEpQM7Hk9gKLAaUlTk305d2L3HUZs/pPqw2e0UIx9zx6fMq0R9+gEEPT O34ufDaxKGHtli6AC8vWpEO2KOSke0lMJ2Bnc+ZgrGF33uzfXf/47/wc4NKgmC4rPYU8 laCKAqxJH/CfzZv3sST8ObLCavdA+kzooEOzZrzU4a8Qxy08Ynv85vjvXyX/9L8+EqaA tt5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0u2iKp9ldtcgMOD+hKR8kz3k0BOCjo1o/U4cCu8boAs=; b=0z9V2HuSivHljJKxSKmI2+cjS7m5kXXbISw3HuM3JQBhnMh1GFfniovARxlo4Z1cTj 3taUqbPLfPdL0c+ai8cAptU0mYwSP/jNiKd4nr+Y9Mf9J+spgNNnJgrhAP6lKapsgMK1 DLxLp7/aaXfKJg3T9fiO88K3dbCCHGMzaS09X/Ol9ddG4rkN9MoVWUwE3LjasD4q0PDK bJ8rbjjuCWh6ge+yLboXfFWebQB8EdSAzvGjlX2PW1DkXx7Lj0RZh9nqH01N/VlmIwyd 1JlY2rgsBAfVRe1+SVjATjQG7XLxdTJrmsPRdo2L3QP7iDQKCAkW9Zo02AqLx7Q5vOl3 1Hdg== X-Gm-Message-State: AO0yUKWdbaF8+VxcBPap1IckbTgADlwLKoti0TE5pw1S6VyYNtVwNHNG KWgyRFOvn1i/2cdB1nlkZEEp/cIEUiOj7O/udnJM/A== X-Google-Smtp-Source: AK7set/XxfKvKpioQb1Ae1RPArAsVZtkPOSIDArcUmsbNHh+ZP+BrV/3l+awhLoUPWogWiBdbkJZzvBuO+cK4uoi2os= X-Received: by 2002:a05:6602:214b:b0:74c:bb62:6763 with SMTP id y11-20020a056602214b00b0074cbb626763mr1138820ioy.1.1677583438043; Tue, 28 Feb 2023 03:23:58 -0800 (PST) MIME-Version: 1.0 References: <000000000000e412e905f5b46201@google.com> <20230227155352.3399bb10@kernel.org> In-Reply-To: <20230227155352.3399bb10@kernel.org> From: Eric Dumazet Date: Tue, 28 Feb 2023 12:23:46 +0100 Message-ID: Subject: Re: [syzbot] [net?] INFO: task hung in tls_sw_sendpage (3) To: Jakub Kicinski Cc: syzbot , borisp@nvidia.com, bpf@vger.kernel.org, davem@davemloft.net, john.fastabend@gmail.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, pabeni@redhat.com, syzkaller-bugs@googlegroups.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 28, 2023 at 12:53=E2=80=AFAM Jakub Kicinski w= rote: > > On Mon, 27 Feb 2023 21:35:41 +0100 Eric Dumazet wrote: > > This looks suspicious to me > > > > commit 79ffe6087e9145d2377385cac48d0d6a6b4225a5 > > Author: Jakub Kicinski > > Date: Tue Nov 5 14:24:35 2019 -0800 > > > > net/tls: add a TX lock > > > > > > If tls_sw_sendpage() has to call sk_stream_wait_memory(), > > sk_stream_wait_memory() is properly releasing the socket lock, > > but knows nothing about mutex_{un}lock(&tls_ctx->tx_lock); > > That's supposed to be the point of the lock, prevent new writers from > messing with the partially pushed records when the original writer > is waiting for write space. > > Obvious hack but the async crypto support makes TLS a bit of a mess :| > > sendpage_lock not taking tx_lock may lead to obvious problems, I'm not > seeing where the deadlock is, tho.. > This report mentions sendpage, but sendmsg() would have the same issue. A thread might be blocked in sk_stream_wait_memory() with the mutex held, for an arbitrary amount of time, say if the remote peer stays in RWIN 0 for hours. This prevents tx_work from making progress, and tls_sw_cancel_work_tx() would be stuck forever. The consensus is that the kernel shouts a warning if a thread has been waiting on a mutex more than 120 seconds (check_hung_uninterruptible_tasks())