Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp2129052rdg; Sun, 13 Aug 2023 10:27:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF+4Vwkkk/zKaoBFFYqIbuy8ls0ubAV49B6Qbdfh4IR9LmZares7Z/2l4FA6zfFtg2Q2y4x X-Received: by 2002:a05:6a21:3d8d:b0:13b:9a09:674b with SMTP id bj13-20020a056a213d8d00b0013b9a09674bmr7572017pzc.36.1691947624486; Sun, 13 Aug 2023 10:27:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691947624; cv=none; d=google.com; s=arc-20160816; b=OCwYmwINPPCozuEdRirW16V0vsNkdbP4NBN19gYfVc67bWzuFogpqiM0WHlNgLUK4O Nr/s0nRPQcw0aDHsWJVM+Gl+7G8iMOwUOEye9NR39VxIUcMpuoCduaeMO1cw/eeOYPGU lnD1qDxlnLyTAOAxQ81J9CWUscT5NDCiW9ChTOfJ+kB4VmD/brmceNqd8TYSH3vrlomX 2h93nwt94L7Qu6/FtmgqkcI8tm01QdlZnjZReA/+egjTtghc6TlM2SlkcqhAOtHqbGbN EMfvv9uvXPLFesMvU+59Ocj6+8Mbx7etmbT16ZVcczexUXtglYutUPUhS4Mg3lBllpOH UgVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=+iDEH45PIbLPRLMGHHsGA82f2vEUhaOKwZb/xF3clyU=; fh=B0V/tOi9WGLSAJ2jUQMeTjT7BfEVqiQ8WYsgRwtfKbI=; b=t/PQwtMgwWXvfKbPuOZRQ/JZUeR2oIYQ4dLP07OlJEUkv8wZXR6GiQ0g6gSUA1adkd HAiKLr4tfXg9pQLa6HAFlyaMDS+2dDgwM5yAYyYc4S6y7gQlfIkwqA7djgOTPMgVi5PY GLcFtUEEtKtsPQNcMRtJjZyTNhK47d/xFQyBwD3BqDyn+BixSDLLfPtpF+zGpjY0oXGc B7WAY/Jva32lGe2jYrNEh8yuEjUok/cHIrNrfmahMMB0+GQ7iG5ekk1r15/qwouGnFtD C0vggSa+cKcYNG57QvD37vAl5uEVTOvBPaCjQzVyJ1zOCXCHbsuRqSnw9jLxMxElNwA9 QBcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="g68bNM/5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e11-20020a630f0b000000b0055e607f1e99si6729222pgl.882.2023.08.13.10.26.50; Sun, 13 Aug 2023 10:27:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="g68bNM/5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231272AbjHMQlw (ORCPT + 99 others); Sun, 13 Aug 2023 12:41:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230468AbjHMQlw (ORCPT ); Sun, 13 Aug 2023 12:41:52 -0400 Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5AF493 for ; Sun, 13 Aug 2023 09:41:53 -0700 (PDT) Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-4fe15bfb1adso5663421e87.0 for ; Sun, 13 Aug 2023 09:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1691944912; x=1692549712; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+iDEH45PIbLPRLMGHHsGA82f2vEUhaOKwZb/xF3clyU=; b=g68bNM/5q2gs3Up6oouMqZc8NydRrEzzlRLGRP2iB5kI53SInfeN1qUmUHECsUQYT1 s/tvB99rJSjxQUw+61peXRTHzMvsnAgcit/05/GDjMSgDxP9tf7X/feyRcnzoEAtjtGO +XecthA25ht49PATsoNLIOBCuRbKQ3XS7iRHE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691944912; x=1692549712; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+iDEH45PIbLPRLMGHHsGA82f2vEUhaOKwZb/xF3clyU=; b=i8nxM8l+eVWFEsPneMhBX39FAwmfSSB1/Daox1lkiWoerhGuClL+iqnXUbXOd5r1Qj ueuCLoFwW4mrm9cn6gT0EE3D7RIX4lGZGmBseMiX9nF8siZkXDtp+ifr7307JjyWH+vb r+nCgb4PsZ25+f9mgjrzPsPDWC1Zb8JDlQhSu+pExHhdDliDlRQR1oUou1uJGwF1w4JM 2jNzEIdNStOzA+VGwAMn3aLoROONSHc2xI2Y7rPzeysnzRTkQzqWMjhQBc9jE4OwmIBN Kqm3xRn7XnhmjKl69bNMGAuEeOocV9A/aqxOgM/+FTQnALZkfK0R3c6ofRfaKJWtYjvF 5kGg== X-Gm-Message-State: AOJu0YwVRYPNNJCqKH/rFyntM5T5/njfWf6n90vXRsNc3BQuMTIjdvX1 yOVDq6/1wBqOqdL9dPYhUW73MTHlp1DejvLV5ia2lHW9 X-Received: by 2002:a05:6512:238f:b0:4fb:8ee0:b8a5 with SMTP id c15-20020a056512238f00b004fb8ee0b8a5mr5532903lfv.46.1691944911728; Sun, 13 Aug 2023 09:41:51 -0700 (PDT) Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com. [209.85.167.42]) by smtp.gmail.com with ESMTPSA id v13-20020ac2558d000000b004fdfefdf4acsm1568528lfg.39.2023.08.13.09.41.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 13 Aug 2023 09:41:51 -0700 (PDT) Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-4fe457ec6e7so5637308e87.3 for ; Sun, 13 Aug 2023 09:41:50 -0700 (PDT) X-Received: by 2002:a05:6512:708:b0:4fa:f79f:85a with SMTP id b8-20020a056512070800b004faf79f085amr4345430lfs.69.1691944910386; Sun, 13 Aug 2023 09:41:50 -0700 (PDT) MIME-Version: 1.0 References: <20230810123905.1531061-1-zhengyejian1@huawei.com> <20230811152525.2511f8f0@gandalf.local.home> <20230812211317.6d015e1d@rorschach.local.home> In-Reply-To: <20230812211317.6d015e1d@rorschach.local.home> From: Linus Torvalds Date: Sun, 13 Aug 2023 09:41:33 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] tracing: Fix race when concurrently splice_read trace_pipe To: Steven Rostedt Cc: Zheng Yejian , mhiramat@kernel.org, laijs@cn.fujitsu.com, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, "Jason A. Donenfeld" , Jens Axboe , Al Viro Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 12 Aug 2023 at 18:13, Steven Rostedt wrote: > > So the test case cannot be run because the "sendfile" on the > trace_pipe now fails? So sendfile() has always required a seekable source, because it - intentionally - doesn't do the "splice to pipe and then splice from pipe to destination". It just does a "splice from source, splice result to destination". That sounds like "obviously the right thing to do", but it means that there is now no buffer with any extended lifetime between the two operations. And that's a big deal. Without that buffer (ie pipe) in the middle, if the splice to destination fails - or is partial - and the system call is interrupted by a signal, then the source splice data is now gone, gone, gone. So it only works if the source can then re-create the data - ie if the source is seekable. In that case, if the destination cannot accept all the data, the sendfile can just go back and read again from the last successfully written position. And if you are a data stream that can't seek, then that "from last successfully written position" doesn't work, and any partial writes will just have dropped the data. This seekability test *used* to be to test that the source was either a regular file or a block device. Now it literally checks "can I seek". It looks like the trace_pipe fiel is *claiming* to be a regular file - so sendfile() used to be ok with it - but cannot actually seek - so sendfile would silently drop data. Now sendfile says "I'm sorry, Dave, I'm afraid I can't do that" rather than silently causing data loss. Now, this is not a user-visible regression in this case, because "cat" will always fall back on just regular read/write when sendfile fails. So all that changed for 'trace_pipe' is that a buggy splice now no longer triggers the bug that it used to (which the patch in question was also trying to fix). End result: things in many ways work better this way. So it really looks like it never really worked before either. There was *both* the dropped data bug because "trace_pipe" lied about being a regular file, *and* apparently this trace_pipe race bug that Zheng Yejian tried to fix. Of course, some destinations always accept a full write, so maybe we could relax the "source must be seekable" to be "source must be seekable, _OR_ destination must be something that never returns partial writes". So sendfile to a POSIX-compliant regular file could always work (ignoring filesystem full situations, and at that point I think we can say "yeah, we're done, no recovering from that in sendfile()"). So if somebody really *really* want sendfile to work for this case, then you (a) do need to fix the race in tracing_splice_read_pipe (which you should do anyway, since you can obviously always use splice() itself, not sendfile()). AND (b) figure out when 'splice_write()' will always succeed fully and convince people that we can do that "extended version" of sendfile(). Hmm? Linus