Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1257261rdb; Fri, 1 Dec 2023 10:55:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IFTmVaTgAmWP+lbUPybM62LPV+NcmGJ9lvCYtj6YfFxKFDesYzmJw/AtOi/oKUGfUUcdeNm X-Received: by 2002:a05:6e02:b4e:b0:35c:c82c:c790 with SMTP id f14-20020a056e020b4e00b0035cc82cc790mr7876ilu.1.1701456861626; Fri, 01 Dec 2023 10:54:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701456861; cv=none; d=google.com; s=arc-20160816; b=QY5QtBw/oRU9poU8iakOqAEqYKlNCz+2h3eEcuQlO7NaivUAEbr+2/Z5ez3Pp0lrjP lZBYmWXPBWsDwo3SRby2nEopA5sipcpbftMj7RpYJR0CtxPOPfczML+ljHXiYNMybJPp Kc4CGqhZ+uzgdzqm0ImmQReIvjBCWZIdMUsFKAFRWlD6ef/L7l6FRTOQa7sVFT5uiemF MBrmEFIfw0AHDmHVhalI9pvGm7IyAQlDueaDlX/UEs+imwv7appgj86fSwJUD2EXXLoW SwYKlTQ96PvMQoaO0ntwx5fzrpUUtm3kwMhYfqo0EYpeqloP3gT5HiWZndZFMxdkiaI/ OrAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=pxC6FvaCwqbKWFKkmrGd/+ggGHthPXEgXLr+pvJmw3g=; fh=5iqDTb3t0BBmB2a1eOviN+JJ42rBkh3Q0FaXPaEMphk=; b=jeLO4ux+4zmEHgZEIPgKT5K6Srz7e/0NcHrKMRG96gfFKwzJDSNkQ60u7x/45RF0xv rMk+WlPzADnfbYs6mX2OTM/i66JL1sVmev3bWeU2erIbURsST1YlHWwrS6tyBxQpR79o F1KO5aZJCm9lJbTv29b2hsxkRAgYA5pPtxTUKE8UAYy4ou4gETNYlqAz3T1AD25bF4q/ I7CXCL+PJZ5ba0dlMUKSc2POGZir1qgy+QMl3tyhhDFlFtJTGiOqbwDMNFf33GN8BbuB cJxmWsmLwWResBObg4gYKGgMMpWUtLhUnpoMzcnfV0go27Lk76XQAkrT4lt8hP+4BfnH xL0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KtV0henW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id s14-20020a65690e000000b005c605dcaeb3si3710607pgq.745.2023.12.01.10.54.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 10:54:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KtV0henW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id F3F0F8343952; Fri, 1 Dec 2023 10:54:18 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230207AbjLASyB (ORCPT + 99 others); Fri, 1 Dec 2023 13:54:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229468AbjLASyA (ORCPT ); Fri, 1 Dec 2023 13:54:00 -0500 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2549410DA; Fri, 1 Dec 2023 10:54:06 -0800 (PST) Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-54ba86ae133so2830485a12.2; Fri, 01 Dec 2023 10:54:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701456844; x=1702061644; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=pxC6FvaCwqbKWFKkmrGd/+ggGHthPXEgXLr+pvJmw3g=; b=KtV0henWgarPXsqR7K5sjYvGsx57KfHBi8itTYCkGNHSp4471VY46vQcR/YdVRRyJe JJy7jksLG0mhZgZLGP35b/WKD2cbOsveRBwPH/6TlZ1g6sXdBBIh1jrl+WCyPlKDfIz5 fXkb27bicU1rg/+SIg5FtZ6tf6IHpSKM1iuSSbLvM5EhkF8I7OVVoYoYNgVS6mVtdjBk ZBIa6w9Mke4vq8G5GzDiuXorqCwBmrMk/rnpGrzC7HztmSxvfHnvmDDZjES0dAfG/74c mvjT5dDkJxH5XesJOMdaAtnX54cr5pObumL49Z68QnMFbeWlzqPDlWues80FaAH86s3H Mg8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701456844; x=1702061644; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pxC6FvaCwqbKWFKkmrGd/+ggGHthPXEgXLr+pvJmw3g=; b=mFo+49JFWBJz8nQKiuu/9VpAKE34CjggjPEHdywDD761mSGyUPfk+8N6+exVEsUJBs NEOFku62j0OBPMowOOSXvaSY5u+9xwHIjmINmslSpZL3Zl1vIXsyWK8hi0LcE0f/rstV wJ0/vDxHAsLHr4+Q019HDIR0ufVTvQw0QY/0TC+A1lJIwVHaBogSXGx6Pz2jQTf5dYCT 8gR81UorIUECd/LumfDcsyHScszIZidUscLk3M0yJ13+p+ouHAvz4Zjq0yS454UDxiK0 Lws/x9gYA1rkeiZwBbLkBTuQTMm2J0SJYx9m3uC8qWttWSiHxiX317AE0YJzGBjTxBZl 2cpg== X-Gm-Message-State: AOJu0Yz/+zJwKfZJxUmwWTDL7AAg5o5HjVjTU4uuyo4J+pc1uHVez5ia fLn+xAUZtt110HZMxXM/K0KzBzO75ew= X-Received: by 2002:a17:907:9047:b0:a10:f9a8:bfe1 with SMTP id az7-20020a170907904700b00a10f9a8bfe1mr1797143ejc.16.1701456844374; Fri, 01 Dec 2023 10:54:04 -0800 (PST) Received: from [192.168.8.100] ([148.252.140.112]) by smtp.gmail.com with ESMTPSA id q19-20020a1709060e5300b009a19701e7b5sm2185813eji.96.2023.12.01.10.54.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 01 Dec 2023 10:54:04 -0800 (PST) Message-ID: <42ef8260-7f92-4312-9291-19301aea3c30@gmail.com> Date: Fri, 1 Dec 2023 18:52:43 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: io_uring: incorrect assumption about mutex behavior on unlock? To: Jann Horn , Jens Axboe , io-uring Cc: kernel list , Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long References: Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 01 Dec 2023 10:54:19 -0800 (PST) On 12/1/23 16:41, Jann Horn wrote: > mutex_unlock() has a different API contract compared to spin_unlock(). > spin_unlock() can be used to release ownership of an object, so that > as soon as the spinlock is unlocked, another task is allowed to free > the object containing the spinlock. > mutex_unlock() does not support this kind of usage: The caller of > mutex_unlock() must ensure that the mutex stays alive until > mutex_unlock() has returned. > (See the thread > > which discusses adding documentation about this.) > (POSIX userspace mutexes are different from kernel mutexes, in > userspace this pattern is allowed.) > > io_ring_exit_work() has a comment that seems to assume that the > uring_lock (which is a mutex) can be used as if the spinlock-style API > contract applied: > > /* > * Some may use context even when all refs and requests have been put, > * and they are free to do so while still holding uring_lock or > * completion_lock, see io_req_task_submit(). Apart from other work, > * this lock/unlock section also waits them to finish. > */ > mutex_lock(&ctx->uring_lock); > Oh crap. I'll check if there more suspects and patch it up, thanks > I couldn't find any way in which io_req_task_submit() actually still > relies on this. I think io_fallback_req_func() now relies on it, > though I'm not sure whether that's intentional. ctx->fallback_work is > flushed in io_ring_ctx_wait_and_kill(), but I think it can probably be > restarted later on via: Yes, io_fallback_req_func() relies on it, and it can be spinned up asynchronously from different places, e.g. in-IRQ block request completion. > io_ring_exit_work -> io_move_task_work_from_local -> > io_req_normal_work_add -> io_fallback_tw(sync=false) -> > schedule_delayed_work > > I think it is probably guaranteed that ctx->refs is non-zero when we > enter io_fallback_req_func, since I think we can't enter > io_fallback_req_func with an empty ctx->fallback_llist, and the > requests queued up on ctx->fallback_llist have to hold refcounted > references to the ctx. But by the time we reach the mutex_unlock(), I > think we're not guaranteed to hold any references on the ctx anymore, > and so the ctx could theoretically be freed in the middle of the > mutex_unlock() call? Right, it comes with refs but loses them in between lock()/unlock(). > I think that to make this code properly correct, it might be necessary > to either add another flush_delayed_work() call after ctx->refs has > dropped to zero and we know that the fallback work can't be restarted > anymore, or create an extra ctx->refs reference that is dropped in > io_fallback_req_func() after the mutex_unlock(). (Though I guess it's > probably unlikely that this goes wrong in practice.) -- Pavel Begunkov