Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4129582yba; Tue, 9 Apr 2019 11:47:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqxLs0Ah90pGQ+FxLKxdmrGqt0Tx2cEbTqWATU0Jiz7uNMG8mt9lDrCzVdoB7LKkoXEqwKtA X-Received: by 2002:a62:7603:: with SMTP id r3mr38666743pfc.32.1554835648089; Tue, 09 Apr 2019 11:47:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554835648; cv=none; d=google.com; s=arc-20160816; b=d0Q4B1rrnspdhaGimYjvRicJV2xwh5ef2K4l0ihS2zJLMjmfuPL+ZXr7H/88347uAI nNDuWg9xZ8QmqRJgs15nDBu1zzX570UfUwitntBA7t8TtM8Nt0JAN+PcsPV94lVgOQYp MregLWBHLtpHvA/4ToEZM9P5NHUxcjaTBiHJFZ3yoHyyHSayrAbb/b18Fl7KsGRmtydO oeCTYNbO8CQvcTeyDuLW7VHcVcbegg4tBeofydVc8rzopvVmD/wLTqfVH9U9AiX5lstJ qZSyZzIS11sY+3ysOD7ABRirK3t9BVKPz7xupBgtMnIR8z2mG4A98gQdRUUccHksS/nF Q2sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=0vApW/buY6OtCIWOQhnD5Xdm5WPEcH/+QK5/B7HgPtU=; b=bys8qpmo2tIUihH1v+8Y5RX2nYb7M123FsOC4PC38VbCF0k9apean/G18OiZFNJ5+G kAL/nKrfN3FIa0pnjqYimplS85u55/fZJmOZuDQAY6Vit2S0tg8vfIqrxnZrmrhrVuUR hB2awxeCjt1UfxTCJ+CaegZjPdKT90cxzXP+Vy1b7Bhc8sCKgvyZoTkLk703EZ8wt/92 WdOD2uKETD+S2bPT6hJ6WLX9GtIOeoDHy2vbSh/nqTRvDvE5aD9Ylw53bVUZZyiYxmhe iAV+kyGKg7gTjz2mn1MsLmNSxP8R0S+zzbzm9fs6OT1csLC7kg3aJb7TGYsjYy4P6GbK REVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b="PoL/k0X6"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a23si29150698pls.188.2019.04.09.11.47.12; Tue, 09 Apr 2019 11:47:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b="PoL/k0X6"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726744AbfDISqT (ORCPT + 99 others); Tue, 9 Apr 2019 14:46:19 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:41854 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726517AbfDISqT (ORCPT ); Tue, 9 Apr 2019 14:46:19 -0400 Received: by mail-pl1-f193.google.com with SMTP id d1so9936154plj.8 for ; Tue, 09 Apr 2019 11:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=0vApW/buY6OtCIWOQhnD5Xdm5WPEcH/+QK5/B7HgPtU=; b=PoL/k0X6ejuMiI0c9MQKWm9xd2VHgVKAB9jdtVvO3kK/e3wM/mpcxFVRmHspMNOgC/ bB1b9loCQDjA1VnrgOvVXwfQ+gllNSnaecp+jqKsn9Ol/BtRXIIFVul4IVhADr9weE42 RmAetUkOE3iZ2F1dLzmooSr/dt7QAdit053yA0l7Ki1DaYGkdFO4G4au7kSPT/OGcAQh Z0fQe4/q+c454Z4G2I+jsz9dXCNoeiS86mBjV0JUUkCKoFzQGQ27zq84otl/tuAokojP sdcsLxvOVvYmCI8FetX8eTq51isRqSaQ3OyPWhKyRVHE9xa+Ru/n4KgX9T+OeZS2KB0P SCLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=0vApW/buY6OtCIWOQhnD5Xdm5WPEcH/+QK5/B7HgPtU=; b=E1uPxW3hZ3vcNkvkdR7gi6TLN99NKaP0rcppeOci9g1vTW19g3oTHD5s1k5CLvn2Yq l9Y9tXVOKnNctvECsJS+Ocav2RKA588Zw90hYoBBIlAvwNrN914DYXvDR6h/UxDUhO4W 4VadCOJfehHR7XbDNoS61ZMsnFla7mx5XlAQVluFMCfS47eWJHOv3Eq2/63lsTAxvdPH e3lTTY/KsnULoBAiggv+HsxinxA6PN1U11f7kC+3NGd11ssHP8Ug1xB9xN7ZDrUw498R qJikRuGZKqybKCsxTXSjUXdGwlArR5vwomh8OX/cdhJFWf8SqjyFW0hB7E68a08o078i UqCQ== X-Gm-Message-State: APjAAAUMb5bjtopQUTH/nL3qIrIkjyfmRVFAQkH+5ouF15Khd2GZDDPs ibyaw8Re4sqUz943AFyIR/Sl1ZZIqkISTA== X-Received: by 2002:a17:902:ec0c:: with SMTP id cy12mr16772022plb.291.1554835577855; Tue, 09 Apr 2019 11:46:17 -0700 (PDT) Received: from [192.168.1.121] (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id h4sm42587504pgn.20.2019.04.09.11.46.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Apr 2019 11:46:16 -0700 (PDT) Subject: Re: [PATCH] io_uring: add support for barrier fsync To: Chris Mason Cc: Christoph Hellwig , linux-fsdevel , "linux-block@vger.kernel.org" , "linux-api@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <7c7276e4-8ffa-495a-6abf-926a58ee899e@kernel.dk> <20190409181742.GA24925@infradead.org> <5f8d9644-9e8f-c9d2-611e-4b144c62539c@kernel.dk> <5BF7FDDE-212E-4F9A-9B50-26BDA99E952A@fb.com> From: Jens Axboe Message-ID: Date: Tue, 9 Apr 2019 12:46:15 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <5BF7FDDE-212E-4F9A-9B50-26BDA99E952A@fb.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/9/19 12:42 PM, Chris Mason wrote: > On 9 Apr 2019, at 14:23, Jens Axboe wrote: > >> On 4/9/19 12:17 PM, Christoph Hellwig wrote: >>> On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote: >>>> It's a quite common use case to issue a bunch of writes, then an >>>> fsync >>>> or fdatasync when they complete. Since io_uring doesn't guarantee >>>> any >>>> type of ordering, the application must track issued writes and wait >>>> with the fsync issue until they have completed. >>>> >>>> Add an IORING_FSYNC_BARRIER flag that helps with this so the >>>> application >>>> doesn't have to do this manually. If this flag is set for the fsync >>>> request, we won't issue it until pending IO has already completed. >>> >>> I think we need a much more detailed explanation of the semantics, >>> preferably in man page format. >>> >>> Barrier at least in Linux traditionally means all previously >>> submitted >>> requests have finished and no new ones are started until the >>> barrier request finishes, which is very heavy handed. Is that what >>> this is supposed to do? If not what are the exact guarantees vs >>> ordering and or barrier semantics? >> >> The patch description isn't that great, and maybe the naming isn't >> that >> intuitive either. The way it's implemented, the fsync will NOT be >> issued >> until previously issued IOs have completed. That means both reads and >> writes, since there's no way to wait for just one. In terms of >> semantics, any previously submitted writes will have completed before >> this fsync is issued. The barrier fsync has no ordering wrt future >> writes, no ordering is implied there. Hence: >> >> W1, W2, W3, FSYNC_W_BARRIER, W4, W5 >> >> W1..3 will have been completed by the hardware side before we start >> FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the >> fsync >> completes, no ordering is provided there. > > Looking at the patch, why is fsync special? Seems like you could add > this ordering bit to any write? It's really not, the exact same technique could be used on any type of command to imply ordering. My initial idea was to have an explicit barrier/ordering command, but I didn't think that separating it from an actual command would be needed/useful. > While you're here, do you want to add a way to FUA/cache flush? > Basically the rest of what user land would need to make their own > write-back-cache-safe implementation. FUA would be a WRITEV/WRITE_FIXED flag, that should be trivially doable. In terms of cache flush, that's very heavy handed (and storage oriented). What applications would want/need to do an explicit whole device flush? -- Jens Axboe