Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 367B7C38142 for ; Wed, 1 Feb 2023 09:47:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231709AbjBAJrq (ORCPT ); Wed, 1 Feb 2023 04:47:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230095AbjBAJro (ORCPT ); Wed, 1 Feb 2023 04:47:44 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D33F7AD1D for ; Wed, 1 Feb 2023 01:46:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675244819; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nX3Sju01yAvnUklbguJHQoTEONurdEPyIljO4LvviIw=; b=cvM3kV4Oan6U2v2FysiUXaV7b86m1rATUX05l/54rtXyaoBSrfQce3gVBAX1fPdo07vjJw JcP8qGGV2jf4OzIMw6MckMdjn5GV3a7xI5P/O81MNTVbGU91z1qqr1OLeEa9Lrlr0Pjlc9 hw0l1hn2yZfgTN77iYXxtqG1/R+PmcE= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-151-JicxHPGaMfa_IXcnePhoww-1; Wed, 01 Feb 2023 04:46:58 -0500 X-MC-Unique: JicxHPGaMfa_IXcnePhoww-1 Received: by mail-ed1-f70.google.com with SMTP id h18-20020a05640250d200b0049e0e9382c2so12525319edb.13 for ; Wed, 01 Feb 2023 01:46:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=nX3Sju01yAvnUklbguJHQoTEONurdEPyIljO4LvviIw=; b=0uUemOIXubSv7yb6j1duP/YIUDgET6rho+9/fpeTB9i6MLVKir0X2+jzDu1+pwQ3qq 8/OUW9YpCTjY5CDD2ns+vNEwncr7oPm5sFSJ2SeID/0JSyDx3nNzvsxv6QXUSwZodl9q YJEuW1ZCTIb7rINIydAguLazobyNeZ6FWDON1NnwgnNytyC5shDeoICqTX2BXJOip3HV cR2usLD++tA88t7RvN+Lq/zLh/UO4upZy5Nz5EjdJtwIhIC2mrG56tuG2KysnTPyZhXJ gr1nhUk02DWXssSQsGGi+5oxUU3dInp3wPjKRIZ9cl7pXjjZYNgZgaODyQ0GGUlHiOC4 /DHA== X-Gm-Message-State: AO0yUKWWBHe36AntNWzbY/PoNpYqp0Wdk3JvQHJ3pulm6/YMce5F18RJ 12wKuYDMqTBESDySuWPenbVEw17Y2rytjplCnq+PLzeNLZEVN4JREJrAUljyTou9ozvTPuzfM27 l5lx8AhvqaacV+6ZRPOY4RJo3 X-Received: by 2002:a17:906:5dcb:b0:877:61e8:915a with SMTP id p11-20020a1709065dcb00b0087761e8915amr1127497ejv.75.1675244816309; Wed, 01 Feb 2023 01:46:56 -0800 (PST) X-Google-Smtp-Source: AK7set+5x0GFlMZdMQ7po8ZqDuDaJg67N4TVXFKMI+cgeM+B85VRZVl9KDVVwhGIMA9idUiIFSmcxg== X-Received: by 2002:a17:906:5dcb:b0:877:61e8:915a with SMTP id p11-20020a1709065dcb00b0087761e8915amr1127481ejv.75.1675244816095; Wed, 01 Feb 2023 01:46:56 -0800 (PST) Received: from greebo.mooo.com (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.gmail.com with ESMTPSA id l23-20020a50d6d7000000b004a0b1d7e39csm9584437edj.51.2023.02.01.01.46.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 01:46:55 -0800 (PST) Message-ID: <071074ad149b189661681aada453995741f75039.camel@redhat.com> Subject: Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem From: Alexander Larsson To: Jingbo Xu , Gao Xiang , Amir Goldstein , gscrivan@redhat.com, brauner@kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, david@fromorbit.com, viro@zeniv.linux.org.uk, Vivek Goyal , Miklos Szeredi Date: Wed, 01 Feb 2023 10:46:54 +0100 In-Reply-To: <51d9d1b3-2b2a-9b58-2f7f-f3a56c9e04ac@linux.alibaba.com> References: <1ea88c8d1e666b85342374ed7c0ddf7d661e0ee1.camel@redhat.com> <5fb32a1297821040edd8c19ce796fc0540101653.camel@redhat.com> <2ef122849d6f35712b56ffbcc95805672980e185.camel@redhat.com> <8ffa28f5-77f6-6bde-5645-5fb799019bca@linux.alibaba.com> <51d9d1b3-2b2a-9b58-2f7f-f3a56c9e04ac@linux.alibaba.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.2 (3.46.2-1.fc37) MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2023-02-01 at 12:28 +0800, Jingbo Xu wrote: > Hi all, >=20 > There are some updated performance statistics with different > combinations on my test environment if you are interested. >=20 >=20 > On 1/27/23 6:24 PM, Gao Xiang wrote: > > ... > >=20 > > I've made a version and did some test, it can be fetched from: > > git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git > > -b > > experimental > >=20 >=20 > Setup > =3D=3D=3D=3D=3D=3D > CPU: x86_64 Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz > Disk: 6800 IOPS upper limit > OS: Linux v6.2 (with composefs v3 patchset) For the record, what was the filesystem backing the basedir files? > I build erofs/squashfs images following the scripts attached on [1], > with each file in the rootfs tagged with "metacopy" and "redirect" > xattr. >=20 > The source rootfs is from the docker image of tensorflow [2]. >=20 > The erofs images are built with mkfs.erofs with support for sparse > file > added [3]. >=20 > [1] > https://lore.kernel.org/linux-fsdevel/5fb32a1297821040edd8c19ce796fc05401= 01653.camel@redhat.com/ > [2] > https://hub.docker.com/layers/tensorflow/tensorflow/2.10.0/images/sha256-= 7f9f23ce2473eb52d17fe1b465c79c3a3604047343e23acc036296f512071bc9?context=3D= explore > [3] > https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/com= mit/?h=3Dexperimental&id=3D7c49e8b195ad90f6ca9dfccce9f6e3e39a8676f6 >=20 >=20 >=20 > Image size > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > 6.4M large.composefs > 5.7M large.composefs.w/o.digest (w/o --compute-digest) > 6.2M large.erofs > 5.2M large.erofs.T0 (with -T0, i.e. w/o nanosecond timestamp) > 1.7M large.squashfs > 5.8M large.squashfs.uncompressed (with -noI -noD -noF -noX) >=20 > (large.erofs.T0 is built without nanosecond timestamp, so that we get > smaller disk inode size (same with squashfs).) >=20 >=20 > Runtime Perf > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > The "uncached" column is tested with: > hyperfine -p "echo 3 > /proc/sys/vm/drop_caches" "ls -lR $MNTPOINT" >=20 >=20 > While the "cached" column is tested with: > hyperfine -w 1 "ls -lR $MNTPOINT" >=20 >=20 > erofs and squashfs are mounted with loopback device. >=20 >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | uncached(ms)| cached(= ms) > ----------------------------------|-------------|----------- > composefs (with digest)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 | 326=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 1= 35 > erofs (w/o -T0)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 264=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 172 > erofs (w/o -T0) + overlayfs=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 651=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 238 > squashfs (compressed) | 538 | 211 > squashfs (compressed) + overlayfs | 968 | 302 Clearly erofs with sparse files is the best fs now for the ro-fs + overlay case. But still, we can see that the additional cost of the overlayfs layer is not negligible.=C2=A0 According to amir this could be helped by a special composefs-like mode in overlayfs, but its unclear what performance that would reach, and we're then talking net new development that further complicates the overlayfs codebase. Its not clear to me which alternative is easier to develop/maintain. Also, the difference between cached and uncached here is less than in my tests. Probably because my test image was larger. With the test image I use, the results are: | uncached(ms)| cached(ms) ----------------------------------|-------------|----------- composefs (with digest) | 681 | 390 erofs (w/o -T0) + overlayfs | 1788 | 532 squashfs (compressed) + overlayfs | 2547 | 443 I gotta say it is weird though that squashfs performed better than erofs in the cached case. May be worth looking into. The test data I'm using is available here: =20 https://my.owndrive.com/index.php/s/irHJXRpZHtT3a5i --=20 =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D= -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- =3D-=3D-=3D Alexander Larsson Red Hat, Inc=20 alexl@redhat.com alexander.larsson@gmail.com=20 He's a lonely flyboy grifter living undercover at Ringling Bros. Circus.=20 She's a virginal thirtysomething former first lady looking for love in=20 all the wrong places. They fight crime!=20