Received: by 10.213.65.68 with SMTP id h4csp3541046imn; Tue, 10 Apr 2018 00:06:44 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+bCOnS4O+OVzpcY578ClGkR86iAQUbtEYhdh2Jr+M+Zm8w8MxnsrvvXllf3WuWrr20CqYp X-Received: by 2002:a17:902:d03:: with SMTP id 3-v6mr42333858plu.245.1523344004097; Tue, 10 Apr 2018 00:06:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523344004; cv=none; d=google.com; s=arc-20160816; b=Q4z299SXLZybNqWUq3Wl19zYGIFBjj/yij2VAeE5OjQwn35t3TXjsQxfuBEKGbieGX Bpga1aBak63AoojysGrBFQcw6NuGRq9Yxn7BZ3DZnXOCABXRKoQeCN95iiDIt6q0vxC8 UDOiF7HQ5cnpwt7tk4wG14PD+2DczaBucmT9zJYablqIot9nR7xG4Z0s4FfYRltxLWNP XLVYAQxew0vDsQldVF6PlU9lccbcElG/LoK7aYIHGDOjFTeOrWDLzWw2UIgdbzJu/P02 wtm1w5xf0+tpJtgAkh7hHtWLldatxRhyl5LxF2e+ChsBSwSc/mq6fgucuABP4lIPZsYb YpHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=pgnq38VZZ6XrtCAn50LLsoVsqopMFm/vBuxZnjibbMQ=; b=e9RhsPj9Wok6lg2iQXNXZ2Vok9x0s6wrpFV8hApITaF6p/e/JqpwRBVoykiZ7OQZLZ Laml0bVF/DQe5BuTKUFkbKR5ofhY2DA3FPT4IbLd/qjfDlsmMNqpLeZoE/dw4Zisbg4n Hp+dEROQm0CPc69CexFDhkGDkVjiJWuPlr3Sluwg40sjk1FSSe2cF5dfmYerqKxMPgX9 zBWeWxRP3RCU7qtuW2QptRKGqXOL37SXDS7W0prvg7KYjqPlYEwXisPK4kzw/BhhglmB YiCAFi23YoRJMUF6CvBGVbsS24xaufvQxH7zR8rdo1MV1oL4qUW/9qLvCVqB9x8a4GUm 5/KQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p3-v6si2008877pld.512.2018.04.10.00.06.06; Tue, 10 Apr 2018 00:06:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752304AbeDJHCY (ORCPT + 99 others); Tue, 10 Apr 2018 03:02:24 -0400 Received: from metis.ext.pengutronix.de ([85.220.165.71]:36487 "EHLO metis.ext.pengutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751845AbeDJHCU (ORCPT ); Tue, 10 Apr 2018 03:02:20 -0400 Received: from ptx.hi.pengutronix.de ([2001:67c:670:100:1d::c0]) by metis.ext.pengutronix.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1f5nIE-0007EE-K0; Tue, 10 Apr 2018 09:02:14 +0200 Received: from sha by ptx.hi.pengutronix.de with local (Exim 4.89) (envelope-from ) id 1f5nIA-0002TY-Ed; Tue, 10 Apr 2018 09:02:10 +0200 Date: Tue, 10 Apr 2018 09:02:10 +0200 From: Sascha Hauer To: David Gstir Cc: linux-fscrypt@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mtd@lists.infradead.org, Richard Weinberger , linux-kernel@vger.kernel.org, kernel@pengutronix.de Subject: Re: [RFC] UBIFS authentication Message-ID: <20180410070210.4chxbjlijoohpo5c@pengutronix.de> References: <20180409095912.f7d35y7yjdnj2ffz@pengutronix.de> <9EEA8AD6-E46F-45B7-8A85-98E4CC82B1D3@sigma-star.at> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9EEA8AD6-E46F-45B7-8A85-98E4CC82B1D3@sigma-star.at> X-Sent-From: Pengutronix Hildesheim X-URL: http://www.pengutronix.de/ X-IRC: #ptxdist @freenode X-Accept-Language: de,en X-Accept-Content-Type: text/plain X-Uptime: 08:16:54 up 109 days, 14:35, 72 users, load average: 0.25, 0.24, 0.19 User-Agent: NeoMutt/20170113 (1.7.2) X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::c0 X-SA-Exim-Mail-From: sha@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 09, 2018 at 05:23:05PM +0200, David Gstir wrote: > Hi Sascha, > > > On 09.04.2018, at 11:59, Sascha Hauer wrote: > > > > Hi David, > > > > On Wed, Jan 17, 2018 at 04:19:14PM +0100, David Gstir wrote: > >> Hi everybody! > >> > >> ### Index Authentication > >> > >> Through UBIFS' concept of a wandering tree, it already takes care of only > >> updating and persisting changed parts from leaf node up to the root node > >> of the full B+ tree. This enables us to augment the index nodes of the tree > >> with a hash over each node's child nodes. As a result, the index basically also > >> a Merkle tree. Since the leaf nodes of the index contain the actual filesystem > >> data, the hashes of their parent index nodes thus cover all the file contents > >> and file metadata. When a file changes, the UBIFS index is updated accordingly > >> from the leaf nodes up to the root node including the master node. This process > >> can be hooked to recompute the hash only for each changed node at the same time. > >> Whenever a file is read, UBIFS can verify the hashes from each leaf node up to > >> the root node to ensure the node's integrity. > >> > >> To ensure the authenticity of the whole index, the UBIFS master node stores a > >> keyed hash (HMAC) over its own contents (which includes the location of the root > >> node) and the root node of the index tree. As mentioned above, the master node > >> is always written to the flash whenever the index is persisted (ie. on index > >> commit). > >> > >> Using this approach only UBIFS index nodes and the master node are changed to > >> include a hash. All other types of nodes will remain unchanged. This reduces > >> the storage overhead which is precious for users of UBIFS (ie. embedded > >> devices). > >> > >> > >> HMAC Master Node > >> | > >> ' ' ' ' ' ' ' ' ' ' ' ' ' '|' ' > >> ' +---------------+ o ' > >> ' | Master Node | ' > >> ' +---------------+ ' Hash Index Node #1 > >> ' | ' | > >> . . . . . . . .'. . . . . . v . . . . . . . . . . . . . . . .|. . . . . > >> . ' +---------------+ ' o . > >> . ' | Index Node #1 | ' . > >> . ' +---------------+ ' . > >> . ' | ... | (fanout: 8) . > >> . ' ' ' ' | ' ' ' ' | ' ' ' ' ' . > >> . +-------+ +------+ . > >> . | | . > >> . ' ' ' ' ' v ' ' ' ' ' ' ' ' ' ' ' v ' ' ' ' ' ' ' ' ' ' ' . > >> . ' +---------------+ ' ' +---------------+ ' . > >> . ' | Index Node #2 | ' ' | Index Node #3 | ' . > >> . ' +---------------+ ' ' +---------------+ ' . > >> . ' | ... ' ' | ... | ' . > >> . . . ' . v . . . . . . . '. ' . . . v . . . . v . . . . . . . .' . . . > >> ' +-----------+ ' '+----------+ +-----------+ ' > >> ' | Data Node | ' '| INO Node | | DENT Node | ' > >> ' +-----------+ ' '+----------+ +-----------+ ' > >> ' o ' ' o ' > >> ' '|' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '|' ' > >> | | > >> Hash Index Node #2 Hash Index Node #3 > > > > When a hash covers an index node and also its children then of course > > this is really space efficient, but this also means that in order to > > read a node we always have to read all of its children. Also adding a > > new leaf node means rehashing all siblings. Is it really worth paying > > this price to save a few bytes for more hashes? > > I would rather suggest to add a hash per child to each index node. > > What you propose is a simple tradeoff between the required on-flash storage > space and the amount of read operations needed to verify the index node integrity. > > To get some numbers here: In case of SHA256 and a fanout of 8, we'd store an > additional 224 bytes per index node and safe 7 index node reads per tree level. To get some more numbers I analyzed some typical rootfs here. It has: 16137 data nodes with total size 41802586 uncompressed data size: 62380950 2116 inode nodes with total size 347042 2115 dent nodes with total size 148790 2911 idx nodes with total size 547068 Total number of branches: 23278 With a SHA256 per branch this would add another 744896 bytes or 1.7% overhead to the image (1.1% if using the uncompressed data size as base). > > I hope it is clear that having a single hash does _not_ mean we have to read > the whole tree! :) > > Considering that UBIFS is mostly used in embedded where storage space is rather > precious, we opted for storing a single hash. But sure, storing a hash per > branch/child is a possibility and we have to discuss if that is acceptable > in most use cases. > > As for adding new leaf nodes: If we add a leaf node, we'll always have to recompute > the hash of every index node on the path to the root node. Even with a hash per > branch/child. In case of TNC split/merge operations likely even more. > But I don't get why you think that would mean we have to recompute the hash > on all siblings of that nodes? Or do you simply mean that we have to re-read all > those siblings to compute the hash on the parent? I mean that before we can use the "DENT Node" in the picture above we also have to verify the "INO Node". To get there, we also have to verify the unrelated "Index Node #2". so in reality we have to verify seven unrelated leaf nodes and another seven unrelated nodes per tree level before we can use a node. When changing nodes we have to make sure all the unrelated nodes stay in memory or we have to read them back from the device. Of course, some of the unrelated nodes will be used anyway in the next few moments, nevertheless I can imagine we pay a significant performance penalty for this. read/write throughput is another very precious resource on embedded systems ;) Sascha -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |