Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp4868590imb; Thu, 7 Mar 2019 02:26:36 -0800 (PST) X-Google-Smtp-Source: APXvYqy1m/NaKfgIhXolI+Wz00lSC/jf66f3vwZ1EOQEqH/veial1pmWiWXRoRIjp12a3UOiMZnA X-Received: by 2002:a63:1061:: with SMTP id 33mr10611745pgq.226.1551954396819; Thu, 07 Mar 2019 02:26:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551954396; cv=none; d=google.com; s=arc-20160816; b=0h/0FUHRXMr28SPVw27M2Nb3m7GbtbqtOZSs2Gc+CpUfeyX/I+f4Uz/3gNzvNbWpyP +UV6RgpppyLJtbZ1NKTbPjK8DJbBAyYxtOyiio3YDEOcqq2WuwK1lxnp5uG1McZDnqoJ e6e5CG74kBuvRyMLURgXy9kEphw8aj5owzBsFBXiYBWMTKr6ugV8d7kfhcs5/xqGCrpM fPYQC7CYHiWS4IPj4SzEaZDn/pMn3c3gzRBfb22vdbsKls0vMYWApl2EyH4mABfmtzjq yoUtmXm05Nu2eKCy3rEIFYTjC+4Hb9YhqNUCPinPGy3CtO3NloK6f6rYP/gW07WOSlmi nzwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=HrrVBsxFyLwBuncur9okJNCjKX3TcQI8oRuE8m58v7M=; b=I4hGG0CjJWYLDrUmf2BVs4ffA7/RFDFA5vgmFRwEqqI5AnsaZTTE6jvuVRsJWaAMwJ at7dzH4eAITDhALUHeCNRrwVloFKBSe2K+5HvMznDwXWihySebHqnw7hiw5v4BoMX6J4 L/NpsamBy0ydNWeLN4tyQij0PO0Ti9LGk+wo9jYGHWPRefXp3h6xztrWOzWcELN6//TT 1sVgqvu8a/B7RIOp7yRK0cJqGvT961DDhvnYrxyWAjWgEVNT0lcewuPuVOv55zFJdfQb /p0xFCqhhA/Ec09Hp0mbzDHR/otCItzFge/smCZLR6JE2cTpeJNwHQAKuqEreheGPY7i OJLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ooB7eI8M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a15si3636261pgg.560.2019.03.07.02.26.20; Thu, 07 Mar 2019 02:26:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ooB7eI8M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726432AbfCGKXy (ORCPT + 99 others); Thu, 7 Mar 2019 05:23:54 -0500 Received: from mail-wm1-f67.google.com ([209.85.128.67]:33040 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725795AbfCGKXx (ORCPT ); Thu, 7 Mar 2019 05:23:53 -0500 Received: by mail-wm1-f67.google.com with SMTP id c13so6393045wmb.0; Thu, 07 Mar 2019 02:23:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=HrrVBsxFyLwBuncur9okJNCjKX3TcQI8oRuE8m58v7M=; b=ooB7eI8Mav+1kOC6iZ15r1EHrOl5UFasbjSLDOwGAnhZve2N9TnYq9hSvmKUp6YNUX U/wBsWtkNWajtsPl5U7c8KM23wERvVAM4CoRy0kSoMu61pTgv96WVq5G11LDD5jW8tkE gdXFFL54FbGXwOMf2qc4DpBpslip9Bt2MFS15iCejMo86ME5CUrPI+3atdBfR45GC9II SejxCXfFhQd0+JmNTPKBPbEEQ7YDE/eWUarDP34dmKhDEb7DFdWc8Hopkd0DglUdL8u7 mmIC1t9fpGDohn23HKdigXD7ypra9zpjwNQWS9jx7YSH8BcxK/pL0HbkqzbY0D05G0oa xC2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=HrrVBsxFyLwBuncur9okJNCjKX3TcQI8oRuE8m58v7M=; b=aMpkzgjwCg7VPbRqZchPslNmEyyA+kleU+YBqAqbutSJnPbSkqdd1hecGgDp7rs4QB VvhOCKwWPzP3aMr436k/zU6KSBAx5G2o3UGj9+vonWt5Xujc6Rosg6G6z2JinDEA2Rgr Y2FZBVlbw8BtKNtILXHPJZPpUQTFnFPsgUJhgfiR/bygDW7pQuQWXkyAvNEpz1P7uetw Kjg89JxUQ/VxB1FOfJYmSKBroHBlfYmDcYuatc3GFsVY3+yvzk6foOQ+jvvwpOOUIw/J +Fx1NIFpBpysuMpoEMvAXvxWkvaS+ijz3+3/YXDV4kzNGqalkX2T4+4ahrvF79lAWWho HBXw== X-Gm-Message-State: APjAAAWvCEA2EqJEzrN6eKh7SkA8hSbuLD89poYdXm8kj4Hwj8QoDaHL nRYrQC2+kJsVsvkDDHqIwos= X-Received: by 2002:a05:600c:2159:: with SMTP id v25mr4989016wml.17.1551954231268; Thu, 07 Mar 2019 02:23:51 -0800 (PST) Received: from pali ([2a02:2b88:2:1::5cc6:2f]) by smtp.gmail.com with ESMTPSA id k20sm4847559wre.41.2019.03.07.02.23.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 07 Mar 2019 02:23:50 -0800 (PST) Date: Thu, 7 Mar 2019 11:23:49 +0100 From: Pali =?utf-8?B?Um9ow6Fy?= To: Steve Magnani Cc: Jan Kara , reinoud@netbsd.org, Colin King , =?utf-8?Q?Vojt=C4=9Bch?= Vladyka , "linux-kernel@vger.kernel.org" , linux-fsdevel@vger.kernel.org Subject: Re: [RFC] udftools: steps towards fsck Message-ID: <20190307102349.q5not23mve2soope@pali> References: <17e5fea5-8d76-c96d-8902-9050acba4288@digidescorp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <17e5fea5-8d76-c96d-8902-9050acba4288@digidescorp.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 06 March 2019 20:44:54 Steve Magnani wrote: > (Please remove at least LKML when responding. Mailing lists are a > scattershot attempt to reach others who might be interested in this topic > since I'm not aware of any linux-udf mailing list. ) IIRC there is no linux-udf mailing list, but I do not see any reason why not to use linux-fsdevel or linux-kernel. > A few months ago I stumbled across an interesting bit of abandonware in the > Sourceforge CVS repo that hosted UDF development through about 2004. Code > that originated here eventually became the modern-day udftools: > >     https://sourceforge.net/p/linux-udf/code/ udftools project was moved to github: https://github.com/pali/udftools/ Ben (original project developer) also updated sourceforce page and you can see there a big blue box "This project can now be found here." which points to github. > The 'udf' module in that repo contains a program from 1999 named 'chkudf', > which appears to have been written by Rob Simms. Being from the Y2K era, the > program has no awareness of anything beyond UDF2.01; in particular, its > comprehension of VAT reflects UDF1.50 and not the revamped design introduced > in UDF2.00. But it does have an ability to analyze the major UDF data > structures and to walk the filesystem. As project page was moved to github I converted also whole source code history. You can find there also that that old chkudf code... https://github.com/pali/udftools/tree/87acf1a2306b7b60ed9d61b53c2a487ea5f3396c/src/chkudf But, I would like to let you know that Vojtěch (CCed) started working on udffsck implementation as part of his master thesis and current WIP code is available on github in pull request: https://github.com/pali/udftools/pull/7 So it would be great if you look at new code and probably help Vojtěch to finish new effort as trying to port and fix 20 years old code which was already removed from udftools project... > I've spent quite a bit of time enhancing and fixing bugs in this code, with > a short term goal of being able to report damage to UDF2.01 filesystems on > "hard disk" (magnetic and SSD) media. It's not quite to the point of being > release-ready, but I think the code is on the cusp of becoming useful to > others so I wanted to get some feedback on the approach. > > I posted a GIT port (via SVN) of the CVS repo here, including all the > changes I've made so far: > >     https://github.com/smagnani/chkudf.git > > If you're interested in building the code you should be able to just run > 'make' within the chkudf folder. On Debian-derived systems you'll need > libblkid-dev installed in order to build. > > Some questions for consideration: > > * Would a udffsck limited to checking of UDF2.01 and earlier on "hard disk" > media be a sufficiently useful starting point to justify inclusion in > udftools? Obviously a tool with such limitations would have to be > particularly vigilant about ensuring that media-under-test doesn't exceed > its capabilities. > > * If so, do you think the chkudf implementation could qualify? It's not > ready yet, but with an investment of some time and energy it could be made > more functionally complete and (maybe more importantly) more user-friendly. > > In part this is a question of whether the chkudf design can support > enhancements to get (eventually) to UDF2.60 and optical media support, > balanced against the many years without an open-source udffsck and not > "letting the perfect become the enemy of the good." > > * For any standards-based parser it's important to have examples of as many > variations as possible (both normal and pathological) in order to ensure > that corner cases and less common features are tested properly. Can anyone > point me to any good sources of UDF data for testing? There are always > commercial DVDs and Blu-Ray discs, of course, and I've cobbled together a > few special cases by hand (i.e., a filesystem with directory cycles), but I > have no examples with extended attributes or stream data. If I could find a > DVD of Mac software in a resale shop would that help? [Side note, I've > thought of enhancing chkudf to support a tool that would store all the UDF > structures of a filesystem in a tarball that could be used to reconstitute > that filesystem within a sparse file. Since none of the file contents would > be stored the tarballs would be relatively small even if they represent > terabyte-scale filesystems. > > * Are there versions (or features) of UDF that are less important to support > than others (1.50? Strategy 4096? Named streams? etc.) I know 1.02, 2.01, > and 2.50 are in wide use. Currently udftools support UDF revisions 1.01, 1.02, 1.50, 2.00, 2.01 and for BD-R (without metadata partition) also 2.50 and 2.60. > * What kinds of repairs are most important to implement? I was thinking that > regeneration of the Logical Volume Integrity Descriptor and the unallocated > space bitmap are both important and hopefully relatively straightforward. > Beyond that...recovering ICBs to "lost+found"? > > > My 2 cents: > I didn't write this program. There are things I would have done differently, > but to this point I have tried to work within the existing design and code > style. After becoming more aware of differences between the various UDF > standards (in particular, the increase in complexity since 2.01) and the > many errata involved, I have a gut feeling that an implementation in a > language that supports inheritance might be a lot more manageable over the > long term - but it's not something I've spent a lot of time thinking about. > I've only recently become aware of UDFclient, and haven't had time to look > over its design yet. And, I can see the potential for followon utilities > such as a filesystem resizer - which might argue for making more of the code > library-based and not so heavy on printed output. > > Bottom line...udffsck has to start somewhere, could it start with chkudf? > > Thanks for reading. > ------------------------------------------------------------------------ >  Steven J. Magnani               "I claim this network for MARS! >  www.digidescorp.com              Earthling, return my space modulator!" > >  #include > -- Pali Rohár pali.rohar@gmail.com