2005-04-21 04:23:13

by David Woodhouse

[permalink] [raw]
Subject: Git-commits mailing list feed.

As of some time in the fairly near future, the bk-commits-head@vger mailing
list will be carrying real commits from Linus' live git repository, instead
of just testing patches. Have fun.

--
dwmw2


2005-04-21 06:27:39

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> As of some time in the fairly near future, the bk-commits-head@vger mailing
> list will be carrying real commits from Linus' live git repository, instead
> of just testing patches. Have fun.
>

What about the daily snapshots? Is there any eta when they'll be back?

--
Jan

2005-04-21 06:35:27

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Thu, 2005-04-21 at 08:24 +0200, Jan Dittmer wrote:
> What about the daily snapshots? Is there any eta when they'll be back?

Those were done by Jeff, not me. I'm planning to fix up the web page
which lists individual commits some time next week, and if Jeff wants me
to I could start generating daily snapshots too. It's unlikely to happen
until after I get back from linux.conf.au though.

--
dwmw2

2005-04-21 10:29:13

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Thu, 2005-04-21 at 14:22 +1000, David Woodhouse wrote:
> As of some time in the fairly near future, the bk-commits-head@vger mailing
> list will be carrying real commits from Linus' live git repository, instead
> of just testing patches. Have fun.
>

with BK this was not possible, but could we please have -p added to the
diff parameters with git ? It makes diffs a LOT more reasable!


2005-04-21 12:24:41

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Thu, 2005-04-21 at 12:29 +0200, Arjan van de Ven wrote:
> with BK this was not possible, but could we please have -p added to the
> diff parameters with git ? It makes diffs a LOT more reasable!

With BK this was not possible, but could you please provide your
criticism in 'diff -up' form?

I've done 'perl -pi -e s/-u/-up/ gitdiff-do' as a quick hack to provide
what you want, but a saner fix to make gitdiff-do obey the same
GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as show-diff.c
would be a more useful answer.

--
dwmw2

2005-04-22 00:53:28

by Greg KH

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Thu, Apr 21, 2005 at 08:24:36AM +0200, Jan Dittmer wrote:
> David Woodhouse wrote:
> > As of some time in the fairly near future, the bk-commits-head@vger mailing
> > list will be carrying real commits from Linus' live git repository, instead
> > of just testing patches. Have fun.
> >
>
> What about the daily snapshots? Is there any eta when they'll be back?

The script that generated this was posted previously on lkml. If anyone
wants to hack that up to generate the snapshots, it would be greatly
appreciated.

thanks,

greg k-h

2005-04-22 08:04:22

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Greg KH wrote:
> On Thu, Apr 21, 2005 at 08:24:36AM +0200, Jan Dittmer wrote:
>
>>David Woodhouse wrote:
>>
>>>As of some time in the fairly near future, the bk-commits-head@vger mailing
>>>list will be carrying real commits from Linus' live git repository, instead
>>>of just testing patches. Have fun.
>>>
>>
>>What about the daily snapshots? Is there any eta when they'll be back?
>
>
> The script that generated this was posted previously on lkml. If anyone
> wants to hack that up to generate the snapshots, it would be greatly
> appreciated.

Care to point out the post? I can't seem to find it. Only thing is
Jeff Garzik announcing that the snapshots work again in 8/04, but
no script attached.

Thanks,

Jan

2005-04-23 12:59:55

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

#!/usr/bin/python
#
# ketchup v0.9-pre "self-contained corner case"
# http://selenic.com/ketchup/
#
# Copyright 2004 Matt Mackall <[email protected]>
#
# This software may be used and distributed according to the terms
# of the GNU General Public License, incorporated herein by reference.
#
# Usage:
#
# in an existing kernel directory, run:
#
# ketchup <version>
#
# where version is a complete kernel version, or a branch name to grab
# the latest version

import re, sys, urllib, os, getopt, glob

def error(*args):
sys.stderr.write("ketchup: ")
for a in args:
sys.stderr.write(str(a))
sys.stderr.write("\n")

def fancyopts(args, options, state, syntax=''):
long=[]
short=''
map={}
dt={}

def help(state, opt, arg, options=options, syntax=syntax):
print "Usage: ", syntax

for s, l, d, c in options:
opt=' '
if s: opt = opt + '-' + s + ' '
if l: opt = opt + '--' + l + ' '
if d: opt = opt + '(' + str(d) + ')'
print opt
if c: print ' %s' % c
sys.exit(0)

if len(args) == 0:
help(state, None, args)

options=[('h', 'help', help, 'Show usage info')] + options

for s, l, d, c in options:
map['-'+s] = map['--'+l]=l
state[l] = d
dt[l] = type(d)
if not d is None and not type(d) is type(help): s, l=s+':', l+'='
if s: short = short + s
if l: long.append(l)

if os.environ.has_key("KETCHUP_OPTS"):
args = os.environ["KETCHUP_OPTS"].split() + args

try:
opts, args = getopt.getopt(args, short, long)
except getopt.GetoptError:
help(state, None, args)
sys.exit(-1)

for opt, arg in opts:
if dt[map[opt]] is type(help): state[map[opt]](state,map[opt],arg)
elif dt[map[opt]] is type(1): state[map[opt]] = int(arg)
elif dt[map[opt]] is type(''): state[map[opt]] = arg
elif dt[map[opt]] is type([]): state[map[opt]].append(arg)
elif dt[map[opt]] is type(None): state[map[opt]] = 1

return args

try: kernel_url = os.environ["KETCHUP_URL"]
except: kernel_url = 'http://www.kernel.org/pub/linux/kernel'

try: archive = os.environ["KETCHUP_ARCH"]
except: archive = os.environ["HOME"] + "/.ketchup"

wget = "/usr/bin/wget"
if not os.path.exists(wget): wget = ""

gpg = "/usr/bin/gpg"
if not os.path.exists(gpg): gpg = ""

options = {}
opts = [
('a', 'archive', archive, 'cache directory'),
('d', 'directory', '.', 'directory to update'),
('f', 'full-tarball', None, 'if unpacking a tarball, download the latest'),
('g', 'gpg-path', gpg, 'path for GnuPG'),
('G', 'no-gpg', None, 'disable GPG signature verification'),
('k', 'kernel-url', kernel_url, 'base url for kernel.org mirror'),
('l', 'list-trees', None, 'list supported trees'),
('m', 'show-makefile', None, 'output version in makefile <arg>'),
('n', 'dry-run', None, 'don\'t download or apply patches'),
('p', 'show-previous', None, 'output version previous to <arg>'),
('q', 'quiet', None, 'reduce output'),
('r', 'rename-directory', None, 'rename updated directory to linux-<v>'),
('s', 'show-latest', None, 'output the latest version of <arg>'),
('u', 'show-url', None, 'output URL for <arg>'),
('w', 'wget', wget, 'command to use for wget'),
]

args = fancyopts(sys.argv[1:], opts, options,
'ketchup [options] <ver>')

archive = options["archive"]
kernel_url = options["kernel-url"]
if options["no-gpg"]: options["gpg-path"] = ''

def qprint(*args):
if not options["quiet"]:
sys.stdout.write(" ".join(map(str, args)))
sys.stdout.write("\n")

# Functions to parse version strings

def tree(ver):
return float(re.match(r'(\d+\.\d+)', ver).group(1))

def rev(ver):
p = pre(ver)
r = int(re.match(r'\d+\.\d+\.(\d+)', ver).group(1))
if p: r = r - 1
return r

def pre(ver):
try: return re.match(r'\d+\.\d+\.\d+(\.\d+)?-((rc|pre)\d+)', ver).group(2)
except: return None

def post(ver):
try: return re.match(r'\d+\.\d+\.\d+\.(\d+)', ver).group(1)
except: return None

def pretype(ver):
try: return re.match(r'\d+\.\d+\.\d+(\.\d+)?-((rc|pre)\d+)', ver).group(3)
except: return None

def prenum(ver):
try: return int(re.match(r'\d+\.\d+\.\d+-((rc|pre)(\d+))', ver).group(4))
except: return None

def prebase(ver):
return re.match(r'(\d+\.\d+\.\d+((-(rc|pre)|\.)\d+)?)', ver).group(1)

def revbase(ver):
return "%s.%s" % (tree(ver), rev(ver))

def base(ver):
v = revbase(ver)
if post(ver): v += "." + post(ver)
return v

def forkname(ver):
try: return re.match(r'\d+.\d+.\d+(\.\d+)?(-(rc|pre)\d+)?(-(\w+?)\d+)?',
ver).group(5)
except: return None

def forknum(ver):
try: return int(
re.match(r'\d+.\d+.\d+(\.\d+)?(-(rc|pre)\d+)?(-(\w+?)(\d+))?',
ver).group(6))
except: return None

def fork(ver):
try: return re.match(r'\d+.\d+.\d+(\.\d+)?(-(rc|pre)\d+)?(-(\w+))?', ver).group(4)
except: return None

def get_ver(makefile):
""" Read the version information from the specified makefile """
part = {}
parts = "VERSION PATCHLEVEL SUBLEVEL EXTRAVERSION".split(' ')
m = open(makefile)
for l in m.readlines():
for p in parts:
try: part[p] = re.match(r'%s\s*=\s*(\S+)' % p, l).group(1)
except: pass

version = "%s.%s.%s" % tuple([part[p] for p in parts[:3]])
x = part.get("EXTRAVERSION", "")

if x != "" and x[0] != '-' and x[0] != '.':
version += '-'; """hack for ac tree"""
version += x
return version

def compare_ver(a, b):
"""
Compare kernel versions a and b

Note that -pre and -rc versions sort before the version they modify,
-pre sorts before -rc, and -bk, -mm, etc. sort alphabetically.
"""
if a == b: return 0

c = cmp(float(tree(a)), float(tree(b)))
if c: return c
c = cmp(rev(a), rev(b))
if c: return c
c = cmp(post(a), post(b))
if c: return c
c = cmp(pretype(a), pretype(b)) # pre sorts before rc
if c: return c
c = cmp(prenum(a), prenum(b))
if c: return c
c = cmp(forkname(a), forkname(b))
if c: return c
return cmp(forknum(a), forknum(b))

def last(url):
for l in urllib.urlopen(url).readlines():
m=re.search('(?i)<a href="(.*/)">', l)
if m: n = m.group(1)
return n

def latest_mm(url, pat):
url = kernel_url + '/people/akpm/patches/2.6/'
url += last(url)
part = last(url)
return part[:-1]

def latest_ac(url, pat):
url = kernel_url + '/people/alan/linux-2.6/'
url += last(url)
for l in urllib.urlopen(url).readlines():
m=re.search('(?i)<a href="patch-(.*)\.bz2">', l)
if m: n = m.group(1)
return n

def latest_26(url, pat):
for l in urllib.urlopen(url).readlines():
m = re.search('"LATEST-IS-(.*)"', l)
if m: p = m.group(1)
return p

def latest_dir(url, pat):
"""Find the latest link matching pat at url after sorting"""
p = []
for l in urllib.urlopen(url).readlines():
m = re.search('"%s"' % pat, l)
if m: p.append(m.group(1))

if not p: return None

p.sort(compare_ver)
return p[-1]

# mbligh is lazy and has a bunch of empty directories
def latest_mjb(url, pat):
url = kernel_url + '/people/mbligh/'

# find the last Linus release and search backwards
l = [ find_ver('2.6'), find_ver("2.6-pre") ]
l.sort(compare_ver)
linus = l[-1]

p = []
for l in urllib.urlopen(url).readlines():
m = re.search('"(2\.6\..*/)"', l)
if m:
v = m.group(1)
if compare_ver(v, linus) <= 0:
p.append(v)

p.sort(compare_ver)
p.reverse()

for ver in p:
mjb = latest_dir(url + ver, pat)
if mjb: return mjb

return None

def latest_26_tip(url, pat):
l = [ find_ver('2.6'), find_ver('2.6-bk'), find_ver('2.6-pre') ]
l.sort(compare_ver)
return l[-1]

# latest lookup function, canonical url, pattern for lookup function,
# signature flag, description
version_info = {
'2.4': (latest_dir,
kernel_url + "/v2.4" + "/patch-%(base)s.bz2",
r'patch-(.*?).bz2',
1, "old stable kernel series"),
'2.4-pre': (latest_dir,
kernel_url + "/v2.4" + "/testing/patch-%(prebase)s.bz2",
r'patch-(.*?).bz2',
1, "old stable kernel series prereleases"),
'2.6': (latest_26,
kernel_url + "/v2.6" + "/patch-%(prebase)s.bz2", "",
1, "current stable kernel series"),
'2.6-rel': (latest_26,
kernel_url + "/v2.6" + "/patch-%(prebase)s.bz2", "",
1, "current stable kernel series RELease"),
'2.6-rc': (latest_dir,
kernel_url + "/v2.6" + "/testing/patch-%(prebase)s.bz2",
r'patch-(.*?).bz2',
1, "current stable kernel series prereleases"),
'2.6-pre': (latest_dir,
kernel_url + "/v2.6" + "/testing/patch-%(prebase)s.bz2",
r'patch-(.*?).bz2',
1, "current stable kernel series prereleases"),
'2.6-bk': (latest_dir,
kernel_url + "/v2.6" +
"/snapshots/patch-%(full)s.bz2", r'patch-(.*?).bz2',
1, "current stable kernel series snapshots"),
'2.6-git': (latest_dir,
"http://l4x.org/kernelgit/" +
"patch-%(full)s.bz2", r'patch-(.*?).bz2',
1, "test git snapshots"),
'2.6-tip': (latest_26_tip, "", "", 1,
"current stable kernel series tip"),
'2.6-mm': (latest_mm,
kernel_url + "/people/akpm/patches/" +
"%(tree)s/%(prebase)s/%(full)s/%(full)s.bz2", "",
1, "Andrew Morton's -mm development tree"),
'2.6-ac': (latest_ac,
kernel_url + "/people/alan/linux-%(tree)s/" +
"%(prebase)s/patch-%(full)s.bz2", "",
1, "Alan Cox's -ac development tree"),
'2.6-tiny': (latest_dir,
"http://www.selenic.com/tiny/%(full)s.patch.bz2",
r'(2.6.*?).patch.bz2',
1, "Matt Mackall's -tiny tree for small systems"),
'2.6-mjb': (latest_mjb,
kernel_url + "/people/mbligh/%(prebase)s/patch-%(full)s.bz2",
r'patch-(2.6.*?).bz2',
1, "Martin Bligh's random collection 'o crap")
}

def version_url(ver, sign = 0):
""" Return the URL for the patch associated with the specified version """
b = "%.1f" % tree(ver)
f = forkname(ver)
p = pre(ver)

s = b
if f: s = "%s-%s" % (b, f)
elif p: s = "%s-pre" % b

if sign and options["no-gpg"]: return None
if sign and not version_info[s][3]: return None

v = {
'full': ver,
'tree': tree(ver),
'base': base(ver),
'prebase': prebase(ver)
}

u = version_info[s][1] % v

if sign: u += ".sign"
return u

def patch_path(ver):
return os.path.join(archive, os.path.basename(version_url(ver)))

def download(url, file):
qprint("Downloading %s" % os.path.basename(url))
if options["dry-run"]: return 1

if not options["wget"]:
p = urllib.urlopen(url).read()
if p.find("<title>404") != -1: return None
open(file, 'w').write(p)
else:
e = os.system("%s -c -O %s %s" % (options["wget"],
file+".partial", url))
if e: return None
os.rename(file+".partial", file)

return 1

def trydownload(url, file):
if download(url, file): return file

# the jgarzik memorial hack
url2 = re.sub("/snapshots/", "/snapshots/old/", url)
if url2 != url:
if download(url2, file): return file
if url2[-4:] == ".bz2":
f2 = file[:-4] + ".gz"
url2 = url2[:-4] + ".gz"
if download(url2, f2): return f2

if url[-4:] == ".bz2":
f2 = file[:-4] + ".gz"
url2 = url[:-4] + ".gz"
if download(url2, f2): return f2

return None

def verify(signurl, file):
if options["gpg-path"] and signurl and not options["dry-run"]:
sf = file + ".sign"
sf = trydownload(signurl, sf)
if not sf:
error("signature download failed")
error("removing files...")
os.unlink(file)
return 0

qprint("Verifying signature...")
r = os.system("%s --verify %s %s" % (options["gpg-path"], sf, file))
if r:
error("gpg returned %d" % r)
error("removing files...")
os.unlink(file)
os.unlink(sf)
return 0
return 1

def get_patch(ver):
"""Return the path to patch for given ver, downloading if necessary"""
f = patch_path(ver)
if os.path.exists(f): return f
if f[-4:] == ".bz2":
f2 = f[:-4] + ".gz"
if os.path.exists(f2): return f2

url = version_url(ver)
f = trydownload(url, f)
if not f:
error("patch download failed")
sys.exit(-1)

if not verify(version_url(ver, 1), f):
sys.exit(-1)

return f

def apply_patch(ver, reverse = 0):
"""Find the patch to upgrade from the predecessor of ver to ver and
apply or reverse it."""
p = get_patch(ver)

r = ""
if reverse: r = "-R"

qprint("Applying %s %s" % (os.path.basename(p), r))
if options["dry-run"]: return ver

if p[-4:] == ".bz2":
err = os.system("bzcat %s | patch -l -p1 %s > .patchdiag" % (p, r))
elif p[-3:] == ".gz":
err = os.system("zcat %s | patch -l -p1 %s > .patchdiag" % (p, r))
else: err = os.system("patch -l -p1 %s < %s > .patchdiag" % (r, p))

if err:
sys.stderr.write(open(".patchdiag").read())
error("patch %s failed: %d" % (p, err))
sys.exit(-1)
os.unlink(".patchdiag")

def install_nearest(ver):
t = tree(ver)
tarballs = glob.glob(archive + "/linux-%s.*.tar.bz2" % t)
list = []

for f in tarballs:
m = re.match(r'.*/linux-(.*).tar.bz2$', f)
v = m.group(1)
d = abs(rev(v) - rev(ver))
list.append((d, f, v))
list.sort()

if not list or (options["full-tarball"] and list[0][0]):
file = "linux-%s.tar.bz2" % ver
url = "%s/v%s/%s" % (kernel_url, t, file)
file = archive + "/" + file

file = trydownload(url, file)
if not file:
error("Tarball download failed")
sys.exit(-1)
if not verify(url + ".sign", file):
sys.exit(-1)
else:
file = list[0][1]
ver = list[0][2]

qprint("Unpacking %s" % os.path.basename(file))
if options["dry-run"]: return ver

err = os.system("tar xjf %s" % file)
if err:
error("Unpacking failed: ", err)
sys.exit(-1)

err = os.system("mv linux*/* . ; rmdir linux*")
if err:
error("Unpacking failed: ", err)
sys.exit(-1)

return ver

def find_ver(ver):
if ver in version_info.keys():
v = version_info[ver]
for n in range(5):
return v[0](os.path.dirname(v[1]), v[2])
error('retrying version lookup for %s' % ver)
else:
return ver

def transform(a, b):
if a == b:
# qprint("Nothing to do!")
return
qprint("%s -> %s" % (a, b))
if not a:
a = install_nearest(base(b))
t = tree(a)
if t != tree(b):
error("Can't patch %s to %s" % (tree(a), tree(b)))
sys.exit(-1)
if fork(a):
apply_patch(a, 1)
a = prebase(a)
if prebase(a) != prebase(b):
if pre(a):
apply_patch(a, 1)
a = base(a)

if post(a) and post(a) != post(b):
apply_patch(prebase(a), 1)

ra, rb = rev(a), rev(b)
if ra > rb:
for r in range(ra, rb, -1):
apply_patch("%s.%s" % (t, r), -1)
if ra < rb:
for r in range(ra + 1, rb + 1):
apply_patch("%s.%s" % (t, r))
a = revbase(b)

if post(b) and post(a) != post(b):
apply_patch(prebase(b), 0)
a = base(b)

if pre(b):
apply_patch(prebase(b))
a = prebase(b)

if fork(b):
a = apply_patch(b)

def rename_dir(v):
"""Rename the current directory to linux-v, where v is the function arg"""
cwd = os.getcwd()
basedir = os.path.dirname(cwd)
newdir = os.path.join(basedir, "linux-" + v)
if os.access(newdir, os.F_OK):
error("Cannot rename directory, destination exists: %s", newdir);
return
os.rename(cwd, newdir)


# Process args

os.chdir(options["directory"])

if options["list-trees"]:
l = version_info.keys()
l.sort()
for tree in l:
qprint(tree, ["(unsigned)","(signed)"][version_info[tree][3]])
qprint(" " + version_info[tree][4])

elif options["show-makefile"] and len(args) < 2:
if not args:
qprint(get_ver("Makefile"))
else:
qprint(get_ver(args[0]))

elif len(args) != 1:
error("incorrect number of arguments")
sys.exit(-1)

elif options["show-latest"]:
qprint(find_ver(args[0]))

elif options["show-url"]:
qprint(version_url(find_ver(args[0])))

elif options["show-previous"]:
v = find_ver(args[0])
p = prebase(v)
if p == v: p = base(v)
if p == v:
if rev(v) > 0: p = "%.1f.%s" % (tree(v), rev(v) -1)
else: p = "unknown"
qprint(p)

else:
if not os.path.exists(options["archive"]):
qprint("Creating cache directory", options["archive"])
os.mkdir(options["archive"])

try: a = get_ver('Makefile')
except: a = None
b = find_ver(args[0])
# qprint("%s -> %s" % (a, b))
transform(a, b)
if options["rename-directory"] and not options["dry-run"]:
rename_dir(b)



Attachments:
ketchup (17.52 kB)
snapshot.sh (1.85 kB)
Download all attachments

2005-04-23 14:22:32

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, 2005-04-23 at 14:58 +0200, Jan Dittmer wrote:
> I didn't found above mentioned post, so I hacked up a cruel script
> myself. It relies on ketchup (http://www.selenic.com/ketchup)
> to retrieve the current base version. Also it requires git's
> `checkout-cache --prefix=` to work properly.

Thanks... but it seems a little excessive. I was thinking of something
much simpler; along the lines of...

#!/bin/sh

STAGE=/staging/dwmw2/git

cd /home/dwmw2/git/snapshot-2.6

git pull || exit 1

LASTRELEASE=`ls -rt .git/tags | grep -v git | grep -v MailDone | tail -1`
LASTTAG=`ls -rt .git/tags | grep -v MailDone | tail -1`

CURCOMMIT=`commit-id`
LASTCOMMIT=`cat .git/tags/$LASTTAG`
RELCOMMIT=`cat .git/tags/$LASTRELEASE`

[ "$LASTCOMMIT" = "$CURCOMMIT" ] && exit 0

CURTREE=`tree-id $CURCOMMIT`
#LASTTREE=`tree-id $LASTCOMMIT`
RELTREE=`tree-id $RELCOMMIT`

if echo $LASTTAG | grep -q -- -git ; then
OLDGITNUM=`echo $LASTTAG | sed s/^.*-git//`
NEWGITNUM=`expr $OLDGITNUM + 1`
NEWTAG=`echo $LASTTAG | sed s/-git$OLDGITNUM/-git$NEWGITNUM/`
else
NEWTAG=$LASTTAG-git1
fi

echo $commit-id > $STAGE/$NEWTAG.id
# This is, unfortunately, in chronological order. Walking the tree would
# be better.
git log $CURCOMMIT ^$RELCOMMIT > $STAGE/$NEWTAG.log
git diff -r $RELTREE -r $CURTREE | gzip -9 > $STAGE/patch-$NEWTAG.gz

echo $CURCOMMIT > .git/tags/$NEWTAG


--
dwmw2

2005-04-23 14:30:32

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> On Sat, 2005-04-23 at 14:58 +0200, Jan Dittmer wrote:
>
>>I didn't found above mentioned post, so I hacked up a cruel script
>>myself. It relies on ketchup (http://www.selenic.com/ketchup)
>>to retrieve the current base version. Also it requires git's
>>`checkout-cache --prefix=` to work properly.
>
>
> Thanks... but it seems a little excessive. I was thinking of something
> much simpler; along the lines of...
>
> #!/bin/sh
>
> STAGE=/staging/dwmw2/git
>
> cd /home/dwmw2/git/snapshot-2.6
>
> git pull || exit 1
>
> LASTRELEASE=`ls -rt .git/tags | grep -v git | grep -v MailDone | tail -1`

My .git/tags is empty. At least 2.6.12-rc3 is not tagged so I wasn't sure
how to extract the latest release from the git tree.
ketchup was the most comfortable way.

--
Jan

2005-04-23 14:35:58

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, 2005-04-23 at 16:30 +0200, Jan Dittmer wrote:
> > LASTRELEASE=`ls -rt .git/tags | grep -v git | grep -v MailDone | tail -1`
>
> My .git/tags is empty. At least 2.6.12-rc3 is not tagged so I wasn't sure
> how to extract the latest release from the git tree.
> ketchup was the most comfortable way.

Nah, asking Linus to tag his releases is the most comfortable way.

mkdir .git/tags
echo 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > .git/tags/2.6.12-rc2
echo a2755a80f40e5794ddc20e00f781af9d6320fafb > .git/tags/2.6.12-rc3

--
dwmw2

2005-04-23 14:43:30

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> On Sat, 2005-04-23 at 14:58 +0200, Jan Dittmer wrote:
>
>>I didn't found above mentioned post, so I hacked up a cruel script
>>myself. It relies on ketchup (http://www.selenic.com/ketchup)
>>to retrieve the current base version. Also it requires git's
>>`checkout-cache --prefix=` to work properly.
>
>
> Thanks... but it seems a little excessive. I was thinking of something
> much simpler; along the lines of...

> echo $commit-id > $STAGE/$NEWTAG.id

you want $CURCOMMIT here.
Otherwise works fine.

--
Jan

2005-04-23 17:29:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sun, 24 Apr 2005, David Woodhouse wrote:
>
> Nah, asking Linus to tag his releases is the most comfortable way.
>
> mkdir .git/tags
> echo 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > .git/tags/2.6.12-rc2
> echo a2755a80f40e5794ddc20e00f781af9d6320fafb > .git/tags/2.6.12-rc3

The reason I've not done tags yet is that I haven't decided how to do
them.

The git-pasky "just remember the tag name" approach certainly works, but I
was literally thinking o fsetting up some signing system, so that a tag
doesn't just say "commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 is
v2.6.12-rc2", but it would actually give stronger guarantees, ie it would
say "Linus says that commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 is
his 2.6.12-rc2 release".

That's something fundamentally more powerful, and it's also something that
I actually can integrate better into git.

In other words, I actually want to create "tag objects", the same way we
have "commit objects". A tag object points to a commit object, but in
addition it contains the tag name _and_ the digital signature of whoever
created the tag.

Then you just distribute these tag objects along with all the other
objects, and fsck-cache can pick them up even without any other knowledge,
but normally you'd actually point to them some other way too, ie you could
have the ".git/tags/xxx" files have the pointers, but now they are
_validated_ pointers.

That was my plan, at least. But I haven't set up any signature generation
thing, and this really isn't my area of expertise any more. But my _plan_
literally was to have the tag object look a lot like a commit object, but
instead of pointing to the tree and the commit parents, it would point to
the commit you are tagging. Somehting like

commit a2755a80f40e5794ddc20e00f781af9d6320fafb
tag v2.6.12-rc3
signer Linus Torvalds

This is my official original 2.6.12-rc2 release

-----BEGIN PGP SIGNATURE-----
....
-----END PGP SIGNATURE-----

with a few fixed headers and then a place for free-form commentary,
everything signed by the key (and then it ends up being encapsulated as an
object with the object type "tag", and SHA1-csummed and compressed, ie it
ends up being just another object as far as git is concerned, but now it's
an object that tells you about _trust_)

(The "signer" field is just a way to easily figure out which public key to
check the signature against, so that you don't have to try them all. Or
something. My point being that I know what I want, but because I normally
don't actually ever _use_ PGP etc, I don't know the scripts to create
these, so I've been punting on it all).

If somebody writes a script to generate the above kind of thing (and tells
me how to validate it), I'll do the rest, and start tagging things
properly. Oh, and make sure the above sounds sane (ie if somebody has a
better idea for how to more easily identify how to find the public key to
check against, please speak up).

Linus

2005-04-23 17:44:16

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Linus Torvalds wrote:
>
> commit a2755a80f40e5794ddc20e00f781af9d6320fafb
> tag v2.6.12-rc3
> signer Linus Torvalds
>
> This is my official original 2.6.12-rc2 release
>
> -----BEGIN PGP SIGNATURE-----
> ....
> -----END PGP SIGNATURE-----

Btw, in case it wasn't clear, one of the advantages of this is that these
objects are really _not_ versioned themselves, and that they are totally
independent of the objects that they actually tag.

They spread together with all the other objects, so they fit very well
into the whole git infrastructure, but the real commit objects don't have
any linkages to the tag and the tag objects themselves don't have any
history amongst themselves, so you can create a tag at any (later) time,
and it doesn't actually change the commit in any way or affect other tags
in any way.

In particular, many different people can tag the same commit, and they
don't even need to tage their _own_ commit - you can use this tag objects
to show that you trust somebody elses commit. You can also throw the tag
objects away, since nothing else depends on them and they have nothing
linking to them - so you can make a "one-time" tag object that you can
pass off to somebody else, and then delete it, and now it's just a
"temporary tag" that tells the recipient _something_ about the commit you
tagged, but that doesn't stay around in the archive.

That's important, because I actually want to have the ability for people
who want me to pull from their archive to send me a message that says
"pull from this archive, and btw, here's the tag that not only tells you
which head to merge, but also proves that it was me who created it".

Will we use this? Maybe not. Quite frankly, I think human trust is much
more important than automated trust through some technical means, but I
think it's good to have the _support_ for this kind of trust mechanism
built into the system. And I think it's a good way for distributors etc to
say: "this is the source code we used to build the kernel that we
released, and we tagged it 'v2.6.11-mm6-crazy-fixes-3.96'".

And if my key gets stolen, I can re-generate all the tags (from my archive
of tags that I trust), and sign them with a new key, and revoke the trust
of my old key. This is why it's important that tags don't have
interdependencies, they are just a one-way "this key trusts that release
and calls it xyzzy".

Linus

2005-04-23 17:54:49

by Fabian Franz

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Samstag, 23. April 2005 19:31 schrieb Linus Torvalds:
> On Sun, 24 Apr 2005, David Woodhouse wrote:
> > Nah, asking Linus to tag his releases is the most comfortable way.
> >
> The reason I've not done tags yet is that I haven't decided how to do
> them.
>
> commit a2755a80f40e5794ddc20e00f781af9d6320fafb
> tag v2.6.12-rc3
> signer Linus Torvalds
>
> This is my official original 2.6.12-rc2 release
>
> -----BEGIN PGP SIGNATURE-----
> ....
> -----END PGP SIGNATURE-----
>
> If somebody writes a script to generate the above kind of thing (and tells
> me how to validate it), I'll do the rest, and start tagging things
> properly. Oh, and make sure the above sounds sane (ie if somebody has a
> better idea for how to more easily identify how to find the public key to
> check against, please speak up).

To generate those you do:

# cat unsigned_tag

commit a2755a80f40e5794ddc20e00f781af9d6320fafb
tag v2.6.12-rc3
signer Linus Torvalds
This is my official original 2.6.12-rc2 release

# gpg --clearsign < unsigned_tag > signed_tag # gpg will ask here for the
secret key phrase

To verify you do:

# gpg --verify < signed_tag

and check exit status.

Hope that helps,

cu

Fabian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCaorzI0lSH7CXz7MRAr3QAJ45f2CQTgJ0sYfF9kRyrWHbsazVQQCeMqW7
HCsah/llt/I8sQ36dlDnRWg=
=Fgq1
-----END PGP SIGNATURE-----

2005-04-23 18:02:30

by Thomas Glanzmann

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Hello,

there is no need to tell the verifier against what key to verify because
the signature already contains this information.

> If somebody writes a script to generate the above kind of thing (and
> tells me how to validate it), I'll do the rest, and start tagging
> things properly. Oh, and make sure the above sounds sane (ie if
> somebody has a better idea for how to more easily identify how to find
> the public key to check against, please speak up).

# This creates the signature.
gpg --clearsign < sign_this > signature

# And this verifies it.
gpg --verify < signature && echo valid

Thomas

2005-04-23 18:28:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Thomas Glanzmann wrote:
>
> # This creates the signature.
> gpg --clearsign < sign_this > signature

This really doesn't work for me - I do not want to have the gpg header
above it, only the signature below. Since I want git to actually
understand the tags, but do _not_ want git to have to know about whatever
signing method was used, I really want the resulting file to look like

commit ....
tag ...

here goes comment
here goes signature

and no headers.

Whether that can be faked by always forcing SHA1 as the hash, and then
just removing the top lines, and re-inserting them when verifying, or
whether there is some mode to make gpg not do the header crud at all, I
don't know. Which is exactly why I never even got started.

Linus

2005-04-23 18:34:37

by Jan Harkes

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, Apr 23, 2005 at 10:31:28AM -0700, Linus Torvalds wrote:
> In other words, I actually want to create "tag objects", the same way we
> have "commit objects". A tag object points to a commit object, but in
> addition it contains the tag name _and_ the digital signature of whoever
> created the tag.

I see how we can use such a tag object to find a specific commit object
in the tree. But if you put the tag objects in the tree as well we now
have to figure out a way to find the tag objects.

Why not keep the tags object outside of the tree in the tags/ directory.
That way it is easy to find them, and simple to validate all tags or
update the signatures if you lost your key.

> properly. Oh, and make sure the above sounds sane (ie if somebody has a
> better idea for how to more easily identify how to find the public key to
> check against, please speak up).

Others already mentioned the gpg clearsign and verify options, to find a
public key that you haven't seen before it is probably easiest to use a
keyserver. If verify complains that it doesn't know a key it will print
a key-id that identifies it. That id can then be looked up as follows,

gpg --keyserver wwwkeys.pgp.net --search-keys 0xA86B35C5
gpg: searching for "0xA86B35C5" from hkp server wwwkeys.pgp.net
(1) Linus Torvalds <[email protected]>
1024 bit RSA key A86B35C5, created: 1996-06-08
Keys 1-1 of 1 for "0xA86B35C5". Enter number(s), N)ext, or Q)uit > q

Ofcourse trusting a key obtained this way is another thing altogether,
and would probably depend on who signed it and such.

Jan

2005-04-23 18:37:35

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

In article <[email protected]> you wrote:
> # This creates the signature.
> gpg --clearsign < sign_this > signature

To not destroy the syntax of the original data, you better generate a
detached signatur and append it. However in that case you have to detech the
signatur handish:

> gpg --detach-sig -a tag-file
<requirs passphrase>
> ls tag*
-rw-rw-r-- 1 ecki ecki 45 Apr 23 20:25 tag-file
-rw-rw-r-- 1 ecki ecki 189 Apr 23 20:26 tag-file.asc
> cat tag.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQBCapNy/vciZ+ODzX4RAgBcAJ92ku1fc5iwhpZ+BJ18HvRFPYa5FACdG2r0
B22yNdcyi/Opz11nbWd2LaE=
=Zt5v
-----END PGP SIGNATURE-----
2ecki@calista:~> cat tag-file
commit 123
signer Bernd Eckenfels
tag RC-123

If you skip the -a the signature file is binary. You can merge both files,
but you have to separate them before you present them to GPG:

> gpg --verify tag.asc tag
gpg: Signature made Sat Apr 23 20:26:58 2005 CEST using DSA key ID E383CD7E
gpg: Good signature from "Bernd Eckenfels <[email protected]>"
echo $?
0

If you dont care about the Format of the plaintext (i.e. additional GPG
Headers and Replacement of -- as well as sensitieness to line endings, then
you can use the clear sign method as well.

Greetings
Bernd

BTW: you can send gpg the passphrase via a specified FD, if you want to
cache it, however thats a bad idea generally. If you want to parse the
results from gpg verify (i.e. expired, who has signed, etc) it is better to
specify some more options which generate easyly parseable extra info:

> gpg --status-fd 1 --verify tag.asc tag
gpg: Signature made Sat Apr 23 20:26:58 2005 CEST using DSA key ID E383CD7E
[GNUPG:] SIG_ID e8Q/kei6ZdkSPK/7MCyBuXTdJIo 2005-04-23 1114280818
[GNUPG:] GOODSIG FEF72267E383CD7E Bernd Eckenfels <[email protected]>
gpg: Good signature from "Bernd Eckenfels <[email protected]>"
[GNUPG:] VALIDSIG 654F33BCA8B3868852DC731DFEF72267E383CD7E 2005-04-23 1114280818 0 3 0 17 2 00 654F33BCA8B3868852DC731DFEF72267E383CD7E
[GNUPG:] TRUST_ULTIMATE

2005-04-23 18:46:28

by Jan Harkes

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, Apr 23, 2005 at 11:30:36AM -0700, Linus Torvalds wrote:
> On Sat, 23 Apr 2005, Thomas Glanzmann wrote:
> > # This creates the signature.
> > gpg --clearsign < sign_this > signature
>
> This really doesn't work for me - I do not want to have the gpg header
> above it, only the signature below. Since I want git to actually
> understand the tags, but do _not_ want git to have to know about whatever
> signing method was used, I really want the resulting file to look like
>
> commit ....
> tag ...
>
> here goes comment
> here goes signature
>
> and no headers.
>
> Whether that can be faked by always forcing SHA1 as the hash, and then
> just removing the top lines, and re-inserting them when verifying, or
> whether there is some mode to make gpg not do the header crud at all, I
> don't know. Which is exactly why I never even got started.

It is a bit more messy, but it can be done with a detached signature.

To sign,
gpg -ab unsigned_commit
cat unsigned_commit unsigned_commit.asc > signed_commit

To verify,
cat signed_commit | sed '/-----BEGIN PGP/Q' | gpg --verify signed_commit -

Jan

2005-04-23 18:47:27

by Thomas Glanzmann

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Hello,

> commit ....
> tag ...

> here goes comment
> here goes signature

# This creates only the signature in Ascii Armor.
gpg -a --detach-sign < to_sign > signature

Thomas

2005-04-23 18:51:10

by Sean

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, April 23, 2005 1:31 pm, Linus Torvalds said:

> If somebody writes a script to generate the above kind of thing (and
tells me how to validate it), I'll do the rest, and start tagging things
properly. Oh, and make sure the above sounds sane (ie if somebody has a
better idea for how to more easily identify how to find the public key to
> check against, please speak up).
>

Hi Linus,

Why not leave tags open to being signed or unsigned? Anyone that wants to
create a trusted tag could simply sign their cleartext entry in the tag
object.

Ideally the SHA1 tree reference would be included in the text entry
whether it was signed or not. Thus any script can pull the SHA1 out of
the text entry. And a script that understands the signing method can
verify it. But scripts that don't understand the signing method can still
use the tag.

For presentation in the log or whatever, the script can look inside the
clear text message, grab the SHA1 and display it in the header area; even
though it's not really in the header, always just in the clear text area.

Sean






2005-04-23 18:55:55

by Junio C Hamano

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

>>>>> "LT" == Linus Torvalds <[email protected]> writes:

LT> I really want the resulting file to look like

LT> commit ....
LT> tag ...

LT> here goes comment
LT> here goes signature

LT> and no headers.

You can use --detach-sign with --armor, like this.

Signed-off-by: Junio C Hamano <[email protected]>
---
#!/bin/sh

sq=s/\'/\''\\'\'\'/g
usage="usage: $0 [--signer=...] commit-id tag < message"
while case "$#" in 0) break;; esac
do
case "$1" in
-s=*|--s=*|--si=*|--sig=*|--sign=*|--signe=*|--signer=*)
signer=`expr "$1" : '-[^=]*=\(.*\)'` ;;
-s|--s|--si|--sig|--sign|--signe|--signer)
case "$#" in 0 | 1) echo "$usage"; exit 1 ;; esac
signer="${2?}"
shift ;;
--)
shift
break ;;
-*)
echo "$usage"
exit 1 ;;
*)
break ;;
esac
shift
done

case "$#" in 2) echo >&2 "$usage"; exit 1 ;; esac
commit="$1" tag="$2"

case "$signer" in
'') signer_arg='' ;;
?*) signer_arg="--local-user '$(echo "$signer" | sed -e "$sq")'" ;;
esac

tmp=.jit-tag.$$
trap 'rm -f $tmp-*' 0 1 2 3 15
tagblob=$tmp-tagblob
tagsign=$tmp-tagsign

case $(cat-file -t "$commit" 2>/dev/null) in
commit) ;;
*) echo >&2 "$0: $commit is not a commit object"; exit 1 ;;
esac
{
echo "commit $commit"
echo "tag $tag"
case "$signer" in
'') ;;
?*) echo "signer $signer" ;;
esac
echo
tty -s && echo >&2 "Type your tag message and end with ^D."
cat
} >$tagblob || exit
gpgcmd="gpg $signer_arg -a --output $tagsign --detach-sign $tagblob"
eval "$gpgcmd" || exit
cat $tagblob $tagsign

2005-04-23 18:54:42

by Thomas Glanzmann

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Hello,

> # This creates only the signature in Ascii Armor.
> gpg -a --detach-sign < to_sign > signature

# And to verify:
gpg --verify signature to_sign

Thomas

2005-04-23 19:06:01

by Sean

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, April 23, 2005 2:30 pm, Linus Torvalds said:
> On Sat, 23 Apr 2005, Thomas Glanzmann wrote:
>> # This creates the signature.
>> gpg --clearsign < sign_this > signature
>
> This really doesn't work for me - I do not want to have the gpg header
above it, only the signature below. Since I want git to actually
understand the tags, but do _not_ want git to have to know about
whatever
> signing method was used, I really want the resulting file to look like
>
> commit ....
> tag ...
>
> here goes comment
> here goes signature
>
> and no headers.
>
> Whether that can be faked by always forcing SHA1 as the hash, and then
just removing the top lines, and re-inserting them when verifying, or
whether there is some mode to make gpg not do the header crud at all, I
don't know. Which is exactly why I never even got started.

Linus,

A script that knows how to validate signed tags, can easly strip off all
the signing overhead for display. Users of scripts that don't understand
will see the cruft, but at least it will still be usable.

Sean


2005-04-23 19:10:44

by Thomas Glanzmann

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Hello,

> Why not leave tags open to being signed or unsigned?

I think that this is the idea anyway.

> For presentation in the log or whatever, the script can look inside the
> clear text message, grab the SHA1 and display it in the header area; even
> though it's not really in the header, always just in the clear text area.

Having the SHA1 signature twice in would be confusing and error-prone
when checking is done automated.

So establishing the infrastructure is a good thing. To use it for every
commit is another issue.

Thomas

2005-04-23 19:14:32

by Sean

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, April 23, 2005 3:02 pm, Thomas Glanzmann said:
> Hello,
>
>> Why not leave tags open to being signed or unsigned?
>
> I think that this is the idea anyway.
>
>> For presentation in the log or whatever, the script can look inside the
>> clear text message, grab the SHA1 and display it in the header area;
>> even
>> though it's not really in the header, always just in the clear text
>> area.
>
> Having the SHA1 signature twice in would be confusing and error-prone
> when checking is done automated.
>
> So establishing the infrastructure is a good thing. To use it for every
> commit is another issue.


There's no need to have the SHA1 object reference twice. It will only be
in the clear text, nowhere else. Of course scripts that display the log,
could show the object reference in the header area for aesthetics.
Another nice thing is that this works no matter cleartext signing methods
emerge in the future.

Sean



2005-04-23 19:28:57

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Jan Harkes wrote:
>
> Why not keep the tags object outside of the tree in the tags/ directory.

Because then you have all those special cases with fetching them and with
fsck, and with shared object directories. In other words: no.

You can have symlinks (or even better, just a single file with all the
tags listed, which you can create with "fsck", for example) from the tags/
directory, but the thing is, objects go in the object directory and
nowhere else.

Linus

2005-04-23 19:32:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Sean wrote:
>
> Why not leave tags open to being signed or unsigned?

That is exactly what my proposal does, except I'd make the normal tags
creation always sign.

But since _git_ won't care which is why I want the signature at the _end_,
not "surrpunding" the thing, you could create a tag that just doesn't have
the signature, and git will never even notice. The people who see the tag
may say "hmm, why couldn't he be bothered to sign it", though.

Linus

2005-04-23 19:36:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Sean wrote:
>
> A script that knows how to validate signed tags, can easly strip off all
> the signing overhead for display. Users of scripts that don't understand
> will see the cruft, but at least it will still be usable.

NO.

Guys, I will say this once more: git will not look at the signature.

That means that we don't "strip them off", because dammit, they DO NOT
EXIST as far as git is concerned. This is why a tag-file will _always_
start with

commit <commit-sha1>
tag <tag-name>

because that way we can use fsck and validate reachability and have things
that want trees (or commits) take tag-files instead, and git will
automatically look up the associated tree/commit. And it will do so
_without_ having to understand about signing, since signing is for trust
between _people_ not for git.

And that is why I from the very beginning tried to make ti very clear that
the signature goes at the end. Not at the beginning, not in the middle,
and not in a different file. IT GOES AT THE END.

Linus

2005-04-23 19:43:47

by Sean

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, April 23, 2005 3:38 pm, Linus Torvalds said:
> On Sat, 23 Apr 2005, Sean wrote:
>>
>> A script that knows how to validate signed tags, can easly strip off all
>> the signing overhead for display. Users of scripts that don't
>> understand
>> will see the cruft, but at least it will still be usable.
>
> NO.
>
> Guys, I will say this once more: git will not look at the signature.
>
> That means that we don't "strip them off", because dammit, they DO NOT
> EXIST as far as git is concerned. This is why a tag-file will _always_
> start with
>
> commit <commit-sha1>
> tag <tag-name>
>
> because that way we can use fsck and validate reachability and have things
> that want trees (or commits) take tag-files instead, and git will
> automatically look up the associated tree/commit. And it will do so
> _without_ having to understand about signing, since signing is for trust
> between _people_ not for git.

Yes, totally agreed.

> And that is why I from the very beginning tried to make ti very clear
> that the signature goes at the end. Not at the beginning, not in the
> middle, and not in a different file. IT GOES AT THE END.
>

Okay now you're just being difficult <g> You're acting like it's
impossible for git to grab the SHA1 out of the clear text message if there
is signing overhead above the tag reference. That is nonesense. You
simply state that tag must include a SHA1 object reference preceded by
"REF:" in the comment. Git can surely use this regardless of what
signing overhead is above, below or beside it. The suggestion for
stripping out the signing overhead was for _human_ readability; git won't
care a gnit.

Sean


2005-04-23 19:57:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Sean wrote:
>
> Okay now you're just being difficult <g> You're acting like it's
> impossible for git to grab the SHA1 out of the clear text message if there
> is signing overhead above the tag reference. That is nonesense.

No. It's not "impossible" for git to parse crap. But git won't.

There are two ways you can write programs:
- reliably
- unreliably

and I do the first one. That means that a program I write does something
_repeatable_. It does the same thing, regardless of whether a human
happened to write "REF:" in the comment section, or anything else.

The thing is, great programs come not out of great coding, but out of
great data structures. The whole git philosophy bases itself on getting
the data structure right.

And what you are asking for is doing it _wrong_. So in git I don't just
parse random free-form text and guess that a line that starts with REF: is
a reference to a commit. It has very rigid and well-specified data
structures, and that's how you make reliable programs.

I don't care what anybody else does on top of git, but dammit, I'll make
sure that the core infrastructure is designed the right way.

And that means that we don't guess, and that we don't parse random ASCII
blobs. It means that we have very very fixed formats so that programs can
either do the right thing or unambiguously say "that's crap".

I've said it before, and I'll say it again: we have enough crap that calls
itself SCM's out there already. I want git to be reliable and _simple_,
not a collection of crap that just happens to work.

Linus

2005-04-23 19:57:56

by Junio C Hamano

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

>>>>> "LT" == Linus Torvalds <[email protected]> writes:

LT> Guys, I will say this once more: git will not look at the signature.

LT> And that is why I from the very beginning tried to make ti very clear that
LT> the signature goes at the end. Not at the beginning, not in the middle,
LT> and not in a different file. IT GOES AT THE END.

If that is the case, can't you do it without introducing this
new tag object, like this?

1. Find existing commit-id that you want to tag.
2. Sign that commit object:

cat-file commit $commit |
gpg --detach-sign --armor -u 'Linus Torvalds' >commit.sig

3. Make another commit, making the original commit as its parent:

{
echo tag This is my tag.
cat commit.sig
} | commit-tree $(cat-file commit $commit |
sed -e 's/tree //;d') -p $commit

Then you can publish the ID of this commit object, which attests
that the original commit is what you vouch for. Am I missing
something?

2005-04-23 20:00:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Jan Harkes wrote:
>
> It is a bit more messy, but it can be done with a detached signature.

Ok, this looks more like it.

Except:

> To sign,
> gpg -ab unsigned_commit
> cat unsigned_commit unsigned_commit.asc > signed_commit
>
> To verify,
> cat signed_commit | sed '/-----BEGIN PGP/Q' | gpg --verify signed_commit -

Except I think you'd need to searc for the "---BEGIN PGP" starting from
the end, rather than the beginning.

Anyway, that should be workable. I'll whip something up.

Linus

2005-04-23 20:15:53

by Jeff Garzik

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Linus Torvalds wrote:
> That was my plan, at least. But I haven't set up any signature generation
> thing, and this really isn't my area of expertise any more. But my _plan_
> literally was to have the tag object look a lot like a commit object, but
> instead of pointing to the tree and the commit parents, it would point to
> the commit you are tagging. Somehting like
>
> commit a2755a80f40e5794ddc20e00f781af9d6320fafb
> tag v2.6.12-rc3
> signer Linus Torvalds
>
> This is my official original 2.6.12-rc2 release
>
> -----BEGIN PGP SIGNATURE-----
> ....
> -----END PGP SIGNATURE-----

> with a few fixed headers and then a place for free-form commentary,

groovy



> If somebody writes a script to generate the above kind of thing (and tells
> me how to validate it), I'll do the rest, and start tagging things
> properly. Oh, and make sure the above sounds sane (ie if somebody has a
> better idea for how to more easily identify how to find the public key to
> check against, please speak up).

[tangent]

Any chance you'll have a tree tagged with older releases?
Is someone with access to BK working on that?

I do a lot of patch merges where someone sends me a 2.6.10 patch.
Presuming the fix is still valid, I'll clone to 2.6.10, merge the patch,
pull 2.6.latest into the 2.6.10-based repo, then push the whole she-bang
into one of my for-upstream repos.

Jeff


2005-04-23 20:21:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Junio C Hamano wrote:
>
> If that is the case, can't you do it without introducing this
> new tag object, like this?

No, because I also want to sign the _name_ I gave it.

Otherwise somebody can take my "signed commit", and claim that I called it
something else.

Just signing the commit is indeed sufficient to just say "I trust this
commit". But I essentially what to also say what I trust it _for_ as well.

And sure, I could make a totally bogus "commit" object that just points to
the original commit, uses the same "tree" from that original commit, and
write what I want to trust into that commit. I then sign that, and create
yet _another_ commit that has the signature (and the pointer to the just
signed commit) in its commit message, and then I point to _that_ commit.

So yes, we can certainly do this with playing games with commits. That
sounds singularly ugly, though, since just doing a "tag" object is a lot
more straightforward, and really tells the world what's going on (and
makes it easy for automated tools to just browse the object database and
see "that's a tag").

Linus

2005-04-23 20:25:02

by Junio C Hamano

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

>>>>> "JCH" == I said:

JCH> If that is the case, can't you do it without introducing this
JCH> new tag object, like this?

Of course It Would Not Work. I am an idiot X-<. Sorry. What I
suggested does not authenticate the tag itself.

2005-04-23 20:50:12

by Jan Harkes

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, Apr 23, 2005 at 12:30:38PM -0700, Linus Torvalds wrote:
> On Sat, 23 Apr 2005, Jan Harkes wrote:
> >
> > Why not keep the tags object outside of the tree in the tags/ directory.
>
> Because then you have all those special cases with fetching them and with
> fsck, and with shared object directories. In other words: no.

I respectfully disagree,

rsync works fine for now, but people are already looking at implementing
smarter (more efficient) ways to synchronize git repositories by
grabbing missing commits, and from there fetching any missing tree and
file blobs. However there is no such linkage to discover missing tag
objects, only a full rsync would be able to get them and for that it has
to send the name of every object in the repository to the other side to
check for any missing ones.

So fetching tags is already going to be a special case.

And any form of validation of a tag is a special operation. In fact tags
could be as simple as a the sha of an (like pasky's tags) followed by
the detached pgp signature of the tagged object instead of trying to
signing the tag itself. That also avoids having to strip the signature
part from the tag when we want to validate it.

Jan

2005-04-23 23:28:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sat, 23 Apr 2005, Jan Harkes wrote:
>
> I respectfully disagree,
>
> rsync works fine for now, but people are already looking at implementing
> smarter (more efficient) ways to synchronize git repositories by
> grabbing missing commits, and from there fetching any missing tree and
> file blobs.

Bit this is a _feature_.

Other people normally shouldn't be interested in your tags. I think it's a
mistake to make everybody care.

So you normally would fetch only tags you _know_ about. For example, one
of the reasons we've been _avoiding_ personal tags in teh BK trees is that
it just gets really ugly really quickly because they get percolated up to
everybody else. That means that in a BK tree, you can't sanely use tags
for "private" stuff, like telling somebody else "please sync with this
tag".

So having the tag in the object database means that fsck etc will notice
these things, and can build up a list of tags you know about. It also
means that you can have tag-aware synchronization tools, ie exactly the
kind of tools that only grab missing commits can also then be used to
select missing tags according to some _private_ understanding of what tags
you might want to find..

Linus

2005-04-24 23:28:08

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, 23 Apr 2005, Linus Torvalds wrote:

> NO.
>
> Guys, I will say this once more: git will not look at the signature.
>
> That means that we don't "strip them off", because dammit, they DO NOT
> EXIST as far as git is concerned. This is why a tag-file will _always_
> start with
>
> commit <commit-sha1>
> tag <tag-name>
>
> because that way we can use fsck and validate reachability and have
> things that want trees (or commits) take tag-files instead, and git
> will automatically look up the associated tree/commit. And it will
> do so _without_ having to understand about signing, since signing
> is for trust between _people_ not for git.

> And that is why I from the very beginning tried to make ti very
> clear that the signature goes at the end. Not at the beginning, not
> in the middle, and not in a different file. IT GOES AT THE END.

Actually, can you make the signature be detached and a seperate
object? Ie, add a signature object in its own right, distinct from
tag. They could then:

- be used to sign any kind of object
- allow objects to be signed by multiple people

Ideally, there'd be an index of signature objects by the SHA-1 sum of
the object they sign, as the signed object should not refer to the
signature (or the second of the above is not possible).

The latter of the two points would, in combination with the former,
allow for cryptographic 'signed-off-by' chains. If a 'commit' is
signed by $RANDOM_CONTRIBUTOR and $SUBSYSTEM_MAINTAINER and $ANDREW,
you know its time to pull it. Would also work for things like "fixes
only" trees, where (say) a change must be approved by X/2+1 of a
group of X hacker providing oversight -> looking up the commit
object's signatures would tell you whether it was approved.

No idea whether this is possible or practical. :) But it would be
good for future flexibility to avoid including the signature in the
object being signed.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
You give me space to belong to myself yet without separating me
from your own life. May it all turn out to your happiness.
-- Goethe

2005-04-24 23:58:31

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Mon, 25 Apr 2005, Paul Jakma wrote:

> Ideally, there'd be an index of signature objects by the SHA-1 sum of the
> object they sign, as the signed object should not refer to the signature (or
> the second of the above is not possible).

Ah, this could (obviously) be done generally by providing a general
index of 'referals' (if desirable).

I have no idea whether git already does this, I havn't checked it out
yet but I'm very interested to see how git will mature and have been
trying to follow its progress - I'm a frustrated admin of a CVS
repository..

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Does the name Pavlov ring a bell?

2005-04-25 00:59:38

by David A. Wheeler

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.




On Sat, 23 Apr 2005, Linus Torvalds wrote:
>> That means that we don't "strip them off", because dammit, they DO NOT
>> EXIST as far as git is concerned. This is why a tag-file will _always_
>> start with
>>
>> commit <commit-sha1>
>> tag <tag-name>
>>
>> because that way we can use fsck and validate reachability and have
>> things that want trees (or commits) take tag-files instead, and git
>> will automatically look up the associated tree/commit. And it will do
>> so _without_ having to understand about signing, since signing is for
>> trust between _people_ not for git.
>
>> And that is why I from the very beginning tried to make ti very clear
>> that the signature goes at the end. Not at the beginning, not in the
>> middle, and not in a different file. IT GOES AT THE END.

It may be better to have them as simple detached signatures, which are
completely separate files (see gpg --detached).
Yeah, gpg currently implements detached signatures
by repeating what gets signed, which is unfortunate,
but the _idea_ is the right one.


Paul Jakma wrote:
> Ideally, there'd be an index of signature objects by the SHA-1 sum of
> the object they sign, as the signed object should not refer to the
> signature (or the second of the above is not possible).

Yes, and see my earlier posting. It'd be easy to store signatures in
the current objects directory, of course. The trick is to be able
to go from signed-object to the signature; this could be done
just by creating a subdirectory using a variant of
the name of the signed-object's file, and in that directory store the
hash values of the signatures. E.G.:
00/
3b128932189018329839019 <- object to sign
3b128932189018329839019.d/
0143709289032890234323451
01/
43709289032890234323451 <- signature

> The latter of the two points would, in combination with the former,
> allow for cryptographic 'signed-off-by' chains. If a 'commit' is signed
> by $RANDOM_CONTRIBUTOR and $SUBSYSTEM_MAINTAINER and $ANDREW, you know
> its time to pull it. Would also work for things like "fixes only" trees,
> where (say) a change must be approved by X/2+1 of a group of X hacker
> providing oversight -> looking up the commit object's signatures would
> tell you whether it was approved.

Right. Lots of tricks you can do once the signatures are there,
such as checking to counter repository subversion
(did everything get signed), finding out who introduced a malicious
line of code (& "proving" what key signed it first), etc.
There are LOTS of reasons for storing signatures so that they can
be checked later on, just like there are lots of reasons for storing
old code... they give you evidence that the reputed history is true
(and if you doubt it, they give you a way to limit the doubt).

--- David A. Wheeler

2005-04-25 01:27:46

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sat, 2005-04-23 at 10:31 -0700, Linus Torvalds wrote:
> In other words, I actually want to create "tag objects", the same way we
> have "commit objects". A tag object points to a commit object, but in
> addition it contains the tag name _and_ the digital signature of whoever
> created the tag.

I'm slightly concerned that to find a given tag by its name if we do
_just_ the above would be a fairly slow process. I suspect you'll want
a .git/tags/ directory _anyway_, but with named files which refer to tag
objects, instead of directly to commit objects as in Petr's current
implementation.

Other operations we might want to be at least _reasonably_ efficient
would include 'show me the latest tag from Linus' and 'show me all
extant tags'.

--
dwmw2

2005-04-25 01:37:27

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sun, 24 Apr 2005, David A. Wheeler wrote:

> It may be better to have them as simple detached signatures, which
> are completely separate files (see gpg --detached). Yeah, gpg
> currently implements detached signatures by repeating what gets
> signed, which is unfortunate, but the _idea_ is the right one.

Hmm, what do you mean by "repeating what gets signed"?

> Yes, and see my earlier posting. It'd be easy to store signatures in
> the current objects directory, of course. The trick is to be able
> to go from signed-object to the signature;

Two ways:

1. An index of sigs to signed-object.

(or more generally: objects to referring-objects)

2. Just give people the URI of the signature, let them (or their
tools) follow the 'parent' link to the object of interest

> this could be done just by creating a subdirectory using a variant
> of the name of the signed-object's file, and in that directory
> store the hash values of the signatures. E.G.:

> 00/
> 3b128932189018329839019 <- object to sign
> 3b128932189018329839019.d/
> 0143709289032890234323451
> 01/
> 43709289032890234323451 <- signature

You could hack it in to the namespace somehow I guess. I'm not sure
hacking it in would be a good thing though.

I think it might be more useful just to provide a general index to
lookup 'referring' objects (if git does not already - I dont think it
does, but I dont know enough to know for sure). So you could ask
"which {commit,tag,signature,tree}(s) refer(s) to this object?" -
that general concept will always work. If you wanted to make the
implementation of this index use some kind of sub directory as in the
above, fine..

See also method 2 above. Which would be more efficient for tools if,
within a project, some developers sign their 'updates' and some
dont.. (you never need to check whether there's a signature or not -
you'll know it from the URI automatically).

> There are LOTS of reasons for storing signatures so that they can
> be checked later on, just like there are lots of reasons for
> storing old code... they give you evidence that the reputed history
> is true (and if you doubt it, they give you a way to limit the
> doubt).

Indeed.

Anyway, we shall see what Linus does. :)

(But I do hope at least that signatures are /not/ included inline
using BEGIN PGP.. in the object that is signed.)

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
To err is human, to purr feline.
To err is human, two curs canine.
To err is human, to moo bovine.

2005-04-25 01:48:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Sun, 24 Apr 2005, David A. Wheeler wrote:
>
> It may be better to have them as simple detached signatures, which are
> completely separate files (see gpg --detached).
> Yeah, gpg currently implements detached signatures
> by repeating what gets signed, which is unfortunate,
> but the _idea_ is the right one.

Actually, if we do totally separate files, then the detached thing is ok,
and we migth decide to not call the objects at all, since that seems to be
unnecessarily complex.

Maybe we'll just have signed tags by doing exactly that: just a collection
of detached signature files. The question becomes one of how to name such
things in a distributed tree. That is the thing that using an object for
them would have solved very naturally.

Linus

2005-04-25 02:12:05

by David A. Wheeler

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Paul Jakma wrote:
> On Sun, 24 Apr 2005, David A. Wheeler wrote:
> Hmm, what do you mean by "repeating what gets signed"?

Forget it, irrelevant. I vaguely remembered some problem with
gpg's detached signatures, but it was probably either a really
early alpha version or someone was using "--clearsign" instead
of "--armor". I just did a quick check with:
gpg --armor --detach -o junk.sig junk.c
and it worked "as expected"; no repeat of the data.

>> Yes, and see my earlier posting. It'd be easy to store signatures in
>> the current objects directory, of course. The trick is to be able
>> to go from signed-object to the signature;
> Two ways:
> 1. An index of sigs to signed-object.
> (or more generally: objects to referring-objects)

Right. I suggested putting it in the same directory as the objects,
so that rsync users get them "for free", but a separate directory
has its own advantages & that'd be fine too.
In fact, the more I think about it, I think it'd be cleaner
to have it separate. You could prepend on top of the signature
(if signatures are separate from assertions) WHAT got signed so
that the index could be recreated from scratch when desired.

> 2. Just give people the URI of the signature, let them (or their
> tools) follow the 'parent' link to the object of interest

If you mean "the signatures aren't stored with the objects", NO.
Please don't! If the signatures are not stored in the database,
then over time they'll get lost. It's important to me to
store the record of trust, as well as what changed, so that
ANYONE can later go back and verify that things are as they're
supposed to be, or exactly who trusted what.

> I think it might be more useful just to provide a general index to
> lookup 'referring' objects (if git does not already - I dont think it
> does, but I dont know enough to know for sure).

git definitely doesn't have this currently, though you could run the
fsck tools which end up creating a lot of the info (but it's then
thrown away).

> So you could ask "which
> {commit,tag,signature,tree}(s) refer(s) to this object?" - that general
> concept will always work.

Yes. The problem is that maintaining the index is a pain.
It's probably worth it for signatures, because the primary use
is the other direction ("who signed this?"); it's not clear that
the other direction is common for other data.

--- David A. Wheeler

2005-04-25 02:21:12

by Fabian Franz

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Montag, 25. April 2005 03:50 schrieb Linus Torvalds:

> Maybe we'll just have signed tags by doing exactly that: just a collection
> of detached signature files. The question becomes one of how to name such
> things in a distributed tree. That is the thing that using an object for
> them would have solved very naturally.

What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?

Or would this violate the concept of the object database to just contain
hashes?

cu

Fabian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCbFMsI0lSH7CXz7MRAof0AKCILjPE/M72cMSVNDC/DWYSzmrU/ACggOuS
ogNPwUf2ASAwmbwixzSTuPs=
=pW5D
-----END PGP SIGNATURE-----

2005-04-25 02:35:39

by Matt Domsch

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sun, Apr 24, 2005 at 09:01:28PM -0400, David A. Wheeler wrote:
> It may be better to have them as simple detached signatures, which are
> completely separate files (see gpg --detached).
> Yeah, gpg currently implements detached signatures
> by repeating what gets signed, which is unfortunate,
> but the _idea_ is the right one.

I solve this with two simple scripts, "sign" calls "cutsig".

--------------
sign

#!/bin/sh

DEFAULT_KEY="my-private-key-string"
CUTSIG=~/bin/cutsig.pl
usage()
{
echo "usage: $0 filename"
echo " produces filename.sign"
}

if [ $# -lt 1 ]; then
usage
exit 1;
fi

gpg --armor --clearsign --detach-sign --default-key "${DEFAULT_KEY} -v -v -o - ${1} | \
${CUTSIG} > ${1}.sign

exit 0


-----------------
cutsig


#!/usr/bin/perl -w

do {
$line = <STDIN>;
} until $line =~ "-----BEGIN PGP SIGNATURE-----";


print $line;
while ( $line = <STDIN>) {
print $line;
}

exit 0;

2005-04-25 02:42:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.



On Mon, 25 Apr 2005, Fabian Franz wrote:
>
> What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?

Well, the SHA1 of an object really is not a very good name, unless you
have something to manage it with. Again, the object database has something
to manage and find those objects with - things like .git/HEAD, but also
"fsck" to find dangling and unnamed objects.

Maybe we'll never have so many tags that we need to manage them, and yes,
if so, we can just have ".git/signatures" be a directory with objects that
are just named for their content SHA1, the same way the object database
is, but separately (and probably just using a flat file structure, no need
for the subdirectory fan-out that the object directory has).

No need for a ".sig" thing, since they'd be defined to be signatures just
from their location.

> Or would this violate the concept of the object database to just contain
> hashes?

This wouldn't be an object at all in that case, they'd be totally outside
the scope of the git object model.

And yes, if they were to be git objects, they'd follow totally different
rules: they'd have to have the "tag+length+'\0'" format, and they would be
zlib-compressed.

If they are totally outside of git, then I don't care what the object
format is, and then it could be just a regular text-file with a signature
and content, and just happen to be named for the SHA1 hash so that there
is no confusion about what happens when multiple people happen to create
different tags with the same name.

Linus

2005-04-25 02:44:01

by Jan Harkes

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sun, Apr 24, 2005 at 09:34:20PM -0500, Matt Domsch wrote:
> On Sun, Apr 24, 2005 at 09:01:28PM -0400, David A. Wheeler wrote:
> > It may be better to have them as simple detached signatures, which are
> > completely separate files (see gpg --detached).
> > Yeah, gpg currently implements detached signatures
> > by repeating what gets signed, which is unfortunate,
> > but the _idea_ is the right one.
>
> I solve this with two simple scripts, "sign" calls "cutsig".
...
> gpg --armor --clearsign --detach-sign --default-key "${DEFAULT_KEY} -v -v -o - ${1} | \
> ${CUTSIG} > ${1}.sign

You could also just leave out the --clearsign option and it will DTRT.

Jan

2005-04-25 03:05:28

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sun, 24 Apr 2005, David A. Wheeler wrote:

> Right. I suggested putting it in the same directory as the
> objects, so that rsync users get them "for free", but a separate
> directory has its own advantages & that'd be fine too. In fact, the
> more I think about it, I think it'd be cleaner to have it separate.
> You could prepend on top of the signature (if signatures are
> separate from assertions) WHAT got signed so that the index could
> be recreated from scratch when desired.

Well, i'm trying to play with git right now to see what would fit
with how it abstracts things.

I think possibly:

- add the 'signature object' to the respository after the signed
object

So a 'signed commit' turns into the

- tool preparing the commit object,
- get the user to sign it
- save the detached signature for later
- adding the commit object to the repository
- prepare the signing object and add to repository

The repository head then refers then to signature object, which could
(handwaving) look something like:

Object Signature
Signing <object ID, in this case of the commit object>
Sign-type GPG

<signature data>

Tools should then treat signature objects as 'stand ins' for the
object they are signing (verify the signature - if desired - and then
just retrieve the 'Signing' object ID and use that further).

I have no working knowledge of git though, other than following this
list. So I have no idea whether above is at all appropriate or
workable.

> If you mean "the signatures aren't stored with the objects", NO.
> Please don't! If the signatures are not stored in the database,
> then over time they'll get lost.

No more lost than anything else in the git 'fs'.

If someone prunes old objects, they'll lose the signed objects along
with the signatures. If those files weren't replicated anywhere else,
well they've just blown away history for good, both the history of
the source and corresponding signatures.

> It's important to me to store the record of trust, as well as what
> changed, so that ANYONE can later go back and verify that things
> are as they're supposed to be, or exactly who trusted what.

See above.

> git definitely doesn't have this currently, though you could run
> the fsck tools which end up creating a lot of the info (but it's
> then thrown away).

Well, it could be retained then.

> Yes. The problem is that maintaining the index is a pain.

Possibly.

> It's probably worth it for signatures, because the primary use is
> the other direction ("who signed this?"); it's not clear that the
> other direction is common for other data.

In CVS it is. If you 'cvs log' a file, you can get a report on which
revisions of the file belong to which tags (which can be useful
information sometimes: "ah, so that release had the buggy version"
type of thing. Or as a sanity check to make sure you got a tag right
- particularly when you have to move a wrong tag[1]). So, in addition
to signatures, a general 'referrers of this object' index could be
useful for reports.

1. This might be just a CVS thing, and not wanted for git -> the
ability to tag historical revisions and indeed change what tags refer
to.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Decaffeinated coffee? Just Say No.

2005-04-25 03:07:05

by David A. Wheeler

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Linus Torvalds wrote:
>
> On Sun, 24 Apr 2005, David A. Wheeler wrote:
>
>>It may be better to have them as simple detached signatures, which are
>>completely separate files (see gpg --detached).
>
> Actually, if we do totally separate files, then the detached thing is ok,
> and we migth decide to not call the objects at all, since that seems to be
> unnecessarily complex.
>
> Maybe we'll just have signed tags by doing exactly that: just a collection
> of detached signature files. The question becomes one of how to name such
> things in a distributed tree. That is the thing that using an object for
> them would have solved very naturally.

I agree, naming signatures using the same way other objects are named
would be very clean. So, why not? It's perfectly reasonable to
just store detached signatures as hashed objects, just like the rest;
just create a new object type ("signature").
If 3 different keys are used to sign the same object, the detached
signatures will have different hash values, so they'll get named easily.

Now you just have to FIND the signature of a signed object,
i.e. efficiently go the "other way" from signed object to detached
signature. A separate directory with this mapping, or embedding the
mapping inside the object directory (HASH.d/<list>) both solve it.

The more I think about it, the more I think a separate "reverse"
index directory would be a better idea. It just needs to from
"me" to "who references me", at least so that you can quickly
find all signatures of a given object. If the reverse directory
gets wonky, anyone can just delete the reverse index directory
at any time & reconstruct it by iterating the objects.
Before "-----BEGIN PGP SIGNATURE-----" you should add:
signatureof HASHVALUE
to make reconstruction easy; PGP processors ignore stuff
before "-----". The PGP data does include a hash, but it's not
easy to get it out (I don't see a way to do it in gpg from the
command line), and it's quite possible that a signer won't
use SHA-1 when they sign something (they may not even
realize it; it depends on their implementation's configuration).
Better to include something about what was signed with the signature.

Hmm, probably worth backtracking to see what's needed.
There needs to be a way to identify tags, and a way to sign that
tag so that you can decide to trust some tags & not others.
There needs to be a way to sign commits, and store that info
for later. And really, these are special cases of general
assertions about other things; you might want someone to be
able to make other signed assertions (e.g., that it
passed test suite XYZ).

If tags & commits are all you plan to sign for now, well, you
already have commits. You can just add a "tag" type and a
"signature" type of object (the "signature" is just a detached
OpenPGP signature). "signature" can sign tag or commit types.
I still like the idea of a more general "assertion" type, esp.
for assertions that something passed a test suite on a certain date
or was reviewed at a certain date by someone, but admittedly
that could be added later in the same manner.

Then you need to be able to quickly find a signature, given a
commit or tag. A "reverse" directory then does that nicely,
and if you put enough information in front of the signature,
you can regenerate the reverse directory whenever you wish.

--- David A. Wheeler

2005-04-25 03:12:50

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Ah, to add to below..

If one wished, one could optionally store the actual signature data
as a seperate blob object and refer to it in the signing object. Not
needed really for a GPG ASCII clear-signed detached signature (tiny
and they're ASCII obviously :) ), but who knows.

On Mon, 25 Apr 2005, Paul Jakma wrote:

> - add the 'signature object' to the respository after the signed
> object
>
> So a 'signed commit' turns into the
>
> - tool preparing the commit object,
> - get the user to sign it
> - save the detached signature for later
> - adding the commit object to the repository

- adding the signature blob, if it is to stored as a blob

> - prepare the signing object and add to repository

> The repository head then refers then to signature object, which could
> (handwaving) look something like:
>
> Object Signature
> Signing <object ID, in this case of the commit object>
> Sign-type GPG

With either a 'Signature <ID of signature data blob>' or else:

> <signature data>


regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
May you have many beautiful and obedient daughters.

2005-04-25 03:24:44

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Sun, 24 Apr 2005, David A. Wheeler wrote:

> Now you just have to FIND the signature of a signed object, i.e.
> efficiently go the "other way" from signed object to detached
> signature. A separate directory with this mapping, or embedding
> the mapping inside the object directory (HASH.d/<list>) both solve
> it.

You dont even need it, see my other mail. If:

- the signature is an object and added after the commit object

- tools know that signatures are 'proxies of' or precursors to the
objects they are signing (which makes sense, a signature by
definition refers to something else)

- the signature object refers to the object it is signing (eg a
'Signing <object ID>' header)

Then head can simply be the signature object and tools can find the
commit by following the 'Signing' field of the signature (they dont
even need to check the signature is valid). No index lookup needed.

You only need the index for historical verification really, and you
can always generate an index if needs be. (and have the tools
maintain it).

> The more I think about it, the more I think a separate "reverse"
> index directory would be a better idea. It just needs to from
> "me" to "who references me", at least so that you can quickly
> find all signatures of a given object. If the reverse directory
> gets wonky, anyone can just delete the reverse index directory
> at any time & reconstruct it by iterating the objects.
> Before "-----BEGIN PGP SIGNATURE-----" you should add:
> signatureof HASHVALUE
> to make reconstruction easy; PGP processors ignore stuff
> before "-----".

Oof, dont do this:

- makes assumptions about the format of the signature
- that it is ASCII
- that you can change it

Just add a git header which is independent of the signature data.

In lieu of the 'signature object as precursor' approach above, just
have the tools maintain an index. It can be maintained as objects as
added, and can always be blown away and recreated by inspection of
the repository data.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
To doubt everything or to believe everything are two equally convenient
solutions; both dispense with the necessity of reflection.
-- H. Poincar'e

2005-04-25 03:31:13

by David A. Wheeler

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Mon, 25 Apr 2005, Fabian Franz wrote:
>> What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?

If you mean "hash of object being signed", the problem is that
there may be more than one signature of a given object.
Keys get stolen, for example, so you want to re-sign the objects.
Yes, you could replace the files, but it's nicer to make it
so there's never a need to replace files in the first place.
That's one of the nice properties of the git object database;
so if we can have that property everywhere, I think we should.

Instead, store the signatures in the normal object database, &
give it type "signature". To speed access FROM a commit or tag
to a signature (and FROM a commit to a tag), create a
separate reverse directory that tells you what objects reference
a given object. Like this:
.git/
objects/
00/
0195297c2a6336c2007548f909769e0862b509 <= a commit object
02/
0395297c2a6336c2007548f909769e0862b509 <= signature of commit
04/
0595297c2a6336c2007548f909769e0862b509 <= a tag
06/
0795297c2a6336c2007548f909769e0862b509 <= signature of tag
reverse/
00/
0195297c2a6336c2007548f909769e0862b509/
020395297c2a6336c2007548f909769e0862b509 "this signs commit"
.... other later signatures of this commit go here.
04/
0595297c2a6336c2007548f909769e0862b509/
060795297c2a6336c2007548f909769e0862b509
.... other later signatures of this tag go here.

The reverse directory's contents are basically the filenames.
The files themselves could be symlinks back up, or not.
Content-free files are probably more portable across filesystems,
and it's probably also good for space efficiency
(though I haven't examined that carefully).

"git"'s knowledge of signatures should be VERY limited, and
not dependent on PGP. I think that'd be easy.
You could prepend some signature data into the "signature" file to
make it much easier to reconstruct the reverse directory and
to make it easy to check things WITHOUT knowledge of PGP or whatever.

Here's potential output:

$ cat-file commit 000195297c2a6336c2007548f909769e0862b509
tree 2aaf94eae20acc451553766f3c063bc46cfa75c6
parent dc459bf85b3ff97333e759d641c5d18f4dad470d
author Petr Baudis <[email protected]> 1114303479 +0200
committer Petr Baudis <[email protected]> 1114303479 +0200

Added the whatsit flag.


$ cat-file signature 000195297c2a6336c2007548f909769e0862b509
signatureof commit 000195297c2a6336c2007548f909769e0862b509
signer Petr Baudis <[email protected]>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQBCbFaRCxlT/+f+SU4RAgYSAKCWpPNlDKDkxuuA649zJop7WkQPnACdF1Fg
JgXatbJU8YJ7JHqvgyGepRU=
=Kttg
-----END PGP SIGNATURE-----


$

--- David A. Wheeler

2005-04-25 03:42:16

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Mon, 25 Apr 2005, Paul Jakma wrote:

> You dont even need it, see my other mail. If:
>
> - the signature is an object and added after the commit object
>
> - tools know that signatures are 'proxies of' or precursors to the
> objects they are signing (which makes sense, a signature by
> definition refers to something else)
>
> - the signature object refers to the object it is signing (eg a
> 'Signing <object ID>' header)
>
> Then head can simply be the signature object and tools can find the
> commit by following the 'Signing' field of the signature (they dont
> even need to check the signature is valid). No index lookup needed.

> You only need the index for historical verification really, and you can
> always generate an index if needs be. (and have the tools maintain it).

Uh, I have no idea whether verifying a signature of a commit object
is sufficient, ie equivalent to signing each file.

commit refers to tree objects, which I presume lists the SHA-1 object
IDs of files, but IIRC Linus already described why a signature of the
commit object should not be used to trust the rest of commit.. (i'll
have to find his mail). If so, an index is required.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Old programmers never die, they just hit account block limit.

2005-04-25 03:47:34

by Paul Jakma

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Mon, 25 Apr 2005, Paul Jakma wrote:

> Uh, I have no idea whether verifying a signature of a commit object is
> sufficient, ie equivalent to signing each file.
>
> commit refers to tree objects, which I presume lists the SHA-1 object IDs of
> files, but IIRC Linus already described why a signature of the commit object
> should not be used to trust the rest of commit.. (i'll have to find his
> mail). If so, an index is required.

Ah, apparently it is sufficient:

Linus:

“Just signing the commit is indeed sufficient to just say "I trust
this commit". But I essentially what to also say what I trust it
_for_ as well.”

So this would work for commit objects.

It would also work for tag objects, if you pointed people at the signature
object rather than the actual tag object.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Humor in the Court:
Q. Were you aquainted with the deceased?
A. Yes, sir.
Q. Before or after he died?

2005-04-25 09:31:44

by David Greaves

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David A. Wheeler wrote:
> $ cat-file signature 000195297c2a6336c2007548f909769e0862b509
minor comment, cat-file gives you raw access to the object data.

better:
$ cat-file signature $(what-signs 000195297c2a6336c2007548f909769e0862b509)
> signatureof commit 000195297c2a6336c2007548f909769e0862b509
> signer Petr Baudis <[email protected]>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.6 (GNU/Linux)
>
> iD8DBQBCbFaRCxlT/+f+SU4RAgYSAKCWpPNlDKDkxuuA649zJop7WkQPnACdF1Fg
> JgXatbJU8YJ7JHqvgyGepRU=
> =Kttg
> -----END PGP SIGNATURE-----

David


--

2005-04-25 15:50:56

by Bodo Eggert

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

Matt Domsch <[email protected]> wrote:

> --------------
> sign

> gpg --armor --clearsign --detach-sign --default-key "${DEFAULT_KEY} -v -v -o -
> ${1} | \ ${CUTSIG} > ${1}.sign

Use quotes!

> exit 0

The exit code should reflect the status from gpg.
If gpg failed, you might also want to remove the .sign file.

--
Top 100 things you don't want the sysadmin to say:
37. What is all this I here about static charges destroying computers?

2005-05-04 09:02:48

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> On Sat, 2005-04-23 at 16:30 +0200, Jan Dittmer wrote:
>
>>>LASTRELEASE=`ls -rt .git/tags | grep -v git | grep -v MailDone | tail -1`
>>
>>My .git/tags is empty. At least 2.6.12-rc3 is not tagged so I wasn't sure
>>how to extract the latest release from the git tree.
>>ketchup was the most comfortable way.
>
>
> Nah, asking Linus to tag his releases is the most comfortable way.
>

Here is an updated version of the script, working with paskys latest tree.

--
Jan


Attachments:
snapjdi.sh (1.05 kB)

2005-05-04 09:20:27

by David Woodhouse

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

On Wed, 2005-05-04 at 11:02 +0200, Jan Dittmer wrote:
> Here is an updated version of the script, working with paskys latest
> tree.

Thanks. I was planning to get this working today -- looks like you've
saved me the trouble.

The chronological output from cg-log is still the wrong thing to do, but
I suppose it'll have to suffice for now.

--
dwmw2

2005-05-04 09:59:52

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> On Wed, 2005-05-04 at 11:02 +0200, Jan Dittmer wrote:
>
>>Here is an updated version of the script, working with paskys latest
>>tree.
>
>
> Thanks. I was planning to get this working today -- looks like you've
> saved me the trouble.
>
> The chronological output from cg-log is still the wrong thing to do, but
> I suppose it'll have to suffice for now.

Btw. it depends on cg-tag-ls having chronological order. I _hope_ that's
correct. At least with my 4 test -git tags it was.
Otherwise one had to use fsck-cache --tags, extract the dates, sort by
date, ...

Jan

2005-05-04 10:42:33

by Jan Dittmer

[permalink] [raw]
Subject: Re: Git-commits mailing list feed.

David Woodhouse wrote:
> On Wed, 2005-05-04 at 11:02 +0200, Jan Dittmer wrote:
>
>>Here is an updated version of the script, working with paskys latest
>>tree.
>
>
> Thanks. I was planning to get this working today -- looks like you've
> saved me the trouble.

No, I think not. The script breaks for v2.6.12, because tags are sorted
by name not by value.
Needs some sed magic to get the version numbers in a canonical form
first. But I've currently no time to do so, sorry.

Jan

> The chronological output from cg-log is still the wrong thing to do, but
> I suppose it'll have to suffice for now.