Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp985507pxu; Mon, 26 Oct 2020 00:36:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwSBkIFeGl07RmcMz2BPp6JtzcmARWZBwAgaXVEb4y7j7RXLxrx6Bcm9jxQPVJbXcXlyDYb X-Received: by 2002:a17:907:444f:: with SMTP id on23mr930546ejb.17.1603697783444; Mon, 26 Oct 2020 00:36:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603697783; cv=none; d=google.com; s=arc-20160816; b=j3RgbO9p1gRgzWHlvpOBfud26EjorQSTY3h9BwfbSeXf/SGSj8iR8itRzLxQefzmOG yfrt8ZatnhsEh4LlZGSrYgqUopgElS+j++bAIbPjBVQh8D5sN6tWfN+yhmTTf+ix4OEX TaDbE8sNli/i7Z3cWc1A69G61a9zVjRr5sJpm+ZjUj/99yZ6ikSXmH+ibW1k5G/Qp0Db wTRtuRfz3At//2/oNNIjnBKV4LHJ9qSeKhSqsN7oWDkJVmGm3VKQ5MmPpAp9tsS8JfRI qXGLw2wHUUPlltN68BKNqfPPHdy0GNAdSeYweZOqN9QxzvRr1fDNIeUHhKlJt18SUQ5f y0yQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=pYjE2wnj9mEaT2eq3mBR7cB94L+RSBU668kgTLaP7Jg=; b=iIJoO4wCuG20YScjR1SoJJ32DSI+jfCNCsKOyZVftYs0E5CDttMQHTWOMdbV9pJQ2e PoQP42qq4Q7vS7kkoJ2rTElwa7+9UronZrjXPvdvJR6gfrtXS9l0s8kdg1ewYjyAShhu RjiN449VT6cahdTwHlMmj2zPYBN+IPQ+uKAqdB7TYmLQAm0TTUBBxvcI9/HDpU279vB1 y7+Os4vQkLiGyQctMw8dqcVGAlfye3yaRvIQ8mQJdPWUXGVB+LqAWn1kEMfaJEpxJ0k0 JKBMb+/Xg38jpAB0L1zHZMlfnheTQs11f4bVVVxAKXu4A3fx6J/uoM1uW8lrSDC2ZBk3 f0tw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=husNgAH+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a12si6924552ejv.55.2020.10.26.00.36.01; Mon, 26 Oct 2020 00:36:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=husNgAH+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1770002AbgJZFzs (ORCPT + 99 others); Mon, 26 Oct 2020 01:55:48 -0400 Received: from mail.kernel.org ([198.145.29.99]:51450 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1769991AbgJZFzr (ORCPT ); Mon, 26 Oct 2020 01:55:47 -0400 Received: from coco.lan (ip5f5ad5a1.dynamic.kabel-deutschland.de [95.90.213.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 147FB20878; Mon, 26 Oct 2020 05:55:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603691746; bh=NwqYhZ/xDwNlhSXG8TGaA11zmrf4vmrr4/1MfvVTn78=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=husNgAH+wT70FQ1Oz7EZf1cZWztgnDzIjE4uPsF1sIQrAN4Vbcl4cZfpcORdQXuWn LwkmzsbGzoZmGsWfOC4ScJeeydwXO/EhD3Yvyay/evQc611z/vOPBhxU1BDNX5pJEc G3TQD4QvGUVNLOc6DytEwxrjUTRCvhWhRIo6U+wo= Date: Mon, 26 Oct 2020 06:55:42 +0100 From: Mauro Carvalho Chehab To: Jonathan Corbet Cc: Linux Doc Mailing List , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 01/56] scripts: kernel-doc: fix typedef parsing Message-ID: <20201026065542.709457a2@coco.lan> In-Reply-To: <20201023112226.4035e3f7@lwn.net> References: <20201023112226.4035e3f7@lwn.net> X-Mailer: Claws Mail 3.17.7 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Fri, 23 Oct 2020 11:22:26 -0600 Jonathan Corbet escreveu: > On Fri, 23 Oct 2020 18:32:48 +0200 > Mauro Carvalho Chehab wrote: > > > The include/linux/genalloc.h file defined this typedef: > > > > typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr); > > > > Because it has a type composite of two words (unsigned long), > > the parser gets the typedef name wrong: > > > > .. c:macro:: long > > > > **Typedef**: Allocation callback function type definition > > > > Fix the regex in order to accept composite types when > > defining a typedef for a function pointer. > > > > Signed-off-by: Mauro Carvalho Chehab > > --- > > scripts/kernel-doc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/scripts/kernel-doc b/scripts/kernel-doc > > index 99cd8418ff8a..311d213ee74d 100755 > > --- a/scripts/kernel-doc > > +++ b/scripts/kernel-doc > > @@ -1438,7 +1438,7 @@ sub dump_typedef($$) { > > $x =~ s@/\*.*?\*/@@gos; # strip comments. > > > > # Parse function prototypes > > - if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ || > > + if ($x =~ /typedef\s+(\w+\s*){1,}\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ || > > I sure wish we could find a way to make all these regexes more > understandable and maintainable. Reviewing a change like this is ... fun. > > Anyway, it seems to work, but it does now include trailing whitespace in > the type portion. So, for example, from include/linux/xarray.h: > > typedef void (*xa_update_node_t)(struct xa_node *node); > > The type is parsed as "void " where it was "void" before. The only ill > effect I can see is that some non-breaking spaces get inserted into the > HTML output, but perhaps it's worth stripping off that trailing space > anyway? Yeah, this is one of the issues. There's another one, tough. While the above regex recognizes the typedef identifier, it only gets the last word of "unsigned long", in the case of something like: typedef unsigned long (*genpool_algo_t)(unsigned long *map); Here, we have no option but to use a hidden group, e. g. using this regex: typedef\s+((?:\w+\s*){1,})\(\*\s*(\w\S+)\s*\)\s*\((.*)\); I'm enclosing a second version with the above. Yeah, reviewing it is even funnier, but regex101 can be used to double-check what the regex is doing: https://regex101.com/r/bPTm18/2 Thanks, Mauro [PATCH] scripts: kernel-doc: fix typedef parsing The include/linux/genalloc.h file defined this typedef: typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr); Because it has a type composite of two words (unsigned long), the parser gets the typedef name wrong: .. c:macro:: long **Typedef**: Allocation callback function type definition Fix the regex in order to accept composite types when defining a typedef for a function pointer. Signed-off-by: Mauro Carvalho Chehab diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 99cd8418ff8a..b37f3cf8a331 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1438,13 +1438,14 @@ sub dump_typedef($$) { $x =~ s@/\*.*?\*/@@gos; # strip comments. # Parse function prototypes - if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ || + if ($x =~ /typedef\s+((?:\w+\s*){1,})\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ || $x =~ /typedef\s+(\w+)\s*(\w\S+)\s*\s*\((.*)\);/) { # Function typedefs $return_type = $1; $declaration_name = $2; my $args = $3; + $return_type =~ s/\s+$//; create_parameterlist($args, ',', $file, $declaration_name);