2022-12-09 19:33:16

by Paulo Miguel Almeida

[permalink] [raw]
Subject: [PATCH] scripts/tags.sh: allow only selected directories to be indexed

It's common for drivers that share same physical components to also
duplicate source code (or at least portions of it). A good example is
both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
file called atombios.h.

While their contents aren't the same, a lot of their structs have
the exact same names which makes navigating through the code base a bit
messy as cscope will show up 'references' across drivers which aren't
exactly correct.

This patch makes it possible for the devs to specify which folders
they want to include as part of the find_other_sources function if a
makefile variable OTHERSRCDIRS is present, otherwise the original
behaviour is kept.

Example:
make ARCH=x86 OTHERSRCDIRS=drivers/gpu/drm/radeon,tools cscope

Signed-off-by: Paulo Miguel Almeida <[email protected]>
---
scripts/tags.sh | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/scripts/tags.sh b/scripts/tags.sh
index e137cf15aae9..958c07c4ac4a 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -59,12 +59,17 @@ find_include_sources()
}

# find sources in rest of tree
-# we could benefit from a list of dirs to search in here
find_other_sources()
{
- find ${tree}* $ignore \
- \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
- -name "$1" -not -type l -print;
+ find_def_params="-name $1 -not -type l -print"
+ if [ -n "${OTHERSRCDIRS}" ]; then
+ exp_src_dirs=$(sed 's/,/ /g' <<< ${OTHERSRCDIRS})
+ find ${exp_src_dirs} ${ignore} ${find_def_params};
+ else
+ find ${tree}* ${ignore} \
+ \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) \
+ -prune -o ${find_def_params};
+ fi
}

find_sources()
--
2.38.1


2022-12-10 20:42:49

by Vipin Sharma

[permalink] [raw]
Subject: Re: [PATCH] scripts/tags.sh: allow only selected directories to be indexed

On Fri, Dec 9, 2022 at 11:18 AM Paulo Miguel Almeida
<[email protected]> wrote:
>
> It's common for drivers that share same physical components to also
> duplicate source code (or at least portions of it). A good example is
> both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
> file called atombios.h.
>
> While their contents aren't the same, a lot of their structs have
> the exact same names which makes navigating through the code base a bit
> messy as cscope will show up 'references' across drivers which aren't
> exactly correct.
>
> This patch makes it possible for the devs to specify which folders
> they want to include as part of the find_other_sources function if a
> makefile variable OTHERSRCDIRS is present, otherwise the original
> behaviour is kept.
>
> Example:
> make ARCH=x86 OTHERSRCDIRS=drivers/gpu/drm/radeon,tools cscope
>

It is better to make the opposite option i.e. ignore directories. By
default, cscope is all inclusive and it is more beneficial to have
more code indexed than less. Default indexed
directories will be different with and without OTHERSRCDIRS.

For example,

make ARCH=x86 cscope

# This includes all of the kernel code except non-x86 arch code.

make ARCH=x86 OTHERSRCSDIRS=drivers/gpu/drm/radeon/tools,tools cscope

# This includes only arch/x86, include/, tools/ and
driver/gpu/drm/radeon/tools. It removes kernel/, block/, lib/,
crypto/, virt/, etc. These are important kernel source code
directories.

My vote is to make something like:
make ARCH=x86 IGNOREDIRS=drivers/gpu/drm/amdgpu cscope

Parse IGNOREDIRS in the scripts/tags.sh and append to $ignore variable.

Also you should write this option in /Documentation/kbuild/kbuild.rst
similar to ALLSOURCE_ARCHS

Thanks


> Signed-off-by: Paulo Miguel Almeida <[email protected]>
> ---
> scripts/tags.sh | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/tags.sh b/scripts/tags.sh
> index e137cf15aae9..958c07c4ac4a 100755
> --- a/scripts/tags.sh
> +++ b/scripts/tags.sh
> @@ -59,12 +59,17 @@ find_include_sources()
> }
>
> # find sources in rest of tree
> -# we could benefit from a list of dirs to search in here
> find_other_sources()
> {
> - find ${tree}* $ignore \
> - \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> - -name "$1" -not -type l -print;
> + find_def_params="-name $1 -not -type l -print"
> + if [ -n "${OTHERSRCDIRS}" ]; then
> + exp_src_dirs=$(sed 's/,/ /g' <<< ${OTHERSRCDIRS})
> + find ${exp_src_dirs} ${ignore} ${find_def_params};
> + else
> + find ${tree}* ${ignore} \
> + \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) \
> + -prune -o ${find_def_params};
> + fi
> }
>
> find_sources()
> --
> 2.38.1
>

2022-12-10 21:56:39

by Paulo Miguel Almeida

[permalink] [raw]
Subject: Re: [PATCH] scripts/tags.sh: allow only selected directories to be indexed

On Sat, Dec 10, 2022 at 12:31:41PM -0800, Vipin Sharma wrote:
> On Fri, Dec 9, 2022 at 11:18 AM Paulo Miguel Almeida
> <[email protected]> wrote:
> >
> > It's common for drivers that share same physical components to also
> > duplicate source code (or at least portions of it). A good example is
> > both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
> > file called atombios.h.
> >
> > While their contents aren't the same, a lot of their structs have
> > the exact same names which makes navigating through the code base a bit
> > messy as cscope will show up 'references' across drivers which aren't
> > exactly correct.
> >
> > This patch makes it possible for the devs to specify which folders
> > they want to include as part of the find_other_sources function if a
> > makefile variable OTHERSRCDIRS is present, otherwise the original
> > behaviour is kept.
> >
> > Example:
> > make ARCH=x86 OTHERSRCDIRS=drivers/gpu/drm/radeon,tools cscope
> >
>
> It is better to make the opposite option i.e. ignore directories. By
> default, cscope is all inclusive and it is more beneficial to have
> more code indexed than less. Default indexed
> directories will be different with and without OTHERSRCDIRS.
>
> For example,
>
> make ARCH=x86 cscope
>
> # This includes all of the kernel code except non-x86 arch code.
>
> make ARCH=x86 OTHERSRCSDIRS=drivers/gpu/drm/radeon/tools,tools cscope
>
> # This includes only arch/x86, include/, tools/ and
> driver/gpu/drm/radeon/tools. It removes kernel/, block/, lib/,
> crypto/, virt/, etc. These are important kernel source code
> directories.
>
> My vote is to make something like:
> make ARCH=x86 IGNOREDIRS=drivers/gpu/drm/amdgpu cscope
>
> Parse IGNOREDIRS in the scripts/tags.sh and append to $ignore variable.
>
> Also you should write this option in /Documentation/kbuild/kbuild.rst
> similar to ALLSOURCE_ARCHS
>
> Thanks

Hi Vipin,

Thanks for taking the time to review this patch. I agree with you that
keeping cscope in its all inclusive approach is a better move. I will
make the requested changes and send a new patch.

Thanks!

- Paulo A.

>
>
> > Signed-off-by: Paulo Miguel Almeida <[email protected]>
> > ---
> > scripts/tags.sh | 13 +++++++++----
> > 1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/scripts/tags.sh b/scripts/tags.sh
> > index e137cf15aae9..958c07c4ac4a 100755
> > --- a/scripts/tags.sh
> > +++ b/scripts/tags.sh
> > @@ -59,12 +59,17 @@ find_include_sources()
> > }
> >
> > # find sources in rest of tree
> > -# we could benefit from a list of dirs to search in here
> > find_other_sources()
> > {
> > - find ${tree}* $ignore \
> > - \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> > - -name "$1" -not -type l -print;
> > + find_def_params="-name $1 -not -type l -print"
> > + if [ -n "${OTHERSRCDIRS}" ]; then
> > + exp_src_dirs=$(sed 's/,/ /g' <<< ${OTHERSRCDIRS})
> > + find ${exp_src_dirs} ${ignore} ${find_def_params};
> > + else
> > + find ${tree}* ${ignore} \
> > + \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) \
> > + -prune -o ${find_def_params};
> > + fi
> > }
> >
> > find_sources()
> > --
> > 2.38.1
> >

2022-12-11 01:00:24

by Paulo Miguel Almeida

[permalink] [raw]
Subject: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

It's common for drivers that share same physical components to also
duplicate source code (or at least portions of it). A good example is
both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
file called atombios.h.

While their contents aren't the same, a lot of their structs have
the exact same names which makes navigating through the code base a bit
messy as cscope will show up 'references' across drivers which aren't
exactly correct.

This patch makes it possible for the devs to specify which folders
they don't want to include into database as part of the
find_other_sources func if a makefile variable IGNOREDIRS is present,
otherwise the original behaviour is kept.

Example:
make ARCH=x86 IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope

Signed-off-by: Paulo Miguel Almeida <[email protected]>
---
Changelog:

- v2: change approach to include everything unless specified by the
IGNOREDIRS variable: (Req: Vipin Sharma)
- v1: https://lore.kernel.org/lkml/[email protected]/
---
Documentation/kbuild/kbuild.rst | 7 +++++++
scripts/tags.sh | 11 +++++++++--
2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index 08f575e6236c..5f99f30e20d8 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::

$ make ALLSOURCE_ARCHS=all tags

+IGNOREDIRS
+---------------
+For tags/TAGS/cscope targets, you can choose which directories won't
+be included in the databases, separated by comma. E.g.:
+
+ $ make IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope
+
KBUILD_BUILD_TIMESTAMP
----------------------
Setting this to a date string overrides the timestamp used in the
diff --git a/scripts/tags.sh b/scripts/tags.sh
index e137cf15aae9..554721e9cad2 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -59,10 +59,17 @@ find_include_sources()
}

# find sources in rest of tree
-# we could benefit from a list of dirs to search in here
find_other_sources()
{
- find ${tree}* $ignore \
+ local loc_ignore=${ignore}
+ if [ -n "${IGNOREDIRS}" ]; then
+ exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
+ for i in ${exp_ignored_dirs}; do
+ loc_ignore="${loc_ignore} ( -path $i ) -prune -o"
+ done
+ fi
+
+ find ${tree}* ${loc_ignore} \
\( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
-name "$1" -not -type l -print;
}
--
2.38.1

2022-12-11 05:43:10

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Sun, Dec 11, 2022 at 12:02:18PM +1300, Paulo Miguel Almeida wrote:
> This patch makes it possible for the devs to specify which folders
> they don't want to include into database as part of the
> find_other_sources func if a makefile variable IGNOREDIRS is present,
> otherwise the original behaviour is kept.

Better say "Add IGNOREDIRS variable, which specifies which directories
to be ignored from indexing."

> @@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::
>
> $ make ALLSOURCE_ARCHS=all tags
>
> +IGNOREDIRS
> +---------------
> +For tags/TAGS/cscope targets, you can choose which directories won't
> +be included in the databases, separated by comma. E.g.:
> +
> + $ make IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope
> +

Use two-colon syntax (::) for code block above to be consistent with
other code blocks.

Thanks.

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (965.00 B)
signature.asc (235.00 B)
Download all attachments

2022-12-11 20:45:50

by Paulo Miguel Almeida

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Sun, Dec 11, 2022 at 11:21:33AM +0700, Bagas Sanjaya wrote:
> On Sun, Dec 11, 2022 at 12:02:18PM +1300, Paulo Miguel Almeida wrote:
> > This patch makes it possible for the devs to specify which folders
> > they don't want to include into database as part of the
> > find_other_sources func if a makefile variable IGNOREDIRS is present,
> > otherwise the original behaviour is kept.
>
> Better say "Add IGNOREDIRS variable, which specifies which directories
> to be ignored from indexing."
>
> > @@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::
> >
> > $ make ALLSOURCE_ARCHS=all tags
> >
> > +IGNOREDIRS
> > +---------------
> > +For tags/TAGS/cscope targets, you can choose which directories won't
> > +be included in the databases, separated by comma. E.g.:
> > +
> > + $ make IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope
> > +
>
> Use two-colon syntax (::) for code block above to be consistent with
> other code blocks.
>
> Thanks.
>
> --
> An old man doll... just what I always wanted! - Clara

Hi Bagas,

thanks for taking the time to review this patch. I will make the changes
that you've pointed out.

thanks!

- Paulo A,

2022-12-12 21:51:33

by Vipin Sharma

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Sat, Dec 10, 2022 at 3:02 PM Paulo Miguel Almeida
<[email protected]> wrote:
>
> It's common for drivers that share same physical components to also
> duplicate source code (or at least portions of it). A good example is
> both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
> file called atombios.h.
>
> While their contents aren't the same, a lot of their structs have
> the exact same names which makes navigating through the code base a bit
> messy as cscope will show up 'references' across drivers which aren't
> exactly correct.
>
> This patch makes it possible for the devs to specify which folders
> they don't want to include into database as part of the
> find_other_sources func if a makefile variable IGNOREDIRS is present,
> otherwise the original behaviour is kept.
>
> Example:
> make ARCH=x86 IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope
>
> Signed-off-by: Paulo Miguel Almeida <[email protected]>
> ---
> Changelog:
>
> - v2: change approach to include everything unless specified by the
> IGNOREDIRS variable: (Req: Vipin Sharma)
> - v1: https://lore.kernel.org/lkml/[email protected]/
> ---
> Documentation/kbuild/kbuild.rst | 7 +++++++
> scripts/tags.sh | 11 +++++++++--
> 2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
> index 08f575e6236c..5f99f30e20d8 100644
> --- a/Documentation/kbuild/kbuild.rst
> +++ b/Documentation/kbuild/kbuild.rst
> @@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::
>
> $ make ALLSOURCE_ARCHS=all tags
>
> +IGNOREDIRS
> +---------------
> +For tags/TAGS/cscope targets, you can choose which directories won't
> +be included in the databases, separated by comma. E.g.:
> +
> + $ make IGNOREDIRS=drivers/gpu/drm/radeon,tools cscope
> +
> KBUILD_BUILD_TIMESTAMP
> ----------------------
> Setting this to a date string overrides the timestamp used in the
> diff --git a/scripts/tags.sh b/scripts/tags.sh
> index e137cf15aae9..554721e9cad2 100755
> --- a/scripts/tags.sh
> +++ b/scripts/tags.sh
> @@ -59,10 +59,17 @@ find_include_sources()
> }
>
> # find sources in rest of tree
> -# we could benefit from a list of dirs to search in here
> find_other_sources()
> {
> - find ${tree}* $ignore \
> + local loc_ignore=${ignore}
> + if [ -n "${IGNOREDIRS}" ]; then
> + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> + for i in ${exp_ignored_dirs}; do
> + loc_ignore="${loc_ignore} ( -path $i ) -prune -o"
> + done
> + fi
> +

This should be global overwrite instead of just in this function.
Before find_other_sources() is executed, this script finds files in
arch directories. So, if you keep it local then those files cannot be
excluded which makes execution of the command incorrect:

make IGNOREDIRS=arch/x86 cscope

Above command will still index all of the code in arch/x86. Something
like this will be better.

--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
# tags and cscope files should also ignore MODVERSION *.mod.c files
ignore="$ignore ( -name *.mod.c ) -prune -o"

+if [ -n "${IGNOREDIRS}" ]; then
+ exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
+ for i in ${exp_ignored_dirs}; do
+ ignore="${ignore} ( -path $i ) -prune -o"
+ done
+fi
+
# Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
# to force full paths for a non-O= build
if [ "${srctree}" = "." -o -z "${srctree}" ]; then
@@ -62,9 +69,9 @@ find_include_sources()
# we could benefit from a list of dirs to search in here
find_other_sources()
{
- find ${tree}* $ignore \
- \( -path ${tree}include -o -path ${tree}arch -o -name
'.tmp_*' \) -prune -o \
- -name "$1" -not -type l -print;
+ find ${tree}* ${ignore} \
+ \( -path ${tree}include -o -path ${tree}arch -o -name
'.tmp_*' \) -prune -o \
+ -name "$1" -not -type l -print;
}

We will still have to specify arch/x86 and arch/x86/include but this
works and keeps the definition of IGNOREDIRS relatively correct.


> + find ${tree}* ${loc_ignore} \
> \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> -name "$1" -not -type l -print;
> }
> --
> 2.38.1
>

2022-12-12 22:29:56

by Paulo Miguel Almeida

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Mon, Dec 12, 2022 at 01:27:37PM -0800, Vipin Sharma wrote:
> On Sat, Dec 10, 2022 at 3:02 PM Paulo Miguel Almeida
> <[email protected]> wrote:
> > # find sources in rest of tree
> > -# we could benefit from a list of dirs to search in here
> > find_other_sources()
> > {
> > - find ${tree}* $ignore \
> > + local loc_ignore=${ignore}
> > + if [ -n "${IGNOREDIRS}" ]; then
> > + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> > + for i in ${exp_ignored_dirs}; do
> > + loc_ignore="${loc_ignore} ( -path $i ) -prune -o"
> > + done
> > + fi
> > +
>
> This should be global overwrite instead of just in this function.
> Before find_other_sources() is executed, this script finds files in
> arch directories. So, if you keep it local then those files cannot be
> excluded which makes execution of the command incorrect:
>
> make IGNOREDIRS=arch/x86 cscope
>

Hi Vipin, thanks for taking the time to review this patch.

I see where you are coming from. I was aware of the 'loophole' that the
current approach could have but, to be honest, I thought that there
would be very little use in being able to exclude arch/.*?/ files.

The reason for that being that I thought the most common usage for this
feature would be to ignore folders within subsystems like drivers and
tools to ensure code navigation would be less 'messy'.

Additionally, if we go with the global IGNOREDIRS approach you just
described, we could have some conflicting options too such as:

make ALLSOURCE_ARCHS="x86 arm" IGNOREDIRS=arch/x86 cscope

My 2 cents is that ALLSOURCE_ARCHS is already the mechanism for
excluding archs so it's 'okay' to keep IGNOREDIRS as is.

Let me know your thoughts.

Thanks!

- Paulo A.

> Above command will still index all of the code in arch/x86. Something
> like this will be better.
>
> --- a/scripts/tags.sh
> +++ b/scripts/tags.sh
> @@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
> # tags and cscope files should also ignore MODVERSION *.mod.c files
> ignore="$ignore ( -name *.mod.c ) -prune -o"
>
> +if [ -n "${IGNOREDIRS}" ]; then
> + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> + for i in ${exp_ignored_dirs}; do
> + ignore="${ignore} ( -path $i ) -prune -o"
> + done
> +fi
> +
> # Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
> # to force full paths for a non-O= build
> if [ "${srctree}" = "." -o -z "${srctree}" ]; then
> @@ -62,9 +69,9 @@ find_include_sources()
> # we could benefit from a list of dirs to search in here
> find_other_sources()
> {
> - find ${tree}* $ignore \
> - \( -path ${tree}include -o -path ${tree}arch -o -name
> '.tmp_*' \) -prune -o \
> - -name "$1" -not -type l -print;
> + find ${tree}* ${ignore} \
> + \( -path ${tree}include -o -path ${tree}arch -o -name
> '.tmp_*' \) -prune -o \
> + -name "$1" -not -type l -print;
> }
>
> We will still have to specify arch/x86 and arch/x86/include but this
> works and keeps the definition of IGNOREDIRS relatively correct.
>
>
> > + find ${tree}* ${loc_ignore} \
> > \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> > -name "$1" -not -type l -print;
> > }
> > --
> > 2.38.1
> >

2022-12-12 22:38:40

by Vipin Sharma

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Mon, Dec 12, 2022 at 1:59 PM Paulo Miguel Almeida
<[email protected]> wrote:
>
> On Mon, Dec 12, 2022 at 01:27:37PM -0800, Vipin Sharma wrote:
> > On Sat, Dec 10, 2022 at 3:02 PM Paulo Miguel Almeida
> > <[email protected]> wrote:
> > > # find sources in rest of tree
> > > -# we could benefit from a list of dirs to search in here
> > > find_other_sources()
> > > {
> > > - find ${tree}* $ignore \
> > > + local loc_ignore=${ignore}
> > > + if [ -n "${IGNOREDIRS}" ]; then
> > > + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> > > + for i in ${exp_ignored_dirs}; do
> > > + loc_ignore="${loc_ignore} ( -path $i ) -prune -o"
> > > + done
> > > + fi
> > > +
> >
> > This should be global overwrite instead of just in this function.
> > Before find_other_sources() is executed, this script finds files in
> > arch directories. So, if you keep it local then those files cannot be
> > excluded which makes execution of the command incorrect:
> >
> > make IGNOREDIRS=arch/x86 cscope
> >
>
> Hi Vipin, thanks for taking the time to review this patch.
>
> I see where you are coming from. I was aware of the 'loophole' that the
> current approach could have but, to be honest, I thought that there
> would be very little use in being able to exclude arch/.*?/ files.
>
> The reason for that being that I thought the most common usage for this
> feature would be to ignore folders within subsystems like drivers and
> tools to ensure code navigation would be less 'messy'.

Yes, the original intent was to make driver code browsing less messy
but if we are introducing an option we should adapt it for generic
cases and correct the semantics.

>
> Additionally, if we go with the global IGNOREDIRS approach you just
> described, we could have some conflicting options too such as:
>
> make ALLSOURCE_ARCHS="x86 arm" IGNOREDIRS=arch/x86 cscope
>

I don't think this is conflicting, to me it is more complementary.
Above line shows get all code for x86 and arm but don't get x86 source
code ("arch/x86/include" is fine). This can even be fine tuned to sub
directories.

I just now noticed after seeing your command, ALLSOURCE_ARCHS take
space separated values, whereas, IGNOREDIRS take comma separated
values. They both should be in the same format, since ALLSOURCE_ARCHS
was already there, it is better to change IGNOREDIRS.

Can you also change IGNOREDIRS to IGNORE_DIRS? It is much easier to
read this way. Sorry, I should have said this in the beginning.

> My 2 cents is that ALLSOURCE_ARCHS is already the mechanism for
> excluding archs so it's 'okay' to keep IGNOREDIRS as is.
>
> Let me know your thoughts.
>
> Thanks!
>
> - Paulo A.
>
> > Above command will still index all of the code in arch/x86. Something
> > like this will be better.
> >
> > --- a/scripts/tags.sh
> > +++ b/scripts/tags.sh
> > @@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
> > # tags and cscope files should also ignore MODVERSION *.mod.c files
> > ignore="$ignore ( -name *.mod.c ) -prune -o"
> >
> > +if [ -n "${IGNOREDIRS}" ]; then
> > + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> > + for i in ${exp_ignored_dirs}; do
> > + ignore="${ignore} ( -path $i ) -prune -o"
> > + done
> > +fi
> > +
> > # Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
> > # to force full paths for a non-O= build
> > if [ "${srctree}" = "." -o -z "${srctree}" ]; then
> > @@ -62,9 +69,9 @@ find_include_sources()
> > # we could benefit from a list of dirs to search in here
> > find_other_sources()
> > {
> > - find ${tree}* $ignore \
> > - \( -path ${tree}include -o -path ${tree}arch -o -name
> > '.tmp_*' \) -prune -o \
> > - -name "$1" -not -type l -print;
> > + find ${tree}* ${ignore} \
> > + \( -path ${tree}include -o -path ${tree}arch -o -name
> > '.tmp_*' \) -prune -o \
> > + -name "$1" -not -type l -print;
> > }
> >
> > We will still have to specify arch/x86 and arch/x86/include but this
> > works and keeps the definition of IGNOREDIRS relatively correct.
> >
> >
> > > + find ${tree}* ${loc_ignore} \
> > > \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> > > -name "$1" -not -type l -print;
> > > }
> > > --
> > > 2.38.1
> > >

2022-12-13 02:24:15

by Paulo Miguel Almeida

[permalink] [raw]
Subject: Re: [PATCH v2] scripts/tags.sh: choose which directories to exclude from being indexed

On Mon, Dec 12, 2022 at 02:32:41PM -0800, Vipin Sharma wrote:
> On Mon, Dec 12, 2022 at 1:59 PM Paulo Miguel Almeida
> <[email protected]> wrote:
> >
> > On Mon, Dec 12, 2022 at 01:27:37PM -0800, Vipin Sharma wrote:
> > > On Sat, Dec 10, 2022 at 3:02 PM Paulo Miguel Almeida
> > > <[email protected]> wrote:
> > > > # find sources in rest of tree
> > > > -# we could benefit from a list of dirs to search in here
> > > > find_other_sources()
> > > > {
> > > > - find ${tree}* $ignore \
> > > > + local loc_ignore=${ignore}
> > > > + if [ -n "${IGNOREDIRS}" ]; then
> > > > + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> > > > + for i in ${exp_ignored_dirs}; do
> > > > + loc_ignore="${loc_ignore} ( -path $i ) -prune -o"
> > > > + done
> > > > + fi
> > > > +
> > >
> > > This should be global overwrite instead of just in this function.
> > > Before find_other_sources() is executed, this script finds files in
> > > arch directories. So, if you keep it local then those files cannot be
> > > excluded which makes execution of the command incorrect:
> > >
> > > make IGNOREDIRS=arch/x86 cscope
> > >
> >
> > Hi Vipin, thanks for taking the time to review this patch.
> >
> > I see where you are coming from. I was aware of the 'loophole' that the
> > current approach could have but, to be honest, I thought that there
> > would be very little use in being able to exclude arch/.*?/ files.
> >
> > The reason for that being that I thought the most common usage for this
> > feature would be to ignore folders within subsystems like drivers and
> > tools to ensure code navigation would be less 'messy'.
>
> Yes, the original intent was to make driver code browsing less messy
> but if we are introducing an option we should adapt it for generic
> cases and correct the semantics.

Agreed.

>
> >
> > Additionally, if we go with the global IGNOREDIRS approach you just
> > described, we could have some conflicting options too such as:
> >
> > make ALLSOURCE_ARCHS="x86 arm" IGNOREDIRS=arch/x86 cscope
> >
>
> I don't think this is conflicting, to me it is more complementary.
> Above line shows get all code for x86 and arm but don't get x86 source
> code ("arch/x86/include" is fine). This can even be fine tuned to sub
> directories.
>

That's a fair point. I had not thought about it that way. Thanks!

Will implement the changes when I get home.

> I just now noticed after seeing your command, ALLSOURCE_ARCHS take
> space separated values, whereas, IGNOREDIRS take comma separated
> values. They both should be in the same format, since ALLSOURCE_ARCHS
> was already there, it is better to change IGNOREDIRS.
>
> Can you also change IGNOREDIRS to IGNORE_DIRS? It is much easier to
> read this way. Sorry, I should have said this in the beginning.
>

Yep, no problem! :-)

> > My 2 cents is that ALLSOURCE_ARCHS is already the mechanism for
> > excluding archs so it's 'okay' to keep IGNOREDIRS as is.
> >
> > Let me know your thoughts.
> >
> > Thanks!
> >
> > - Paulo A.
> >
> > > Above command will still index all of the code in arch/x86. Something
> > > like this will be better.
> > >
> > > --- a/scripts/tags.sh
> > > +++ b/scripts/tags.sh
> > > @@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
> > > # tags and cscope files should also ignore MODVERSION *.mod.c files
> > > ignore="$ignore ( -name *.mod.c ) -prune -o"
> > >
> > > +if [ -n "${IGNOREDIRS}" ]; then
> > > + exp_ignored_dirs=$(sed 's/,/ /g' <<< ${IGNOREDIRS})
> > > + for i in ${exp_ignored_dirs}; do
> > > + ignore="${ignore} ( -path $i ) -prune -o"
> > > + done
> > > +fi
> > > +
> > > # Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
> > > # to force full paths for a non-O= build
> > > if [ "${srctree}" = "." -o -z "${srctree}" ]; then
> > > @@ -62,9 +69,9 @@ find_include_sources()
> > > # we could benefit from a list of dirs to search in here
> > > find_other_sources()
> > > {
> > > - find ${tree}* $ignore \
> > > - \( -path ${tree}include -o -path ${tree}arch -o -name
> > > '.tmp_*' \) -prune -o \
> > > - -name "$1" -not -type l -print;
> > > + find ${tree}* ${ignore} \
> > > + \( -path ${tree}include -o -path ${tree}arch -o -name
> > > '.tmp_*' \) -prune -o \
> > > + -name "$1" -not -type l -print;
> > > }
> > >
> > > We will still have to specify arch/x86 and arch/x86/include but this
> > > works and keeps the definition of IGNOREDIRS relatively correct.
> > >
> > >
> > > > + find ${tree}* ${loc_ignore} \
> > > > \( -path ${tree}include -o -path ${tree}arch -o -name '.tmp_*' \) -prune -o \
> > > > -name "$1" -not -type l -print;
> > > > }
> > > > --
> > > > 2.38.1
> > > >

2022-12-13 20:49:13

by Paulo Miguel Almeida

[permalink] [raw]
Subject: [PATCH v3] scripts/tags.sh: choose which directories to exclude from being indexed

It's common for drivers that share same physical components to also
duplicate source code (or at least portions of it). A good example is
both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
file called atombios.h.

While their contents aren't the same, a lot of their structs have
the exact same names which makes navigating through the code base a bit
messy as cscope will show up 'references' across drivers which aren't
exactly correct.

Add IGNORE_DIRS variable, which specifies which directories
to be ignored from indexing.

Example:
make ARCH=x86 IGNORE_DIRS="drivers/gpu/drm/radeon tools" cscope

Signed-off-by: Paulo Miguel Almeida <[email protected]>
---
Changelog:

- v3: change commit message wording and rst syntax (Req Bagas Sanjaya)
change makefile variable scope to global, use blank space
separator and change variable name to IGNORE_DIRS.
(Req: Vipin Sharma)
- v2: change approach to include everything unless specified by the
IGNOREDIRS variable: (Req: Vipin Sharma)
- v1: https://lore.kernel.org/lkml/[email protected]/
---
Documentation/kbuild/kbuild.rst | 7 +++++++
scripts/tags.sh | 7 +++++++
2 files changed, 14 insertions(+)

diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index 08f575e6236c..5202186728b4 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::

$ make ALLSOURCE_ARCHS=all tags

+IGNORE_DIRS
+-----------
+For tags/TAGS/cscope targets, you can choose which directories won't
+be included in the databases, separated by blank space. E.g.::
+
+ $ make IGNORE_DIRS="drivers/gpu/drm/radeon tools" cscope
+
KBUILD_BUILD_TIMESTAMP
----------------------
Setting this to a date string overrides the timestamp used in the
diff --git a/scripts/tags.sh b/scripts/tags.sh
index e137cf15aae9..1ad45f17179a 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
# tags and cscope files should also ignore MODVERSION *.mod.c files
ignore="$ignore ( -name *.mod.c ) -prune -o"

+# ignore arbitrary directories
+if [ -n "${IGNORE_DIRS}" ]; then
+ for i in ${IGNORE_DIRS}; do
+ ignore="${ignore} ( -path $i ) -prune -o"
+ done
+fi
+
# Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
# to force full paths for a non-O= build
if [ "${srctree}" = "." -o -z "${srctree}" ]; then
--
2.38.1

2022-12-14 18:02:59

by Vipin Sharma

[permalink] [raw]
Subject: Re: [PATCH v3] scripts/tags.sh: choose which directories to exclude from being indexed

On Tue, Dec 13, 2022 at 12:26 PM Paulo Miguel Almeida
<[email protected]> wrote:
>
> It's common for drivers that share same physical components to also
> duplicate source code (or at least portions of it). A good example is
> both drivers/gpu/drm/amdgpu/* and drivers/gpu/drm/radeon/* have a header
> file called atombios.h.
>
> While their contents aren't the same, a lot of their structs have
> the exact same names which makes navigating through the code base a bit
> messy as cscope will show up 'references' across drivers which aren't
> exactly correct.
>
> Add IGNORE_DIRS variable, which specifies which directories
> to be ignored from indexing.
>
> Example:
> make ARCH=x86 IGNORE_DIRS="drivers/gpu/drm/radeon tools" cscope
>
> Signed-off-by: Paulo Miguel Almeida <[email protected]>
> ---
> Changelog:
>
> - v3: change commit message wording and rst syntax (Req Bagas Sanjaya)
> change makefile variable scope to global, use blank space
> separator and change variable name to IGNORE_DIRS.
> (Req: Vipin Sharma)
> - v2: change approach to include everything unless specified by the
> IGNOREDIRS variable: (Req: Vipin Sharma)
> - v1: https://lore.kernel.org/lkml/[email protected]/
> ---
> Documentation/kbuild/kbuild.rst | 7 +++++++
> scripts/tags.sh | 7 +++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
> index 08f575e6236c..5202186728b4 100644
> --- a/Documentation/kbuild/kbuild.rst
> +++ b/Documentation/kbuild/kbuild.rst
> @@ -278,6 +278,13 @@ To get all available archs you can also specify all. E.g.::
>
> $ make ALLSOURCE_ARCHS=all tags
>
> +IGNORE_DIRS
> +-----------
> +For tags/TAGS/cscope targets, you can choose which directories won't
> +be included in the databases, separated by blank space. E.g.::
> +
> + $ make IGNORE_DIRS="drivers/gpu/drm/radeon tools" cscope
> +
> KBUILD_BUILD_TIMESTAMP
> ----------------------
> Setting this to a date string overrides the timestamp used in the
> diff --git a/scripts/tags.sh b/scripts/tags.sh
> index e137cf15aae9..1ad45f17179a 100755
> --- a/scripts/tags.sh
> +++ b/scripts/tags.sh
> @@ -17,6 +17,13 @@ ignore="$(echo "$RCS_FIND_IGNORE" | sed 's|\\||g' )"
> # tags and cscope files should also ignore MODVERSION *.mod.c files
> ignore="$ignore ( -name *.mod.c ) -prune -o"
>
> +# ignore arbitrary directories
> +if [ -n "${IGNORE_DIRS}" ]; then
> + for i in ${IGNORE_DIRS}; do
> + ignore="${ignore} ( -path $i ) -prune -o"
> + done
> +fi
> +
> # Use make KBUILD_ABS_SRCTREE=1 {tags|cscope}
> # to force full paths for a non-O= build
> if [ "${srctree}" = "." -o -z "${srctree}" ]; then
> --
> 2.38.1
>

Thanks for the patch.

Reviewed-by: Vipin Sharma <[email protected]>