2021-03-09 11:23:46

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH] [RFC] dt-bindings: Consider DT_SCHEMA_FILES when finding all json-schema

Setting DT_SCHEMA_FILES allows the user to restrict the
"dt_binding_check" make target to a specified set of DT binding files.
However, yamllint is still run on all available files, which can take
quite some time.

Fix this by changing "find_cmd" to only return the specified files.
Note that this also affects the "cmd_chk_bindings" and "cmd_mk_schema"
rules.

This reduces the execution time of

make dt_binding_check DT_SCHEMA_FILES=/path/to/json/schema/file

from ca. 22 to less than 2 seconds on an i7-8700K.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
Should this be restricted to cmd_yamllint?
I'm not sure which users of find_cmd do and do not need all files.
---
Documentation/devicetree/bindings/Makefile | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/Makefile b/Documentation/devicetree/bindings/Makefile
index 780e5618ec0ae2fc..60ac03bade2da0ad 100644
--- a/Documentation/devicetree/bindings/Makefile
+++ b/Documentation/devicetree/bindings/Makefile
@@ -22,10 +22,18 @@ $(obj)/%.example.dts: $(src)/%.yaml check_dtschema_version FORCE
# Use full schemas when checking %.example.dts
DT_TMP_SCHEMA := $(obj)/processed-schema-examples.json

+ifeq ($(DT_SCHEMA_FILES),)
+
find_cmd = find $(srctree)/$(src) \( -name '*.yaml' ! \
-name 'processed-schema*' ! \
-name '*.example.dt.yaml' \)

+else
+
+find_cmd = echo $(addprefix $(srctree)/, $(DT_SCHEMA_FILES))
+
+endif
+
quiet_cmd_yamllint = LINT $(src)
cmd_yamllint = ($(find_cmd) | \
xargs $(DT_SCHEMA_LINT) -f parsable -c $(srctree)/$(src)/.yamllint) || true
--
2.25.1


2021-03-09 17:44:24

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH] [RFC] dt-bindings: Consider DT_SCHEMA_FILES when finding all json-schema

On Tue, Mar 9, 2021 at 4:21 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> Setting DT_SCHEMA_FILES allows the user to restrict the
> "dt_binding_check" make target to a specified set of DT binding files.
> However, yamllint is still run on all available files, which can take
> quite some time.
>
> Fix this by changing "find_cmd" to only return the specified files.
> Note that this also affects the "cmd_chk_bindings" and "cmd_mk_schema"
> rules.
>
> This reduces the execution time of
>
> make dt_binding_check DT_SCHEMA_FILES=/path/to/json/schema/file
>
> from ca. 22 to less than 2 seconds on an i7-8700K.

We could use xargs sharding like 'chk_bindings' does. That goes from
18s to 5s for me (i7-7700HQ). Good enough? Not sure why I didn't other
than thinking 20sec was fast enough.

Another option would be doing yamllint as part of cmd_extract_ex or we
could have a command line variable to disable yamllint.

> Signed-off-by: Geert Uytterhoeven <[email protected]>
> ---
> Should this be restricted to cmd_yamllint?
> I'm not sure which users of find_cmd do and do not need all files.

cmd_chk_bindings always does. cmd_mk_schema needs both. So I think
this doesn't work for all cases.

> ---
> Documentation/devicetree/bindings/Makefile | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/Makefile b/Documentation/devicetree/bindings/Makefile
> index 780e5618ec0ae2fc..60ac03bade2da0ad 100644
> --- a/Documentation/devicetree/bindings/Makefile
> +++ b/Documentation/devicetree/bindings/Makefile
> @@ -22,10 +22,18 @@ $(obj)/%.example.dts: $(src)/%.yaml check_dtschema_version FORCE
> # Use full schemas when checking %.example.dts
> DT_TMP_SCHEMA := $(obj)/processed-schema-examples.json
>
> +ifeq ($(DT_SCHEMA_FILES),)
> +
> find_cmd = find $(srctree)/$(src) \( -name '*.yaml' ! \
> -name 'processed-schema*' ! \
> -name '*.example.dt.yaml' \)
>
> +else
> +
> +find_cmd = echo $(addprefix $(srctree)/, $(DT_SCHEMA_FILES))
> +
> +endif
> +
> quiet_cmd_yamllint = LINT $(src)
> cmd_yamllint = ($(find_cmd) | \
> xargs $(DT_SCHEMA_LINT) -f parsable -c $(srctree)/$(src)/.yamllint) || true
> --
> 2.25.1
>

2021-03-10 08:46:02

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] [RFC] dt-bindings: Consider DT_SCHEMA_FILES when finding all json-schema

Hi Rob,

On Tue, Mar 9, 2021 at 6:42 PM Rob Herring <[email protected]> wrote:
> On Tue, Mar 9, 2021 at 4:21 AM Geert Uytterhoeven
> <[email protected]> wrote:
> > Setting DT_SCHEMA_FILES allows the user to restrict the
> > "dt_binding_check" make target to a specified set of DT binding files.
> > However, yamllint is still run on all available files, which can take
> > quite some time.
> >
> > Fix this by changing "find_cmd" to only return the specified files.
> > Note that this also affects the "cmd_chk_bindings" and "cmd_mk_schema"
> > rules.
> >
> > This reduces the execution time of
> >
> > make dt_binding_check DT_SCHEMA_FILES=/path/to/json/schema/file
> >
> > from ca. 22 to less than 2 seconds on an i7-8700K.
>
> We could use xargs sharding like 'chk_bindings' does. That goes from
> 18s to 5s for me (i7-7700HQ). Good enough? Not sure why I didn't other
> than thinking 20sec was fast enough.

Sounds better, but yamllint (on all files) would still take 80% of the
time for a single binding check, assuming a reasonably fast multi-core
machine. My main objection is that while working on a new binding, and
using DT_SCHEMA_FILES, I don't want to waste time on checking other
bindings, and being bothered with warnings about them.

> > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > ---
> > Should this be restricted to cmd_yamllint?
> > I'm not sure which users of find_cmd do and do not need all files.
>
> cmd_chk_bindings always does. cmd_mk_schema needs both. So I think
> this doesn't work for all cases.

Thanks, will respin to restrict to yamllint.

> > --- a/Documentation/devicetree/bindings/Makefile
> > +++ b/Documentation/devicetree/bindings/Makefile
> > @@ -22,10 +22,18 @@ $(obj)/%.example.dts: $(src)/%.yaml check_dtschema_version FORCE
> > # Use full schemas when checking %.example.dts
> > DT_TMP_SCHEMA := $(obj)/processed-schema-examples.json
> >
> > +ifeq ($(DT_SCHEMA_FILES),)
> > +
> > find_cmd = find $(srctree)/$(src) \( -name '*.yaml' ! \
> > -name 'processed-schema*' ! \
> > -name '*.example.dt.yaml' \)
> >
> > +else
> > +
> > +find_cmd = echo $(addprefix $(srctree)/, $(DT_SCHEMA_FILES))
> > +
> > +endif
> > +
> > quiet_cmd_yamllint = LINT $(src)
> > cmd_yamllint = ($(find_cmd) | \
> > xargs $(DT_SCHEMA_LINT) -f parsable -c $(srctree)/$(src)/.yamllint) || true

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-03-11 21:17:55

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH] [RFC] dt-bindings: Consider DT_SCHEMA_FILES when finding all json-schema

On Wed, Mar 10, 2021 at 1:43 AM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Rob,
>
> On Tue, Mar 9, 2021 at 6:42 PM Rob Herring <[email protected]> wrote:
> > On Tue, Mar 9, 2021 at 4:21 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > Setting DT_SCHEMA_FILES allows the user to restrict the
> > > "dt_binding_check" make target to a specified set of DT binding files.
> > > However, yamllint is still run on all available files, which can take
> > > quite some time.
> > >
> > > Fix this by changing "find_cmd" to only return the specified files.
> > > Note that this also affects the "cmd_chk_bindings" and "cmd_mk_schema"
> > > rules.
> > >
> > > This reduces the execution time of
> > >
> > > make dt_binding_check DT_SCHEMA_FILES=/path/to/json/schema/file
> > >
> > > from ca. 22 to less than 2 seconds on an i7-8700K.
> >
> > We could use xargs sharding like 'chk_bindings' does. That goes from
> > 18s to 5s for me (i7-7700HQ). Good enough? Not sure why I didn't other
> > than thinking 20sec was fast enough.
>
> Sounds better, but yamllint (on all files) would still take 80% of the
> time for a single binding check, assuming a reasonably fast multi-core
> machine. My main objection is that while working on a new binding, and
> using DT_SCHEMA_FILES, I don't want to waste time on checking other
> bindings, and being bothered with warnings about them.

We could move yamllint to when we extract the example. Then it only
runs on modified schema files. The problem with that is we have to
watch out for slow python start-up times. I get about 25sec for the
whole tree 1 by 1 with a quick test:

$ time find Documentation/devicetree/bindings/ -name '*.yaml' | xargs
-P8 -n1 yamllint -f parsable -c
Documentation/devicetree/bindings/.yamllint

real 0m24.939s
user 2m55.846s
sys 0m12.738s

Of course, that's a parallel build compared to 18sec single threaded.
I'm seeing about 8 min for full dt_binding_check without yamllint.

Here's what the kbuild changes would look like(whitespace damaged):

diff --git a/Documentation/devicetree/bindings/Makefile
b/Documentation/devicetree/bindings/Makefile
index 780e5618ec0a..3a213343d587 100644
--- a/Documentation/devicetree/bindings/Makefile
+++ b/Documentation/devicetree/bindings/Makefile
@@ -16,8 +16,13 @@ check_dtschema_version:
quiet_cmd_extract_ex = DTEX $@
cmd_extract_ex = $(DT_EXTRACT_EX) $< > $@

+define rule_extract_ex
+ $(if $(DT_SCHEMA_LINT),$(call cmd,yamllint),)
+ $(call cmd,extract_ex)
+endef
+
$(obj)/%.example.dts: $(src)/%.yaml check_dtschema_version FORCE
- $(call if_changed,extract_ex)
+ $(call if_changed_rule,extract_ex)

# Use full schemas when checking %.example.dts
DT_TMP_SCHEMA := $(obj)/processed-schema-examples.json
@@ -26,9 +31,8 @@ find_cmd = find $(srctree)/$(src) \( -name '*.yaml' ! \
-name 'processed-schema*' ! \
-name '*.example.dt.yaml' \)

-quiet_cmd_yamllint = LINT $(src)
- cmd_yamllint = ($(find_cmd) | \
- xargs $(DT_SCHEMA_LINT) -f parsable -c
$(srctree)/$(src)/.yamllint) || true
+quiet_cmd_yamllint = LINT $(patsubst $(srctree)/%,%,$<)
+ cmd_yamllint = $(DT_SCHEMA_LINT) -f parsable -c
$(srctree)/$(src)/.yamllint $(real-prereqs) || true

quiet_cmd_chk_bindings = CHKDT $@
cmd_chk_bindings = ($(find_cmd) | \
@@ -43,7 +47,6 @@ quiet_cmd_mk_schema = SCHEMA $@
rm -f $$f

define rule_chkdt
- $(if $(DT_SCHEMA_LINT),$(call cmd,yamllint),)
$(call cmd,chk_bindings)
$(call cmd,mk_schema)
endef