2021-06-16 13:27:49

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v2 1/3] scripts: add spelling_sanitizer.sh script

The file scripts/spelling.txt recorded a large number of spelling
"mistake||correction" pairs. These entries are currently maintained in
order, but the results are not strict. In addition, when someone wants to
add some new pairs, he either sort them manually or write a script, which
is clearly a waste of labor. So add this script. For all spelling
"mistake||correction" pairs, sort based on "correction", then on "mistake",
and remove duplicates. Sorting based on "mistake" first is not chosen
because it is uncontrollable.

Signed-off-by: Zhen Lei <[email protected]>
---
scripts/spelling_sanitizer.sh | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
create mode 100755 scripts/spelling_sanitizer.sh

diff --git a/scripts/spelling_sanitizer.sh b/scripts/spelling_sanitizer.sh
new file mode 100755
index 000000000000..603bb7e0e66b
--- /dev/null
+++ b/scripts/spelling_sanitizer.sh
@@ -0,0 +1,27 @@
+#!/bin/sh -efu
+# SPDX-License-Identifier: GPL-2.0
+
+# To get the traditional sort order that uses native byte values
+export LC_ALL=C
+
+cd ${0%/*}
+
+src=spelling.txt
+comments=`sed -n '/#/p' $src`
+
+# Convert the format of 'codespell' to the current
+sed -r -i 's/ ==> /||/' $src
+
+# For all spelling "mistake||correction" pairs(non-comment lines):
+# Sort based on "correction", then "mistake", and remove duplicates
+sed -n '/#/!p' $src | sort -u -t '|' -k 3 -k 1 -o $src
+
+# Backfill comment lines
+ln=0
+echo "$comments" | while read line
+do
+ let ln+=1
+ sed -i "$ln i\\$line" $src
+done
+
+cd - > /dev/null
--
2.25.1



2021-06-16 20:23:13

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] scripts: add spelling_sanitizer.sh script

Zhen Lei <[email protected]> writes:

> The file scripts/spelling.txt recorded a large number of spelling
> "mistake||correction" pairs. These entries are currently maintained in
> order, but the results are not strict. In addition, when someone wants to
> add some new pairs, he either sort them manually or write a script, which
> is clearly a waste of labor. So add this script. For all spelling
> "mistake||correction" pairs, sort based on "correction", then on "mistake",
> and remove duplicates. Sorting based on "mistake" first is not chosen
> because it is uncontrollable.
>
> Signed-off-by: Zhen Lei <[email protected]>
> ---
> scripts/spelling_sanitizer.sh | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
> create mode 100755 scripts/spelling_sanitizer.sh
>
> diff --git a/scripts/spelling_sanitizer.sh b/scripts/spelling_sanitizer.sh
> new file mode 100755
> index 000000000000..603bb7e0e66b
> --- /dev/null
> +++ b/scripts/spelling_sanitizer.sh
> @@ -0,0 +1,27 @@
> +#!/bin/sh -efu
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# To get the traditional sort order that uses native byte values

So I am of the naive opinion that everything we drop into scripts/
should start with a comment saying why it exists and how to use it.
Otherwise how are people going to benefit from it?

Thanks,

jon

2021-06-17 02:47:35

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] scripts: add spelling_sanitizer.sh script



On 2021/6/16 22:53, Jonathan Corbet wrote:
> Zhen Lei <[email protected]> writes:
>
>> The file scripts/spelling.txt recorded a large number of spelling
>> "mistake||correction" pairs. These entries are currently maintained in
>> order, but the results are not strict. In addition, when someone wants to
>> add some new pairs, he either sort them manually or write a script, which
>> is clearly a waste of labor. So add this script. For all spelling
>> "mistake||correction" pairs, sort based on "correction", then on "mistake",
>> and remove duplicates. Sorting based on "mistake" first is not chosen
>> because it is uncontrollable.
>>
>> Signed-off-by: Zhen Lei <[email protected]>
>> ---
>> scripts/spelling_sanitizer.sh | 27 +++++++++++++++++++++++++++
>> 1 file changed, 27 insertions(+)
>> create mode 100755 scripts/spelling_sanitizer.sh
>>
>> diff --git a/scripts/spelling_sanitizer.sh b/scripts/spelling_sanitizer.sh
>> new file mode 100755
>> index 000000000000..603bb7e0e66b
>> --- /dev/null
>> +++ b/scripts/spelling_sanitizer.sh
>> @@ -0,0 +1,27 @@
>> +#!/bin/sh -efu
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +# To get the traditional sort order that uses native byte values
>
> So I am of the naive opinion that everything we drop into scripts/
> should start with a comment saying why it exists and how to use it.
> Otherwise how are people going to benefit from it?

Rigth, I will add the description, thanks.

>
> Thanks,
>
> jon
>
> .
>

2021-06-17 07:36:20

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] scripts: add spelling_sanitizer.sh script

On Thu 2021-06-17 09:11:05, Leizhen (ThunderTown) wrote:
>
>
> On 2021/6/16 22:53, Jonathan Corbet wrote:
> > Zhen Lei <[email protected]> writes:
> >
> >> The file scripts/spelling.txt recorded a large number of spelling
> >> "mistake||correction" pairs. These entries are currently maintained in
> >> order, but the results are not strict. In addition, when someone wants to
> >> add some new pairs, he either sort them manually or write a script, which
> >> is clearly a waste of labor. So add this script. For all spelling
> >> "mistake||correction" pairs, sort based on "correction", then on "mistake",
> >> and remove duplicates. Sorting based on "mistake" first is not chosen
> >> because it is uncontrollable.
> >>
> >> Signed-off-by: Zhen Lei <[email protected]>
> >> ---
> >> scripts/spelling_sanitizer.sh | 27 +++++++++++++++++++++++++++
> >> 1 file changed, 27 insertions(+)
> >> create mode 100755 scripts/spelling_sanitizer.sh
> >>
> >> diff --git a/scripts/spelling_sanitizer.sh b/scripts/spelling_sanitizer.sh
> >> new file mode 100755
> >> index 000000000000..603bb7e0e66b
> >> --- /dev/null
> >> +++ b/scripts/spelling_sanitizer.sh
> >> @@ -0,0 +1,27 @@
> >> +#!/bin/sh -efu
> >> +# SPDX-License-Identifier: GPL-2.0
> >> +
> >> +# To get the traditional sort order that uses native byte values
> >
> > So I am of the naive opinion that everything we drop into scripts/
> > should start with a comment saying why it exists and how to use it.
> > Otherwise how are people going to benefit from it?
>
> Rigth, I will add the description, thanks.

Ideally, please add also some -h/--help option that would print a short
description and usage.

Best Regards,
Petr

2021-06-18 03:44:22

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] scripts: add spelling_sanitizer.sh script



On 2021/6/17 15:32, Petr Mladek wrote:
> On Thu 2021-06-17 09:11:05, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2021/6/16 22:53, Jonathan Corbet wrote:
>>> Zhen Lei <[email protected]> writes:
>>>
>>>> The file scripts/spelling.txt recorded a large number of spelling
>>>> "mistake||correction" pairs. These entries are currently maintained in
>>>> order, but the results are not strict. In addition, when someone wants to
>>>> add some new pairs, he either sort them manually or write a script, which
>>>> is clearly a waste of labor. So add this script. For all spelling
>>>> "mistake||correction" pairs, sort based on "correction", then on "mistake",
>>>> and remove duplicates. Sorting based on "mistake" first is not chosen
>>>> because it is uncontrollable.
>>>>
>>>> Signed-off-by: Zhen Lei <[email protected]>
>>>> ---
>>>> scripts/spelling_sanitizer.sh | 27 +++++++++++++++++++++++++++
>>>> 1 file changed, 27 insertions(+)
>>>> create mode 100755 scripts/spelling_sanitizer.sh
>>>>
>>>> diff --git a/scripts/spelling_sanitizer.sh b/scripts/spelling_sanitizer.sh
>>>> new file mode 100755
>>>> index 000000000000..603bb7e0e66b
>>>> --- /dev/null
>>>> +++ b/scripts/spelling_sanitizer.sh
>>>> @@ -0,0 +1,27 @@
>>>> +#!/bin/sh -efu
>>>> +# SPDX-License-Identifier: GPL-2.0
>>>> +
>>>> +# To get the traditional sort order that uses native byte values
>>>
>>> So I am of the naive opinion that everything we drop into scripts/
>>> should start with a comment saying why it exists and how to use it.
>>> Otherwise how are people going to benefit from it?
>>
>> Rigth, I will add the description, thanks.
>
> Ideally, please add also some -h/--help option that would print a short
> description and usage.

OK, I will add it.

>
> Best Regards,
> Petr
>
> .
>