2020-07-26 00:38:48

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 0/9] powerpc: delete duplicated words

Drop duplicated words in arch/powerpc/ header files.

Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]

arch/powerpc/include/asm/book3s/64/mmu-hash.h | 2 +-
arch/powerpc/include/asm/book3s/64/radix-4k.h | 2 +-
arch/powerpc/include/asm/cputime.h | 2 +-
arch/powerpc/include/asm/epapr_hcalls.h | 4 ++--
arch/powerpc/include/asm/hw_breakpoint.h | 2 +-
arch/powerpc/include/asm/ppc_asm.h | 2 +-
arch/powerpc/include/asm/reg.h | 2 +-
arch/powerpc/include/asm/smu.h | 2 +-
arch/powerpc/platforms/powernv/pci.h | 2 +-
9 files changed, 10 insertions(+), 10 deletions(-)


2020-07-26 00:38:57

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 3/9] powerpc: cputime.h: delete duplicated word

Drop the repeated word "use".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/cputime.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/cputime.h
+++ linux-next-20200720/arch/powerpc/include/asm/cputime.h
@@ -67,7 +67,7 @@ static inline void arch_vtime_task_switc

/*
* account_cpu_user_entry/exit runs "unreconciled", so can't trace,
- * can't use use get_paca()
+ * can't use get_paca()
*/
static notrace inline void account_cpu_user_entry(void)
{

2020-07-26 00:39:03

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 2/9] powerpc: book3s: radix-4k.h: delete duplicated word

Drop the repeated word "per".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/book3s/64/radix-4k.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/book3s/64/radix-4k.h
+++ linux-next-20200720/arch/powerpc/include/asm/book3s/64/radix-4k.h
@@ -11,7 +11,7 @@
#define RADIX_PGD_INDEX_SIZE 13 // size: 8B << 13 = 64KB, maps 2^13 x 512GB = 4PB

/*
- * One fragment per per page
+ * One fragment per page
*/
#define RADIX_PTE_FRAG_SIZE_SHIFT (RADIX_PTE_INDEX_SIZE + 3)
#define RADIX_PTE_FRAG_NR (PAGE_SIZE >> RADIX_PTE_FRAG_SIZE_SHIFT)

2020-07-26 00:39:04

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 1/9] powerpc: book3s: mmu-hash.h: delete duplicated word

Drop the repeated word "below".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ linux-next-20200720/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -793,7 +793,7 @@ static inline unsigned long get_vsid(uns
}

/*
- * For kernel space, we use context ids as below
+ * For kernel space, we use context ids as
* below. Range is 512TB per context.
*
* 0x00001 - [ 0xc000000000000000 - 0xc001ffffffffffff]

2020-07-26 00:39:07

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 5/9] powerpc: hw_breakpoint.h: delete duplicated word

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/hw_breakpoint.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/hw_breakpoint.h
+++ linux-next-20200720/arch/powerpc/include/asm/hw_breakpoint.h
@@ -17,7 +17,7 @@ struct arch_hw_breakpoint {
u16 hw_len; /* length programmed in hw */
};

-/* Note: Don't change the the first 6 bits below as they are in the same order
+/* Note: Don't change the first 6 bits below as they are in the same order
* as the dabr and dabrx.
*/
#define HW_BRK_TYPE_READ 0x01

2020-07-26 00:39:55

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 8/9] powerpc: smu.h: delete duplicated word

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/smu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/smu.h
+++ linux-next-20200720/arch/powerpc/include/asm/smu.h
@@ -108,7 +108,7 @@
/*
* i2c commands
*
- * To issue an i2c command, first is to send a parameter block to the
+ * To issue an i2c command, first is to send a parameter block to
* the SMU. This is a command of type 0x9a with 9 bytes of header
* eventually followed by data for a write:
*

2020-07-26 00:41:04

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 6/9] powerpc: ppc_asm.h: delete duplicated word

Drop the repeated word "in".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/ppc_asm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/ppc_asm.h
+++ linux-next-20200720/arch/powerpc/include/asm/ppc_asm.h
@@ -774,7 +774,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_B
#define FIXUP_ENDIAN
#else
/*
- * This version may be used in in HV or non-HV context.
+ * This version may be used in HV or non-HV context.
* MSR[EE] must be disabled.
*/
#define FIXUP_ENDIAN \

2020-07-26 00:41:48

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 4/9] powerpc: epapr_hcalls.h: delete duplicated words

Drop the repeated words "file" and "the".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/epapr_hcalls.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/epapr_hcalls.h
+++ linux-next-20200720/arch/powerpc/include/asm/epapr_hcalls.h
@@ -37,7 +37,7 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

-/* A "hypercall" is an "sc 1" instruction. This header file file provides C
+/* A "hypercall" is an "sc 1" instruction. This header file provides C
* wrapper functions for the ePAPR hypervisor interface. It is inteded
* for use by Linux device drivers and other operating systems.
*
@@ -246,7 +246,7 @@ static inline unsigned int ev_int_get_ma
* ev_int_eoi - signal the end of interrupt processing
* @interrupt: the interrupt number
*
- * This function signals the end of processing for the the specified
+ * This function signals the end of processing for the specified
* interrupt, which must be the interrupt currently in service. By
* definition, this is also the highest-priority interrupt.
*

2020-07-26 00:41:53

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 9/9] powerpc: powernv: pci.h: delete duplicated word

Drop the repeated word "for".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/platforms/powernv/pci.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/platforms/powernv/pci.h
+++ linux-next-20200720/arch/powerpc/platforms/powernv/pci.h
@@ -87,7 +87,7 @@ struct pnv_ioda_pe {
bool tce_bypass_enabled;
uint64_t tce_bypass_base;

- /* MSIs. MVE index is identical for for 32 and 64 bit MSI
+ /* MSIs. MVE index is identical for 32 and 64 bit MSI
* and -1 if not supported. (It's actually identical to the
* PE number)
*/

2020-07-26 00:42:09

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH 7/9] powerpc: reg.h: delete duplicated word

Drop the repeated word "a".

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/reg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200720.orig/arch/powerpc/include/asm/reg.h
+++ linux-next-20200720/arch/powerpc/include/asm/reg.h
@@ -1472,7 +1472,7 @@ static inline void update_power8_hid0(un
{
/*
* The HID0 update on Power8 should at the very least be
- * preceded by a a SYNC instruction followed by an ISYNC
+ * preceded by a SYNC instruction followed by an ISYNC
* instruction
*/
asm volatile("sync; mtspr %0,%1; isync":: "i"(SPRN_HID0), "r"(hid0));

2020-07-26 14:29:41

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

Randy Dunlap <[email protected]> a écrit :

> Drop duplicated words in arch/powerpc/ header files.

How did you detect them ? Do you have some script for tgat, or you
just read all comments ?

>
> Cc: Michael Ellerman <[email protected]>

You say you Cc Michael, but in fact you don't ... Allthough he is the
powerpc maintainer

Christophe

> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: [email protected]
>
> arch/powerpc/include/asm/book3s/64/mmu-hash.h | 2 +-
> arch/powerpc/include/asm/book3s/64/radix-4k.h | 2 +-
> arch/powerpc/include/asm/cputime.h | 2 +-
> arch/powerpc/include/asm/epapr_hcalls.h | 4 ++--
> arch/powerpc/include/asm/hw_breakpoint.h | 2 +-
> arch/powerpc/include/asm/ppc_asm.h | 2 +-
> arch/powerpc/include/asm/reg.h | 2 +-
> arch/powerpc/include/asm/smu.h | 2 +-
> arch/powerpc/platforms/powernv/pci.h | 2 +-
> 9 files changed, 10 insertions(+), 10 deletions(-)


2020-07-26 17:25:47

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On 7/26/20 7:29 AM, Christophe Leroy wrote:
> Randy Dunlap <[email protected]> a écrit :
>
>> Drop duplicated words in arch/powerpc/ header files.
>
> How did you detect them ? Do you have some script for tgat, or you just read all comments ?

Yes, it's a script that finds lots of false positives, so I have to check
each and every one of them for validity.

>> Cc: Michael Ellerman <[email protected]>
>
> You say you Cc Michael, but in fact you don't ... Allthough he is the powerpc maintainer

Thanks for noticing that.
[time passes]
I checked all of my emails for this patch series and they say that Mike was Cc:ed
on all of them.

I am adding his email address back to this one.
Mike, did you receive this patch series?


> Christophe
>
>> Cc: Benjamin Herrenschmidt <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> Cc: [email protected]
>>
>>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |    2 +-
>>  arch/powerpc/include/asm/book3s/64/radix-4k.h |    2 +-
>>  arch/powerpc/include/asm/cputime.h            |    2 +-
>>  arch/powerpc/include/asm/epapr_hcalls.h       |    4 ++--
>>  arch/powerpc/include/asm/hw_breakpoint.h      |    2 +-
>>  arch/powerpc/include/asm/ppc_asm.h            |    2 +-
>>  arch/powerpc/include/asm/reg.h                |    2 +-
>>  arch/powerpc/include/asm/smu.h                |    2 +-
>>  arch/powerpc/platforms/powernv/pci.h          |    2 +-
>>  9 files changed, 10 insertions(+), 10 deletions(-)


thanks.
--
~Randy

2020-07-26 17:50:07

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
> On 7/26/20 7:29 AM, Christophe Leroy wrote:
> > Randy Dunlap <[email protected]> a ?crit :
> >
> > > Drop duplicated words in arch/powerpc/ header files.
> >
> > How did you detect them ? Do you have some script for tgat, or you just read all comments ?
>
> Yes, it's a script that finds lots of false positives, so I have to check
> each and every one of them for validity.

And it's a lot of work too. (thanks Randy)

It could be something like:

$ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[ \t]*\*[ \t]*|)\1\b' * | \
grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b' | \
grep -vP '\blong\s+long\b' | \
grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'


2020-07-26 19:11:31

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On 7/26/20 10:49 AM, Joe Perches wrote:
> On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
>> On 7/26/20 7:29 AM, Christophe Leroy wrote:
>>> Randy Dunlap <[email protected]> a écrit :
>>>
>>>> Drop duplicated words in arch/powerpc/ header files.
>>>
>>> How did you detect them ? Do you have some script for tgat, or you just read all comments ?
>>
>> Yes, it's a script that finds lots of false positives, so I have to check
>> each and every one of them for validity.
>
> And it's a lot of work too. (thanks Randy)
>
> It could be something like:
>
> $ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[ \t]*\*[ \t]*|)\1\b' * | \
> grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b' | \
> grep -vP '\blong\s+long\b' | \
> grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'

Hi Joe,

(what is grep-2.5.4 ?)

It looks like you tried a few iterations of this -- since it drops things
like "long long". There are lots of data types that are repeated & valid.
And many struct names, like "struct kref kref", "struct completion completion",
and "struct mutex mutex". I handle (ignore) those manually, although that
could be added to the Perl script.

v0.1 of this script also found lots of repeated numbers and strings of
special characters (ASCII art etc.), so now it ignores duplicated numbers
or special characters -- since it is really looking for duplicate words.

Anyway, I might as well attach it. It's no big deal.
And if someone else wants to tackle using it, go for it.

--
~Randy


Attachments:
find_dup_words.pl (3.02 kB)

2020-07-26 20:48:55

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On 2020-07-26 12:08, Randy Dunlap wrote:
> On 7/26/20 10:49 AM, Joe Perches wrote:
>> On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
>>> On 7/26/20 7:29 AM, Christophe Leroy wrote:
>>>> Randy Dunlap <[email protected]> a écrit :
>>>>
>>>>> Drop duplicated words in arch/powerpc/ header files.
>>>>
>>>> How did you detect them ? Do you have some script for tgat, or you
>>>> just read all comments ?
>>>
>>> Yes, it's a script that finds lots of false positives, so I have to
>>> check
>>> each and every one of them for validity.
>>
>> And it's a lot of work too. (thanks Randy)
>>
>> It could be something like:
>>
>> $ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[
>> \t]*\*[ \t]*|)\1\b' * | \
>> grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b'
>> | \
>> grep -vP '\blong\s+long\b' | \
>> grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'
>
> Hi Joe,

Hi Randy

> (what is grep-2.5.4 ?)

It's the last version of grep that allowed spanning multiple lines.

That's to find the comment second lines that start with *

> It looks like you tried a few iterations of this -- since it drops
> things
> like "long long". There are lots of data types that are repeated &
> valid.
> And many struct names, like "struct kref kref", "struct completion
> completion",
> and "struct mutex mutex". I handle (ignore) those manually

that's the first exclude pattern.

2020-07-26 23:56:14

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

Randy Dunlap <[email protected]> writes:
> On 7/26/20 7:29 AM, Christophe Leroy wrote:
>> Randy Dunlap <[email protected]> a écrit :
>>
>>> Drop duplicated words in arch/powerpc/ header files.
>>
>> How did you detect them ? Do you have some script for tgat, or you just read all comments ?
>
> Yes, it's a script that finds lots of false positives, so I have to check
> each and every one of them for validity.
>
>>> Cc: Michael Ellerman <[email protected]>
>>
>> You say you Cc Michael, but in fact you don't ... Allthough he is the powerpc maintainer
>
> Thanks for noticing that.
> [time passes]
> I checked all of my emails for this patch series and they say that Mike was Cc:ed
> on all of them.
>
> I am adding his email address back to this one.
> Mike, did you receive this patch series?

Yes.

There's a mailman option which drops me from being explicity on Cc,
because I'm subscribed to the list. Otherwise I get two copies of
everything.

So as long as it goes to linuxppc-dev I should see it, regardless of
whether I'm explicitly listed in Cc.

cheers

2020-07-27 06:54:50

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On Sun, 2020-07-26 at 12:08 -0700, Randy Dunlap wrote:

> v0.1 of this script also found lots of repeated numbers and strings of
> special characters (ASCII art etc.), so now it ignores duplicated numbers
> or special characters -- since it is really looking for duplicate words.
>
> Anyway, I might as well attach it. It's no big deal.
> And if someone else wants to tackle using it, go for it.

This might be a reasonable thing to add to checkpatch.

And here's another possible similar perl word deduplicator attached:

Assuming you have git, this could be used like:

$ git ls-files -- <dir> | xargs perl deduplicate_words.pl

And it would overwrite all files with duplicated words.

No guarantees any changes it makes are right of course.
It still needs a human to verify any change.

For instance:

$ git ls-files kernel/trace/*.[ch] | xargs perl deduplicate_words.pl
$ git diff kernel/trace
kernel/trace/ftrace.c | 2 +-
kernel/trace/trace.c | 2 +-
kernel/trace/trace_dynevent.c | 2 +-
kernel/trace/trace_events_synth.c | 2 +-
kernel/trace/tracing_map.c | 2 +-
5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a3093a84bae3..b7f085a4f71a 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2405,7 +2405,7 @@ struct ftrace_ops direct_ops = {
*
* If the record has the FTRACE_FL_REGS set, that means that it
* wants to convert to a callback that saves all regs. If FTRACE_FL_REGS
- * is not not set, then it wants to convert to the normal callback.
+ * is not set, then it wants to convert to the normal callback.
*
* Returns the address of the trampoline to set to
*/
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 5aa5c01e2fed..4d3dcfb06d6d 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -9253,7 +9253,7 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)

/*
* We need to stop all tracing on all CPUS to read the
- * the next buffer. This is a bit expensive, but is
+ * next buffer. This is a bit expensive, but is
* not done often. We fill all what we can read,
* and then release the locks again.
*/
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index 2c435fdef565..8c1e7e168505 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -402,7 +402,7 @@ void dynevent_arg_init(struct dynevent_arg *arg,
* whitespace, all followed by a separator, if applicable. After the
* first arg string is successfully appended to the command string,
* the optional @operator is appended, followed by the second arg and
- * and optional @separator. If no separator was specified when
+ * optional @separator. If no separator was specified when
* initializing the arg, a space will be appended.
*/
void dynevent_arg_pair_init(struct dynevent_arg_pair *arg_pair,
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index e2a623f2136c..3801d3088744 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1211,7 +1211,7 @@ __synth_event_trace_start(struct trace_event_file *file,
* ENABLED bit is set (which attaches the probe thus allowing
* this code to be called, etc). Because this is called
* directly by the user, we don't have that but we still need
- * to honor not logging when disabled. For the the iterated
+ * to honor not logging when disabled. For the iterated
* trace case, we save the enabed state upon start and just
* ignore the following data calls.
*/
diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c
index 74738c9856f1..4b50fc0cb12c 100644
--- a/kernel/trace/tracing_map.c
+++ b/kernel/trace/tracing_map.c
@@ -260,7 +260,7 @@ int tracing_map_add_var(struct tracing_map *map)
* to use cmp_fn.
*
* A key can be a subset of a compound key; for that purpose, the
- * offset param is used to describe where within the the compound key
+ * offset param is used to describe where within the compound key
* the key referenced by this key field resides.
*
* Return: The index identifying the field in the map and associated


Attachments:
deduplicate_words.pl (1.15 kB)

2020-07-27 07:28:59

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 0/9] powerpc: delete duplicated words

On Sat, 25 Jul 2020 17:38:00 -0700, Randy Dunlap wrote:
> Drop duplicated words in arch/powerpc/ header files.
>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: [email protected]
>
> [...]

Applied to powerpc/next.

[1/9] powerpc/book3s/mmu-hash.h: delete duplicated word
https://git.kernel.org/powerpc/c/10a4a016d6a882ba7601159b0f719330b102c41b
[2/9] powerpc/book3s/radix-4k.h: delete duplicated word
https://git.kernel.org/powerpc/c/92be1fca08eabe8ab083b1dfccd3e932b4fb6f1a
[3/9] powerpc/cputime.h: delete duplicated word
https://git.kernel.org/powerpc/c/dc9bf323d6b8996d22c111add0ac8b0c895dcf52
[4/9] powerpc/epapr_hcalls.h: delete duplicated words
https://git.kernel.org/powerpc/c/8965aa4b684f022c4d0bc6429097ddb38a26eaef
[5/9] powerpc/hw_breakpoint.h: delete duplicated word
https://git.kernel.org/powerpc/c/028cc22d29959b501add32fc62660e5484c8092d
[6/9] powerpc/ppc_asm.h: delete duplicated word
https://git.kernel.org/powerpc/c/db10f5500004268b29e3c5bfd1e44ef53a1e25c9
[7/9] powerpc/reg.h: delete duplicated word
https://git.kernel.org/powerpc/c/850659392abc303d41c3f9217d45ab4fa79d201c
[8/9] powerpc/smu.h: delete duplicated word
https://git.kernel.org/powerpc/c/3b56ed4b461fd92b66f6ea44d81837e12878031f
[9/9] powerpc/powernv/pci.h: delete duplicated word
https://git.kernel.org/powerpc/c/86052e407e8e1964c81965de25832258875a0e6d

cheers

2020-07-27 07:35:08

by Joe Perches

[permalink] [raw]
Subject: [PATCH] checkpatch: Add test for repeated words

Try to avoid adding repeated words either on the
same line or consecutive comment lines in a block

e.g.:

duplicated word in comment block

/*
* this is a comment block where the last word of the previous
* previous line is also the first word of the next line
*/

and simple duplication

/* test this this again */

Inspired-by: Randy Dunlap ([email protected]>
Signed-off-by: Joe Perches <[email protected]>
---
scripts/checkpatch.pl | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e9fde28eb0de..c6ef76b72bf3 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -591,6 +591,8 @@ our @mode_permission_funcs = (
["__ATTR", 2],
);

+my $word_pattern = '\b[A-Z]?[a-z]{2,}\b';
+
#Create a search pattern for all these functions to speed up a loop below
our $mode_perms_search = "";
foreach my $entry (@mode_permission_funcs) {
@@ -3340,6 +3342,42 @@ sub process {
}
}

+# check for repeated words separated by a single space
+ if ($rawline =~ /^\+/) {
+ while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
+
+ my $first = $1;
+ my $second = $2;
+
+ if ($first =~ /(?:struct|union|enum)/) {
+ pos($rawline) += length($first) + length($second) + 1;
+ next;
+ }
+
+ next if ($first ne $second);
+ next if ($first eq 'long');
+
+ if (WARN("REPEATED_WORD",
+ "Possible repeated word: '$first'\n" . $herecurr) &&
+ $fix) {
+ $fixed[$fixlinenr] =~ s/\b$first $second\b/$first/;
+ }
+ }
+
+ # if it's a repeated word on consecutive lines in a comment block
+ if ($prevline =~ /$;+\s*$/ &&
+ $prevrawline =~ /($word_pattern)\s*$/) {
+ my $last_word = $1;
+ if ($rawline =~ /^\+\s*\*\s*$last_word /) {
+ if (WARN("REPEATED_WORD",
+ "Possible repeated word: '$last_word'\n" . $hereprev) &&
+ $fix) {
+ $fixed[$fixlinenr] =~ s/(\+\s*\*\s*)$last_word /$1/;
+ }
+ }
+ }
+ }
+
# check for space before tabs.
if ($rawline =~ /^\+/ && $rawline =~ / \t/) {
my $herevet = "$here\n" . cat_vet($rawline) . "\n";


2020-07-28 00:03:59

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add test for repeated words

On 7/27/20 12:33 AM, Joe Perches wrote:
> Try to avoid adding repeated words either on the
> same line or consecutive comment lines in a block
>
> e.g.:
>
> duplicated word in comment block
>
> /*
> * this is a comment block where the last word of the previous
> * previous line is also the first word of the next line
> */
>
> and simple duplication
>
> /* test this this again */
>
> Inspired-by: Randy Dunlap ([email protected]>
> Signed-off-by: Joe Perches <[email protected]>

Thanks for adding this check, Joe.

> ---
> scripts/checkpatch.pl | 38 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)


--
~Randy

2020-07-28 01:26:40

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] checkpatch: Add test for repeated words

On Mon, 2020-07-27 at 17:03 -0700, Randy Dunlap wrote:
> On 7/27/20 12:33 AM, Joe Perches wrote:
> > Try to avoid adding repeated words either on the
> > same line or consecutive comment lines in a block
> >
> > e.g.:
> >
> > duplicated word in comment block
> >
> > /*
> > * this is a comment block where the last word of the previous
> > * previous line is also the first word of the next line
> > */
> >
> > and simple duplication
> >
> > /* test this this again */
> >
> > Inspired-by: Randy Dunlap ([email protected]>
> > Signed-off-by: Joe Perches <[email protected]>
>
> Thanks for adding this check, Joe.

No charge.

It seemed simple enough and you are/were doing
an awful lot of work to fix these.

Thanks for that too.

Joe