2023-02-28 20:40:09

by kernel test robot

[permalink] [raw]
Subject: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
head: e492250d5252635b6c97d52eddf2792ec26f1ec1
commit: 74ff8864cc842be994853095dba6db48e716400a PCI: hotplug: Allow marking devices as disconnected during bind/unbind
date: 13 days ago
config: loongarch-randconfig-s042-20230226 (https://download.01.org/0day-ci/archive/20230301/[email protected]/config)
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=74ff8864cc842be994853095dba6db48e716400a
git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 74ff8864cc842be994853095dba6db48e716400a
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

sparse warnings: (new ones prefixed by >>)
drivers/pci/pcie/err.c: note: in included file:
>> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t
>> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
>> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t

vim +325 drivers/pci/pcie/../pci.h

306
307 /**
308 * pci_dev_set_io_state - Set the new error state if possible.
309 *
310 * @dev: PCI device to set new error_state
311 * @new: the state we want dev to be in
312 *
313 * If the device is experiencing perm_failure, it has to remain in that state.
314 * Any other transition is allowed.
315 *
316 * Returns true if state has been changed to the requested state.
317 */
318 static inline bool pci_dev_set_io_state(struct pci_dev *dev,
319 pci_channel_state_t new)
320 {
321 pci_channel_state_t old;
322
323 switch (new) {
324 case pci_channel_io_perm_failure:
> 325 xchg(&dev->error_state, pci_channel_io_perm_failure);
326 return true;
327 case pci_channel_io_frozen:
328 old = cmpxchg(&dev->error_state, pci_channel_io_normal,
329 pci_channel_io_frozen);
330 return old != pci_channel_io_perm_failure;
331 case pci_channel_io_normal:
332 old = cmpxchg(&dev->error_state, pci_channel_io_frozen,
333 pci_channel_io_normal);
334 return old != pci_channel_io_perm_failure;
335 default:
336 return false;
337 }
338 }
339

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


2023-02-28 21:33:32

by Lukas Wunner

[permalink] [raw]
Subject: Re: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

Hi Dan,

0-day is reporting sparse warnings introduced by commit 74ff8864cc84
("PCI: hotplug: Allow marking devices as disconnected during
bind/unbind"), which landed in Linus' tree this week.

The warnings are caused by invocations of xchg() and cmpxchg()
on an enum type ("cast from restricted pci_channel_state_t").

It seems they are only reported for architectures whose arch_xchg()
and arch_cmpxchg() macros cast the argument to an unsigned long.
Archictures such as x86 don't do that, but a number of others do.
The 0-day report, reproduced below in full, is for loongarch.

I'm wondering why the cast is necessary at all. Digging in the
git history, I noticed that it has existed at least on arm since
forever. I suspect that its use on newer arches such as loongarch
may be due to cargo-culting.

Please advise whether these sparse warnings are false positives which
can be ignored and if they aren't, how to resolve them. If you happen
to know the rationale for the cast, I'd be grateful if you could shed
some light on it. Thanks a lot!


On Wed, Mar 01, 2023 at 04:39:24AM +0800, kernel test robot wrote:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> head: e492250d5252635b6c97d52eddf2792ec26f1ec1
> commit: 74ff8864cc842be994853095dba6db48e716400a PCI: hotplug: Allow marking devices as disconnected during bind/unbind
> date: 13 days ago
> config: loongarch-randconfig-s042-20230226 (https://download.01.org/0day-ci/archive/20230301/[email protected]/config)
> compiler: loongarch64-linux-gcc (GCC) 12.1.0
> reproduce:
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # apt-get install sparse
> # sparse version: v0.6.4-39-gce1a6720-dirty
> # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=74ff8864cc842be994853095dba6db48e716400a
> git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout 74ff8864cc842be994853095dba6db48e716400a
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch olddefconfig
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch SHELL=/bin/bash drivers/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <[email protected]>
> | Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
>
> sparse warnings: (new ones prefixed by >>)
> drivers/pci/pcie/err.c: note: in included file:
> >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t
> >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
> >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
>
> vim +325 drivers/pci/pcie/../pci.h
>
> 306
> 307 /**
> 308 * pci_dev_set_io_state - Set the new error state if possible.
> 309 *
> 310 * @dev: PCI device to set new error_state
> 311 * @new: the state we want dev to be in
> 312 *
> 313 * If the device is experiencing perm_failure, it has to remain in that state.
> 314 * Any other transition is allowed.
> 315 *
> 316 * Returns true if state has been changed to the requested state.
> 317 */
> 318 static inline bool pci_dev_set_io_state(struct pci_dev *dev,
> 319 pci_channel_state_t new)
> 320 {
> 321 pci_channel_state_t old;
> 322
> 323 switch (new) {
> 324 case pci_channel_io_perm_failure:
> > 325 xchg(&dev->error_state, pci_channel_io_perm_failure);
> 326 return true;
> 327 case pci_channel_io_frozen:
> 328 old = cmpxchg(&dev->error_state, pci_channel_io_normal,
> 329 pci_channel_io_frozen);
> 330 return old != pci_channel_io_perm_failure;
> 331 case pci_channel_io_normal:
> 332 old = cmpxchg(&dev->error_state, pci_channel_io_frozen,
> 333 pci_channel_io_normal);
> 334 return old != pci_channel_io_perm_failure;
> 335 default:
> 336 return false;
> 337 }
> 338 }
> 339
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests

2023-03-01 04:44:51

by Dan Carpenter

[permalink] [raw]
Subject: Re: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

Hi Lukas,

I do smatch, not sparse. Sparse maintainers are on
[email protected].

On Tue, Feb 28, 2023 at 10:33:25PM +0100, Lukas Wunner wrote:
> Hi Dan,
>
> 0-day is reporting sparse warnings introduced by commit 74ff8864cc84
> ("PCI: hotplug: Allow marking devices as disconnected during
> bind/unbind"), which landed in Linus' tree this week.
>
> The warnings are caused by invocations of xchg() and cmpxchg()
> on an enum type ("cast from restricted pci_channel_state_t").

pci_channel_state_t is not an enum. The problem is the __bit_wise.

>
> It seems they are only reported for architectures whose arch_xchg()
> and arch_cmpxchg() macros cast the argument to an unsigned long.
> Archictures such as x86 don't do that, but a number of others do.
> The 0-day report, reproduced below in full, is for loongarch.
>
> I'm wondering why the cast is necessary at all. Digging in the
> git history, I noticed that it has existed at least on arm since
> forever. I suspect that its use on newer arches such as loongarch
> may be due to cargo-culting.
>

Speaking as an absolutely newbie and ignoramous, I can't see any point
to the cast in arch_xchg(). But I am also surprised that silences the
warning. I would have thought that removing the cast would change the
warning from "warning: cast from restricted my_type_t" to
"warning: incorrect type in argument 1 (different base types)".

> Please advise whether these sparse warnings are false positives which
> can be ignored and if they aren't, how to resolve them. If you happen
> to know the rationale for the cast, I'd be grateful if you could shed
> some light on it. Thanks a lot!

The question is more why is pci_channel_state_t declared as __bit_wise.
__bit_wise data can only be used through accessor functions, like user
pointers have to go through copy_from_user() and endian data has to go
through le32_to_cpu() etc.

I don't know the answer to that though.

regards,
dan carpenter

>
>
> On Wed, Mar 01, 2023 at 04:39:24AM +0800, kernel test robot wrote:
> > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > head: e492250d5252635b6c97d52eddf2792ec26f1ec1
> > commit: 74ff8864cc842be994853095dba6db48e716400a PCI: hotplug: Allow marking devices as disconnected during bind/unbind
> > date: 13 days ago
> > config: loongarch-randconfig-s042-20230226 (https://download.01.org/0day-ci/archive/20230301/[email protected]/config)
> > compiler: loongarch64-linux-gcc (GCC) 12.1.0
> > reproduce:
> > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # apt-get install sparse
> > # sparse version: v0.6.4-39-gce1a6720-dirty
> > # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=74ff8864cc842be994853095dba6db48e716400a
> > git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > git fetch --no-tags linus master
> > git checkout 74ff8864cc842be994853095dba6db48e716400a
> > # save the config file
> > mkdir build_dir && cp config build_dir/.config
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch olddefconfig
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch SHELL=/bin/bash drivers/
> >
> > If you fix the issue, kindly add following tag where applicable
> > | Reported-by: kernel test robot <[email protected]>
> > | Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> >
> > sparse warnings: (new ones prefixed by >>)
> > drivers/pci/pcie/err.c: note: in included file:
> > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t
> > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
> > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
> >
> > vim +325 drivers/pci/pcie/../pci.h
> >
> > 306
> > 307 /**
> > 308 * pci_dev_set_io_state - Set the new error state if possible.
> > 309 *
> > 310 * @dev: PCI device to set new error_state
> > 311 * @new: the state we want dev to be in
> > 312 *
> > 313 * If the device is experiencing perm_failure, it has to remain in that state.
> > 314 * Any other transition is allowed.
> > 315 *
> > 316 * Returns true if state has been changed to the requested state.
> > 317 */
> > 318 static inline bool pci_dev_set_io_state(struct pci_dev *dev,
> > 319 pci_channel_state_t new)
> > 320 {
> > 321 pci_channel_state_t old;
> > 322
> > 323 switch (new) {
> > 324 case pci_channel_io_perm_failure:
> > > 325 xchg(&dev->error_state, pci_channel_io_perm_failure);
> > 326 return true;
> > 327 case pci_channel_io_frozen:
> > 328 old = cmpxchg(&dev->error_state, pci_channel_io_normal,
> > 329 pci_channel_io_frozen);
> > 330 return old != pci_channel_io_perm_failure;
> > 331 case pci_channel_io_normal:
> > 332 old = cmpxchg(&dev->error_state, pci_channel_io_frozen,
> > 333 pci_channel_io_normal);
> > 334 return old != pci_channel_io_perm_failure;
> > 335 default:
> > 336 return false;
> > 337 }
> > 338 }
> > 339
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests

2023-03-01 04:51:37

by Dan Carpenter

[permalink] [raw]
Subject: Re: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

On Wed, Mar 01, 2023 at 07:44:39AM +0300, Dan Carpenter wrote:
> > It seems they are only reported for architectures whose arch_xchg()
> > and arch_cmpxchg() macros cast the argument to an unsigned long.
> > Archictures such as x86 don't do that, but a number of others do.
> > The 0-day report, reproduced below in full, is for loongarch.
> >
> > I'm wondering why the cast is necessary at all. Digging in the
> > git history, I noticed that it has existed at least on arm since
> > forever. I suspect that its use on newer arches such as loongarch
> > may be due to cargo-culting.
> >
>
> Speaking as an absolutely newbie and ignoramous, I can't see any point
> to the cast in arch_xchg(). But I am also surprised that silences the
> warning. I would have thought that removing the cast would change the
> warning from "warning: cast from restricted my_type_t" to
> "warning: incorrect type in argument 1 (different base types)".

Your other option would be to add a __force to the cast. I don't know
if this is a good option. I guess first figure out if the __bit_wise
is really required.

regards,
dan carpenter

diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index 497acf134d99..387c6af01941 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -93,7 +93,7 @@ __XCHG_GEN(_mb)
({ \
__typeof__(*(ptr)) __ret; \
__ret = (__typeof__(*(ptr))) \
- __xchg##sfx((unsigned long)(x), (ptr), sizeof(*(ptr))); \
+ __xchg##sfx((__force unsigned long)(x), (ptr), sizeof(*(ptr))); \
__ret; \
})


2023-12-03 17:00:05

by Lukas Wunner

[permalink] [raw]
Subject: Re: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

Hi Dan,

0-day has sent out 5 more e-mails about this sparse warning in the last
2 days alone. That prompted me to take a closer look...


On Wed, Mar 01, 2023 at 07:44:39AM +0300, Dan Carpenter wrote:
> On Tue, Feb 28, 2023 at 10:33:25PM +0100, Lukas Wunner wrote:
> I do smatch, not sparse. Sparse maintainers are on
> [email protected].

Right, sorry for the confusion. Adding linux-sparse and Luc to cc.
Quoting previous conversation for context, my findings are further below:


> > 0-day is reporting sparse warnings introduced by commit 74ff8864cc84
> > ("PCI: hotplug: Allow marking devices as disconnected during
> > bind/unbind"), which landed in Linus' tree this week.
> >
> > The warnings are caused by invocations of xchg() and cmpxchg()
> > on an enum type ("cast from restricted pci_channel_state_t").
>
> pci_channel_state_t is not an enum. The problem is the __bit_wise.
>
> >
> > It seems they are only reported for architectures whose arch_xchg()
> > and arch_cmpxchg() macros cast the argument to an unsigned long.
> > Archictures such as x86 don't do that, but a number of others do.
> > The 0-day report, reproduced below in full, is for loongarch.
> >
> > I'm wondering why the cast is necessary at all. Digging in the
> > git history, I noticed that it has existed at least on arm since
> > forever. I suspect that its use on newer arches such as loongarch
> > may be due to cargo-culting.
>
> Speaking as an absolutely newbie and ignoramous, I can't see any point
> to the cast in arch_xchg(). But I am also surprised that silences the
> warning. I would have thought that removing the cast would change the
> warning from "warning: cast from restricted my_type_t" to
> "warning: incorrect type in argument 1 (different base types)".
>
> > Please advise whether these sparse warnings are false positives which
> > can be ignored and if they aren't, how to resolve them. If you happen
> > to know the rationale for the cast, I'd be grateful if you could shed
> > some light on it. Thanks a lot!
>
> The question is more why is pci_channel_state_t declared as __bit_wise.
> __bit_wise data can only be used through accessor functions, like user
> pointers have to go through copy_from_user() and endian data has to go
> through le32_to_cpu() etc.

__bitwise is not only to ensure endian correctness. It can be used to
define data types which are integer-based, but treated as a distinct
data type by sparse. A pci_channel_state_t value cannot be assigned
to an integer variable of a different type, for example.

A few arches define arch_xchg() and arch_cmpxchg() as pure macros.
The sparse warning for pci_channel_state_t does not appear on those
arches. (These are: arc csky riscv x86)

All other arches use a mix of macros and static inlines to define
arch_xchg() and arch_cmpxchg(). The static inlines use unsigned long
as argument and return types and the macros cast the argument type to
unsigned long.

Why are the casts necessary? Because there are callers of xchg() and
cmpxchg() which pass pointers instead of integers as arguments.
Examples include llist_del_all(), __wake_q_add(), mark_oom_victim(),
fsnotify_attach_connector_to_object(). (Note that NULL is defined as
"(void *)0".)

When using xchg() or cmpxchg() with __bitwise arguments (as is done
by commit 74ff8864cc84 in pci_dev_set_io_state()), the arches which
define arch_xchg() and arch_cmpxchg() with static inlines see sparse
warnings because the __bitwise arguments are cast to unsigned long.
Those arches are: alpha arm arm64 hexagon loongarch m68k mips openrisc
parisc powerpc s390 sh sparc xtensa

Indeed adding __force to the cast, as you suggest, should avoid the
issue. sparse cannot parse the inline assembler, so it does not
understand that arch_xchg() and arch_cmpxchg() internally perform a
comparison and/or assignment. By adding __force, we therefore do not
lose any validation coverage.

A better approach might be to teach sparse that arch_xchg() and
arch_cmpxchg() internally perform a comparison and/or assignment.
Then sparse could validate the argument and return value types.

builtin.c in the sparse source code already contains functions
to handle __atomic_exchange() and __atomic_compare_exchange(),
so I think xchg() and cmpxchg() should be handled there as well.

Unfortunately the sparse source code does not feel very approachable
to me: E.g. the builtins_common[] table wants a return type of the
function, but the return type can be either an integer or a pointer
in this case. How do I encode this? Some entries in the table use
NULL as return value, is that a way to tell sparse that the return
type is variable? NULL does not seem to mean void, because there's
a separate void_ctype for that. Documentation is nearly non-existent.

Another issue is that the last commit to the sparse repository was
made 17 months ago. Patches submitted to linux-sparse are ignored,
e.g. this one submitted by you half a year ago:

https://lore.kernel.org/linux-sparse/ZIyt1uUYW%2FYXEluw@moroto/

It appears sparse is unmaintained, so it's unclear if it's worth
investing time to add xchg() / cmpxchg() support. I cannot blame
the maintainer as the source code seems less than simple to understand.

Suddenly the __force solution you've proposed looks very appealing...

Thoughts?

Thanks,

Lukas

> > On Wed, Mar 01, 2023 at 04:39:24AM +0800, kernel test robot wrote:
> > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > head: e492250d5252635b6c97d52eddf2792ec26f1ec1
> > > commit: 74ff8864cc842be994853095dba6db48e716400a PCI: hotplug: Allow marking devices as disconnected during bind/unbind
> > > date: 13 days ago
> > > config: loongarch-randconfig-s042-20230226 (https://download.01.org/0day-ci/archive/20230301/[email protected]/config)
> > > compiler: loongarch64-linux-gcc (GCC) 12.1.0
> > > reproduce:
> > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > > chmod +x ~/bin/make.cross
> > > # apt-get install sparse
> > > # sparse version: v0.6.4-39-gce1a6720-dirty
> > > # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=74ff8864cc842be994853095dba6db48e716400a
> > > git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > git fetch --no-tags linus master
> > > git checkout 74ff8864cc842be994853095dba6db48e716400a
> > > # save the config file
> > > mkdir build_dir && cp config build_dir/.config
> > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch olddefconfig
> > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=loongarch SHELL=/bin/bash drivers/
> > >
> > > If you fix the issue, kindly add following tag where applicable
> > > | Reported-by: kernel test robot <[email protected]>
> > > | Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > >
> > > sparse warnings: (new ones prefixed by >>)
> > > drivers/pci/pcie/err.c: note: in included file:
> > > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t
> > > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast from restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast from restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
> > > >> drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast to restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:328:23: sparse: sparse: cast to restricted pci_channel_state_t
> > > drivers/pci/pcie/../pci.h:332:23: sparse: sparse: cast to restricted pci_channel_state_t
> > >
> > > vim +325 drivers/pci/pcie/../pci.h
> > >
> > > 306
> > > 307 /**
> > > 308 * pci_dev_set_io_state - Set the new error state if possible.
> > > 309 *
> > > 310 * @dev: PCI device to set new error_state
> > > 311 * @new: the state we want dev to be in
> > > 312 *
> > > 313 * If the device is experiencing perm_failure, it has to remain in that state.
> > > 314 * Any other transition is allowed.
> > > 315 *
> > > 316 * Returns true if state has been changed to the requested state.
> > > 317 */
> > > 318 static inline bool pci_dev_set_io_state(struct pci_dev *dev,
> > > 319 pci_channel_state_t new)
> > > 320 {
> > > 321 pci_channel_state_t old;
> > > 322
> > > 323 switch (new) {
> > > 324 case pci_channel_io_perm_failure:
> > > > 325 xchg(&dev->error_state, pci_channel_io_perm_failure);
> > > 326 return true;
> > > 327 case pci_channel_io_frozen:
> > > 328 old = cmpxchg(&dev->error_state, pci_channel_io_normal,
> > > 329 pci_channel_io_frozen);
> > > 330 return old != pci_channel_io_perm_failure;
> > > 331 case pci_channel_io_normal:
> > > 332 old = cmpxchg(&dev->error_state, pci_channel_io_frozen,
> > > 333 pci_channel_io_normal);
> > > 334 return old != pci_channel_io_perm_failure;
> > > 335 default:
> > > 336 return false;
> > > 337 }
> > > 338 }
> > > 339
> > >
> > > --
> > > 0-DAY CI Kernel Test Service
> > > https://github.com/intel/lkp-tests

2023-12-04 14:10:06

by Luc Van Oostenryck

[permalink] [raw]
Subject: Re: drivers/pci/pcie/../pci.h:325:17: sparse: sparse: cast from restricted pci_channel_state_t

On Sun, Dec 03, 2023 at 05:59:32PM +0100, Lukas Wunner wrote:
> Hi Dan,
>
> > > Please advise whether these sparse warnings are false positives which
> > > can be ignored and if they aren't, how to resolve them. If you happen
> > > to know the rationale for the cast, I'd be grateful if you could shed
> > > some light on it. Thanks a lot!
> >
> > The question is more why is pci_channel_state_t declared as __bit_wise.
> > __bit_wise data can only be used through accessor functions, like user
> > pointers have to go through copy_from_user() and endian data has to go
> > through le32_to_cpu() etc.
>
> __bitwise is not only to ensure endian correctness. It can be used to
> define data types which are integer-based, but treated as a distinct
> data type by sparse. A pci_channel_state_t value cannot be assigned
> to an integer variable of a different type, for example.

Correct. It was done on purpose to make a sort of 'strong' enum type,
and thus warn if pci_channel_state_t values are mixed with some other types.

> A few arches define arch_xchg() and arch_cmpxchg() as pure macros.
> The sparse warning for pci_channel_state_t does not appear on those
> arches. (These are: arc csky riscv x86)
>
> All other arches use a mix of macros and static inlines to define
> arch_xchg() and arch_cmpxchg(). The static inlines use unsigned long
> as argument and return types and the macros cast the argument type to
> unsigned long.
>
> Why are the casts necessary? Because there are callers of xchg() and
> cmpxchg() which pass pointers instead of integers as arguments.

Hmmm, I'm missing the precise context but it make me wonder what's happening
on 64-bit archs where enums are usually only 32-bit ...

> Examples include llist_del_all(), __wake_q_add(), mark_oom_victim(),
> fsnotify_attach_connector_to_object(). (Note that NULL is defined as
> "(void *)0".)
>
> When using xchg() or cmpxchg() with __bitwise arguments (as is done
> by commit 74ff8864cc84 in pci_dev_set_io_state()), the arches which
> define arch_xchg() and arch_cmpxchg() with static inlines see sparse
> warnings because the __bitwise arguments are cast to unsigned long.
> Those arches are: alpha arm arm64 hexagon loongarch m68k mips openrisc
> parisc powerpc s390 sh sparc xtensa

OK, if the underlying macros do as:
switch (size) {
case 1: ...
then things are ok there (but only if the size is given by a sizeof() of
the correct type (not casted to long or something).

> Indeed adding __force to the cast, as you suggest, should avoid the
> issue. sparse cannot parse the inline assembler, so it does not
> understand that arch_xchg() and arch_cmpxchg() internally perform a
> comparison and/or assignment. By adding __force, we therefore do not
> lose any validation coverage.

Yes, indeed, using __force inside specific accessors or any low-level macros,
like here these arch_xchg(), is OK. It's done in plenty of similar cases.

> A better approach might be to teach sparse that arch_xchg() and
> arch_cmpxchg() internally perform a comparison and/or assignment.
> Then sparse could validate the argument and return value types.
>
> builtin.c in the sparse source code already contains functions
> to handle __atomic_exchange() and __atomic_compare_exchange(),
> so I think xchg() and cmpxchg() should be handled there as well.

I don't agree. Sparse shouldn't know about the semantics of functions
in the kernel. Sparse offers some tools (annotations like the __bitwise,
noderef or address_space attributes for __user, __iomem, ... and type
checking stricter than standard/GCC C). By using these annotations, the
kernel can define a stricter semantic, that better suits its needs,
Sparse should know nothing about this, it's not its job.

Best regards,
-- Luc