2011-03-02 07:27:50

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: Tree for March 2

Hi all,

Changes since 20110301:

Removed trees: bkl-trivial, bkl-llseek and blk-vfs (served their purpose)

Dropped tree: xen

The v4l-dvb tree still has its build failure so I used the version from
next-20110225.

The devicetree tree lost its build failure.

The usb tree gained conflicts against the omap and omap_dss2 trees.

The staging tree gained a conflict against the wireless tree.

The blk-config tree lost its conflicts.

The powerpc allyesconfig build is still broken by some obscure bloating
of the low memory code.

----------------------------------------------------------------------------

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/v2.6/next/ ). If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one. You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source. There are also quilt-import.log and merge.log files
in the Next directory. Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc
and sparc64 defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 184 trees (counting Linus' and 28 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next . If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel.

--
Cheers,
Stephen Rothwell [email protected]

$ git checkout master
$ git reset --hard stable
Merging origin/master
Merging fixes/fixes
Merging kbuild-current/rc-fixes
Merging arm-current/master
Merging m68k-current/for-linus
Merging powerpc-merge/merge
Merging 52xx-and-virtex-current/powerpc/merge
Merging sparc-current/master
Merging scsi-rc-fixes/master
Merging net-current/master
Merging sound-current/for-linus
Merging pci-current/for-linus
Merging wireless-current/master
Merging driver-core.current/driver-core-linus
Merging tty.current/tty-linus
Merging usb.current/usb-linus
Merging staging.current/staging-linus
Merging cpufreq-current/fixes
Merging input-current/for-linus
Merging md-current/for-linus
Merging audit-current/for-linus
Merging crypto-current/master
Merging ide-curent/master
Merging dwmw2/master
Merging sh-current/sh-fixes-for-linus
Merging rmobile-current/rmobile-fixes-for-linus
Merging fbdev-current/fbdev-fixes-for-linus
Merging devicetree-current/devicetree/merge
Merging spi-current/spi/merge
Merging arm/for-next
Merging davinci/davinci-next
Merging i.MX/for-next
Merging linux-spec/for-next
Merging msm/for-next
CONFLICT (content): Merge conflict in arch/arm/mach-msm/board-msm7x27.c
CONFLICT (content): Merge conflict in arch/arm/mach-msm/board-msm7x30.c
CONFLICT (content): Merge conflict in arch/arm/mach-msm/board-qsd8x50.c
CONFLICT (content): Merge conflict in arch/arm/mach-msm/board-sapphire.c
CONFLICT (content): Merge conflict in arch/arm/mach-msm/include/mach/memory.h
Merging omap/for-next
Merging pxa/for-next
Merging samsung/next-samsung
Merging s5p/for-next
Merging tegra/for-next
Merging ux500-core/ux500-core
CONFLICT (content): Merge conflict in arch/arm/configs/u8500_defconfig
CONFLICT (content): Merge conflict in arch/arm/mach-ux500/cpu-db8500.c
Merging avr32/avr32-arch
Merging blackfin/for-linus
Merging cris/for-next
Merging ia64/test
Merging m68k/for-next
Merging m68knommu/for-next
Merging microblaze/next
Merging mips/mips-for-linux-next
Merging parisc/for-next
Merging powerpc/next
Merging 4xx/next
Merging 52xx-and-virtex/powerpc/next
Merging galak/next
Merging s390/features
Merging sh/sh-latest
Merging rmobile/rmobile-latest
Merging sparc/master
Merging tile/master
Merging unicore32/unicore32
Merging xtensa/master
CONFLICT (content): Merge conflict in arch/xtensa/configs/iss_defconfig
Merging ceph/for-next
Merging cifs/master
Merging configfs/linux-next
Merging ecryptfs/next
Merging ext3/for_next
Merging ext4/next
Merging fatfs/master
Merging fuse/for-next
Merging gfs2/master
Merging hfsplus/for-next
Merging jfs/next
Merging logfs/master
CONFLICT (content): Merge conflict in fs/logfs/logfs.h
Merging nfs/linux-next
Merging nfsd/nfsd-next
Merging nilfs2/for-next
Merging ocfs2/linux-next
Merging omfs/for-next
Merging squashfs/master
Merging udf/for_next
Merging v9fs/for-next
Merging ubifs/linux-next
Merging xfs/master
Merging vfs/for-next
Merging vfs-scale/vfs-scale-working
Merging pci/linux-next
Merging hid/for-next
Merging quilt/i2c
Merging bjdooks-i2c/next-i2c
Merging quilt/jdelvare-hwmon
Merging hwmon-staging/hwmon-next
CONFLICT (content): Merge conflict in drivers/hwmon/Makefile
Merging quilt/kernel-doc
Merging v4l-dvb/master
$ git reset --hard HEAD^
Merging refs/next/20110225/v4l-dvb
Merging kbuild/for-next
Merging kconfig/for-next
Merging ide/master
Merging libata/NEXT
Merging infiniband/for-next
Merging acpi/test
Merging idle-test/idle-test
Merging powertools/tools-test
Merging ieee1394/for-next
Merging ubi/linux-next
Merging kvm/linux-next
Merging dlm/next
Merging swiotlb/master
Merging ibft/master
Merging scsi/master
CONFLICT (content): Merge conflict in drivers/scsi/libsas/sas_ata.c
CONFLICT (content): Merge conflict in drivers/scsi/libsas/sas_scsi_host.c
Merging async_tx/next
Merging net/master
Merging wireless/master
Merging bluetooth/master
Merging mtd/master
Merging crypto/master
Merging sound/for-next
Merging sound-asoc/for-next
Merging cpufreq/next
Merging quilt/rr
Merging input/next
Merging input-mt/next
Merging lsm/for-next
Merging block/for-next
Merging quilt/device-mapper
Merging embedded/master
Merging firmware/master
Merging pcmcia/master
Merging battery/master
Merging leds/for-mm
CONFLICT (content): Merge conflict in drivers/leds/Kconfig
Merging backlight/for-mm
Merging mmc/mmc-next
Merging kgdb/kgdb-next
Merging slab/for-next
Merging uclinux/for-next
Merging md/for-next
Merging mfd/for-next
CONFLICT (content): Merge conflict in arch/arm/mach-imx/mach-mx27_3ds.c
CONFLICT (content): Merge conflict in arch/arm/mach-imx/mach-pcm038.c
CONFLICT (content): Merge conflict in arch/arm/mach-mx3/mach-mx31_3ds.c
CONFLICT (content): Merge conflict in arch/arm/mach-mx3/mach-mx31moboard.c
Merging hdlc/hdlc-next
Merging drm/drm-next
Merging fbdev/master
Merging viafb/viafb-next
Merging omap_dss2/for-next
Merging voltage/for-next
Merging security-testing/next
Merging selinux/master
Merging lblnet/master
Merging agp/agp-next
Merging watchdog/master
Merging bdev/master
Merging dwmw2-iommu/master
Merging cputime/cputime
Merging osd/linux-next
Merging jc_docs/docs-next
Merging nommu/master
Merging trivial/for-next
CONFLICT (content): Merge conflict in MAINTAINERS
CONFLICT (content): Merge conflict in fs/eventpoll.c
Merging audit/for-next
Merging suspend/linux-next
Merging fsnotify/for-next
Merging irda/for-next
Merging catalin/for-next
Merging alacrity/linux-next
CONFLICT (content): Merge conflict in drivers/Makefile
CONFLICT (content): Merge conflict in include/linux/Kbuild
CONFLICT (content): Merge conflict in lib/Kconfig
Merging i7core_edac/linux_next
Merging i7300_edac/linux_next
Merging devicetree/devicetree/next
Merging spi/spi/next
Merging tip/auto-latest
CONFLICT (content): Merge conflict in arch/x86/kernel/acpi/sleep.c
Merging rcu/rcu/next
Merging oprofile/for-next
Merging xen-two/linux-next
Merging xen-pvhvm/linux-next
Merging edac-amd/for-next
CONFLICT (content): Merge conflict in include/linux/pci_ids.h
Merging percpu/for-next
CONFLICT (content): Merge conflict in arch/x86/kernel/vmlinux.lds.S
Merging workqueues/for-next
Merging sfi/sfi-test
Merging asm-generic/next
Merging drivers-x86/linux-next
CONFLICT (content): Merge conflict in drivers/hwmon/lm85.c
Applying: OLPC: fix for removal of run_wake_count
Merging hwpoison/hwpoison
Merging sysctl/master
Merging driver-core/driver-core-next
Merging tty/tty-next
CONFLICT (content): Merge conflict in drivers/tty/serial/Kconfig
CONFLICT (content): Merge conflict in drivers/tty/serial/Makefile
Merging usb/usb-next
CONFLICT (content): Merge conflict in arch/arm/mach-omap2/board-4430sdp.c
CONFLICT (content): Merge conflict in arch/arm/mach-omap2/board-omap3evm.c
CONFLICT (content): Merge conflict in arch/arm/mach-omap2/clock3xxx_data.c
CONFLICT (content): Merge conflict in arch/arm/mach-omap2/usb-musb.c
CONFLICT (content): Merge conflict in arch/arm/plat-omap/include/plat/usb.h
CONFLICT (content): Merge conflict in drivers/usb/gadget/Kconfig
CONFLICT (content): Merge conflict in drivers/usb/musb/musb_core.h
Merging staging/staging-next
CONFLICT (content): Merge conflict in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
Merging slabh/slabh
Merging bkl-config/config
Merging cleancache/linux-next
CONFLICT (content): Merge conflict in fs/ocfs2/super.c
CONFLICT (content): Merge conflict in fs/super.c
CONFLICT (content): Merge conflict in include/linux/fs.h
CONFLICT (content): Merge conflict in mm/Kconfig
Merging scsi-post-merge/merge-base:master
$ git checkout scsi-post-merge/master


Attachments:
(No filename) (9.60 kB)
(No filename) (490.00 B)
Download all attachments

2011-03-02 22:10:31

by Mariusz Kozlowski

[permalink] [raw]
Subject: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)

Signed-off-by: Mariusz Kozlowski <[email protected]>
---
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index e476e87..689db39 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -1812,10 +1812,12 @@ static int bnx2fc_disable(struct net_device *netdev)

mutex_lock(&bnx2fc_dev_lock);

+#ifdef CONFIG_SCSI_BNX2X_FCOE_MODULE
if (THIS_MODULE->state != MODULE_STATE_LIVE) {
rc = -ENODEV;
goto nodev;
}
+#endif

/* obtain physical netdev */
if (netdev->priv_flags & IFF_802_1Q_VLAN)
@@ -1875,10 +1877,12 @@ static int bnx2fc_enable(struct net_device *netdev)
BNX2FC_MISC_DBG("Entered %s\n", __func__);
mutex_lock(&bnx2fc_dev_lock);

+#ifdef CONFIG_SCSI_BNX2X_FCOE_MODULE
if (THIS_MODULE->state != MODULE_STATE_LIVE) {
rc = -ENODEV;
goto nodev;
}
+#endif

/* obtain physical netdev */
if (netdev->priv_flags & IFF_802_1Q_VLAN)
--
1.7.0.4

2011-03-07 20:16:19

by Mariusz Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)

Hm. Still there in next-20110307. Is this patch wrong or..?

--
Mariusz Kozlowski

2011-03-07 23:54:48

by Bhanu Prakash Gollapudi

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ?MODULE_STATE_LIVE? undeclared (first use in this function)
>
> Hm. Still there in next-20110307. Is this patch wrong or..?
>

James,

Here is my ack for this patch.

Thanks,
Bhanu

2011-03-08 00:18:28

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)
> >
> > Hm. Still there in next-20110307. Is this patch wrong or..?
> >
>
> James,
>
> Here is my ack for this patch.

OK, so the patch is actually wrong because adding #ifdefs on modules in
files really impedes readability. The bug is using a direct deref on
module state instead of one of the APIs which work in the non-modular
case, namely try_module_get(). That means the other two need to come out
and be reworked (plus all the others in fcoe).

Reworked looks like it might be a bigger item than bnx2fc. If any of
those tests is ever relevant, it means we have a race in the
fcoe_transport because it shouldn't be calling function pointers on a
dying module (unless it wants to trigger an oops).

So, why are you trying to do this in the first place?

James

2011-03-08 01:09:32

by Love, Robert W

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Mon, 2011-03-07 at 16:18 -0800, James Bottomley wrote:
> On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> > On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)
> > >
> > > Hm. Still there in next-20110307. Is this patch wrong or..?
> > >
> >
> > James,
> >
> > Here is my ack for this patch.
>
> OK, so the patch is actually wrong because adding #ifdefs on modules in
> files really impedes readability. The bug is using a direct deref on
> module state instead of one of the APIs which work in the non-modular
> case, namely try_module_get(). That means the other two need to come out
> and be reworked (plus all the others in fcoe).
>
> Reworked looks like it might be a bigger item than bnx2fc. If any of
> those tests is ever relevant, it means we have a race in the
> fcoe_transport because it shouldn't be calling function pointers on a
> dying module (unless it wants to trigger an oops).
>
> So, why are you trying to do this in the first place?
>
First, fcoe.c started with these checks. Here is a comment in fcoe.c at
the point of one of the checks.

/*
* Make sure the module has been initialized, and is not about to be
* removed. Module paramter sysfs files are writable before the
* module_init function is called and after module_exit.
*/

I don't know the correct way to fix that race is, but we may be past the
need to fix it in the LLDs.

Next, the fcoe transport was added. Since it (libfcoe.ko) is now calling
what used to be the fcoe.ko sysfs entry points I don't think the problem
exists in fcoe.c or in bnx2fc_fcoe.c, the problem should be in the fcoe
transport code, as James suggested.

The fcoe transport code already has these checks to protect against
sysfs files being writable before module initialization is complete. It
uses the ft_mutex to protect the list of transports(LLDs) so when
'create' is called it knows that the transport is still there to call
down to. It holds the ft_mutex until the LLD's 'create' routine returns.
The transports(LLDs) should be detaching themselves from the fcoe
transport layer before they exit. fcoe_transport_detach will try to
acquire the ft_mutex and block until the 'create' call returns and
releases the ft_mutex. I think this ensures that the transport(LLD) will
be fine when the fcoe transport calls it.

My feeling is that these checks are still needed in the fcoe transport,
but not in the LLDs. If someone can suggest a better way to protect
against writable sysfs files when the module hasn't finished
initializing, we should do that instead of the ifdefs.

Hope this helps,

//Rob

FYI: mnc asked about this code and the trylock code in fcoe and libfcoe.
I have patches in our internal validation to remove the trylock usage,
but I don't have patches to fix the module state checking.

2011-03-08 19:37:18

by Zou, Yi

[permalink] [raw]
Subject: RE: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

> On Mon, 2011-03-07 at 16:18 -0800, James Bottomley wrote:
> > On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> > > On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > > > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing
> pointer to incomplete type
> > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error:
> ‘MODULE_STATE_LIVE’ undeclared (first use in this function)
> > > >
> > > > Hm. Still there in next-20110307. Is this patch wrong or..?
> > > >
> > >
> > > James,
> > >
> > > Here is my ack for this patch.
> >
> > OK, so the patch is actually wrong because adding #ifdefs on modules in
> > files really impedes readability. The bug is using a direct deref on
> > module state instead of one of the APIs which work in the non-modular
> > case, namely try_module_get(). That means the other two need to come
> out
> > and be reworked (plus all the others in fcoe).
> >
> > Reworked looks like it might be a bigger item than bnx2fc. If any of
> > those tests is ever relevant, it means we have a race in the
> > fcoe_transport because it shouldn't be calling function pointers on a
> > dying module (unless it wants to trigger an oops).
> >
> > So, why are you trying to do this in the first place?
> >
> First, fcoe.c started with these checks. Here is a comment in fcoe.c at
> the point of one of the checks.
>
> /*
> * Make sure the module has been initialized, and is not about to be
> * removed. Module paramter sysfs files are writable before the
> * module_init function is called and after module_exit.
> */
>
> I don't know the correct way to fix that race is, but we may be past the
> need to fix it in the LLDs.
>
> Next, the fcoe transport was added. Since it (libfcoe.ko) is now calling
> what used to be the fcoe.ko sysfs entry points I don't think the problem
> exists in fcoe.c or in bnx2fc_fcoe.c, the problem should be in the fcoe
> transport code, as James suggested.
>
> The fcoe transport code already has these checks to protect against
> sysfs files being writable before module initialization is complete. It
> uses the ft_mutex to protect the list of transports(LLDs) so when
> 'create' is called it knows that the transport is still there to call
> down to. It holds the ft_mutex until the LLD's 'create' routine returns.
> The transports(LLDs) should be detaching themselves from the fcoe
> transport layer before they exit. fcoe_transport_detach will try to
> acquire the ft_mutex and block until the 'create' call returns and
> releases the ft_mutex. I think this ensures that the transport(LLD) will
> be fine when the fcoe transport calls it.
>
> My feeling is that these checks are still needed in the fcoe transport,
> but not in the LLDs. If someone can suggest a better way to protect
> against writable sysfs files when the module hasn't finished
> initializing, we should do that instead of the ifdefs.
>
> Hope this helps,
>
> //Rob
>
> FYI: mnc asked about this code and the trylock code in fcoe and libfcoe.
> I have patches in our internal validation to remove the trylock usage,
> but I don't have patches to fix the module state checking.
>
Yeah, this logic was from original fixing race condition in fcoe.ko, note
that we do need check the MODULE_STATE_LIVE, try_module_get() is not what
we wanted, plus module_is_live () checks if it is !GOING. Anyway, I don't
think this is needed any more for individual fcoe transport driver, e.g.,
fcoe.ko or bnx2fc, as the race is now for sysfs of libfcoe.

I will send out a patch to clean up the fcoe.c for this.

Thanks,
yi






> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2011-03-08 20:29:23

by Bhanu Prakash Gollapudi

[permalink] [raw]
Subject: RE: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Tue, 2011-03-08 at 11:36 -0800, Zou, Yi wrote:
> > On Mon, 2011-03-07 at 16:18 -0800, James Bottomley wrote:
> > > On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> > > > On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > > > > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing
> > pointer to incomplete type
> > > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error:
> > ?MODULE_STATE_LIVE? undeclared (first use in this function)
> > > > >
> > > > > Hm. Still there in next-20110307. Is this patch wrong or..?
> > > > >
> > > >
> > > > James,
> > > >
> > > > Here is my ack for this patch.
> > >
> > > OK, so the patch is actually wrong because adding #ifdefs on modules in
> > > files really impedes readability. The bug is using a direct deref on
> > > module state instead of one of the APIs which work in the non-modular
> > > case, namely try_module_get(). That means the other two need to come
> > out
> > > and be reworked (plus all the others in fcoe).
> > >
> > > Reworked looks like it might be a bigger item than bnx2fc. If any of
> > > those tests is ever relevant, it means we have a race in the
> > > fcoe_transport because it shouldn't be calling function pointers on a
> > > dying module (unless it wants to trigger an oops).
> > >
> > > So, why are you trying to do this in the first place?
> > >
> > First, fcoe.c started with these checks. Here is a comment in fcoe.c at
> > the point of one of the checks.
> >
> > /*
> > * Make sure the module has been initialized, and is not about to be
> > * removed. Module paramter sysfs files are writable before the
> > * module_init function is called and after module_exit.
> > */
> >
> > I don't know the correct way to fix that race is, but we may be past the
> > need to fix it in the LLDs.
> >
> > Next, the fcoe transport was added. Since it (libfcoe.ko) is now calling
> > what used to be the fcoe.ko sysfs entry points I don't think the problem
> > exists in fcoe.c or in bnx2fc_fcoe.c, the problem should be in the fcoe
> > transport code, as James suggested.
> >
> > The fcoe transport code already has these checks to protect against
> > sysfs files being writable before module initialization is complete. It
> > uses the ft_mutex to protect the list of transports(LLDs) so when
> > 'create' is called it knows that the transport is still there to call
> > down to. It holds the ft_mutex until the LLD's 'create' routine returns.
> > The transports(LLDs) should be detaching themselves from the fcoe
> > transport layer before they exit. fcoe_transport_detach will try to
> > acquire the ft_mutex and block until the 'create' call returns and
> > releases the ft_mutex. I think this ensures that the transport(LLD) will
> > be fine when the fcoe transport calls it.
> >
> > My feeling is that these checks are still needed in the fcoe transport,
> > but not in the LLDs. If someone can suggest a better way to protect
> > against writable sysfs files when the module hasn't finished
> > initializing, we should do that instead of the ifdefs.
> >
> > Hope this helps,
> >
> > //Rob
> >
> > FYI: mnc asked about this code and the trylock code in fcoe and libfcoe.
> > I have patches in our internal validation to remove the trylock usage,
> > but I don't have patches to fix the module state checking.
> >
> Yeah, this logic was from original fixing race condition in fcoe.ko, note
> that we do need check the MODULE_STATE_LIVE, try_module_get() is not what
> we wanted, plus module_is_live () checks if it is !GOING. Anyway, I don't
> think this is needed any more for individual fcoe transport driver, e.g.,
> fcoe.ko or bnx2fc, as the race is now for sysfs of libfcoe.
>
> I will send out a patch to clean up the fcoe.c for this.

I agree. I'll follow it up with bnx2fc patch.

Thanks,
Bhanu
>
> Thanks,
> yi
>
>
>
>
>
>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html


2011-03-10 15:33:13

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Mon, 2011-03-07 at 17:09 -0800, Robert Love wrote:
> On Mon, 2011-03-07 at 16:18 -0800, James Bottomley wrote:
> > On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> > > On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > > > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)
> > > >
> > > > Hm. Still there in next-20110307. Is this patch wrong or..?
> > > >
> > >
> > > James,
> > >
> > > Here is my ack for this patch.
> >
> > OK, so the patch is actually wrong because adding #ifdefs on modules in
> > files really impedes readability. The bug is using a direct deref on
> > module state instead of one of the APIs which work in the non-modular
> > case, namely try_module_get(). That means the other two need to come out
> > and be reworked (plus all the others in fcoe).
> >
> > Reworked looks like it might be a bigger item than bnx2fc. If any of
> > those tests is ever relevant, it means we have a race in the
> > fcoe_transport because it shouldn't be calling function pointers on a
> > dying module (unless it wants to trigger an oops).
> >
> > So, why are you trying to do this in the first place?
> >
> First, fcoe.c started with these checks. Here is a comment in fcoe.c at
> the point of one of the checks.
>
> /*
> * Make sure the module has been initialized, and is not about to be
> * removed. Module paramter sysfs files are writable before the
> * module_init function is called and after module_exit.
> */
>
> I don't know the correct way to fix that race is, but we may be past the
> need to fix it in the LLDs.

Well, what you've done isn't even fixing the race ... it's just
narrowing the window. count checks on refcounted objects are almost
always wrong. To see this just think what happens if the module goes
dead the instruction cycle after you do the check.

I don't understand the problem with the comments above. The way we
solve the sysfs race in most systems (including modules) is to make sure
they're initialised with the module and torn down as part of its exit
process ... modules.c follows this pattern. The parameters are supposed
to be initialised before the module has any state (because the init code
may rely on them).

I think the problem is you have what are essentially functional sysfs
files done as module parameters in fcoe_transport.c ... this looks to be
wrong. What you should have is these added as group attributes once the
module is capable of processing them; that way you control the
lifetimes.

James


> Next, the fcoe transport was added. Since it (libfcoe.ko) is now calling
> what used to be the fcoe.ko sysfs entry points I don't think the problem
> exists in fcoe.c or in bnx2fc_fcoe.c, the problem should be in the fcoe
> transport code, as James suggested.
>
> The fcoe transport code already has these checks to protect against
> sysfs files being writable before module initialization is complete. It
> uses the ft_mutex to protect the list of transports(LLDs) so when
> 'create' is called it knows that the transport is still there to call
> down to. It holds the ft_mutex until the LLD's 'create' routine returns.
> The transports(LLDs) should be detaching themselves from the fcoe
> transport layer before they exit. fcoe_transport_detach will try to
> acquire the ft_mutex and block until the 'create' call returns and
> releases the ft_mutex. I think this ensures that the transport(LLD) will
> be fine when the fcoe transport calls it.
>
> My feeling is that these checks are still needed in the fcoe transport,
> but not in the LLDs. If someone can suggest a better way to protect
> against writable sysfs files when the module hasn't finished
> initializing, we should do that instead of the ifdefs.
>
> Hope this helps,
>
> //Rob
>
> FYI: mnc asked about this code and the trylock code in fcoe and libfcoe.
> I have patches in our internal validation to remove the trylock usage,
> but I don't have patches to fix the module state checking.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-03-12 02:03:22

by Love, Robert W

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] bnx2fc: fix build error when !CONFIG_MODULES

On Thu, 2011-03-10 at 07:32 -0800, James Bottomley wrote:
> On Mon, 2011-03-07 at 17:09 -0800, Robert Love wrote:
> > On Mon, 2011-03-07 at 16:18 -0800, James Bottomley wrote:
> > > On Mon, 2011-03-07 at 15:54 -0800, Bhanu Gollapudi wrote:
> > > > On Mon, 2011-03-07 at 12:16 -0800, Mariusz Kozlowski wrote:
> > > > > On Wed, Mar 02, 2011 at 11:10:03PM +0100, Mariusz Kozlowski wrote:
> > > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: dereferencing pointer to incomplete type
> > > > > > drivers/scsi/bnx2fc/bnx2fc_fcoe.c:1815: error: ‘MODULE_STATE_LIVE’ undeclared (first use in this function)
> > > > >
> > > > > Hm. Still there in next-20110307. Is this patch wrong or..?
> > > > >
> > > >
> > > > James,
> > > >
> > > > Here is my ack for this patch.
> > >
> > > OK, so the patch is actually wrong because adding #ifdefs on modules in
> > > files really impedes readability. The bug is using a direct deref on
> > > module state instead of one of the APIs which work in the non-modular
> > > case, namely try_module_get(). That means the other two need to come out
> > > and be reworked (plus all the others in fcoe).
> > >
> > > Reworked looks like it might be a bigger item than bnx2fc. If any of
> > > those tests is ever relevant, it means we have a race in the
> > > fcoe_transport because it shouldn't be calling function pointers on a
> > > dying module (unless it wants to trigger an oops).
> > >
> > > So, why are you trying to do this in the first place?
> > >
> > First, fcoe.c started with these checks. Here is a comment in fcoe.c at
> > the point of one of the checks.
> >
> > /*
> > * Make sure the module has been initialized, and is not about to be
> > * removed. Module paramter sysfs files are writable before the
> > * module_init function is called and after module_exit.
> > */
> >
> > I don't know the correct way to fix that race is, but we may be past the
> > need to fix it in the LLDs.
>
> Well, what you've done isn't even fixing the race ... it's just
> narrowing the window. count checks on refcounted objects are almost
> always wrong. To see this just think what happens if the module goes
> dead the instruction cycle after you do the check.
>
> I don't understand the problem with the comments above. The way we
> solve the sysfs race in most systems (including modules) is to make sure
> they're initialised with the module and torn down as part of its exit
> process ... modules.c follows this pattern. The parameters are supposed
> to be initialised before the module has any state (because the init code
> may rely on them).
>
> I think the problem is you have what are essentially functional sysfs
> files done as module parameters in fcoe_transport.c ... this looks to be
> wrong. What you should have is these added as group attributes once the
> module is capable of processing them; that way you control the
> lifetimes.

You're right, this is our problem. Unfortunately, we're in a bit of a
"chicken and egg" situation.

For SW FCoE, we don't know which netdevice the user wants to run FCoE
traffic on so we don't create any devices or other sysfs entries until
the user tells us to 'create'. The user passes the netdevice name to our
sysfs file and then we create the scsi_host/fc_host.

I am not sure what I would attach a group of attributes (create,
destroy, etc...) to. If I had per-instance devices I could add
attributes to them, but I need the 'create' interface before I can
create any per-instance devices.

One option would be to go back to what we used to do (pre-mainline
acceptance) which is the following:

static const struct kobj_attribute fcoe_destroyattr = \
__ATTR(destroy, S_IWUSR, NULL, fcoe_destroy);
static const struct kobj_attribute fcoe_createattr = \
__ATTR(create, S_IWUSR, NULL, fcoe_create);

static int __init fcoe_init(void)
{
...
rc = sysfs_create_file(&THIS_MODULE->mkobj.kobj,
&fcoe_destroyattr.attr);
if (!rc)
rc = sysfs_create_file(&THIS_MODULE->mkobj.kobj,
&fcoe_createattr.attr)
...
}

I think this is bad because we're manipulating kobjects too directly,
right?

A second option would be to keep everything the same but use an atomic
bit to signify when the module has completed initialization. We'd still
need a check in each of the sysfs entries store routines, but we
wouldn't be checking against the module state.

A third option is to do something like bonding does, where it creates
a /sys/class/net/bonding_masters entry. My guess is that this would be
controversial; adding fcoe interfaces class/net could be considered
polluting the networking space with storage information.

A fourth option would be to drop sysfs all together and use netlink,
like iSCSI. Currently we don't need an extensive kernel/user interface,
so I look at this as overkill.

I suppose the second option would solve the current race. I realize it
doesn't necessarily fix the fact that we're adding usable sysfs files
prematurely, but I'm not sure where I'd attach the attributes if I were
do do it at the end of libfcoe's module_init routine. I'll mail my patch
for comments.

Thanks, //Rob