2006-09-19 08:28:54

by Andrew Morton

[permalink] [raw]
Subject: 2.6.18-rc7-mm1


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/


- git-input.patch has been dropped due to major mismatches between it
and the driver tree.

- git-alsa.patch has been dropped due to similar mismatches.

- ia64 doesn't build due to bugs in the PCI tree.

- The kernel doesn't work properly on RH FC3 or pretty much anything
which uses old udev, due to improvements in the driver tree.

- `make headers_check' is busted due to various bugs in various trees
and due to collisions between git-magic.patch and git-gfs2.patch
which I couldn't be bothered fixing.

- CONFIG_BLOCK=n is still busted due to mismatches between the NFS
and block trees. Will fix later.

- NFS automounts of subdirectories remain unfixed.

- The large-NR_IRQS-exhausts-per_cpu-memory problem remains unfixed.
I won't merge the genirq changes until it is.

- The i386 genirq MSI bugs have been "fixed" by disabling 4k stacks.

- It took maybe ten hours solid work to get this dogpile vaguely
compiling and limping to a login prompt on x86, x86_64 and powerpc.
I guess it's worth briefly testing if you're keen.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

git fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.

echo "subscribe mm-commits" | mail [email protected]

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.

- Semi-daily snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list.



Breakage since 2.6.18-rc6-mm2:


-libata-ignore-cfa-signature-while-sanity-checking-an-atapi-device.patch
-lockdep-double-the-number-of-stack-trace-entries.patch
-we-can-not-allow-anonymous-contributions-to-the-kernel.patch
-alim15x3c-m5229-rev-c8-support-for-dma-cd-writer.patch
-scsi-lockdep-annotation-in-scsi_send_eh_cmnd.patch
-rcu_do_batch-make-qlen-decrement-irq-safe.patch
-x86-reserve-a-boot-loader-id-number-for-xen.patch
-headers_check-improve-include-regexp.patch
-headers_check-clarify-error-message.patch
-headers_check-reduce-user-visible-noise-in-linux-nfs_fsh.patch
-headers_check-remove-asm-timexh-from-user-export.patch
-headers_check-move-inclusion-of-linux-linkageh-in.patch
-headers_check-move-kernel-only-includes-within-asm-i386-elfh.patch
-headers_check-dont-expose-pfn-stuff-to-userspace-in.patch
-headers_check-fix-userspace-build-of-asm-mips-pageh.patch
-cciss-version-update-new-hw.patch
-usbserial-reference-leak.patch
-drivers-base-check-errors.patch
-fix-device_attribute-memory-leak-in-device_del.patch
-git-ieee1394-fixup.patch
-ieee1394-fix-kerneldoc-of-hpsb_alloc_host.patch
-ieee1394-shrink-tlabel-pools-remove-tpool-semaphores.patch
-ieee1394-remove-include-asm-semaphoreh.patch
-ieee1394-sbp2-safer-last_orb-and.patch
-ieee1394-sbp2-discard-return-value-of.patch
-ieee1394-sbp2-optimize-dma-direction-of.patch
-ieee1394-sbp2-safer-initialization-of.patch
-ieee1394-sbp2-more-checks-of-status.patch
-ieee1394-sbp2-convert.patch
-video1394-add-poll-file-operation-support.patch
-ieee1394-safer-definition-of-empty-macros.patch
-ieee1394-sbp2-enable-auto-spin-up-for-all-sbp-2-devices.patch
-config_pm=n-slim-drivers-ieee1394-ohci1394c.patch
-the-scheduled-removal-of-drivers-ieee1394-sbp2cforce_inquiry_hack.patch
-ieee1394-sbp2-handle-sbp2util_node_write_no_wait-failed.patch
-ieee1394-sbp2-safer-agent-reset-in-error-handlers.patch
-ieee1394-sbp2-recheck-node-generation-in-sbp2_update.patch
-ieee1394-sbp2-better-handling-of-transport-errors.patch
-ieee1394-sbp2-update-includes.patch
-ieee1394-sbp2-prevent-rare-deadlock-in-shutdown.patch
-initialize-ieee1394-early-when-built-in.patch
-ieee1394-sbp2-more-help-in-kconfig.patch
-ieee1394-nodemgr-fix-rwsem-recursion.patch
-ieee1394-nodemgr-grab-classsubsysrwsem-in.patch
-ieee1394-sbp2-dont-prefer-mode-sense-10.patch
-ieee1394-ohci1394-fix-endianess-bug-in-debug-message.patch
-ieee1394-ohci1394-more-obvious-endianess-handling.patch
-maintainers-updates-to-ieee-1394-subsystem.patch
-git-libata-all-ata_piix-build-fix.patch
-8139cp-trim-ring_info.patch
-8139cp-remove-gratuitous-indirection.patch
-8139cp-ring_info-removal-for-the-receive-path.patch
-8139cp-sync-the-device-private-data-with-its-r8169-counterpart.patch
-8139cp-removal-of-useless-bug_on-check.patch
-8139cp-pci_get_drvdatapdev-can-not-be-null-in-suspend-handler.patch
-8139cp-use-pci_device-to-shorten-the-pci-device-table.patch
-rtnetlink-fix-netdevice-name-corruption.patch
-fix-gregkh-pci-msi-blacklist-pci-e-chipsets-depending-on-hypertransport-msi-capability.patch
-watchdog-use-enotty-instead-of-enoioctlcmd-in-ioctl.patch
-hostap_cs-added-support-for-proxim-harmony-pci-w-lan.patch
-x86_64-mm-core-2-oprofile-identification.patch
-kernel-bug-fixing-for-kernel-kmodc.patch
-linux-magich-for-magic-numbers.patch
-linux-magich-for-magic-numbers-sparc-fix.patch
-knfsd-have-ext2-reject-file-handles-with-bad-inode-numbers-early.patch
-knfsd-have-ext2-reject-file-handles-with-bad-inode-numbers-early-tidy.patch
-knfsd-make-ext3-reject-filehandles-referring-to-invalid-inode-numbers.patch
-knfsd-make-ext3-reject-filehandles-referring-to-invalid-inode-numbers-tidy.patch
-pr_debug-check-pr_debug-arguments.patch

Merged into mainline or a subsystem tree.

+add-headers_check-target-to-output-of-make-help.patch
+fix-make-headers_check-on-m68k.patch
+headers_check-clean-up-asm-parisc-pageh-for-user-headers.patch
+ext2-remove-superblock-lock-contention-in-ext2_statfs-2.patch

Sent to Linus for 2.6.18.

+autofs4-zero-timeout-prevents-shutdown.patch

Probably for 2.6.18.

+fix-longstanding-load-balancing-bug-in-the-scheduler.patch

sched fix

+update-to-the-kernel-kmap-kunmap-api.patch

Prepare to overload the kmap() API in probably-wrong ways.

+sound-core-use-seek_set-cur.patch
+opl4-use-seek_set-cur.patch
+gus-use-seek_set-cur.patch
+mixart-use-seek_set-cur.patch

Sound stuff.

+cifs-use-seek_end-instead-of-hardcoded-value.patch

CIFS cleanup

+git-cpufreq-sw_any_bug_dmi_table-can-be-used-on-resume.patch

cpufreq fix

+gregkh-driver-driver-core-add-const-to-class_create.patch
+gregkh-driver-sysfs_symlink_in_root.patch
+gregkh-driver-class_device_interface.patch
+gregkh-driver-config_sysfs_deprecated.patch
+gregkh-driver-sound-device.patch
+gregkh-driver-ppp-device.patch
+gregkh-driver-ppdev-device.patch
+gregkh-driver-mmc-device.patch
+gregkh-driver-pcmcia-device.patch
+gregkh-driver-input-device.patch
+gregkh-driver-firmware-device.patch
+gregkh-driver-fb-device.patch

Driver tree updates.

-revert-gregkh-driver-class_device_rename-remove.patch
-revert-gregkh-driver-network-class_device-to-device.patch
-revert-gregkh-driver-tty-device.patch
-revert-gregkh-driver-mem-devices.patch

It became untenable to revert so many things.

-more-driver-core-fixes-for-mm.patch
-yet-further-driver-core-fixes-for-mm.patch
-return-code-checking-for-make_class_name.patch

Some of these were broken by driver-tree changes.

+gregkh-driver-input-device-a3d-fix.patch
+gregkh-driver-input-device-more-fixes.patch
+gregkh-driver-input-device-even-more-fixes.patch
+gregkh-driver-input-device-even-more-fixes-2.patch
+gregkh-driver-fb-device-fixes.patch
+more-driver-tree-fixes.patch

Fix driver-tree mess.

+dvb-usb-vs-driver-tree.patch

More.

-gregkh-i2c-i2c-isa-plan-for-removal.patch

Dropped due to rejects.

+hdapsc-inversion-of-each-axis.patch

hdaps fix

+stowaway-keyboard-support-update.patch
+stowaway-vs-driver-tree.patch

Fix stowaway-keyboard-support.patch

+hdrcheck-permission-fix.patch

Fix `make headercheck'

-revert-libata-ignore-cfa-signature-while-sanity-checking-an-atapi-device.patch
-redo-libata-ignore-cfa-signature-while-sanity-checking-an-atapi-device.patch

Unneeded.

+git-magic.patch
+git-magic-fixup.patch
+git-magic-fixup-2.patch

Consolidate magic numbers.

-tulip-update-tulip-version.patch

Unneeded

-tulip-update-winbond840c-version.patch

Merged, I think.

+ip100a-fix-tx-pause-bug-reset_tx-intr_handler.patch
+ip100a-change-phy-address-search-from-phy=1-to-phy=0.patch
+ip100a-correct-initial-and-close-hardware-step.patch
+ip100a-solve-host-error-problem-in-low-performance.patch

Net driver updates

+net-ipv6-bh_lock_sock_nested-on-tcp_v6_rcv.patch

lockdep fix

+revert-allow-file-systems-to-manually-d_move-inside-of-rename.patch

Revert a patch which is also in the OCFS2 tree.

+git-parisc-powerpc-fix.patch

Fix broken changes to core IRQ code which are (logically) in the parisc tree.

+8250-uart-backup-timer.patch

Serial fix.

+gregkh-pci-msi-rename-pci_cap_id_ht_irqconf-into-pci_cap_id_ht.patch
+gregkh-pci-pci_bridge-device.patch

PCI tree updates

+pci-quirks-update.patch

PCI fixes

-fix-panic-when-reinserting-adaptec-pcmcia-scsi-card.patch

Dropped

+bodge-scsi-misc-module-reference-count-checks-with-no-module_unload.patch
+scsi-remove-seagateh.patch
+scsi-seagate-scsi_cmnd-conversion.patch
+aha152x-fix.patch

SCSI stuff.

+revert-gregkh-usb-usbcore-remove-usb_suspend_root_hub.patch

Revert broken USB patch

+gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure-2.patch
+microtek-usb-scanner-scsi_cmnd-conversion.patch

USB stuff

+x86-remaining-pda-patches.patch

Fix PDA patches in x86_64 tree.

+cleanup-radix_tree_derefreplace_slot-calling-conventions-warning-fixes.patch

Fix cleanup-radix_tree_derefreplace_slot-calling-conventions.patch

-page-migration-replace-radix_tree_lookup_slot-with-radix_tree_lockup.patch

Dropped.

+have-power-use-add_active_range-and-free_area_init_nodes-ppc-fix.patch

Fix have-power-use-add_active_range-and-free_area_init_nodes.patch

+page-invalidation-cleanup.patch
+slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id.patch
+slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id-fix.patch
+condense-output-of-show_free_areas.patch
+add-numa_build-definition-in-kernelh-to-avoid-ifdef.patch
+disable-gfp_thisnode-in-the-non-numa-case.patch
+gfp_thisnode-for-the-slab-allocator-v2.patch
+gfp_thisnode-for-the-slab-allocator-v2-fix.patch
+add-node-to-zone-for-the-numa-case.patch
+add-node-to-zone-for-the-numa-case-fix.patch
+get-rid-of-zone_table.patch
+get-rid-of-zone_table-fix.patch
+do-not-allocate-pagesets-for-unpopulated-zones.patch
+zone_statistics-use-hot-node-instead-of-cold-zone_pgdat.patch
+deal-with-cases-of-zone_dma-meaning-the-first-zone.patch
+introduce-config_zone_dma.patch
+optional-zone_dma-in-the-vm.patch
+optional-zone_dma-for-i386.patch
+optional-zone_dma-for-x86_64.patch
+optional-zone_dma-for-ia64.patch
+remove-zone_dma-remains-from-parisc.patch
+remove-zone_dma-remains-from-sh-sh64.patch

MM updates

+frv-fix-fls-to-handle-bit-31-being-set-correctly.patch
+frv-implement-fls64.patch
+frv-optimise-ffs.patch

FRV updates

+alchemy-delete-unused-pt_regs-argument-from-au1xxx_dbdma_chan_alloc.patch

MIPS fix

+avr32-dont-leave-dbe-set-when-resetting-cpu.patch
+avr32-make-prot_write-prot_exec-imply-prot_read.patch
+avr32-remove-set_wmb.patch
+avr32-use-parse_early_param.patch
+avr32-fix-exported-headers.patch
+avr32-fix-__const_udelay-overflow-bug.patch
+avr32-mtd-static-memory-controller-driver-try-2.patch
+avr32-mtd-at49bv6416-platform-device-for-atstk1000.patch
+avr32-mtd-unlock-flash-if-necessary.patch

AVR32 udpates

-i386-print-stack-size-in-oops-messages.patch

Dropped due to rejects against x86_64 tree

+x86-restore-i8259a-eoi-status-on-resume.patch

x86 fix

+split-i386-and-x86_64-ptraceh.patch
+split-i386-and-x86_64-ptraceh-fix.patch
+make-uml-use-ptrace-abih.patch

UML work

+inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-vs-gfs2.patch

Fix gfs2 for inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default.patch

-sanitize-3c589_cs.patch

Dropped.

+blockdevc-check-errors-fix.patch

Fix blockdevc-check-errors.patch

+serial-fix-up-offenders-peering-at-baud-bits-directly.patch
+remove-the-old-bd_mutex-lockdep-annotation.patch
+new-bd_mutex-lockdep-annotation.patch
+codingstyle-cleanup-for-kernel-sysc.patch
+allow-proc-configgz-to-be-built-as-a-module.patch
+add-config_headers_check-option-to-automatically-run-make-headers_check.patch
+add-config_headers_check-option-to-automatically-run-make-headers_check-nobble.patch
+pci-via82cxxx_audio-use-pci_get_device.patch
+pci-cs46xx-oss-switch-to-pci_get_device.patch
+#pci-mxser-pci-refcounts.patch
+pci-piix-use-refcounted-interface-when-searching-for-a-450nx.patch
+pci-serverworks-switch-to-pci-refcounted-interfaces.patch
+pci-sis5513-switch-to-pci-refcounting.patch
+pci-mtd-switch-to-pci_get_device-and-do-ref-counting.patch
+pci-via-switch-to-pci_get_device-refcounted-pci-api.patch
+mbcs-use-seek_set-cur.patch
+eicon-isdn-removed-unused-definitions-for-os_seek_.patch
+vfs-use-seek_set-cur.patch
+proper-flags-type-of-spin_lock_irqsave.patch

Misc

+add-missing-page_copy-export-for-ppc-and-powerpc.patch

Fix nfs-use-local-caching-12.patch

-r-o-bind-mount-clean-up-ocfs2-nlink-handling.patch
+r-o-bind-mount-clean-up-ocfs2-nlink-handling-2.patch

Updated due to changes in git-ocfs2.patch

-thinkpad_ec-new-driver-for-thinkpad-embedded-controller-access.patch
-hdaps-use-thinkpad_ec-instead-of-direct-port-access.patch
-hdaps-unify-and-cache-hdaps-readouts.patch
-hdaps-unify-and-cache-hdaps-readouts-fix.patch
-hdaps-correct-readout-and-remove-nonsensical-attributes.patch
-hdaps-remember-keyboard-and-mouse-activity.patch
-hdaps-limit-hardware-query-rate.patch
-hdaps-delay-calibration-to-first-hardware-query.patch
-hdaps-add-explicit-hardware-configuration-functions.patch
-hdaps-add-explicit-hardware-configuration-functions-fix.patch
-hdaps-add-explicit-hardware-configuration-functions-fix-fix.patch
-hdaps-add-new-sysfs-attributes.patch
-hdaps-power-off-accelerometer-on-suspend-and-unload.patch
-hdaps-stop-polling-timer-when-suspended.patch
-hdaps-simplify-whitelist.patch

Dropped.

+s390-update-fs3270-to-use-a-struct-pid.patch

Fix s390 for pid patches in -mm.

+knfsd-replace-two-page-lists-in-struct-svc_rqst-with-one-fix.patch

Fix knfsd-replace-two-page-lists-in-struct-svc_rqst-with-one.patch

+scheduler-numa-aware-placement-of-sched_group_allnodes.patch

sched tweak.

+ecryptfs-versioning-fixes.patch
+ecryptfs-versioning-fixes-tidy.patch

ecryptfs fixes

+namespaces-utsname-implement-clone_newuts-flag-fix.patch

Fix namespaces-utsname-implement-clone_newuts-flag.patch

+rename-the-provided-execve-functions-to-kernel_execve-headers-fix.patch

Fix rename-the-provided-execve-functions-to-kernel_execve.patch some more

+ide-fix-crash-on-repeated-reset-tidy.patch

Clean up ide-fix-crash-on-repeated-reset.patch

+dm-support-ioctls-on-mapped-devices-fix-with-fake-file.patch
+dm-fix-alloc_dev-error-path.patch
+dm-snapshot-fix-invalidation-enomem.patch
+dm-snapshot-allow-zero-chunk_size.patch
+dm-snapshot-fix-metadata-error-handling.patch
+dm-snapshot-make-read-and-write-exception-functions-void.patch
+dm-snapshot-fix-metadata-writing-when-suspending.patch
+dm-snapshot-tidy-snapshot_map.patch
+dm-snapshot-tidy-pending_complete.patch
+dm-snapshot-add-workqueue.patch
+dm-snapshot-tidy-pe-ref-counting.patch
+dm-snapshot-fix-freeing-pending-exception.patch
+dm-mirror-remove-trailing-space-from-table.patch
+dm-mpath-tidy-ctr.patch
+dm-mpath-use-kzalloc.patch
+dm-add-uevent-change-event-on-resume.patch
+dm-add-debug-macro.patch
+dm-table-add-target-preresume.patch
+dm-crypt-add-key-msg.patch
+dm-crypt-restructure-for-workqueue-change.patch
+dm-crypt-restructure-write-processing.patch
+dm-crypt-move-io-to-workqueue.patch
+dm-crypt-use-private-biosets.patch
+dm-use-private-biosets.patch
+dm-extract-device-limit-setting.patch
+dm-table-add-target-flush.patch

Device mapper updates

+statistics-infrastructure-exploitation-zfcp-sched_clock-fix.patch

Fix statistics-infrastructure-exploitation-zfcp.patch

+genirq-msi-restore-__do_irq-compat-logic-temporarily.patch

Kludge around genirq MSI bugs.

+rcu-credits-and-maintainers.patch

RCU update

-nozomi-pci_module_init-conversion.patch

Collaterally damaged by driver tree fun.

+pr_debug-check-pr_debug-arguments-arm-fix.patch
+pr_debug-check-pr_debug-arguments.patch

Fix pr_debug patches in -mm.

+mprotect-patch-for-use-by-slim.patch
+integrity-service-api-and-dummy-provider.patch
+integrity-service-api-and-dummy-provider-compilation-warning-fix.patch
+slim-main-patch.patch
+slim-main-patch-socket_post_create-hook-return-code.patch
+slim-secfs-patch.patch
+slim-make-and-config-stuff.patch
+slim-debug-output.patch
+slim-fix-security-issue-with-the-task_post_setuid-hook.patch
+slim-secfs-inode-i_private-build-fix.patch
+slim-documentation.patch

New security feature.

-input_register_device-debug.patch

Dropped due to rejects.



All 1979 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/patch-list



2006-09-19 13:08:50

by Olivier Galibert

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, Sep 19, 2006 at 01:28:48AM -0700, Andrew Morton wrote:
> - The kernel doesn't work properly on RH FC3 or pretty much anything
> which uses old udev, due to improvements in the driver tree.

Breaking compatibility again? I thought the sysfs/driver tree
maintainers free pass had expired.

OG.

2006-09-19 14:45:59

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1


> - It took maybe ten hours solid work to get this dogpile vaguely
> compiling and limping to a login prompt on x86, x86_64 and powerpc.
> I guess it's worth briefly testing if you're keen.

PPC64 blades shit themselves in a strange way. Possibly the udev
breakage you mentioned? Hard to tell really if people are going to
go around breaking userspace compatibility ;-(

http://test.kernel.org/abat/48127/debug/console.log

rpa_vscsi: SPR_VERSION: 16.a
scsi0 : IBM POWER Virtual SCSI Adapter 1.5.8
ibmvscsi: partner initialization complete
ibmvscsic: sent SRP login
ibmvscsi: SRP_LOGIN succeeded
ibmvscsi: host srp version: 16.a, host partition gekko-vios (4), OS 3,
max io 262144
scsi 0:0:1:0: Direct-Access AIX VDASD PQ: 0 ANSI: 3
SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
sda: Write Protect is off
sda: cache data unavailable
sda: assuming drive cache: write through
SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
sda: Write Protect is off
sda: cache data unavailable
sda: assuming drive cache: write through
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
sd 0:0:1:0: Attached scsi disk sda
creating device nodes .[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
[: [0-9]*: bad number
0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
..
mount -o ro /dev/sda2
ReiserFS: sda2: found reiserfs format "3.6" with standard journal
ReiserFS: sda2: using ordered data mode
reiserfs: using flush barriers
ReiserFS: sda2: journal params: device sda2, size 8192, journal first
block 18, max trans len 1024, max batch 900, max commit age 30, max
trans age 30
ReiserFS: sda2: checking transaction log (sda2)
ReiserFS: sda2: Using r5 hash to sort names
looking for init ...
found /sbin/init
/init: cannot open .//dev//console: no such file
Kernel panic - not syncing: Attempted to kill init!
<0>Rebooting in 180 seconds..-- 0:conmux-control -- time-stamp --
Sep/19/06 4:18:52 --
(bot:conmon-payload) disconnected

2006-09-19 15:18:10

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, Sep 19, 2006 at 01:28:48AM -0700, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>
>
> - git-input.patch has been dropped due to major mismatches between it
> and the driver tree.
>
> - git-alsa.patch has been dropped due to similar mismatches.
>
> - ia64 doesn't build due to bugs in the PCI tree.
>
> - The kernel doesn't work properly on RH FC3 or pretty much anything
> which uses old udev, due to improvements in the driver tree.

I've reworked the driver tree, so all 4 of these issues should no longer
happen.

Although the ia64 one should not be due to anything in the driver tree,
I don't know what caused that, the pci tree is pretty tiny right now.

Sorry for the mess.

greg k-h

2006-09-19 15:19:32

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, Sep 19, 2006 at 03:08:49PM +0200, Olivier Galibert wrote:
> On Tue, Sep 19, 2006 at 01:28:48AM -0700, Andrew Morton wrote:
> > - The kernel doesn't work properly on RH FC3 or pretty much anything
> > which uses old udev, due to improvements in the driver tree.
>
> Breaking compatibility again? I thought the sysfs/driver tree
> maintainers free pass had expired.

"again"? No, this is the same breakage as before, nothing new here,
move along... :)

thanks,

greg k-h

2006-09-19 15:40:10

by Frederik Deweerdt

[permalink] [raw]
Subject: [-mm patch] missing class_dev to dev conversions

On Tue, Sep 19, 2006 at 01:28:48AM -0700, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>
Greg,

There are some net drivers that didn't get their class_device converted to
device, as introduced by the gregkh-driver-network-class_device-to-device
patch.
The arm defconfig build thus fails with the following message:

drivers/net/smc91x.c: In function `smc_ethtool_getdrvinfo':
drivers/net/smc91x.c:1713: error: structure has no member named
`class_dev'
make[2]: *** [drivers/net/smc91x.o] Error 1
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2

The following patch fixes at91_ether.c, etherh.c, smc911x.c and smc91x.c.

Regards,
Frederik

Signed-off-by: Frederik Deweerdt <[email protected]>

diff --git a/drivers/net/arm/at91_ether.c b/drivers/net/arm/at91_ether.c
index 95b28aa..0662a72 100644
--- a/drivers/net/arm/at91_ether.c
+++ b/drivers/net/arm/at91_ether.c
@@ -645,7 +645,7 @@ static void at91ether_get_drvinfo(struct
{
strlcpy(info->driver, DRV_NAME, sizeof(info->driver));
strlcpy(info->version, DRV_VERSION, sizeof(info->version));
- strlcpy(info->bus_info, dev->class_dev.dev->bus_id, sizeof(info->bus_info));
+ strlcpy(info->bus_info, dev->dev.parent->bus_id, sizeof(info->bus_info));
}

static const struct ethtool_ops at91ether_ethtool_ops = {
diff --git a/drivers/net/arm/etherh.c b/drivers/net/arm/etherh.c
index 4ae9897..218380a 100644
--- a/drivers/net/arm/etherh.c
+++ b/drivers/net/arm/etherh.c
@@ -580,7 +580,7 @@ static void etherh_get_drvinfo(struct ne
{
strlcpy(info->driver, DRV_NAME, sizeof(info->driver));
strlcpy(info->version, DRV_VERSION, sizeof(info->version));
- strlcpy(info->bus_info, dev->class_dev.dev->bus_id,
+ strlcpy(info->bus_info, dev->dev.parent->bus_id,
sizeof(info->bus_info));
}

diff --git a/drivers/net/smc911x.c b/drivers/net/smc911x.c
index a621b17..b5aafb0 100644
--- a/drivers/net/smc911x.c
+++ b/drivers/net/smc911x.c
@@ -1653,7 +1653,7 @@ smc911x_ethtool_getdrvinfo(struct net_de
{
strncpy(info->driver, CARDNAME, sizeof(info->driver));
strncpy(info->version, version, sizeof(info->version));
- strncpy(info->bus_info, dev->class_dev.dev->bus_id, sizeof(info->bus_info));
+ strncpy(info->bus_info, dev->dev.parent->bus_id, sizeof(info->bus_info));
}

static int smc911x_ethtool_nwayreset(struct net_device *dev)
diff --git a/drivers/net/smc91x.c b/drivers/net/smc91x.c
index d7e5643..810157d 100644
--- a/drivers/net/smc91x.c
+++ b/drivers/net/smc91x.c
@@ -1710,7 +1710,7 @@ smc_ethtool_getdrvinfo(struct net_device
{
strncpy(info->driver, CARDNAME, sizeof(info->driver));
strncpy(info->version, version, sizeof(info->version));
- strncpy(info->bus_info, dev->class_dev.dev->bus_id, sizeof(info->bus_info));
+ strncpy(info->bus_info, dev->dev.parent->bus_id, sizeof(info->bus_info));
}

static int smc_ethtool_nwayreset(struct net_device *dev)

2006-09-19 16:31:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 19 Sep 2006 07:45:06 -0700
"Martin J. Bligh" <[email protected]> wrote:

>
> > - It took maybe ten hours solid work to get this dogpile vaguely
> > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > I guess it's worth briefly testing if you're keen.
>
> PPC64 blades shit themselves in a strange way. Possibly the udev
> breakage you mentioned? Hard to tell really if people are going to
> go around breaking userspace compatibility ;-(

What version of udev is it running?

> http://test.kernel.org/abat/48127/debug/console.log
>
> ..
>
> sda: Write Protect is off
> sda: cache data unavailable
> sda: assuming drive cache: write through
> SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
> sda: Write Protect is off
> sda: cache data unavailable
> sda: assuming drive cache: write through
> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> sd 0:0:1:0: Attached scsi disk sda
> creating device nodes .[: [0-9]*: bad number
> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
> [: [0-9]*: bad number
> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
> [: [0-9]*: bad number
> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
> [: [0-9]*: bad number
> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
> [: [0-9]*: bad number
>
>

That all looks rather bad.

> ReiserFS: sda2: Using r5 hash to sort names
> looking for init ...
> found /sbin/init
> /init: cannot open .//dev//console: no such file

Bizarrely-formed pathname. Does it always do that?

> Kernel panic - not syncing: Attempted to kill init!
> <0>Rebooting in 180 seconds..-- 0:conmux-control -- time-stamp --
> Sep/19/06 4:18:52 --
> (bot:conmon-payload) disconnected

Has udev actually attempted to do anything by this stage?

I wasn't seeing anything that spectacular. It used to be the case that
udev simply hung. But in rc7-mm1 the symptoms are that incoming ssh
sessions hang, but most other things work OK.

Oh well - Greg has split that tree apart and I shall not be pulling the
more problematic bits henceforth.

2006-09-19 16:36:45

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 19 Sep 2006 07:21:16 -0700
Greg KH <[email protected]> wrote:

> Although the ia64 one should not be due to anything in the driver tree,
> I don't know what caused that, the pci tree is pretty tiny right now.

drivers/pci/probe.c: In function `pci_create_legacy_files':
drivers/pci/probe.c:45: warning: implicit declaration of function `device_create_bin_file'
drivers/pci/probe.c: In function `pci_remove_legacy_files':
drivers/pci/probe.c:61: warning: implicit declaration of function `device_remove_bin_file'
drivers/pci/probe.c: In function `pci_create_bus':
drivers/pci/probe.c:1033: warning: label `sys_create_link_err' defined but not used

The changes inside HAVE_PCI_LEGACY broke.

gregkh-pci-pci_bridge-device.patch
gregkh-pci-pci-sort-device-lists-breadth-first.patch and
gregkh-pci-pci-must_check-fixes.patch

touch that file.

2006-09-19 16:58:22

by Olaf Hering

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, Sep 19, Andrew Morton wrote:


> What version of udev is it running?

021 likely, a simple udevstart that looks for 'dev' entries.
Where do they hide now in -mm?

> > [: [0-9]*: bad number
> >
> >
>
> That all looks rather bad.

'bad number' is harmless, affects only the persistant /dev/disk/ symlinks,
happens since the SCSI target patches in 2.6.9.

> > ReiserFS: sda2: Using r5 hash to sort names
> > looking for init ...
> > found /sbin/init
> > /init: cannot open .//dev//console: no such file
>
> Bizarrely-formed pathname. Does it always do that?

Yes, I wonder why /dev/console got lost in the first place.

/lib/mkinitrd/kinit.sh
...
rm -rf /bin /lib*
#
exec /run_init "$@" < "./$udev_root/console" > "./$udev_root/console" 2>&1
...

> Has udev actually attempted to do anything by this stage?

udevstart spawns alot /sbin/udev processes to propagate /dev

2006-09-19 17:00:50

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

Andrew Morton wrote:
> On Tue, 19 Sep 2006 07:45:06 -0700
> "Martin J. Bligh" <[email protected]> wrote:
>
>
>>>- It took maybe ten hours solid work to get this dogpile vaguely
>>> compiling and limping to a login prompt on x86, x86_64 and powerpc.
>>> I guess it's worth briefly testing if you're keen.
>>
>>PPC64 blades shit themselves in a strange way. Possibly the udev
>>breakage you mentioned? Hard to tell really if people are going to
>>go around breaking userspace compatibility ;-(
>
>
> What version of udev is it running?
>
>
>>http://test.kernel.org/abat/48127/debug/console.log
>>
>>..
>>
>>sda: Write Protect is off
>>sda: cache data unavailable
>>sda: assuming drive cache: write through
>>SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
>>sda: Write Protect is off
>>sda: cache data unavailable
>>sda: assuming drive cache: write through
>> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
>>sd 0:0:1:0: Attached scsi disk sda
>>creating device nodes .[: [0-9]*: bad number
>>0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>>0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>>[: [0-9]*: bad number
>>0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>>0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>>[: [0-9]*: bad number
>>0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>>0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>>[: [0-9]*: bad number
>>0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>>0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>>[: [0-9]*: bad number
>>
>>
>
>
> That all looks rather bad.
>
>
>>ReiserFS: sda2: Using r5 hash to sort names
>>looking for init ...
>>found /sbin/init
>>/init: cannot open .//dev//console: no such file
>
>
> Bizarrely-formed pathname. Does it always do that?


Working one (-git3): http://test.kernel.org/abat/48064/debug/console.log

Same sgio shit. no mention of /dev/console, but it's an error message,
so not unexpected.

> Has udev actually attempted to do anything by this stage?

Buggered if I know. I always just turn it off on my machines.

> I wasn't seeing anything that spectacular. It used to be the case that
> udev simply hung. But in rc7-mm1 the symptoms are that incoming ssh
> sessions hang, but most other things work OK.
>
> Oh well - Greg has split that tree apart and I shall not be pulling the
> more problematic bits henceforth.

OK, may not be that at all ... could be something entirely different.
Just seemed co-incicental to your comments.

M.

2006-09-19 18:38:38

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, Sep 19, 2006 at 09:36:41AM -0700, Andrew Morton wrote:
> On Tue, 19 Sep 2006 07:21:16 -0700
> Greg KH <[email protected]> wrote:
>
> > Although the ia64 one should not be due to anything in the driver tree,
> > I don't know what caused that, the pci tree is pretty tiny right now.
>
> drivers/pci/probe.c: In function `pci_create_legacy_files':
> drivers/pci/probe.c:45: warning: implicit declaration of function `device_create_bin_file'
> drivers/pci/probe.c: In function `pci_remove_legacy_files':
> drivers/pci/probe.c:61: warning: implicit declaration of function `device_remove_bin_file'
> drivers/pci/probe.c: In function `pci_create_bus':
> drivers/pci/probe.c:1033: warning: label `sys_create_link_err' defined but not used
>
> The changes inside HAVE_PCI_LEGACY broke.
>
> gregkh-pci-pci_bridge-device.patch
> gregkh-pci-pci-sort-device-lists-breadth-first.patch and
> gregkh-pci-pci-must_check-fixes.patch
>
> touch that file.

Ok, thanks, only ia64 has HAVE_PCI_LEGACY still enabled and I missed
that.

It should now be fixed, sorry for the noise.

thanks,

greg k-h

2006-09-19 20:22:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tuesday, 19 September 2006 10:28, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>
>
> - git-input.patch has been dropped due to major mismatches between it
> and the driver tree.
>
> - git-alsa.patch has been dropped due to similar mismatches.
>
> - ia64 doesn't build due to bugs in the PCI tree.
>
> - The kernel doesn't work properly on RH FC3 or pretty much anything
> which uses old udev, due to improvements in the driver tree.
>
> - `make headers_check' is busted due to various bugs in various trees
> and due to collisions between git-magic.patch and git-gfs2.patch
> which I couldn't be bothered fixing.
>
> - CONFIG_BLOCK=n is still busted due to mismatches between the NFS
> and block trees. Will fix later.
>
> - NFS automounts of subdirectories remain unfixed.
>
> - The large-NR_IRQS-exhausts-per_cpu-memory problem remains unfixed.
> I won't merge the genirq changes until it is.
>
> - The i386 genirq MSI bugs have been "fixed" by disabling 4k stacks.
>
> - It took maybe ten hours solid work to get this dogpile vaguely
> compiling and limping to a login prompt on x86, x86_64 and powerpc.
> I guess it's worth briefly testing if you're keen.

It's not that bad, but unfortunately the networking doesn't work on my system
(HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
get configured (both tg3 and bcm43xx are affected).

The .config is attached.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller


Attachments:
(No filename) (1.57 kB)
kernel-config.gz (12.61 kB)
Download all attachments

2006-09-19 20:36:23

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 19 Sep 2006 22:25:21 +0200
"Rafael J. Wysocki" <[email protected]> wrote:

> > - It took maybe ten hours solid work to get this dogpile vaguely
> > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > I guess it's worth briefly testing if you're keen.
>
> It's not that bad, but unfortunately the networking doesn't work on my system
> (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> get configured (both tg3 and bcm43xx are affected).

Is there anything interesting in the dmesg output?

Perhaps an `strace -f ifup' or whatever would tell us what's failing.

2006-09-19 21:27:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Tuesday, 19 September 2006 22:36, Andrew Morton wrote:
> On Tue, 19 Sep 2006 22:25:21 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > > - It took maybe ten hours solid work to get this dogpile vaguely
> > > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > > I guess it's worth briefly testing if you're keen.
> >
> > It's not that bad, but unfortunately the networking doesn't work on my system
> > (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> > get configured (both tg3 and bcm43xx are affected).
>
> Is there anything interesting in the dmesg output?

Not to me. :-)

> Perhaps an `strace -f ifup' or whatever would tell us what's failing.

Well, I can configure the interfaces manually, with ifconfig, but the SUSE's
configuration tools don't work. For example, "ifup eth0" tells me that
"No configuration found for eth0" and that's all.

Also, powersaved segfaults at startup so I think the problem is with hal
vs sysfs (again).

The output of dmesg after a fresh boot and the "strace ifup eth0" output
are attached.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller


Attachments:
(No filename) (1.18 kB)
dmesg.log.gz (11.25 kB)
strace.log.gz (5.42 kB)
Download all attachments

2006-09-19 22:03:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Tuesday, 19 September 2006 23:30, Rafael J. Wysocki wrote:
> On Tuesday, 19 September 2006 22:36, Andrew Morton wrote:
> > On Tue, 19 Sep 2006 22:25:21 +0200
> > "Rafael J. Wysocki" <[email protected]> wrote:
> >
> > > > - It took maybe ten hours solid work to get this dogpile vaguely
> > > > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > > > I guess it's worth briefly testing if you're keen.
> > >
> > > It's not that bad, but unfortunately the networking doesn't work on my system
> > > (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> > > get configured (both tg3 and bcm43xx are affected).
> >
> > Is there anything interesting in the dmesg output?
>
> Not to me. :-)
>
> > Perhaps an `strace -f ifup' or whatever would tell us what's failing.
>
> Well, I can configure the interfaces manually, with ifconfig, but the SUSE's
> configuration tools don't work. For example, "ifup eth0" tells me that
> "No configuration found for eth0" and that's all.
>
> Also, powersaved segfaults at startup so I think the problem is with hal
> vs sysfs (again).
>
> The output of dmesg after a fresh boot and the "strace ifup eth0" output
> are attached.

I _guess_ the problem is caused by
gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
because the kernel (obviously) doesn't compile if I revert it.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-09-19 22:06:26

by David Miller

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

From: "Rafael J. Wysocki" <[email protected]>
Date: Wed, 20 Sep 2006 00:06:52 +0200

> I _guess_ the problem is caused by
> gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
> because the kernel (obviously) doesn't compile if I revert it.

Indeed.

I thought we threw this patch out because we knew it would cause
problems for existing systems? I do remember Greg making an argument
as to why we needed the change, but that doesn't make breaking people's
systems legitimate in any way.

2006-09-19 22:32:27

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Tue, Sep 19, 2006 at 03:06:29PM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <[email protected]>
> Date: Wed, 20 Sep 2006 00:06:52 +0200
>
> > I _guess_ the problem is caused by
> > gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
> > because the kernel (obviously) doesn't compile if I revert it.
>
> Indeed.
>
> I thought we threw this patch out because we knew it would cause
> problems for existing systems? I do remember Greg making an argument
> as to why we needed the change, but that doesn't make breaking people's
> systems legitimate in any way.

It's now thrown out, and I think Andrew already had a patch in his tree
that reverted this.

I'll be bringing it back eventually, but first we are going to work out
all the kinks by probably putting these changes in the next few SuSE
alpha releases to see what shakes out in userspace that we need to go
fix.

It's not 2.6.19 material at all, so don't worry :)

thanks,

greg k-h

2006-09-19 22:53:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Wednesday, 20 September 2006 00:30, Greg KH wrote:
> On Tue, Sep 19, 2006 at 03:06:29PM -0700, David Miller wrote:
> > From: "Rafael J. Wysocki" <[email protected]>
> > Date: Wed, 20 Sep 2006 00:06:52 +0200
> >
> > > I _guess_ the problem is caused by
> > > gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
> > > because the kernel (obviously) doesn't compile if I revert it.
> >
> > Indeed.
> >
> > I thought we threw this patch out because we knew it would cause
> > problems for existing systems? I do remember Greg making an argument
> > as to why we needed the change, but that doesn't make breaking people's
> > systems legitimate in any way.
>
> It's now thrown out, and I think Andrew already had a patch in his tree
> that reverted this.
>
> I'll be bringing it back eventually, but first we are going to work out
> all the kinks by probably putting these changes in the next few SuSE
> alpha releases to see what shakes out in userspace that we need to go
> fix.
>
> It's not 2.6.19 material at all, so don't worry :)

Please note, however, that by including such changes in -mm we make _other_
things be not tested.

For example, if I can't install a new kernel and use it on my system without
replacing some other pieces of software, I just won't be using it, because I
have no time for playing with udev, hal, powersaved, acpid, ...
Then, if there are any bugs in it that would have shown up on my system,
we won't know about them unless they show up on someone else's system,
which may not happen.

The more changes that break existing setups are there in -mm, the less
people will acutally try to use -mm kernels and that will result in buggier
-rc kernels and more bugs propagating to the "stable" ones. Do we really
want that to happen?

Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-09-20 01:05:05

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Tue, 19 Sep 2006 23:30:34 +0200, "Rafael J. Wysocki" said:

> Well, I can configure the interfaces manually, with ifconfig, but the SUSE's
> configuration tools don't work. For example, "ifup eth0" tells me that
> "No configuration found for eth0" and that's all.

I'm seeing issues on a Dell Latitude C840 as well, but I'm not positive
it's the same bug(s). The problem I'm seeing is that device renaming is
failing (I have up to 5 different ethernet-ish interfaces that can be
connected, so I abuse /sbin/nameif extensively. There seem to be some
other issues with pcmcia, but it's not clear what the problem is - it
manages to find the (normally down) ethernet on my Xircom card, but the
orinoco driver seems unable to find my wireless card....

For instance, under 2.6.18-rc6-mm2, I see:

pccard: CardBus card inserted into slot 0
PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:03:00.0 to 64
eth2: Xircom cardbus revision 3 at irq 11
PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
0000:03:00.1: ttyS1 at I/O 0xe080 (irq = 11) is a 16550A
pccard: PCMCIA card inserted into slot 2
[rename_device:851]: Changing netdevice name from [eth1] to [eth3]
ohci1394: fw-host0: AT dma reset ctx=0, aborting transmission
ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting...
ieee1394: Host added: ID:BUS[0-00:1023] GUID[374fc0002a71c021]
[rename_device:1237]: Changing netdevice name from [eth2] to [eth1]
cs: memory probe 0xf4000000-0xfbffffff: excluding 0xf4000000-0xf8ffffff 0xfa000000-0xfbffffff
pcmcia: registering new device pcmcia2.0
orinoco 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
orinoco_cs 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
pcmcia: request for exclusive IRQ could not be fulfilled.
pcmcia: the driver needs updating to supported shared IRQ lines.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
eth2: Hardware identity 0005:0004:0005:0000
eth2: Station identity 001f:0001:0008:000a
eth2: Firmware determined as Lucent/Agere 8.10
eth2: Ad-hoc demo mode supported
eth2: IEEE standard IBSS ad-hoc mode supported
eth2: WEP supported, 104-bit key
eth2: MAC address 00:02:2D:5C:11:48
eth2: Station name "HERMES I"
eth2: ready
eth2: orinoco_cs at 2.0, irq 11, io 0xe100-0xe13f
[rename_device:1295]: Changing netdevice name from [eth2] to [eth5]
Non-volatile memory driver v1.2

and under -rc7-mm1, I see:

pccard: CardBus card inserted into slot 0
PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:03:00.0 to 64
eth1: Xircom cardbus revision 3 at irq 11
PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
0000:03:00.1: ttyS1 at I/O 0xe080 (irq = 11) is a 16550A
pccard: PCMCIA card inserted into slot 2
ohci1394: fw-host0: AT dma reset ctx=0, aborting transmission
ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting...
ieee1394: Host added: ID:BUS[0-00:1023] GUID[374fc0002a71c021]
Non-volatile memory driver v1.2

Amazingly less chatty. Much later, when /etc/rc5.d/S10network runs, we finally
see:

orinoco 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
orinoco_cs 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)

but no output for the wireless configuring.

Unless somebody has a better idea overnight, I'll start a bisect of -rc7-mm1
in the morning...


Attachments:
(No filename) (226.00 B)

2006-09-20 01:31:13

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Tuesday 19 September 2006 18:30, Greg KH wrote:
> On Tue, Sep 19, 2006 at 03:06:29PM -0700, David Miller wrote:
> > From: "Rafael J. Wysocki" <[email protected]>
> > Date: Wed, 20 Sep 2006 00:06:52 +0200
> >
> > > I _guess_ the problem is caused by
> > > gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
> > > because the kernel (obviously) doesn't compile if I revert it.
> >
> > Indeed.
> >
> > I thought we threw this patch out because we knew it would cause
> > problems for existing systems? I do remember Greg making an argument
> > as to why we needed the change, but that doesn't make breaking people's
> > systems legitimate in any way.
>
> It's now thrown out, and I think Andrew already had a patch in his tree
> that reverted this.
>
> I'll be bringing it back eventually, but first we are going to work out
> all the kinks by probably putting these changes in the next few SuSE
> alpha releases to see what shakes out in userspace that we need to go
> fix.
>

Greg,

Could you please comment on the patch below which is I believe achieves
the desired result - produces unified sysfs representation of kernel
device tree - without major reshuffle of every kernel subsystem.

--
Dmitry


Driver core: move class_device to /sys/device/... part of the tree

Move sysfs representation of class_device structure from /sys/class/...
to /sys/device/... to provide unified device tree; create symlinks
in /sys/class pointing to /sys/device/... to preserve existing
classification of devices.

Create /sys/device/virtual device which is parent for all class_devices
that do not have real parent device.

Signed-off-by: Dmitry Torokhov <[email protected]>
---

drivers/base/class.c | 154 ++++++++++++++++++++++++---------------------------
1 files changed, 73 insertions(+), 81 deletions(-)

Index: work/drivers/base/class.c
===================================================================
--- work.orig/drivers/base/class.c
+++ work/drivers/base/class.c
@@ -521,60 +521,73 @@ char *make_class_name(const char *name,
return class_name;
}

+static struct device virtual_device = {
+ .bus_id = "virtual",
+};
+
int class_device_add(struct class_device *class_dev)
{
- struct class *parent_class = NULL;
+ struct class *parent_class;
struct class_device *parent_class_dev = NULL;
+ struct device *parent_dev = NULL;
struct class_interface *class_intf;
- char *class_name = NULL;
int error = -EINVAL;

- class_dev = class_device_get(class_dev);
- if (!class_dev)
- return -EINVAL;
-
if (!strlen(class_dev->class_id))
- goto out1;
+ return -EINVAL;

parent_class = class_get(class_dev->class);
if (!parent_class)
- goto out1;
-
- parent_class_dev = class_device_get(class_dev->parent);
+ return -EINVAL;

pr_debug("CLASS: registering class device: ID = '%s'\n",
class_dev->class_id);

+ parent_class_dev = class_device_get(class_dev->parent);
+
+ if (!class_dev->dev)
+ class_dev->dev = &virtual_device;
+ parent_dev = get_device(class_dev->dev);
+
/* first, register with generic layer. */
error = kobject_set_name(&class_dev->kobj, "%s", class_dev->class_id);
if (error)
- goto out2;
+ goto err_put_parents;

- if (parent_class_dev)
- class_dev->kobj.parent = &parent_class_dev->kobj;
- else
- class_dev->kobj.parent = &parent_class->subsys.kset.kobj;
+ class_dev->kobj.parent = parent_class_dev ?
+ &parent_class_dev->kobj : &parent_dev->kobj;

error = kobject_add(&class_dev->kobj);
if (error)
- goto out2;
+ goto err_put_parents;

/* add the needed attributes to this device */
- sysfs_create_link(&class_dev->kobj, &parent_class->subsys.kset.kobj, "subsystem");
+ error = sysfs_create_link(&class_dev->kobj,
+ &parent_class->subsys.kset.kobj,
+ "subsystem");
+ if (error)
+ goto err_del_kobject;
+
+ error = sysfs_create_link(&parent_class->subsys.kset.kobj,
+ &class_dev->kobj,
+ class_dev->class_id);
+ if (error)
+ goto err_del_subsys_link;
+
class_dev->uevent_attr.attr.name = "uevent";
class_dev->uevent_attr.attr.mode = S_IWUSR;
class_dev->uevent_attr.attr.owner = parent_class->owner;
class_dev->uevent_attr.store = store_uevent;
error = class_device_create_file(class_dev, &class_dev->uevent_attr);
if (error)
- goto out3;
+ goto err_del_class_link;

if (MAJOR(class_dev->devt)) {
struct class_device_attribute *attr;
attr = kzalloc(sizeof(*attr), GFP_KERNEL);
if (!attr) {
error = -ENOMEM;
- goto out4;
+ goto err_del_uevent_attr;
}
attr->attr.name = "dev";
attr->attr.mode = S_IRUGO;
@@ -583,7 +596,7 @@ int class_device_add(struct class_device
error = class_device_create_file(class_dev, attr);
if (error) {
kfree(attr);
- goto out4;
+ goto err_del_uevent_attr;
}

class_dev->devt_attr = attr;
@@ -591,24 +604,11 @@ int class_device_add(struct class_device

error = class_device_add_attrs(class_dev);
if (error)
- goto out5;
-
- if (class_dev->dev) {
- class_name = make_class_name(class_dev->class->name,
- &class_dev->kobj);
- error = sysfs_create_link(&class_dev->kobj,
- &class_dev->dev->kobj, "device");
- if (error)
- goto out6;
- error = sysfs_create_link(&class_dev->dev->kobj, &class_dev->kobj,
- class_name);
- if (error)
- goto out7;
- }
+ goto err_del_devt_attr;

error = class_device_add_groups(class_dev);
if (error)
- goto out8;
+ goto err_del_attrs;

kobject_uevent(&class_dev->kobj, KOBJ_ADD);

@@ -621,30 +621,26 @@ int class_device_add(struct class_device
}
up(&parent_class->sem);

- goto out1;
+ return 0;

- out8:
- if (class_dev->dev)
- sysfs_remove_link(&class_dev->kobj, class_name);
- out7:
- if (class_dev->dev)
- sysfs_remove_link(&class_dev->kobj, "device");
- out6:
+ err_del_attrs:
class_device_remove_attrs(class_dev);
- out5:
+ err_del_devt_attr:
if (class_dev->devt_attr)
class_device_remove_file(class_dev, class_dev->devt_attr);
- out4:
+ err_del_uevent_attr:
class_device_remove_file(class_dev, &class_dev->uevent_attr);
- out3:
+ err_del_class_link:
+ sysfs_remove_link(&parent_class->subsys.kset.kobj, class_dev->class_id);
+ err_del_subsys_link:
+ sysfs_remove_link(&class_dev->kobj, "subsystem");
+ err_del_kobject:
kobject_del(&class_dev->kobj);
- out2:
- if(parent_class_dev)
- class_device_put(parent_class_dev);
+ err_put_parents:
+ class_device_put(parent_class_dev);
+ put_device(parent_dev);
class_put(parent_class);
- out1:
- class_device_put(class_dev);
- kfree(class_name);
+
return error;
}

@@ -718,7 +714,8 @@ error:
void class_device_del(struct class_device *class_dev)
{
struct class *parent_class = class_dev->class;
- struct class_device *parent_device = class_dev->parent;
+ struct class_device *parent_class_device = class_dev->parent;
+ struct device *parent_device = class_dev->dev;
struct class_interface *class_intf;
char *class_name = NULL;

@@ -731,12 +728,8 @@ void class_device_del(struct class_devic
up(&parent_class->sem);
}

- if (class_dev->dev) {
- class_name = make_class_name(class_dev->class->name,
- &class_dev->kobj);
- sysfs_remove_link(&class_dev->kobj, "device");
- sysfs_remove_link(&class_dev->dev->kobj, class_name);
- }
+ sysfs_remove_link(&parent_class->subsys.kset.kobj,
+ class_dev->class_id);
sysfs_remove_link(&class_dev->kobj, "subsystem");
class_device_remove_file(class_dev, &class_dev->uevent_attr);
if (class_dev->devt_attr)
@@ -747,7 +740,8 @@ void class_device_del(struct class_devic
kobject_uevent(&class_dev->kobj, KOBJ_REMOVE);
kobject_del(&class_dev->kobj);

- class_device_put(parent_device);
+ put_device(parent_device);
+ class_device_put(parent_class_device);
class_put(parent_class);
kfree(class_name);
}
@@ -788,36 +782,30 @@ void class_device_destroy(struct class *

int class_device_rename(struct class_device *class_dev, char *new_name)
{
- int error = 0;
- char *old_class_name = NULL, *new_class_name = NULL;
-
- class_dev = class_device_get(class_dev);
- if (!class_dev)
- return -EINVAL;
+ int error;
+ char *old_name;

pr_debug("CLASS: renaming '%s' to '%s'\n", class_dev->class_id,
new_name);

- if (class_dev->dev)
- old_class_name = make_class_name(class_dev->class->name,
- &class_dev->kobj);
+ old_name = kstrdup(class_dev->class_id, GFP_KERNEL);
+ if (!old_name)
+ return -ENOMEM;

strlcpy(class_dev->class_id, new_name, KOBJ_NAME_LEN);

error = kobject_rename(&class_dev->kobj, new_name);
-
- if (class_dev->dev) {
- new_class_name = make_class_name(class_dev->class->name,
- &class_dev->kobj);
- sysfs_create_link(&class_dev->dev->kobj, &class_dev->kobj,
- new_class_name);
- sysfs_remove_link(&class_dev->dev->kobj, old_class_name);
+ if (error) {
+ strlcpy(class_dev->class_id, old_name, KOBJ_NAME_LEN);
+ goto out;
}
- class_device_put(class_dev);

- kfree(old_class_name);
- kfree(new_class_name);
+ sysfs_create_link(&class_dev->class->subsys.kset.kobj,
+ &class_dev->kobj, new_name);
+ sysfs_remove_link(&class_dev->class->subsys.kset.kobj, old_name);

+ out:
+ kfree(old_name);
return error;
}

@@ -877,8 +865,6 @@ void class_interface_unregister(struct c
class_put(parent);
}

-
-
int __init classes_init(void)
{
int retval;
@@ -892,6 +878,12 @@ int __init classes_init(void)
subsystem_init(&class_obj_subsys);
if (!class_obj_subsys.kset.subsys)
class_obj_subsys.kset.subsys = &class_obj_subsys;
+
+ retval = device_register(&virtual_device);
+ if (retval)
+ printk(KERN_ERR "Failed to register virtual device, err: %d\n",
+ retval);
+
return 0;
}

2006-09-20 02:30:47

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1

On Wed, Sep 20, 2006 at 12:56:57AM +0200, Rafael J. Wysocki wrote:
> On Wednesday, 20 September 2006 00:30, Greg KH wrote:
> > On Tue, Sep 19, 2006 at 03:06:29PM -0700, David Miller wrote:
> > > From: "Rafael J. Wysocki" <[email protected]>
> > > Date: Wed, 20 Sep 2006 00:06:52 +0200
> > >
> > > > I _guess_ the problem is caused by
> > > > gregkh-driver-network-class_device-to-device.patch, but I can't verify this,
> > > > because the kernel (obviously) doesn't compile if I revert it.
> > >
> > > Indeed.
> > >
> > > I thought we threw this patch out because we knew it would cause
> > > problems for existing systems? I do remember Greg making an argument
> > > as to why we needed the change, but that doesn't make breaking people's
> > > systems legitimate in any way.
> >
> > It's now thrown out, and I think Andrew already had a patch in his tree
> > that reverted this.
> >
> > I'll be bringing it back eventually, but first we are going to work out
> > all the kinks by probably putting these changes in the next few SuSE
> > alpha releases to see what shakes out in userspace that we need to go
> > fix.
> >
> > It's not 2.6.19 material at all, so don't worry :)
>
> Please note, however, that by including such changes in -mm we make _other_
> things be not tested.
>
> For example, if I can't install a new kernel and use it on my system without
> replacing some other pieces of software, I just won't be using it, because I
> have no time for playing with udev, hal, powersaved, acpid, ...
> Then, if there are any bugs in it that would have shown up on my system,
> we won't know about them unless they show up on someone else's system,
> which may not happen.
>
> The more changes that break existing setups are there in -mm, the less
> people will acutally try to use -mm kernels and that will result in buggier
> -rc kernels and more bugs propagating to the "stable" ones. Do we really
> want that to happen?

When it comes back, I will have updates for all versions of broken udev
packages so that it will not break older distros. Then it will be able
to be tested.

thanks,

greg k-h

2006-09-20 12:11:49

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote:
> On Tue, 19 Sep 2006 22:25:21 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > > - It took maybe ten hours solid work to get this dogpile vaguely
> > > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > > I guess it's worth briefly testing if you're keen.
> >
> > It's not that bad, but unfortunately the networking doesn't work on my system
> > (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> > get configured (both tg3 and bcm43xx are affected).
>
> Is there anything interesting in the dmesg output?
>
> Perhaps an `strace -f ifup' or whatever would tell us what's failing.

FYI, it`s SuSE`s /sbin/getcfg binary that doesn't like the changes. It
sees /sys/class/net/eth0 as a symlink, and reels off into sys/block (?)
looking for a directory.

lstat64("/sys/class/net/eth0", {st_dev=makedev(0, 0), st_ino=5968, st_mode=S_IFLNK|0777, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:13, st_mtime=2006/09/20-13:58:57, st_ctime=2006/09/20-13:58:57}) = 0
lstat64("/sys/block/eth0", 0xbf9e432c) = -1 ENOENT (No such file or directory)
open("/proc/mounts", O_RDONLY) = 3
fstat64(3, {st_dev=makedev(0, 3), st_ino=22711, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-14:00:35, st_mtime=2006/09/20-14:00:35, st_ctime=2006/09/20-14:00:35}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f59000
read(3, "rootfs / rootfs rw 0 0\nudev /dev"..., 4096) = 601
close(3) = 0
munmap(0xb7f59000, 4096) = 0
lstat64("/sys/block", {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block", {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
open("/dev/null", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOTDIR (Not a directory)
open("/sys/block", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_dev=makedev(0, 0), st_ino=256, st_mode=S_IFDIR|0755, st_nlink=18, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-14:00:17, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, {{d_ino=256, d_off=1, d_type=DT_DIR, d_reclen=24, d_name="."} {d_ino=1, d_off=2, d_type=DT_DIR, d_reclen=24, d_name=".."} {d_ino=11521, d_off=3, d_type=DT_DIR, d_reclen=24, d_name="sde"} {d_ino=11455, d_off=4, d_type=DT_DIR, d_reclen=24, d_name="sdd"} {d_ino=11416, d_off=5, d_type=DT_DIR, d_reclen=24, d_name="sdc"} {d_ino=11358, d_off=6, d_type=DT_DIR, d_reclen=24, d_name="sdb"} {d_ino=11311, d_off=7, d_type=DT_DIR, d_reclen=24, d_name="sda"} {d_ino=1784, d_off=8, d_type=DT_DIR, d_reclen=24, d_name="hdd"} {d_ino=1770, d_off=9, d_type=DT_DIR, d_reclen=24, d_name="hdc"} {d_ino=1757, d_off=10, d_type=DT_DIR, d_reclen=24, d_name="hda"} {d_ino=1725, d_off=11, d_type=DT_DIR, d_reclen=32, d_name="loop7"} {d_ino=1722, d_off=12, d_type=DT_DIR, d_reclen=32, d_name="loop6"} {d_ino=1719, d_off=13, d_type=DT_DIR, d_reclen=32, d_name="loop5"} {d_ino=1716, d_off=14, d_type=DT_DIR, d_reclen=32, d_name="loop4"} {d_ino=1713, d_off=15, d_type=DT_DIR, d_reclen=32, d_name="loop3"} {d_ino=1710, d_off=16, d_type=DT_DIR, d_reclen=32, d_name="loop2"} {d_ino=1707, d_off=17, d_type=DT_DIR, d_reclen=32, d_name="loop1"} {d_ino=1704, d_off=18, d_type=DT_DIR, d_reclen=32, d_name="loop0"}}, 4096) = 496
lstat64("/sys/block/sde", {st_dev=makedev(0, 0), st_ino=11521, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sde", {st_dev=makedev(0, 0), st_ino=11521, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sdd", {st_dev=makedev(0, 0), st_ino=11455, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sdd", {st_dev=makedev(0, 0), st_ino=11455, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sdc", {st_dev=makedev(0, 0), st_ino=11416, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sdc", {st_dev=makedev(0, 0), st_ino=11416, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
lstat64("/sys/block/sdb", {st_dev=makedev(0, 0), st_ino=11358, st_mode=S_IFDIR|0755, st_nlink=5, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2006/09/20-13:59:14, st_mtime=2006/09/20-13:58:59, st_ctime=2006/09/20-13:58:59}) = 0
... fruitless search


2006-09-20 13:15:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Wednesday, 20 September 2006 16:23, Mike Galbraith wrote:
> On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote:
> > On Tue, 19 Sep 2006 22:25:21 +0200
> > "Rafael J. Wysocki" <[email protected]> wrote:
> >
> > > > - It took maybe ten hours solid work to get this dogpile vaguely
> > > > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > > > I guess it's worth briefly testing if you're keen.
> > >
> > > It's not that bad, but unfortunately the networking doesn't work on my system
> > > (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> > > get configured (both tg3 and bcm43xx are affected).
> >
> > Is there anything interesting in the dmesg output?
> >
> > Perhaps an `strace -f ifup' or whatever would tell us what's failing.
>
> FYI, it`s SuSE`s /sbin/getcfg binary that doesn't like the changes. It
> sees /sys/class/net/eth0 as a symlink, and reels off into sys/block (?)
> looking for a directory.

I have filed a report in the SUSE bugzilla. Let's see what happens.

Greetings,
Rafael


--
You never change things by fighting the existing reality.
R. Buckminster Fuller

2006-09-21 09:45:16

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Wednesday 20 September 2006 16:23, Mike Galbraith wrote:
> On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote:
> > On Tue, 19 Sep 2006 22:25:21 +0200
> > "Rafael J. Wysocki" <[email protected]> wrote:
> >
> > > > - It took maybe ten hours solid work to get this dogpile vaguely
> > > > compiling and limping to a login prompt on x86, x86_64 and powerpc.
> > > > I guess it's worth briefly testing if you're keen.
> > >
> > > It's not that bad, but unfortunately the networking doesn't work on my system
> > > (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit). Apparently, the interfaces don't
> > > get configured (both tg3 and bcm43xx are affected).
> >
> > Is there anything interesting in the dmesg output?
> >
> > Perhaps an `strace -f ifup' or whatever would tell us what's failing.
>
> FYI, it`s SuSE`s /sbin/getcfg binary that doesn't like the changes. It
> sees /sys/class/net/eth0 as a symlink, and reels off into sys/block (?)
> looking for a directory.

It's a known problem. It's actually libsysfs' fault which somehow manages
to not support symlinks properly. Unfortunately getcfg made the mistake of using libsysfs
instead of accessing /sys directly

-Andi

2006-09-21 12:55:55

by Andy Whitcroft

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

Andrew Morton wrote:
> On Tue, 19 Sep 2006 07:45:06 -0700
> "Martin J. Bligh" <[email protected]> wrote:
>
>>> - It took maybe ten hours solid work to get this dogpile vaguely
>>> compiling and limping to a login prompt on x86, x86_64 and powerpc.
>>> I guess it's worth briefly testing if you're keen.
>> PPC64 blades shit themselves in a strange way. Possibly the udev
>> breakage you mentioned? Hard to tell really if people are going to
>> go around breaking userspace compatibility ;-(
>
> What version of udev is it running?

Ok, this is not a blade, but a ppc lpar. Its running the following
version of udev:

udevinfo, version 021_bk

(Assuming of course the help for udev info -V is not lying when it says
"-V print udev version".)

>> http://test.kernel.org/abat/48127/debug/console.log
>>
>> ..
>>
>> sda: Write Protect is off
>> sda: cache data unavailable
>> sda: assuming drive cache: write through
>> SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
>> sda: Write Protect is off
>> sda: cache data unavailable
>> sda: assuming drive cache: write through
>> sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
>> sd 0:0:1:0: Attached scsi disk sda
>> creating device nodes .[: [0-9]*: bad number

I was assuming this message was from udev? I can't find it in the
kernel anyhow. Might just be noise on the initrd.

>> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>> [: [0-9]*: bad number
>> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>> [: [0-9]*: bad number
>> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>> [: [0-9]*: bad number
>> 0:0:1:0: sg_io failed status 0x8 0x0 0x0 0x2
>> 0:0:1:0: sense key 0x5 ASC 0x24 ASCQ 0x0
>> [: [0-9]*: bad number
>>
>>
>
> That all looks rather bad.
>
>> ReiserFS: sda2: Using r5 hash to sort names
>> looking for init ...
>> found /sbin/init
>> /init: cannot open .//dev//console: no such file
>
> Bizarrely-formed pathname. Does it always do that?
>
>> Kernel panic - not syncing: Attempted to kill init!
>> <0>Rebooting in 180 seconds..-- 0:conmux-control -- time-stamp --
>> Sep/19/06 4:18:52 --
>> (bot:conmon-payload) disconnected
>
> Has udev actually attempted to do anything by this stage?
>
> I wasn't seeing anything that spectacular. It used to be the case that
> udev simply hung. But in rc7-mm1 the symptoms are that incoming ssh
> sessions hang, but most other things work OK.
>
> Oh well - Greg has split that tree apart and I shall not be pulling the
> more problematic bits henceforth.

/me pins his hopes on rc7-mm2.

-apw

2006-09-21 13:12:12

by Andy Whitcroft

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 -- ppc64 crash in slab_node ??

Hmmm seeing this on a ppc64 lpar.

PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour dummy device 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
freeing bootmem node 0
freeing bootmem node 1
Memory: 2042288k/2097152k available (5752k kernel code, 55392k reserved,
1456k data, 875k bss, 252k init)
Unable to handle kernel paging request for data at address 0x00000004
Faulting instruction address: 0xc0000000000bc830
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA
Modules linked in:
NIP: C0000000000BC830 LR: C0000000000C7DF4 CTR: 0000000000000000
REGS: c00000000070f990 TRAP: 0300 Not tainted (2.6.18-rc7-mm1-autokern1)
MSR: 8000000000001032 <ME,IR,DR> CR: 24004022 XER: 0000000B
DAR: 0000000000000004, DSISR: 0000000040000000
TASK = c0000000005c0900[0] 'swapper' THREAD: c00000000070c000 CPU: 0
GPR00: C0000000000C80DC C00000000070FC10 C00000000070B1A0 0000000000000000
GPR04: 00000000000000D0 0000000000000000 0000000000000000 0000000000000042
GPR08: 0000000000000000 C0000000005C0900 0000000000000000 C00000007FFF3800
GPR12: 0000000024004022 C0000000005C1480 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 4000000001C00000
GPR20: 0000000000000000 0000000000000000 0000000000141000 C0000000004DA2D0
GPR24: 000000000199FB40 0000000000000000 0000000000042000 C0000000005F31A8
GPR28: C0000000005F31A8 C0000000007416E8 C0000000005FCC60 00000000000000D0
NIP [C0000000000BC830] .slab_node+0x10/0x78
LR [C0000000000C7DF4] .fallback_alloc+0x3c/0x100
Call Trace:
[C00000000070FC10] [8000000000001032] 0x8000000000001032 (unreliable)
[C00000000070FCB0] [C0000000000C80DC] .kmem_cache_zalloc+0x128/0x150
[C00000000070FD50] [C0000000000C90BC] .kmem_cache_create+0x2a0/0x6ac
[C00000000070FE30] [C00000000057BF90] .kmem_cache_init+0x1b4/0x4f8
[C00000000070FEF0] [C00000000055F7BC] .start_kernel+0x214/0x33c
[C00000000070FF90] [C0000000000084F4] .start_here_common+0x50/0x5c
Instruction dump:
7fc3f378 60000000 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020
fbc1fff0 ebc2ce20 60000000 60000000 <a8030004> 2f800002 419e0038 2c800001
<0>Kernel panic - not syncing: Attempted to kill the idle task!

Given all the problems with -mm1 I'm not sure how hard to search for this.

-apw

2006-09-21 13:41:00

by Ian Kent

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 2006-09-19 at 01:28 -0700, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>

> - NFS automounts of subdirectories remain unfixed.

I'm guessing this one is mine.
Should have first pass patch tomorrow.

Sorry for the delay.

Ian


2006-09-21 17:28:31

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 -- ppc64 crash in slab_node ??

On Thu, 21 Sep 2006 14:11:48 +0100
Andy Whitcroft <[email protected]> wrote:

> Hmmm seeing this on a ppc64 lpar.
>
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Console: colour dummy device 80x25
> Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
> freeing bootmem node 0
> freeing bootmem node 1
> Memory: 2042288k/2097152k available (5752k kernel code, 55392k reserved,
> 1456k data, 875k bss, 252k init)
> Unable to handle kernel paging request for data at address 0x00000004
> Faulting instruction address: 0xc0000000000bc830
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=128 NUMA
> Modules linked in:
> NIP: C0000000000BC830 LR: C0000000000C7DF4 CTR: 0000000000000000
> REGS: c00000000070f990 TRAP: 0300 Not tainted (2.6.18-rc7-mm1-autokern1)
> MSR: 8000000000001032 <ME,IR,DR> CR: 24004022 XER: 0000000B
> DAR: 0000000000000004, DSISR: 0000000040000000
> TASK = c0000000005c0900[0] 'swapper' THREAD: c00000000070c000 CPU: 0
> GPR00: C0000000000C80DC C00000000070FC10 C00000000070B1A0 0000000000000000
> GPR04: 00000000000000D0 0000000000000000 0000000000000000 0000000000000042
> GPR08: 0000000000000000 C0000000005C0900 0000000000000000 C00000007FFF3800
> GPR12: 0000000024004022 C0000000005C1480 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 4000000001C00000
> GPR20: 0000000000000000 0000000000000000 0000000000141000 C0000000004DA2D0
> GPR24: 000000000199FB40 0000000000000000 0000000000042000 C0000000005F31A8
> GPR28: C0000000005F31A8 C0000000007416E8 C0000000005FCC60 00000000000000D0
> NIP [C0000000000BC830] .slab_node+0x10/0x78
> LR [C0000000000C7DF4] .fallback_alloc+0x3c/0x100
> Call Trace:
> [C00000000070FC10] [8000000000001032] 0x8000000000001032 (unreliable)
> [C00000000070FCB0] [C0000000000C80DC] .kmem_cache_zalloc+0x128/0x150
> [C00000000070FD50] [C0000000000C90BC] .kmem_cache_create+0x2a0/0x6ac
> [C00000000070FE30] [C00000000057BF90] .kmem_cache_init+0x1b4/0x4f8
> [C00000000070FEF0] [C00000000055F7BC] .start_kernel+0x214/0x33c
> [C00000000070FF90] [C0000000000084F4] .start_here_common+0x50/0x5c
> Instruction dump:
> 7fc3f378 60000000 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020
> fbc1fff0 ebc2ce20 60000000 60000000 <a8030004> 2f800002 419e0038 2c800001
> <0>Kernel panic - not syncing: Attempted to kill the idle task!
>
> Given all the problems with -mm1 I'm not sure how hard to search for this.
>

I guess the below will fix it. But Christoph's machine would have oopsed
too, if it had called fallback_alloc() this early. So presumably it did
not. But yours does. I wonder why?


diff -puN mm/slab.c~gfp_thisnode-for-the-slab-allocator-v2-fix-2 mm/slab.c
--- a/mm/slab.c~gfp_thisnode-for-the-slab-allocator-v2-fix-2
+++ a/mm/slab.c
@@ -3103,17 +3103,21 @@ static void *alternate_node_alloc(struct

/*
* Fallback function if there was no memory available and no objects on a
- * certain node and we are allowed to fall back. We mimick the behavior of
+ * certain node and we are allowed to fall back. We mimic the behavior of
* the page allocator. We fall back according to a zonelist determined by
* the policy layer while obeying cpuset constraints.
*/
void *fallback_alloc(struct kmem_cache *cache, gfp_t flags)
{
- struct zonelist *zonelist = &NODE_DATA(slab_node(current->mempolicy))
- ->node_zonelists[gfp_zone(flags)];
+ struct zonelist *zonelist;
struct zone **z;
void *obj = NULL;

+ if (!current->mempolicy)
+ return NULL;
+
+ zonelist = &NODE_DATA(slab_node(current->mempolicy))
+ ->node_zonelists[gfp_zone(flags)];
for (z = zonelist->zones; *z && !obj; z++)
if (zone_idx(*z) <= ZONE_NORMAL &&
cpuset_zone_allowed(*z, flags))
_

2006-09-21 18:02:49

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 -- ppc64 crash in slab_node ??

On Thu, 21 Sep 2006, Andrew Morton wrote:

> I guess the below will fix it. But Christoph's machine would have oopsed
> too, if it had called fallback_alloc() this early. So presumably it did
> not. But yours does. I wonder why?

Hmmm... Fallback during boot? Any zones that have no ZONE_NORMAL memory?

The right fix though is to check for a NULL memory policy in slab_node.
This is the way other mempol functions behave.

Signed-off-by: Christoph Lameter <[email protected]>

Index: linux-2.6.18-rc7-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.18-rc7-mm1.orig/mm/mempolicy.c 2006-09-19 09:27:03.000000000 -0500
+++ linux-2.6.18-rc7-mm1/mm/mempolicy.c 2006-09-21 12:59:09.385528424 -0500
@@ -1136,7 +1136,9 @@ static unsigned interleave_nodes(struct
*/
unsigned slab_node(struct mempolicy *policy)
{
- switch (policy->policy) {
+ int pol = policy ? policy->policy : MPOL_DEFAULT;
+
+ switch (pol) {
case MPOL_INTERLEAVE:
return interleave_nodes(policy);

2006-09-21 18:04:01

by Andy Whitcroft

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 -- ppc64 crash in slab_node ??

Andrew Morton wrote:
> On Thu, 21 Sep 2006 14:11:48 +0100
> Andy Whitcroft <[email protected]> wrote:
>
>> Hmmm seeing this on a ppc64 lpar.
>>
>> PID hash table entries: 4096 (order: 12, 32768 bytes)
>> Console: colour dummy device 80x25
>> Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
>> Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
>> freeing bootmem node 0
>> freeing bootmem node 1
>> Memory: 2042288k/2097152k available (5752k kernel code, 55392k reserved,
>> 1456k data, 875k bss, 252k init)
>> Unable to handle kernel paging request for data at address 0x00000004
>> Faulting instruction address: 0xc0000000000bc830
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> SMP NR_CPUS=128 NUMA
>> Modules linked in:
>> NIP: C0000000000BC830 LR: C0000000000C7DF4 CTR: 0000000000000000
>> REGS: c00000000070f990 TRAP: 0300 Not tainted (2.6.18-rc7-mm1-autokern1)
>> MSR: 8000000000001032 <ME,IR,DR> CR: 24004022 XER: 0000000B
>> DAR: 0000000000000004, DSISR: 0000000040000000
>> TASK = c0000000005c0900[0] 'swapper' THREAD: c00000000070c000 CPU: 0
>> GPR00: C0000000000C80DC C00000000070FC10 C00000000070B1A0 0000000000000000
>> GPR04: 00000000000000D0 0000000000000000 0000000000000000 0000000000000042
>> GPR08: 0000000000000000 C0000000005C0900 0000000000000000 C00000007FFF3800
>> GPR12: 0000000024004022 C0000000005C1480 0000000000000000 0000000000000000
>> GPR16: 0000000000000000 0000000000000000 0000000000000000 4000000001C00000
>> GPR20: 0000000000000000 0000000000000000 0000000000141000 C0000000004DA2D0
>> GPR24: 000000000199FB40 0000000000000000 0000000000042000 C0000000005F31A8
>> GPR28: C0000000005F31A8 C0000000007416E8 C0000000005FCC60 00000000000000D0
>> NIP [C0000000000BC830] .slab_node+0x10/0x78
>> LR [C0000000000C7DF4] .fallback_alloc+0x3c/0x100
>> Call Trace:
>> [C00000000070FC10] [8000000000001032] 0x8000000000001032 (unreliable)
>> [C00000000070FCB0] [C0000000000C80DC] .kmem_cache_zalloc+0x128/0x150
>> [C00000000070FD50] [C0000000000C90BC] .kmem_cache_create+0x2a0/0x6ac
>> [C00000000070FE30] [C00000000057BF90] .kmem_cache_init+0x1b4/0x4f8
>> [C00000000070FEF0] [C00000000055F7BC] .start_kernel+0x214/0x33c
>> [C00000000070FF90] [C0000000000084F4] .start_here_common+0x50/0x5c
>> Instruction dump:
>> 7fc3f378 60000000 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020
>> fbc1fff0 ebc2ce20 60000000 60000000 <a8030004> 2f800002 419e0038 2c800001
>> <0>Kernel panic - not syncing: Attempted to kill the idle task!
>>
>> Given all the problems with -mm1 I'm not sure how hard to search for this.
>>
>
> I guess the below will fix it. But Christoph's machine would have oopsed
> too, if it had called fallback_alloc() this early. So presumably it did
> not. But yours does. I wonder why?

Thanks I'll push it into the testing system and see what happens.

The following at least feels suspicious to my mind. This appears to say
that this machine has most of its memory in node 1. I am pretty sure
that this machine is infact a single node lpar and shouldn't be numa at all.

early_node_map[3] active PFN ranges
1: 0 -> 32768
0: 32768 -> 40960
1: 40960 -> 524288

If I am doing the math right this machine only has 32Mb in node 0.
Yeah, according to the system we have one node of 32Mb with both CPU's
in it, and another node with no CPUS's with the rest of its 2Gb of ram.

# cat /sys/devices/system/node/node*/*
00000000,00000000,00000000,00000003
10 20

Node 0 MemTotal: 32768 kB
[...]
00000000,00000000,00000000,00000000
20 10

Node 1 MemTotal: 2064384 kB
[...]

I'll have a look at it tommorrow and see if I can figure out whats wrong
with the layout.

:/

-apw

2006-09-21 18:08:22

by Andy Whitcroft

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 -- ppc64 crash in slab_node ??

Christoph Lameter wrote:
> On Thu, 21 Sep 2006, Andrew Morton wrote:
>
>> I guess the below will fix it. But Christoph's machine would have oopsed
>> too, if it had called fallback_alloc() this early. So presumably it did
>> not. But yours does. I wonder why?
>
> Hmmm... Fallback during boot? Any zones that have no ZONE_NORMAL memory?

I think there is some kind of memory layout issue with the machine (see
my reply to akpm), which I'll look into tommorrow. But as the machine
is tripping this bug, I'll throw this patch at it also.

-apw

2006-09-21 21:07:42

by Mark Haverkamp

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Tue, 2006-09-19 at 01:28 -0700, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>

>
> +rename-the-provided-execve-functions-to-kernel_execve-headers-fix.patch
>
> Fix rename-the-provided-execve-functions-to-kernel_execve.patch some more
>

While running cross compile tests for prpmc750 we got the following
error that I think is related to the above patches.


UPD include/linux/compile.h
CC init/version.o
LD init/built-in.o
LD vmlinux
init/built-in.o: In function `run_init_process':
main.c:(.text+0x20): undefined reference to `kernel_execve'
init/built-in.o: In function `do_linuxrc':
do_mounts_initrd.c:(.init.text+0x3a98): undefined reference to `kernel_execve'
arch/ppc/kernel/built-in.o: In function `execve':
arch/ppc/kernel/entry.S:(.text+0x24a2): undefined reference to `errno'
arch/ppc/kernel/entry.S:(.text+0x24a6): undefined reference to `errno'
kernel/built-in.o: In function `____call_usermodehelper':
kmod.c:(.text+0x163f8): undefined reference to `kernel_execve'
make: [vmlinux] Error 1 (ignored)
SYSMAP System.map
powerpc-750-linux-gnu-nm: 'vmlinux': No such file
make: [vmlinux] Error 1 (ignored)
MODPOST vmlinux


Here is a patch that fixes the compile errors. I took the code from
misc_32.S.

Signed-off-by: Mark Haverkamp <[email protected]>

---

--- linux-2.6.17.orig/arch/ppc/kernel/misc.S 2006-09-21 08:43:08.000000000 -0700
+++ linux-2.6.17/arch/ppc/kernel/misc.S 2006-09-21 12:48:56.000000000 -0700
@@ -1030,20 +1030,16 @@
addi r1,r1,16
blr

+_GLOBAL(kernel_execve)
+ li r0,__NR_execve
+ sc
+ bnslr
+ neg r3,r3
+ blr
+
/*
* This routine is just here to keep GCC happy - sigh...
*/
_GLOBAL(__main)
blr

-#define SYSCALL(name) \
-_GLOBAL(name) \
- li r0,__NR_##name; \
- sc; \
- bnslr; \
- lis r4,errno@ha; \
- stw r3,errno@l(r4); \
- li r3,-1; \
- blr
-
-SYSCALL(execve)

--
Mark Haverkamp <[email protected]>

2006-09-21 22:19:55

by Arnd Bergmann

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Thursday 21 September 2006 23:07, Mark Haverkamp wrote:
> Here is a patch that fixes the compile errors. ?I took the code from
> misc_32.S. ?
>
> Signed-off-by: Mark Haverkamp <[email protected]>

Acked-by: Arnd Bergmann <[email protected]>

I must have missed this one because I went through all architectures
that have an include/asm-*/unistd.h file, which ppc no longer has
since the consolidation of arch/powerpc.

Arnd <><

2006-09-22 11:47:37

by Cédric Le Goater

[permalink] [raw]
Subject: [PATCH -mm] x86_64 mm generic getcpu syscall fix

Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/

while working on a new syscall, i've noticed that the getcpu
patch x86_64-mm-generic-getcpu-syscall.patch does not increase
NR_syscalls. shouldn't it ?

C.

Signed-off-by: Cedric Le Goater <[email protected]>
---
include/asm-i386/unistd.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: 2.6.18-rc7-mm1/include/asm-i386/unistd.h
===================================================================
--- 2.6.18-rc7-mm1.orig/include/asm-i386/unistd.h
+++ 2.6.18-rc7-mm1/include/asm-i386/unistd.h
@@ -327,7 +327,7 @@

#ifdef __KERNEL__

-#define NR_syscalls 318
+#define NR_syscalls 319
#include <linux/err.h>

/*

2006-09-22 12:32:43

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH -mm] x86_64 mm generic getcpu syscall fix

On Friday 22 September 2006 13:47, Cedric Le Goater wrote:
> Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
>
> while working on a new syscall, i've noticed that the getcpu
> patch x86_64-mm-generic-getcpu-syscall.patch does not increase
> NR_syscalls. shouldn't it ?

Yes. Fixed thanks.

-Andi

2006-09-23 11:04:08

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 - gregkh-driver-pcmcia-device.patch breaks orinoco card

On Tue, 19 Sep 2006 01:28:48 PDT, Andrew Morton said:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/

> +gregkh-driver-pcmcia-device.patch

This one breaks the orinoco wireless card on my Dell Latitude C840. Oddly
enough, it *doesn't* break the ethernet on a Xircom multi-function card I
also have. I tried it both with and without CONFIG_SYSFS_DEPRECATED.
Userspace udev is 095 as shipped in Fedora Core 6 test3.

(slot 0 is a Xircom 10/100/56k modem card, slot 2 is the Orinoco-based
Dell TruMobile 1150 card)

Under 2.6.18-rc6-mm2, I see:

pccard: CardBus card inserted into slot 0
PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:03:00.0 to 64
eth2: Xircom cardbus revision 3 at irq 11
PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
0000:03:00.1: ttyS1 at I/O 0xe080 (irq = 11) is a 16550A
pccard: PCMCIA card inserted into slot 2
[rename_device:851]: Changing netdevice name from [eth1] to [eth3]
ohci1394: fw-host0: AT dma reset ctx=0, aborting transmission
ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting...
ieee1394: Host added: ID:BUS[0-00:1023] GUID[374fc0002a71c021]
[rename_device:1237]: Changing netdevice name from [eth2] to [eth1]
cs: memory probe 0xf4000000-0xfbffffff: excluding 0xf4000000-0xf8ffffff 0xfa000000-0xfbffffff
pcmcia: registering new device pcmcia2.0
orinoco 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
orinoco_cs 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
pcmcia: request for exclusive IRQ could not be fulfilled.
pcmcia: the driver needs updating to supported shared IRQ lines.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
cs: IO port probe 0x3e0-0x4ff: clean.
cs: IO port probe 0x820-0x8ff: clean.
cs: IO port probe 0xc00-0xcf7: clean.
cs: IO port probe 0xa00-0xaff: clean.
eth2: Hardware identity 0005:0004:0005:0000
eth2: Station identity 001f:0001:0008:000a
eth2: Firmware determined as Lucent/Agere 8.10
eth2: Ad-hoc demo mode supported
eth2: IEEE standard IBSS ad-hoc mode supported
eth2: WEP supported, 104-bit key
eth2: MAC address 00:02:2D:5C:11:48
eth2: Station name "HERMES I"
eth2: ready
eth2: orinoco_cs at 2.0, irq 11, io 0xe100-0xe13f
[rename_device:1295]: Changing netdevice name from [eth2] to [eth5]
Non-volatile memory driver v1.2

and under -rc7-mm1, I see:

pccard: CardBus card inserted into slot 0
PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:03:00.0 to 64
eth1: Xircom cardbus revision 3 at irq 11
PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
0000:03:00.1: ttyS1 at I/O 0xe080 (irq = 11) is a 16550A
pccard: PCMCIA card inserted into slot 2
ohci1394: fw-host0: AT dma reset ctx=0, aborting transmission
ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting...
ieee1394: Host added: ID:BUS[0-00:1023] GUID[374fc0002a71c021]
Non-volatile memory driver v1.2

Hmm.. a lot quieter...

Incidentally, this bisection took about 8 more compiles than it should
have because of compile-time breakage right in that section of the 'series'
file. To bisect through it, I had to apply the following tweaks to the
series file:

--- patches/series.orig 2006-09-21 15:06:13.000000000 -0400
+++ patches/series 2006-09-23 00:18:54.000000000 -0400
@@ -104,6 +104,8 @@
gregkh-driver-v4l-dev2-handle-__must_check.patch
gregkh-driver-drivers-base-platform-notify-needs-to-occur-before-drivers-attach-to-the-device.patch
gregkh-driver-drivers-base-check-errors.patch
+gregkh-driver-network-class_device-to-device.patch
+gregkh-driver-class_device_rename-remove.patch
gregkh-driver-sysfs-add-proper-sysfs_init-prototype.patch
gregkh-driver-config_sysfs_deprecated.patch
gregkh-driver-udev-devices.patch
@@ -123,12 +125,11 @@
gregkh-driver-input-device.patch
gregkh-driver-firmware-device.patch
gregkh-driver-fb-device.patch
+gregkh-driver-fb-device-fixes.patch
gregkh-driver-usb-move-usb_device_class-class-devices-to-be-real-devices.patch
gregkh-driver-usb-convert-usb-class-devices-to-real-devices.patch
gregkh-driver-driver-multithread.patch
gregkh-driver-pci-multithreaded-probe.patch
-gregkh-driver-network-class_device-to-device.patch
-gregkh-driver-class_device_rename-remove.patch
gregkh-driver-put_device-might_sleep.patch
gregkh-driver-sysfs-crash-debugging.patch
gregkh-driver-kobject-warn.patch
@@ -140,7 +141,6 @@
gregkh-driver-input-device-more-fixes.patch
gregkh-driver-input-device-even-more-fixes.patch
gregkh-driver-input-device-even-more-fixes-2.patch
-gregkh-driver-fb-device-fixes.patch
more-driver-tree-fixes.patch
#drivers-base-check-errors.patch
#fix-device_attribute-memory-leak-in-device_del.patch


Attachments:
(No filename) (226.00 B)

2006-09-27 04:30:34

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1 - gregkh-driver-pcmcia-device.patch breaks orinoco card

On Sat, Sep 23, 2006 at 07:03:26AM -0400, [email protected] wrote:
> orinoco 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
> orinoco_cs 0.15 (David Gibson <[email protected]>, Pavel Roskin <[email protected]>, et al)
> pcmcia: request for exclusive IRQ could not be fulfilled.
> pcmcia: the driver needs updating to supported shared IRQ lines.
> cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
> cs: IO port probe 0x3e0-0x4ff: clean.
> cs: IO port probe 0x820-0x8ff: clean.
> cs: IO port probe 0xc00-0xcf7: clean.
> cs: IO port probe 0xa00-0xaff: clean.
> cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
> cs: IO port probe 0x3e0-0x4ff: clean.
> cs: IO port probe 0x820-0x8ff: clean.
> cs: IO port probe 0xc00-0xcf7: clean.
> cs: IO port probe 0xa00-0xaff: clean.
> cs: IO port probe 0x100-0x3af: excluding 0x370-0x37f
> cs: IO port probe 0x3e0-0x4ff: clean.
> cs: IO port probe 0x820-0x8ff: clean.
> cs: IO port probe 0xc00-0xcf7: clean.
> cs: IO port probe 0xa00-0xaff: clean.
> eth2: Hardware identity 0005:0004:0005:0000
> eth2: Station identity 001f:0001:0008:000a
> eth2: Firmware determined as Lucent/Agere 8.10
> eth2: Ad-hoc demo mode supported
> eth2: IEEE standard IBSS ad-hoc mode supported
> eth2: WEP supported, 104-bit key
> eth2: MAC address 00:02:2D:5C:11:48
> eth2: Station name "HERMES I"
> eth2: ready
> eth2: orinoco_cs at 2.0, irq 11, io 0xe100-0xe13f
> [rename_device:1295]: Changing netdevice name from [eth2] to [eth5]
> Non-volatile memory driver v1.2
>
> and under -rc7-mm1, I see:
>
> pccard: CardBus card inserted into slot 0
> PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
> ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
> PCI: Setting latency timer of device 0000:03:00.0 to 64
> eth1: Xircom cardbus revision 3 at irq 11
> PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
> ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11
> 0000:03:00.1: ttyS1 at I/O 0xe080 (irq = 11) is a 16550A
> pccard: PCMCIA card inserted into slot 2
> ohci1394: fw-host0: AT dma reset ctx=0, aborting transmission
> ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting...
> ieee1394: Host added: ID:BUS[0-00:1023] GUID[374fc0002a71c021]
> Non-volatile memory driver v1.2
>
> Hmm.. a lot quieter...

So, you have a pcmcia or cardbus card here? I've tried this with a
cardbus card and it worked fine.

Are you sure you have the latest userspace tools, I didn't think that cs
was needed anymore, but again, without a pcmcia device to test this
with, I really am not sure :(

thanks,

greg k-h

2006-10-04 05:10:23

by Greg KH

[permalink] [raw]
Subject: Re: [-mm patch] missing class_dev to dev conversions

On Tue, Sep 19, 2006 at 05:39:03PM +0000, Frederik Deweerdt wrote:
> On Tue, Sep 19, 2006 at 01:28:48AM -0700, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/
> >
> Greg,
>
> There are some net drivers that didn't get their class_device converted to
> device, as introduced by the gregkh-driver-network-class_device-to-device
> patch.
> The arm defconfig build thus fails with the following message:
>
> drivers/net/smc91x.c: In function `smc_ethtool_getdrvinfo':
> drivers/net/smc91x.c:1713: error: structure has no member named
> `class_dev'
> make[2]: *** [drivers/net/smc91x.o] Error 1
> make[1]: *** [drivers/net] Error 2
> make: *** [drivers] Error 2
>
> The following patch fixes at91_ether.c, etherh.c, smc911x.c and smc91x.c.

Thanks a lot, I've merged this in with the original patch that caused
this problem.

greg k-h

2006-10-07 20:32:59

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Thu, Sep 21, 2006 at 01:55:17PM +0100, Andy Whitcroft wrote:
> Andrew Morton wrote:
> > On Tue, 19 Sep 2006 07:45:06 -0700
> > "Martin J. Bligh" <[email protected]> wrote:
> >
> >>> - It took maybe ten hours solid work to get this dogpile vaguely
> >>> compiling and limping to a login prompt on x86, x86_64 and powerpc.
> >>> I guess it's worth briefly testing if you're keen.
> >> PPC64 blades shit themselves in a strange way. Possibly the udev
> >> breakage you mentioned? Hard to tell really if people are going to
> >> go around breaking userspace compatibility ;-(
> >
> > What version of udev is it running?
>
> Ok, this is not a blade, but a ppc lpar. Its running the following
> version of udev:
>
> udevinfo, version 021_bk
>
> (Assuming of course the help for udev info -V is not lying when it says
> "-V print udev version".)

What distro shipped 021_bk for a version of udev? What is running on
this machine?

(yeah, I know this is a old message, but I'm trying to fix up the udev
issues right now...)

thanks,

greg k-h

2006-10-09 12:31:42

by Andy Whitcroft

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

Greg KH wrote:
> On Thu, Sep 21, 2006 at 01:55:17PM +0100, Andy Whitcroft wrote:
>> Andrew Morton wrote:
>>> On Tue, 19 Sep 2006 07:45:06 -0700
>>> "Martin J. Bligh" <[email protected]> wrote:
>>>
>>>>> - It took maybe ten hours solid work to get this dogpile vaguely
>>>>> compiling and limping to a login prompt on x86, x86_64 and powerpc.
>>>>> I guess it's worth briefly testing if you're keen.
>>>> PPC64 blades shit themselves in a strange way. Possibly the udev
>>>> breakage you mentioned? Hard to tell really if people are going to
>>>> go around breaking userspace compatibility ;-(
>>> What version of udev is it running?
>> Ok, this is not a blade, but a ppc lpar. Its running the following
>> version of udev:
>>
>> udevinfo, version 021_bk
>>
>> (Assuming of course the help for udev info -V is not lying when it says
>> "-V print udev version".)
>
> What distro shipped 021_bk for a version of udev? What is running on
> this machine?
>
> (yeah, I know this is a old message, but I'm trying to fix up the udev
> issues right now...)

Hmmm. The machine claims to be running SuSE. We have it recorded as
SLES9, but I actually can't find any way to tell from the machine which
actual release thereof it is.

This version of ud
gekko-lp1:~ # udevinfo -V
udevinfo, version 021_bk
gekko-lp1:~ # rpm -qa | grep udev
udev-021-36.32

This seems to be the correct version for the first GA of SLES9.

-apw

2006-10-09 16:09:55

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc7-mm1

On Mon, Oct 09, 2006 at 01:31:04PM +0100, Andy Whitcroft wrote:
> Greg KH wrote:
> > On Thu, Sep 21, 2006 at 01:55:17PM +0100, Andy Whitcroft wrote:
> >> Andrew Morton wrote:
> >>> On Tue, 19 Sep 2006 07:45:06 -0700
> >>> "Martin J. Bligh" <[email protected]> wrote:
> >>>
> >>>>> - It took maybe ten hours solid work to get this dogpile vaguely
> >>>>> compiling and limping to a login prompt on x86, x86_64 and powerpc.
> >>>>> I guess it's worth briefly testing if you're keen.
> >>>> PPC64 blades shit themselves in a strange way. Possibly the udev
> >>>> breakage you mentioned? Hard to tell really if people are going to
> >>>> go around breaking userspace compatibility ;-(
> >>> What version of udev is it running?
> >> Ok, this is not a blade, but a ppc lpar. Its running the following
> >> version of udev:
> >>
> >> udevinfo, version 021_bk
> >>
> >> (Assuming of course the help for udev info -V is not lying when it says
> >> "-V print udev version".)
> >
> > What distro shipped 021_bk for a version of udev? What is running on
> > this machine?
> >
> > (yeah, I know this is a old message, but I'm trying to fix up the udev
> > issues right now...)
>
> Hmmm. The machine claims to be running SuSE. We have it recorded as
> SLES9, but I actually can't find any way to tell from the machine which
> actual release thereof it is.
>
> This version of ud
> gekko-lp1:~ # udevinfo -V
> udevinfo, version 021_bk
> gekko-lp1:~ # rpm -qa | grep udev
> udev-021-36.32
>
> This seems to be the correct version for the first GA of SLES9.

Ah, ok, thanks. But udev on SLES9 does not actually control /dev, only
/dev/disk/, which is not used for booting in any manner, so you should
not have any issues with that old udev.

thanks,

greg k-h