So it's almost getting to be a habit: yet another -rc release that is
delayed by a couple of days.
However, this time I actually have an excuse. Well, an excuse other
than my usual "Sue me, I'm disorganized and forgot".
This time the reason for the delay is that we spent several days
chasing down a nasty floating point state corruption that happens on
32-bit x86 - but only if you have a modern CPU (why are you using
32-bit kernels?) that supports the AES-NI instructions. And then you
have to enable support for them *and* use a wireless driver that uses
it. The most likely reason for that is using the mac80211
infrastructure with WPA with AES encryption (ie usually WPA2)
Anyway, if you are using wireless networking, have a modern CPU that
you cripple by running in 32-bit mode, and have seen odd FP-related
crashes with (the usual symptoms seem to be flash problems in the
browser or the mouse in X suddenly moving to a corner or similar, but
anything goes, really), that might be due to this.
The workaround is either to just compile a 64-bit kernel (hey, you can
leave your 32-bit userland alone if you are so emotionally attached to
last century) or upgrade to 3.3-rc4.
Sure, we'll backport the patches to -stable too for the boring people
who don't want to help test development kernels. But wouldn't it be
nice to have the bug fixed *and* feel like you are helping development
by testing shiny new -rc kernels?
You know you want to.
Anyway, while I'm just blathering about the FPU state fix (because
that was what *I* was doing), other people worked on other things too.
So -rc4 also does various other things, like remove the old stale
gma500 staging driver that got properly merged. Or remove the old
pohmelfs staging code, since that will get a complete rewrite.
Or if you're not interested in the big patches, but in the small ones
that actually matter to you, try to find them in the shortlog below.
There are various network driver updates, some ecryptfs and xfs
updates, ARM fixes, sha512 stack usage fixes, yadda yadda. Something
for almost everyone, I'm sure.
Go forth and test.
Linus
---
Adrian Hunter (1):
mmc: sdhci-pci: set Medfield SDIO as non-removable
Al Viro (1):
CONFIG_TR/CONFIG_LLC: work around the problem with select
Alex Deucher (2):
drm/radeon/kms/atom: bios scratch reg handling updates
drm/radeon/kms: fix MSI re-arm on rv370+
Alexander Duyck (1):
ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page size
Alexey Dobriyan (1):
crypto: sha512 - use standard ror64()
Ameya Palande (1):
MAINTAINERS: staging: iio: add iio information
Amitkumar Karwar (2):
mwifiex: handle association failure case correctly
mwifiex: add NULL checks in driver unload path
Andrew Lunn (2):
ARM: orion: Fix Orion5x GPIO regression from MPP cleanup
ARM: orion: Fix USB phy for orion5x.
Anisse Astier (1):
net: Fix build regression when INET_UDP_DIAG=y and IPV6=m
Anton Blanchard (1):
powerpc/perf: power_pmu_start restores incorrect values,
breaking frequency events
Anton Vorontsov (2):
Revert "bq27x00_battery: Fix reporting status value for bq27500 battery"
staging: android/ram_console: Don't build on arches w/o ioremap
Arun Sharma (1):
net: Disambiguate kernel message
Arve Hj?nnev?g (3):
Staging: android: binder: Don't call dump_stack in binder_vma_open
Staging: android: binder: Fix crashes when sharing a binder file
between processes
staging: android: lowmemorykiller: Don't wait more than one
second for a process to die
Asai Thambi S P (1):
mtip32xx: removed the irrelevant argument of mtip_hw_submit_io()
and the unused member of struct driver_data
Atsushi Nemoto (1):
net: enable TC35815 for MIPS again
Axel Lin (6):
power_supply: Fix modalias for charger-manager
lp8727_charger: Add terminating entry for i2c_device_id table
mmc: cb710 core: Add missing spin_lock_init for irq_lock of
struct cb710_chip
regulator: Fix getting voltage in max8649_enable_time()
mlx4: Fix kcalloc parameters swapped
RxRPC: Fix kcalloc parameters swapped
Bart Westgeest (1):
staging: usbip: fix to prevent potentially using uninitialized spinlock
Ben Hutchings (1):
ethtool: Null-terminate filename passed to ethtool_ops::flash_device
Benjamin Herrenschmidt (5):
powerpc/wsp: Permanently enable PCI class code workaround
powerpc/wsp: Fix IRQ affinity setting
powerpc: Fix WARN_ON in decrementer_check_overflow
powerpc/fsl/pci: Fix PCIe fixup regression
powerpc: Disable interrupts early in Program Check
Benoit Cousson (1):
ARM: OMAP2+: board-generic: Add missing handle_irq callbacks
Brian King (1):
powerpc/pseries: Fix partition migration hang in stop_topology_update
Chris Wilson (1):
drm/i915:: Disable FBC on SandyBridge
Christoph Hellwig (1):
xfs: use a normal shrinker for the dquot freelist
Cong Wang (3):
usb: musb: fix a build error on mips
tty: fix a build failure on sparc
ecryptfs: remove the second argument of k[un]map_atomic()
Cousson, Benoit (1):
ks8851: Fix NOHZ local_softirq_pending 08 warning
Dan Carpenter (4):
cdrom: use copy_to_user() without the underscores
isdn: type bug in isdn_net_header()
bna: fix error handling of bnad_get_flash_partition_by_offset()
relay: prevent integer overflow in relay_open()
Dan Magenheimer (2):
zcache: fix deadlock condition
zcache: Set SWIZ_BITS to 8 to reduce tmem bucket lock contention.
Daniel T Chen (1):
ALSA: intel8x0: Fix default inaudible sound on Gateway M520
Daniel Vetter (2):
drm/i915: fixup interlaced bits clearing in PIPECONF on PCH_SPLIT
drm/i915: no lvds quirk for AOpen MP45
Danny Kukawka (2):
vmw_balloon: fix for a -Wuninitialized warning
cs5535-mfgpt: don't call __init function from __devinit
Dave Airlie (1):
drm/radeon/kms: drop lock in return path of radeon_fence_count_emitted.
Dave Young (2):
loop: zero fill bio instead of return -EIO for partial read
module: make module param bint handle nul value
David Howells (1):
Reduce the number of expensive division instructions done by
_parse_integer()
David Lv (1):
via-velocity: S3 resume fix.
David Miller (1):
regulator: Fix mc13xxx regulator modular build (again)
David S. Miller (1):
net: Make qdisc_skb_cb upper size bound explicit.
Dean Nelson (1):
e1000: add dropped DMA receive enable back in for WoL
Dmitry Tarnyagin (1):
caif: Bugfix double kfree_skb upon xmit failure
Don Skidmore (1):
ixgbe: update copyright to 2012
Eliad Peller (1):
mac80211: timeout a single frame in the rx reorder buffer
Emmanuel Grumbach (1):
iwlwifi: don't mess up QoS counters with non-QoS frames
Eric Dumazet (4):
gro: more generic L2 header check
bnx2x: fix bnx2x_storm_stats_update() on big endian
netpoll: netpoll_poll_dev() should access dev->flags
3c59x: shorten timer period for slave devices
Eugenia Emantayev (6):
mlx4_core: fix memory leak at multi_func_cleanup
mlx4_core: use correct flag for unicast_promisc
mlx4_core: use correct port for steering
mlx4: fix buffer overrun
mlx4: fix QP tree trashing
mlx4: add unicast steering entries to resource_tracker
Evgeniy Polyakov (1):
staging: pohmelfs: remove drivers/staging/pohmelfs
Fabio Estevam (2):
drivers: misc: Remove MISC_DEVICES config option
usb: host: Distinguish Kconfig text for Freescale controllers
Felix Fietkau (2):
ath9k: fix a WEP crypto related regression
ath9k_hw: fix a RTS/CTS timeout regression
Florian Fainelli (5):
cpmac: fix PHY name to match MDIO bus name
bcm63xx-enet: fix PHY name to match MDIO bus name
fec: fix PHY name to match fixed MDIO bus name
octeon: fix PHY name to match MDIO bus name
ixp4xx-eth: fix PHY name to match MDIO bus name
Francesco Virlinzi (1):
stmmac: request_irq when use an ext wake irq line (v2)
Girish K S (2):
mmc: core: Fix low speed mmc card detection failure
mmc: core: Fix PowerOff Notify suspend/resume
Giuseppe CAVALLARO (3):
stmmac: do not discard frame on dribbling bit assert
stmmac: move hw init in the probe (v2)
stmmac: update the driver version to Feb 2012 (v2)
Grazvydas Ignotas (1):
bq27x00_battery: Fix flag register read
Greg Kroah-Hartman (3):
driver core: cpu: remove kernel warning when removing a cpu
staging: delete gma500 driver
driver-core: cpu: fix kobject warning when hotplugging a cpu
Greg Rose (6):
ixgbe: Add warning when no space left for more MAC filters
ixgbevf: Fix mailbox interrupt ack bug
ixgbevf: Update copyright notices
igb: fix vf lookup
ixgbe: fix vf lookup
ixgbe: Fix case of Tx Hang in PF with 32 VFs
Guennadi Liakhovetski (2):
mmc: tmio_mmc: fix card eject during IO with DMA
mmc: sh_mmcif: fix late delayed work initialisation
Guenter Roeck (1):
hwmon: (w83627ehf) Remove duplicate code
H Hartley Sweeten (1):
ep93xx: fix build of vision_ep93xx.c
Haiyang Zhang (2):
net/hyperv: Use netif_tx_disable() instead of netif_stop_queue()
when necessary
net/hyperv: Fix the page buffer when an RNDIS message goes
beyond page boundary
Hauke Mehrtens (1):
ssb: fix cardbus slot in hostmode
Heiko Stuebner (1):
ARM: SAMSUNG: Fix missing api-change from subsys_interface change
Henrik Rydberg (1):
bcma: don't fail for bad SPROM CRC
Herbert Xu (2):
crypto: sha512 - Use binary and instead of modulus
crypto: sha512 - Avoid stack bloat on i386
Igor Grinberg (1):
ARM: OMAP3: cm-t35: fix section mismatch warning
Ira Snyder (1):
powerpc: Fix kernel log of oops/panic instruction dump
Jaehoon Chung (1):
mmc: core: add the capability for broken voltage
Jan Beulich (1):
xenbus_dev: add missing error check to watch handling
Jan Weitzel (1):
net/ethernet: ks8851_mll fix irq handling
Jason Wang (1):
tcp: properly initialize tcp memory limits
Jayachandran C (1):
usb: Skip PCI USB quirk handling for Netlogic XLP
Jean-Christophe PLAGNIOL-VILLARD (5):
ARM: at91:rtc/rtc-at91sam9: ioremap register bank
ARM: at91: add accessor to manage SMC
pata/at91: use newly introduced SMC accessors
ARM: at91: drop ide driver in favor of the pata one
mmc: of_mmc_spi: fix little endian support
Jeff Layton (3):
cifs: fix error handling when cifscreds key payload is an error
cifs: request oplock when doing open on lookup
cifs: don't return error from standard_receive3 after marking
response malformed
Jerry Huang (2):
mmc: esdhc: add PIO mode support
mmc: esdhc: set the timeout to the max value
Jesper Juhl (2):
bcma: Fix mem leak in bcma_bus_scan()
bnx2x: Fix mem leak in bnx2x_tpa_stop() if build_skb() fails.
Jiri Olsa (2):
perf tools: Fix perf stack to non executable on x86_64
perf tools: Fix prefix matching for kernel maps
John Fastabend (2):
ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB state
ixgbe: ethtool: stats user buffer overrun
John W. Linville (2):
iwlwifi: make "Tx aggregation enabled on ra =" be at DEBUG level
ath9k: use WARN_ON_ONCE in ath_rc_get_highest_rix
Jonghwan Choi (1):
ARM: EXYNOS: Fix "warning: initialization from incompatible pointer type"
Julia Lawall (3):
drivers/net/ethernet/ti: Move call to PTR_ERR after reassignment
drivers/gpu/drm/drm_ioc32.c: initialize all fields
drivers/staging/omapdrm/omap_fbdev.c: move free after uses
Julian Anastasov (1):
ipv4: reset flowi parameters on route connect
Jurgen Heeks (1):
mmc: core: Fix comparison issue in mmc_compare_ext_csds
Karol Lewandowski (1):
ARM: EXYNOS: Bring exynos4-dt up to date
Keith Packard (2):
drm/i915: Force explicit bpp selection for intel_dp_link_required
drm/i915: fixup interlaced bits clearing in PIPECONF on PCH_SPLIT (v2)
Kent Overstreet (1):
bio: don't overflow in bio_get_nr_vecs()
Kim, Milo (1):
lp8727_chager: Fix permissions on a header file
Konrad Rzeszutek Wilk (3):
xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling
while atomic.
xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
Kukjin Kim (4):
ARM: EXYNOS: Remove build warning without enabling PM
ARM: S5PV210: Fix the name of exynos4_clk_hdmiphy_ctrl() for S5PV210
serial: samsung: Add support for EXYNOS4212 and EXYNOS4412
serial: samsung: Add support for EXYNOS5250
Kuninori Morimoto (3):
usb: ch9.h: usb_endpoint_maxp() uses __le16_to_cpu()
usb: ch9.h: usb_endpoint_maxp() uses __le16_to_cpu()
ASoC: fsi: fixup fsi_pointer() calculation method
Larry Finger (3):
staging: r8712u: Add new Sitecom UsB ID
staging: r8712u: Fix problem when CONFIG_R8712_AP is set
staging: r8712u: Use asynchronous firmware loading
Lars-Peter Clausen (1):
regmap: Fix cache defaults initialization from raw cache defaults
Li Wei (1):
ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr.
Linus Torvalds (11):
i387: math_state_restore() isn't called from asm
i387: make irq_fpu_usable() tests more robust
i387: fix sense of sanity check
i387: fix x86-64 preemption-unsafe user stack save/restore
i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers
i387: don't ever touch TS_USEDFPU directly, use helper functions
i387: do not preload FPU state at task switch time
i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore
i387: move TS_USEDFPU flag from thread_info to task_struct
i387: re-introduce FPU state preloading at context switch time
Linux 3.3-rc4
Linus Walleij (1):
pinctrl: restore pin naming
Ludovic Desroches (1):
mmc: atmel-mci: save and restore sdioirq when soft reset is performed
Luigi Tarenga (1):
rt2800lib: fix wrong -128dBm when signal is stronger than -12dBm
Marc Dietrich (2):
ARM: tegra: paz00: fix wrong SD1 power gpio
ARM: tegra: paz00: fix wrong UART port on mini-pcie plug
Marc Zyngier (1):
ARM: 7320/1: Fix proc_info table alignment
Marek Szyprowski (1):
ARM: EXYNOS: fix non-SMP builds for EXYNOS4
Mark Brown (2):
ARM: S3C64XX: Make s3c64xx_init_uarts() static
ARM: S3C6410: Use device names for both I2C clocks
Masanari Iida (1):
ixgbe: Fix typo in ixgbe_common.h
Matthijs Kooijman (1):
drm/radeon: do not continue after error from r600_ib_test
Michael Ellerman (1):
powerpc/powernv: Disable interrupts while taking phb->lock
Michal Schmidt (1):
bnx2x: remove the 'poll' module option
Milan Kocian (1):
USB: usbserial: add new PID number (0xa951) to the ftdi driver
Mitch A Williams (1):
igbvf: change copyright date
Mitsuo Hayasaka (1):
xfs: pass KM_SLEEP flag to kmem_realloc() in
xlog_recover_add_to_cnt_trans()
Mohammed Shafi Shajakhan (2):
ath9k: Fix kernel panic during driver initilization
mac80211: Fix a rwlock bad magic bug
Naveen N. Rao (1):
perf evsel: Fix an issue where perf report fails to show the
proper percentage
Neal Cardwell (3):
tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbs
tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one()
tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACK
Neil Horman (4):
netprio_cgroup: Fix obo in get_prioidx
netprio_cgroup: fix an off-by-one bug
netprio_cgroup: don't allocate prio table when a device is registered
netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m
Neil Zhang (2):
usb: otg: mv_otg: Add dependence
usb: otg: mv_otg: Add dependence
Nicolas Ferre (1):
ARM: at91: USB AT91 gadget registration for module
Nikolaus Schulz (4):
hwmon: (f75375s) Fix automatic pwm mode setting for F75373 & F75375
hwmon: (f75375s) Fix reading of wrong register when initializing
the F75387
hwmon: (f75375s) Fix bit shifting in f75375_write16
hwmon: (f75375s) Let f75375_update_device treat pwmX as a measured value
Olof Johansson (1):
ARM: tegra: dma: fix buildbreak for !CONFIG_TEGRA_SYSTEM_DMA
Omar Ramirez Luna (2):
staging: tidspbridge: fix bridge_open memory leaks
staging: tidspbridge: fix incorrect free to drv_datap
Ondrej Zary (1):
module: fix broken isapnp handling in file2alias
Paolo Bonzini (1):
cdrom: move shared static to cdrom_device_info
Paul Gortmaker (2):
c2port: fix build error for duramar2150 due to missing header.
m32r: relocate drivers back out of 8250 dir
Paul Walmsley (4):
tty: serial: OMAP: use a 1-byte RX FIFO threshold in PIO mode
tty: serial: OMAP: block idle while the UART is transferring
data in PIO mode
tty: serial: omap-serial: wakeup latency constraint is in
microseconds, not milliseconds
ARM: OMAP2xxx: PM: fix OMAP2xxx-specific UART idle bug in v3.3
Pekka Paalanen (2):
Staging: asus_oled: fix image processing
Staging: asus_oled: fix NULL-ptr crash on unloading
Philip Rakity (1):
mmc: core: UHS sdio card that fails should not exceed 50MHz
Rabin Vincent (2):
backing-dev: fix wakeup timer races with bdi_unregister()
mmc: block: Init ro_lock sysfs attr to fix lockdep warnings
Randy Dunlap (3):
uwb & wusb & usb wireless controllers: fix kconfig error & build errors
docbook: fix fatal errors in device-drivers docbook and add DMA
Management section
staging: fix go7007-usb license
Richard Zhao (1):
net: fec: correct phy_name buffer length when init phy_name
Rob Clark (7):
staging: drm/omap: drm API update: make fops struct const
staging: drm/omap: drm API update: addfb2
staging: drm/omap: add drm_plane support
staging: drm/omap: multiplanar and YUV support
staging: drm/omap: updates for DSS fifomerge API changes
staging: drm/omap: fix minimum width/height
staging: drm/omap: fix locking issue
Roland Dreier (1):
IPoIB: Stop lying about hard_header_len and use skb->cb to stash
LL addresses
Roy Zang (1):
mmc: esdhc: fix errors when booting kernel on Freescale eSDHC version 2.3
Rui li (1):
USB: add new zte 3g-dongle's pid to option.c
Russell King (15):
ARM: omap: fix oops in arch/arm/mach-omap2/vp.c when pmic is not found
ARM: omap: fix oops in drivers/video/omap2/dss/dpi.c
ARM: omap: fix broken twl-core dependencies and ifdefs
ARM: omap: fix prm44xx.c OMAP44XX_IRQ_PRCM build error
ARM: omap: fix vc.c PMIC error message
ARM: omap: fix uninformative vc/i2c configuration error message
ARM: omap: fix section mismatch errors in TWL PMIC driver
ARM: omap: fix section mismatch warning in mux.c
ARM: omap: preemptively fix section mismatch in
omap4_sdp4430_wifi_mux_init()
ARM: omap: fix section mismatch warning for omap_secondary_startup()
ARM: omap: fix section mismatch error for omap_4430sdp_display_init()
ARM: omap: fix section mismatch warning for sdp3430_twl_gpio_setup()
ARM: omap: fix section mismatch warnings in mux.c caused by hsmmc.c
ARM: omap: fix wrapped error messages in omap_hwmod.c
ARM: omap: resolve nebulous 'Error setting wl12xx data'
Samuel Thibault (1):
drivers/tty/vt/vt_ioctl.c: fix KDFONTOP 32bit compatibility layer
Santosh Shilimkar (1):
ARM: OMAP2: Fix the OMAP2 only build break seen with 2011+ ARM tool-chains
Sebastian Haas (1):
can: ems_usb: Removed double netif_device_detach
Seth Jennings (1):
staging: zcache: fix serialization bug in zv stats
Seungwon Jeon (1):
mmc: dw_mmc: Fix PIO mode with support of highmem
Shaohua Li (3):
block,cfq: change code order
block: fix NULL icq_cache reference
block: fix ioc locking warning
Shawn Lu (1):
tcp_v4_send_reset: binding oif to iif in no sock case
Shengzhou Liu (1):
powerpc/usb: fix issue of CPU halt when missing USB PHY clock
Shuah Khan (1):
Staging: android: Remove pmem driver
Shubhrajyoti Datta (1):
i2c: tegra: Add devexit_p() for remove
Simon Graham (1):
rtlwifi: Modify rtl_pci_init to return 0 on success
Srikar Dronamraju (1):
powerpc: Implement GET_IP/SET_IP
Stanislaw Gruszka (1):
bsg: fix sysfs link remove warning
Stefano Stabellini (1):
xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
Stephane Eranian (2):
perf: Remove deprecated WARN_ON_ONCE()
perf: Fix double start/stop in x86_pmu_start()
Stephane Grosjean (1):
can: peak_pci: Fix the way channels are linked together
Stephen Boyd (2):
ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR
ARM: 7322/1: Print BUG instead of undefined instruction on BUG_ON()
Stephen Hemminger (1):
ixgbe: make ethtool strings table const
Stephen Rothwell (1):
powerpc: Remove legacy iSeries from ppc64_defconfig
Sujit Reddy Thumma (1):
mmc: core: Ensure clocks are always enabled before host interaction
Sylwester Nawrocki (3):
ARM: SAMSUNG: Fix platform data setup for I2C adapter 0
ARM: EXYNOS: Correct framebuffer window size on Nuri board
ARM: EXYNOS: Correct M-5MOLS sensor clock frequency on Universal
C210 board
Takashi Iwai (3):
ALSA: hda - Fix mute-LED VREF value for new HP laptops
ALSA: hda - Fix initialization of secondary capture source on VT1705
ALSA: hda - Fix silent speaker output on Acer Aspire 6935
Tejun Heo (4):
block: strip out locking optimization in put_io_context()
block: separate out blk_rq_merge_ok() and blk_try_merge() from
elevator functions
block: don't call elevator callbacks for plug merges
block: fix lockdep warning on io_context release put_io_context()
Thadeu Lima de Souza Cascardo (3):
mlx4: allow device removal by fixing dma unmap size
mlx4: fix DMA mapping leak when allocation fails
powerpc/pseries/eeh: Fix crash when error happens during device probe
Thomas Abraham (1):
ARM: EXYNOS: Add cpu-offset property in gic device tree node
Thomas Graf (2):
net: Don't proxy arp respond if iif == rt->dst.dev if private
VLAN is disabled
veth: Enforce minimum size of VETH_INFO_PEER
Thomas Tuttle (2):
USB: qcserial: add several new serial devices
USB: qcserial: don't enable autosuspend
Tim Gardner (1):
ipheth: Add iPhone 4S
Timo Juhani Lindfors (1):
usb: gadget: zero: fix bug in loopback autoresume handling
Tomas Vanek (1):
zd1211rw: firmware needs duration_id set to zero for non-pspoll frames
Tyler Hicks (2):
eCryptfs: Improve statfs reporting
eCryptfs: Copy up lower inode attrs after setting lower xattr
Vaidyanathan Srinivasan (1):
PCI: set pci sriov page size before reading SRIOV BAR
Vivek Goyal (2):
floppy: Cleanup disk->queue before caling put_disk() if
add_disk() was never called
floppy: Fix a crash during rmmod
Wei Yongjun (2):
net/hyperv: rx_bytes should account the ether header size
net/hyperv: fix the issue that large packets be dropped under bridge
Wolfgang Grandegger (3):
can: flexcan: fix irq flooding by clearing all interrupt sources
can: cc770: store echo skb before starting the transfer
can: ti_hecc: use netif_rx in the interrupt
Wolfgang Zarre (1):
can: cc770: Fix indirect access deadlock on ISA cards
Wu Fengguang (3):
writeback: fix NULL bdi->dev in trace writeback_single_inode
lib: proportion: lower PROP_MAX_SHIFT to 32 on 64-bit kernel
writeback: fix dereferencing NULL bdi->dev on trace_writeback_queue
Xi Wang (1):
can: pch_can: fix error passive level test
Yi Zou (1):
ixgbe: do not update real num queues when netdev is going away
Yinghai Lu (4):
drivers/base/memory.c: fix memory_dev_init() long delay
ACPI: remove duplicated lines of merging problems with acpi_processor_add
PCI: workaround hard-wired bus number V2
PCI: Fix pci cardbus removal
Yoshihiro Shimoda (1):
net: sh_eth: fix skb_over_panic happen
majianpeng (1):
powerpc/adb: Use set_current_state()
[email protected] (1):
caif: Bugfix list_del_rcu race in cfmuxl_ctrlcmd.
stephen hemminger (1):
Revert "skge: check for PCI dma mapping errors"
Am 19.02.2012 um 01:27 schrieb Linus Torvalds:
>
> The workaround is either to just compile a 64-bit kernel (hey, you can
> leave your 32-bit userland alone if you are so emotionally attached to
> last century) or upgrade to 3.3-rc4.
>
Theoretically yes, practically (at least with Fedora 15/16) no. I did hit these two bugs:
1.) autofs4 interface is broken between x86 and x86_64. as systemd uses autofs, this bug hangs the boot process as e.g. binfmt is mounted via autofs. see also http://lists.freedesktop.org/archives/systemd-devel/2011-September/003396.html
- autofs4 maintainer says that this minor kernel bug could be circumvented in user land.
- systemd developers says that this kernel bug should get fixed in the kernel
2.) while debugging above issue: I did find an minor bug in sys_poll() - nobody did take care of my proposed patch: https://lkml.org/lkml/2011/9/24/35
with kind regards
thomas
On Sun, Feb 19, 2012 at 5:42 AM, Thomas Meyer <[email protected]> wrote:
>
> Theoretically yes, practically (at least with Fedora 15/16) no. I did hit these two bugs:
>
> 1.) autofs4 interface is broken between x86 and x86_64. as systemd uses autofs, this bug hangs the boot process as e.g. binfmt is mounted via autofs. see also http://lists.freedesktop.org/archives/systemd-devel/2011-September/003396.html
Duh.
That is just broken.
The code even *talks* about how the packet layout is the same on
32-bit and 64-bit architectures, and that's largely true.
However, while true, x86-64 has 8-byte alignment for 'long', and
x86-32 has 4-byte alignment. Which means that even though the
structure layout is exactly the same, on x86-64 the *alignment* issue
will push it out to 304 bytes.
That's just stupid. We've had that problem before. It's easy to
overlook, but that packet is just mis-designed.
The attached patch isn't pretty, but this is definitely a kernel bug.
Binary compatibility is *important*, dammit.
Does this fix it?
Sadly, the *right* fix would have been to just mark the structure
packed, or select a maximum name length that padded things out to 8
bytes on all architectures, but we can't change that any more, because
changing the size on native 32-bit would break binaries there.
> 2.) while debugging above issue: I did find an minor bug in sys_poll() - nobody did take care of my proposed patch: https://lkml.org/lkml/2011/9/24/35
Looks correct, and searching my lkml archives shows that it was acked
by Eric. But it never went any further.
Mind re-sending that patch updated for the x86 system call
re-organization? Sure, I could do it myself, but it would be much
better if somebody who then actually *tests* the end result did it
(hpa cc'd, just because he's the one that touched the x86 compat layer
system call table thing. "Tag, you're it")
Anyway, if that, together with the sizeof hack in the attached patch,
makes everything work for you, I'll happily apply both of them.
Linus
From: Linus Torvalds <[email protected]>
Date: Sun, 19 Feb 2012 10:45:26 -0800
> However, while true, x86-64 has 8-byte alignment for 'long', and
> x86-32 has 4-byte alignment. Which means that even though the
> structure layout is exactly the same, on x86-64 the *alignment* issue
> will push it out to 304 bytes.
>
> That's just stupid. We've had that problem before. It's easy to
> overlook, but that packet is just mis-designed.
A real shame, this should have used "__aligned_u64" from the
beginning.
On Sun, Feb 19, 2012 at 11:49 AM, David Miller <[email protected]> wrote:
>
> A real shame, this should have used "__aligned_u64" from the
> beginning.
I agree. Sadly, this is exactly the kind of thing that is *really*
easy to overlook, and once it is overlooked we're screwed because
fixing it just breaks the native 32-bit case.
We probably should have made __u64 itself be marked as aligned, but
that's too late now too, unless somebody wants to go through and fix
any cases like this ;(
Binary compatibility is really important, and while arguably
compat-compatibility is slightly less critical, I think we should aim
to DTRT there too. I don't think we need to necessarily bend over
quite as far backwards for the compat case, but in places like this
where we are so close to being compatible - and not being compatible
kills the boot sequence and isn't just some theoretical thing - I do
think that it's worth
Linus
On Sun, 2012-02-19 at 12:02 -0800, Linus Torvalds wrote:
> On Sun, Feb 19, 2012 at 11:49 AM, David Miller <[email protected]> wrote:
> >
> > A real shame, this should have used "__aligned_u64" from the
> > beginning.
>
> I agree. Sadly, this is exactly the kind of thing that is *really*
> easy to overlook, and once it is overlooked we're screwed because
> fixing it just breaks the native 32-bit case.
It sure is.
There was a suggestion from the systemd folks to bump the kernel
protocol major version to 6 and add a packed structure for use with that
version and beyond. That's a bit ugly too but won't break things that
already work around it in user space for major version 5 and avoids.
I've not got around to checking if the patch works correctly and
finishing it.
Ian
On Sun, 2012-02-19 at 12:02 -0800, Linus Torvalds wrote:
> On Sun, Feb 19, 2012 at 11:49 AM, David Miller <[email protected]> wrote:
> >
> > A real shame, this should have used "__aligned_u64" from the
> > beginning.
>
> I agree. Sadly, this is exactly the kind of thing that is *really*
> easy to overlook, and once it is overlooked we're screwed because
> fixing it just breaks the native 32-bit case.
>
> We probably should have made __u64 itself be marked as aligned, but
> that's too late now too, unless somebody wants to go through and fix
> any cases like this ;(
>
> Binary compatibility is really important, and while arguably
> compat-compatibility is slightly less critical, I think we should aim
> to DTRT there too. I don't think we need to necessarily bend over
> quite as far backwards for the compat case, but in places like this
> where we are so close to being compatible - and not being compatible
> kills the boot sequence and isn't just some theoretical thing - I do
> think that it's worth
Not sure this is the way you'd like to go with this but here is a patch
that bumps the major autofs kernel communications version (as yet not
even compile tested).
The advantage of this is that users of the v5 protocol that use a
workaround like the one in list archive post (which includes autofs
itself) in the first post on this won't be broken and users can elect to
use the packed data structure from now on.
Comments please.
autofs4 - add packed autofs_v6_packet to avoid struct alignment problems.
From: Thomas Meyer <[email protected]>
Updated by Ian Kent <[email protected]>
autofs_v5_packet is 300 bytes on archectures that are 32-bit aligned and
304 bytes on archectures that are 64-bit aligned. This leads to an error
in at least systemd when running a x86_64 kernel on an x86 userspace.
Fix this by adding a new protocol version 6 packet that is packed so that
it will have the same size on all archectures regardless of alignement.
Signed-off-by: Thomas Meyer <[email protected]>
Signed-off-by: Ian Kent <[email protected]>
---
fs/autofs4/inode.c | 36 ++++++++++++++++++++++++++++--------
fs/autofs4/waitq.c | 46 +++++++++++++++++++++++++++++++---------------
include/linux/auto_fs4.h | 36 +++++++++++++++++++++++++++++++++---
3 files changed, 92 insertions(+), 26 deletions(-)
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index e16980b..18b5b89 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -197,6 +197,28 @@ static int parse_options(char *options, int *pipefd, uid_t *uid, gid_t *gid,
return (*pipefd < 0);
}
+static bool check_protocol_version(int minproto, int maxproto)
+{
+ bool rc = true;
+
+ if (minproto > maxproto || maxproto < minproto) {
+ printk("autofs: protocol min(%d)/max(%d) version error!",
+ minproto, maxproto);
+ rc = false;
+ }
+
+ if (maxproto < AUTOFS_MIN_PROTO_VERSION ||
+ minproto > AUTOFS_MAX_PROTO_VERSION) {
+ printk("autofs: kernel does not match daemon version "
+ "daemon (%d, %d) kernel (%d, %d)\n",
+ minproto, maxproto,
+ AUTOFS_MIN_PROTO_VERSION, AUTOFS_MAX_PROTO_VERSION);
+ rc = false;
+ }
+
+ return rc;
+}
+
int autofs4_fill_super(struct super_block *s, void *data, int silent)
{
struct inode * root_inode;
@@ -270,21 +292,19 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
root_inode->i_op = &autofs4_dir_inode_operations;
/* Couldn't this be tested earlier? */
- if (sbi->max_proto < AUTOFS_MIN_PROTO_VERSION ||
- sbi->min_proto > AUTOFS_MAX_PROTO_VERSION) {
- printk("autofs: kernel does not match daemon version "
- "daemon (%d, %d) kernel (%d, %d)\n",
- sbi->min_proto, sbi->max_proto,
- AUTOFS_MIN_PROTO_VERSION, AUTOFS_MAX_PROTO_VERSION);
+ if(check_protocol_version(sbi->min_proto, sbi->max_proto) == false)
goto fail_dput;
- }
/* Establish highest kernel protocol version */
if (sbi->max_proto > AUTOFS_MAX_PROTO_VERSION)
sbi->version = AUTOFS_MAX_PROTO_VERSION;
else
sbi->version = sbi->max_proto;
- sbi->sub_version = AUTOFS_PROTO_SUBVERSION;
+
+ if (sbi->version == 5)
+ sbi->sub_version = AUTOFS_V5_PROTO_SUBVERSION;
+ else
+ sbi->sub_version = AUTOFS_MAX_PROTO_SUBVERSION;
DPRINTK("pipe fd = %d, pgrp = %u", pipefd, sbi->oz_pgrp);
pipe = fget(pipefd);
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index da8876d..361a9bf 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -100,6 +100,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct autofs_packet_hdr hdr;
union autofs_packet_union v4_pkt;
union autofs_v5_packet_union v5_pkt;
+ struct autofs_v6_packet_union v6_pkt;
} pkt;
struct file *pipe = NULL;
size_t pktsz;
@@ -145,7 +146,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
break;
}
/*
- * Kernel protocol v5 packet for handling indirect and direct
+ * Kernel protocol v5/v6 packet for handling indirect and direct
* mount missing and expire requests
*/
case autofs_ptype_missing_indirect:
@@ -153,20 +154,35 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
case autofs_ptype_missing_direct:
case autofs_ptype_expire_direct:
{
- struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
-
- pktsz = sizeof(*packet);
-
- packet->wait_queue_token = wq->wait_queue_token;
- packet->len = wq->name.len;
- memcpy(packet->name, wq->name.name, wq->name.len);
- packet->name[wq->name.len] = '\0';
- packet->dev = wq->dev;
- packet->ino = wq->ino;
- packet->uid = wq->uid;
- packet->gid = wq->gid;
- packet->pid = wq->pid;
- packet->tgid = wq->tgid;
+ if(sbi->version == 5) {
+ struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
+
+ packet->wait_queue_token = wq->wait_queue_token;
+ packet->len = wq->name.len;
+ memcpy(packet->name, wq->name.name, wq->name.len);
+ packet->name[wq->name.len] = '\0';
+ packet->dev = wq->dev;
+ packet->ino = wq->ino;
+ packet->uid = wq->uid;
+ packet->gid = wq->gid;
+ packet->pid = wq->pid;
+ packet->tgid = wq->tgid;
+ pktsz = sizeof(packet);
+ } else { /* all other versions, currently only version 6 */
+ struct autofs_v6_packet *packet = &pkt.v6_pkt.v6_packet;
+
+ packet->wait_queue_token = wq->wait_queue_token;
+ packet->len = wq->name.len;
+ memcpy(packet->name, wq->name.name, wq->name.len);
+ packet->name[wq->name.len] = '\0';
+ packet->dev = wq->dev;
+ packet->ino = wq->ino;
+ packet->uid = wq->uid;
+ packet->gid = wq->gid;
+ packet->pid = wq->pid;
+ packet->tgid = wq->tgid;
+ pktsz = sizeof(packet);
+ }
break;
}
default:
diff --git a/include/linux/auto_fs4.h b/include/linux/auto_fs4.h
index e02982f..e8c644a 100644
--- a/include/linux/auto_fs4.h
+++ b/include/linux/auto_fs4.h
@@ -20,11 +20,12 @@
#undef AUTOFS_MIN_PROTO_VERSION
#undef AUTOFS_MAX_PROTO_VERSION
-#define AUTOFS_PROTO_VERSION 5
+#define AUTOFS_PROTO_VERSION 6
#define AUTOFS_MIN_PROTO_VERSION 3
-#define AUTOFS_MAX_PROTO_VERSION 5
+#define AUTOFS_MAX_PROTO_VERSION 6
-#define AUTOFS_PROTO_SUBVERSION 2
+#define AUTOFS_V5_PROTO_SUBVERSION 2
+#define AUTOFS_MAX_PROTO_SUBVERSION 0
/* Mask for expire behaviour */
#define AUTOFS_EXP_IMMEDIATE 1
@@ -154,6 +155,35 @@ union autofs_v5_packet_union {
autofs_packet_expire_direct_t expire_direct;
};
+/* autofs v6 common packet struct */
+/* packed structure for same packet size on all archs */
+struct autofs_v6_packet {
+ struct autofs_packet_hdr hdr;
+ autofs_wqt_t wait_queue_token;
+ __u32 dev;
+ __u64 ino;
+ __u32 uid;
+ __u32 gid;
+ __u32 pid;
+ __u32 tgid;
+ __u32 len;
+ char name[NAME_MAX+1];
+} __attribute__ ((packed));
+
+typedef struct autofs_v6_packet autofs_v6_packet_missing_indirect_t;
+typedef struct autofs_v6_packet autofs_v6_packet_expire_indirect_t;
+typedef struct autofs_v6_packet autofs_v6_packet_missing_direct_t;
+typedef struct autofs_v6_packet autofs_v6_packet_expire_direct_t;
+
+union autofs_v6_packet_union {
+ struct autofs_packet_hdr hdr;
+ struct autofs_v6_packet v6_packet;
+ autofs_v6_packet_missing_indirect_t missing_indirect;
+ autofs_v6_packet_expire_indirect_t expire_indirect;
+ autofs_v6_packet_missing_direct_t missing_direct;
+ autofs_v6_packet_expire_direct_t expire_direct;
+};
+
#define AUTOFS_IOC_EXPIRE_MULTI _IOW(0x93,0x66,int)
#define AUTOFS_IOC_EXPIRE_INDIRECT AUTOFS_IOC_EXPIRE_MULTI
#define AUTOFS_IOC_EXPIRE_DIRECT AUTOFS_IOC_EXPIRE_MULTI
On Mon, Feb 20, 2012 at 7:29 PM, Ian Kent <[email protected]> wrote:
>
> Not sure this is the way you'd like to go with this but here is a patch
> that bumps the major autofs kernel communications version (as yet not
> even compile tested).
.. and exactly how does this fix existing binaries?
Linus
On Mon, 2012-02-20 at 19:38 -0800, Linus Torvalds wrote:
> On Mon, Feb 20, 2012 at 7:29 PM, Ian Kent <[email protected]> wrote:
> >
> > Not sure this is the way you'd like to go with this but here is a patch
> > that bumps the major autofs kernel communications version (as yet not
> > even compile tested).
>
> .. and exactly how does this fix existing binaries?
It doesn't but changing it for exiting binaries will break existing
binaries that use a workaround.
I'm proposing this because the systemd folks were happy to do it this
way. But if you would like any other existing user space users to change
to using a correctly sized packet then, yes, it isn't what you want to
happen.
Ian
On Mon, Feb 20, 2012 at 8:02 PM, Ian Kent <[email protected]> wrote:
>
> It doesn't but changing it for exiting binaries will break existing
> binaries that use a workaround.
What existing binaries?
> I'm proposing this because the systemd folks were happy to do it this
> way. But if you would like any other existing user space users to change
> to using a correctly sized packet then, yes, it isn't what you want to
> happen.
Existing binaries *do* use the correct size packet - it's the correct
size for native x86-32!
It's our x86-64 compat layer that is wrong. It's a clear bug. Nothing else.
We don't start making up new interfaces because we have clear bugs: we
fix the damn bugs.
How could you even sanely do "workarounds"? A x86-32 binary shouldn't
even be able to *tell* that the kernel is 64-bit. And if it does that
somehow, we should fix that too!
Linus
On 02/20/2012 07:29 PM, Ian Kent wrote:
>
> Fix this by adding a new protocol version 6 packet that is packed so that
> it will have the same size on all archectures regardless of alignement.
>
You probably also want to make sure there aren't any holes in it, ever.
Alignment holes not only cause these kinds of problems, but also cause
security holes.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On 02/19/2012 12:02 PM, Linus Torvalds wrote:
> On Sun, Feb 19, 2012 at 11:49 AM, David Miller <[email protected]> wrote:
>>
>> A real shame, this should have used "__aligned_u64" from the
>> beginning.
>
> I agree. Sadly, this is exactly the kind of thing that is *really*
> easy to overlook, and once it is overlooked we're screwed because
> fixing it just breaks the native 32-bit case.
>
I'm starting to think we should compile the kernel with -Wpadded by
default (currently it's only done at "warning level 3", which I doubt
anyone ever uses, especially since that also includes -Wpacked which is
an actively toxic warning) and force people to add explicit padding
where it needs to go, if necessary. Unfortunately even that doesn't
guarantee compatibility with userspace, but it's at least something.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Mon, 2012-02-20 at 20:17 -0800, Linus Torvalds wrote:
> On Mon, Feb 20, 2012 at 8:02 PM, Ian Kent <[email protected]> wrote:
> >
> > It doesn't but changing it for exiting binaries will break existing
> > binaries that use a workaround.
>
> What existing binaries?
My bad, autofs for one, but obviously when running in compat mode.
And any other user space users, such as the auto_dir package, but I
don't know if that is maintained any more.
>
> > I'm proposing this because the systemd folks were happy to do it this
> > way. But if you would like any other existing user space users to change
> > to using a correctly sized packet then, yes, it isn't what you want to
> > happen.
>
> Existing binaries *do* use the correct size packet - it's the correct
> size for native x86-32!
>
> It's our x86-64 compat layer that is wrong. It's a clear bug. Nothing else.
>
> We don't start making up new interfaces because we have clear bugs: we
> fix the damn bugs.
>
> How could you even sanely do "workarounds"? A x86-32 binary shouldn't
> even be able to *tell* that the kernel is 64-bit. And if it does that
> somehow, we should fix that too!
Sure, not an acceptable excuse I know, but it was too late when I
realized my mistake.
So, as you say, let's fix the bug.
autofs is by far the biggest user and I can manage the change for it and
I can at least post to the auto_dir list (if it still exists) and Thomas
can feedback to the systemd folks.
Then there is the question of how it should be done.
I think the patch attached to your original post needs a little work if
that is to be used. Correct me if I'm wrong but AFAICT there are more
architectures that use 8-byte alignment than just x86-64, such as alpha,
ia64 and ppc64 and I believe they may also be used in a compat mode.
Is there a better way to do this change, anyone?
Ian
On Mon, Feb 20, 2012 at 8:52 PM, Ian Kent <[email protected]> wrote:
>
> I think the patch attached to your original post needs a little work if
> that is to be used. Correct me if I'm wrong but AFAICT there are more
> architectures that use 8-byte alignment than just x86-64, such as alpha,
> ia64 and ppc64 and I believe they may also be used in a compat mode.
The only issue is compat mode, and afaik, all other architectures
except for x86-32 do __u64 with natural alignment.
So all 64-bit architectures use natural alignment, the only issue is
the alignment of __u64 in 32-bit mode.
So it really is *not* about 8-byte alignment. Quite the reverse. It's
about 4-byte alignment of 64-bit entities, and I suspect x86-32 is the
only one that does that.
See "compat_u64", and notice how only in arch/x86/include/asm/compat.h
do we have
typedef u64 __attribute__((aligned(4))) compat_u64;
So it really is limited to only x86.
Linus
On Mon, Feb 20, 2012 at 9:06 PM, Linus Torvalds
<[email protected]> wrote:
>
> So it really is limited to only x86.
Oh, and if people really are using "uname()" to figure out that they
are running a 64-bit kernel, then we should probably make uname() use
"is_compat_task()" instead of checking the PER_LINUX32 personality.
So we already do have support for returning a different machine-name
to 32-bit binaries, but it uses the "personality" thing that nobody
cares about, rather than the compat layer. Looks like purely
historical reasons.
Linus
On 02/20/2012 09:21 PM, Linus Torvalds wrote:
> On Mon, Feb 20, 2012 at 9:06 PM, Linus Torvalds
> <[email protected]> wrote:
>>
>> So it really is limited to only x86.
>
> Oh, and if people really are using "uname()" to figure out that they
> are running a 64-bit kernel, then we should probably make uname() use
> "is_compat_task()" instead of checking the PER_LINUX32 personality.
>
> So we already do have support for returning a different machine-name
> to 32-bit binaries, but it uses the "personality" thing that nobody
> cares about, rather than the compat layer. Looks like purely
> historical reasons.
>
No, it serves a real function. I use both directions of this to deal
with various compatibility things. PER_LINUX32 lets you run, say, an
installer as if it was on a 32-bit program, even if it is written in a
scripting language (and hence running a native 64-bit interpreter).
Similarly, a 32-bit legacy binary can still function as part of a bigger
64-bit system.
So let's not change that just because someone did something idiotic.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On 02/20/2012 09:06 PM, Linus Torvalds wrote:
> On Mon, Feb 20, 2012 at 8:52 PM, Ian Kent <[email protected]> wrote:
>>
>> I think the patch attached to your original post needs a little work if
>> that is to be used. Correct me if I'm wrong but AFAICT there are more
>> architectures that use 8-byte alignment than just x86-64, such as alpha,
>> ia64 and ppc64 and I believe they may also be used in a compat mode.
>
> The only issue is compat mode, and afaik, all other architectures
> except for x86-32 do __u64 with natural alignment.
>
> So all 64-bit architectures use natural alignment, the only issue is
> the alignment of __u64 in 32-bit mode.
>
> So it really is *not* about 8-byte alignment. Quite the reverse. It's
> about 4-byte alignment of 64-bit entities, and I suspect x86-32 is the
> only one that does that.
>
> See "compat_u64", and notice how only in arch/x86/include/asm/compat.h
> do we have
>
> typedef u64 __attribute__((aligned(4))) compat_u64;
>
> So it really is limited to only x86.
>
m68k has alignment 2 for 32- and 64-bit quantities, so it's not just
x86; the only reason you don't see that one is because m68k doesn't have
a compat layer to worry about.
Holes are highly undesirable for another reason: they create security
holes where kernel information leaks out.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Mon, Feb 20, 2012 at 9:28 PM, H. Peter Anvin <[email protected]> wrote:
>
> Holes are highly undesirable for another reason: they create security
> holes where kernel information leaks out.
.. however, this is not an argument for adding a *new* interface.
You are still stuck handling the old one, so adding a new interface
without holes doesn't help *anything*. It's just a bad idea.
So we're stuck with the interfaces we have. Don't say "let's fix the
problems by adding new ones". It doesn't work, it doesn't solve
anything, and all it results in is even *more* interfaces to maintain
and find bugs in.
It's also a major pain for testing, since different people will
invariably use different interfaces. So a person running an older
distro will see a bug that the maintainer cannot reproduce, because
they maintainer has in the meantime updated to all the new and
"improved" interfaces.
So "new and improved" is just bad. Fix the existing ones, instead of
saying "oops, that was a bad interface so let's make a new one".
Always.
Linus
On Mon, 2012-02-20 at 21:06 -0800, Linus Torvalds wrote:
> On Mon, Feb 20, 2012 at 8:52 PM, Ian Kent <[email protected]> wrote:
> >
> > I think the patch attached to your original post needs a little work if
> > that is to be used. Correct me if I'm wrong but AFAICT there are more
> > architectures that use 8-byte alignment than just x86-64, such as alpha,
> > ia64 and ppc64 and I believe they may also be used in a compat mode.
>
> The only issue is compat mode, and afaik, all other architectures
> except for x86-32 do __u64 with natural alignment.
>
> So all 64-bit architectures use natural alignment, the only issue is
> the alignment of __u64 in 32-bit mode.
>
> So it really is *not* about 8-byte alignment. Quite the reverse. It's
> about 4-byte alignment of 64-bit entities, and I suspect x86-32 is the
> only one that does that.
>
> See "compat_u64", and notice how only in arch/x86/include/asm/compat.h
> do we have
>
> typedef u64 __attribute__((aligned(4))) compat_u64;
>
> So it really is limited to only x86.
Thanks, I wasn't aware of this.
>
> Linus
> I'm starting to think we should compile the kernel with -Wpadded by
> default (currently it's only done at "warning level 3", which I doubt
> anyone ever uses, especially since that also includes -Wpacked which is
> an actively toxic warning) and force people to add explicit padding
> where it needs to go, if necessary. Unfortunately even that doesn't
> guarantee compatibility with userspace, but it's at least something.
We generate a set of believed correct export headers and check them these
days as an option. Can that work not be turned to do this. Ie create a
temporary .c file that includes all the user visible header data (without
KERNEL being defined) and runs it through -Wpacked etc
Alan
On Tue, Feb 21, 2012 at 01:03:45PM +0000, Alan Cox wrote:
> > I'm starting to think we should compile the kernel with -Wpadded by
> > default (currently it's only done at "warning level 3", which I doubt
> > anyone ever uses, especially since that also includes -Wpacked which is
> > an actively toxic warning) and force people to add explicit padding
> > where it needs to go, if necessary. Unfortunately even that doesn't
> > guarantee compatibility with userspace, but it's at least something.
>
> We generate a set of believed correct export headers and check them these
> days as an option. Can that work not be turned to do this. Ie create a
> temporary .c file that includes all the user visible header data (without
> KERNEL being defined) and runs it through -Wpacked etc
Hi Alan.
If we require every exported file to be buildable on thier own
then it is simple to run gcc on every exported file.
We have before discussed that our header files should include everything
they need - and at least forcing this for our user-space headers would not be bad.
Simple hack I cooked up (will not work if you use O=...):
diff --git a/scripts/headers_check.pl b/scripts/headers_check.pl
index 7957e7a..b479575 100644
--- a/scripts/headers_check.pl
+++ b/scripts/headers_check.pl
@@ -27,12 +27,16 @@ my $line;
my $lineno = 0;
my $filename;
+my $gcc_options = "-Wall -Wpadded";
+my $gcc_include = "-I usr/include";
+
foreach my $file (@files) {
$filename = $file;
open(my $fh, '<', $filename)
or die "$filename: $!\n";
$lineno = 0;
+ &check_build();
while ($line = <$fh>) {
$lineno++;
&check_include();
@@ -45,6 +49,13 @@ foreach my $file (@files) {
}
exit $ret;
+# Check that the header-file can build
+# All exported headers are assumed to include what they need
+sub check_build
+{
+ system("gcc -xc -c $gcc_include $gcc_options $filename")
+}
+
sub check_include
{
if ($line =~ m/^\s*#\s*include\s+<((asm|linux).*)>/) {
Now we just need someone to fix up the headers....
I counted around 870 errors in 160 files.
But 99% looked trivial to fix...
Almost all is a matter of adding one or a few missing includes.
Sam
Am Sonntag, den 19.02.2012, 10:45 -0800 schrieb Linus Torvalds:
> On Sun, Feb 19, 2012 at 5:42 AM, Thomas Meyer <[email protected]> wrote:
> >
> > 1.) autofs4 interface is broken between x86 and x86_64. as systemd uses autofs, this bug hangs the boot process as e.g. binfmt is mounted via autofs. see also http://lists.freedesktop.org/archives/systemd-devel/2011-September/003396.html
>
> Duh.
>
> That is just broken.
>
> The code even *talks* about how the packet layout is the same on
> 32-bit and 64-bit architectures, and that's largely true.
>
> However, while true, x86-64 has 8-byte alignment for 'long', and
> x86-32 has 4-byte alignment. Which means that even though the
> structure layout is exactly the same, on x86-64 the *alignment* issue
> will push it out to 304 bytes.
>
> That's just stupid. We've had that problem before. It's easy to
> overlook, but that packet is just mis-designed.
>
> The attached patch isn't pretty, but this is definitely a kernel bug.
> Binary compatibility is *important*, dammit.
>
> Does this fix it?
yes, it does!
thanks.
On Tue, Feb 21, 2012 at 10:58 AM, Thomas Meyer <[email protected]> wrote:
> Am Sonntag, den 19.02.2012, 10:45 -0800 schrieb Linus Torvalds:
>>
>> Does this fix it?
>
> yes, it does!
Do you know if anybody ever applied that disgusting (and incorrect -
the 4-byte off thing *only* happens in 32-bit mode with an x86-64
kernel - so the other architecture strcmp's are wrong, and even the
x86-64 strcmp is valid only when compiling as a legacy 32-bit x86 app)
patch of yours?
Because if that actually did happen, there are i386 binaries that are
now broken and know of the kernel bug as Ian was afraid there would
be.
But hopefully that patch never got anywhere, and we could still fix
this in the kernel with my suggested patch.
Not to say that *my* patch isn't also disgusting, of course. But at
least my patch fixes a real compat task problem.
Linus
On Sun, Feb 19, 2012 at 5:42 AM, Thomas Meyer <[email protected]> wrote:
>
> 2.) while debugging above issue: I did find an minor bug in sys_poll() - nobody did take care of my proposed patch: https://lkml.org/lkml/2011/9/24/35
Ok, so I started out forward-porting that patch to current -git
(trivial: it's just that the system call tables are differently
generated now), but the more I look at it, the more I suspect that we
should perhaps just globally fix "sys_poll()" to have the timeout
argument be 'int'.
Because that *is* the standard user interface (just do "man 2 poll"),
and while all of the git history (and all of the BK history) we've had
it as "long", I suspect we should just fix it.
So I suspect the correct patch is just as attached instead: make
sys_poll() just take an "int timeout". Any user who tried to use a
long value would already have got truncated by glibc - I just checked.
Of course, there is a remote possibility that somebody might not use
glibc, and have used "poll()" with the raw system call interface, and
depended on using a 64-bit "long timeout" on 64-bit architectures.
But quite frankly, that sounds rather unlikely in the extreme.
Comments? If we do this, and somebody actually reports that they use a
64-bit timeout, we could always go back to the broken 'long' argument,
and take your patch to fix the compat case.
Linus
Le mardi 21 février 2012 à 14:43 -0800, Linus Torvalds a écrit :
> On Sun, Feb 19, 2012 at 5:42 AM, Thomas Meyer <[email protected]> wrote:
> >
> > 2.) while debugging above issue: I did find an minor bug in sys_poll() - nobody did take care of my proposed patch: https://lkml.org/lkml/2011/9/24/35
>
> Ok, so I started out forward-porting that patch to current -git
> (trivial: it's just that the system call tables are differently
> generated now), but the more I look at it, the more I suspect that we
> should perhaps just globally fix "sys_poll()" to have the timeout
> argument be 'int'.
>
> Because that *is* the standard user interface (just do "man 2 poll"),
> and while all of the git history (and all of the BK history) we've had
> it as "long", I suspect we should just fix it.
>
> So I suspect the correct patch is just as attached instead: make
> sys_poll() just take an "int timeout". Any user who tried to use a
> long value would already have got truncated by glibc - I just checked.
>
> Of course, there is a remote possibility that somebody might not use
> glibc, and have used "poll()" with the raw system call interface, and
> depended on using a 64-bit "long timeout" on 64-bit architectures.
>
> But quite frankly, that sounds rather unlikely in the extreme.
>
> Comments? If we do this, and somebody actually reports that they use a
> 64-bit timeout, we could always go back to the broken 'long' argument,
> and take your patch to fix the compat case.
>
> Linus
Yep, this is what I thought, but when this was raised last september,
both Andrew and Andi disagreed.
https://lkml.org/lkml/2011/10/6/389
On Tue, Feb 21, 2012 at 3:10 PM, Eric Dumazet <[email protected]> wrote:
>
> Yep, this is what I thought, but when this was raised last september,
> both Andrew and Andi disagreed.
>
> https://lkml.org/lkml/2011/10/6/389
Well, I agree that it *could* break things, but considering that at
least glibc does the sign-exitension, any code that puts a large
number in the 'timeout' field would *already* have broken.
Which is why I think we should first try to fix the system call
interface - because it's the simpler patch, and it's the
RightThing(tm) to do from a standards standpoint. It's also almost
guaranteed to work, exactly because of how glibc already does that
conversion.
But if something does break - however unlikely and perverse the code
has to be to be able to do that - we'd clearly have to undo that "just
fix sys_poll()" and use Thomas' patch to have a compat_sys_poll()
instead.
I just don't like the notion of doing that silly compat thing when it
really shouldn't be needed to begin with.
Linus
On Sat, 18 Feb 2012, Linus Torvalds wrote:
> So it's almost getting to be a habit: yet another -rc release that is
> delayed by a couple of days.
I just got the BUG below (with g45196ce being the topmost commit).
It happened when trying to start 'gwenview', but I am not able to
reproduce it again. Adding a few people to CC just in case someone
immediately sees what might be the problem.
The IP resolves to
#ifdef CONFIG_MMU
static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
{
[ ... snip ... ]
if (file) {
===> this line struct inode *inode = file->f_path.dentry->d_inode;
struct address_space *mapping = file->f_mapping;
get_file(file);
if (tmp->vm_flags & VM_DENYWRITE)
atomic_dec(&inode->i_writecount);
mutex_lock(&mapping->i_mmap_mutex);
if (tmp->vm_flags & VM_SHARED)
mapping->i_mmap_writable++;
flush_dcache_mmap_lock(mapping);
/* insert tmp into the share list, just after mpnt */
vma_prio_tree_add(tmp, mpnt);
flush_dcache_mmap_unlock(mapping);
mutex_unlock(&mapping->i_mmap_mutex);
}
more precisely:
[ ... snip ... ]
0xffffffff8103a4f9 <+409>: andq $0xffffffffffffdfff,0x30(%rbx)
0xffffffff8103a501 <+417>: movq $0x0,0x20(%rbx)
0xffffffff8103a509 <+425>: movq $0x0,0x18(%rbx)
0xffffffff8103a511 <+433>: test %rdx,%rdx
0xffffffff8103a514 <+436>: je 0xffffffff8103a565 <dup_mmap+517>
0xffffffff8103a516 <+438>: mov 0x18(%rdx),%rax
0xffffffff8103a51a <+442>: mov 0x130(%rdx),%r12
===> this line 0xffffffff8103a521 <+449>: mov 0x30(%rax),%rax
0xffffffff8103a525 <+453>: lock incq 0x68(%rdx)
0xffffffff8103a52a <+458>: testb $0x8,0x31(%rbx)
[ ... snip ... ]
The machine has gone through several suspend-resume cycles before this
happened, so it might well also be some memory corruption caused by a
random driver.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [<ffffffff8103a521>] dup_mmap+0x1c1/0x3b0
PGD 3774f067 PUD 36cf7067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: af_packet iwlwifi tun iptable_mangle xt_DSCP xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tab
conntrack cpufreq_conservative iptable_filter cpufreq_userspace cpufreq_powersave acpi_cpufreq ip_tables mperf x_tables microcode
ooth snd_hda_codec_conexant pcspkr iTCO_wdt iTCO_vendor_support i2c_i801 cfg80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sn
l ac snd tpm_tis soundcore tpm tpm_bios battery wmi autofs4 uhci_hcd i915 drm_kms_helper drm i2c_algo_bit ehci_hcd button video us
ermal thermal_sys [last unloaded: iwlwifi]
Pid: 1993, comm: Xorg Not tainted 3.3.0-rc4-00074-g45196ce #1 LENOVO 7470BN2/7470BN2
RIP: 0010:[<ffffffff8103a521>] [<ffffffff8103a521>] dup_mmap+0x1c1/0x3b0
RSP: 0018:ffff8800780bdd50 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff880077f25d98 RCX: 0000000000000000
RDX: ffff88003767ed00 RSI: ffff880037b36298 RDI: ffff880077f25d98
RBP: ffff8800780bddb0 R08: ffff880067ded4e0 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800767a5d50
R13: ffff880037b36298 R14: ffff880056d520c0 R15: 0000000000000000
FS: 00007f96b2bd6880(0000) GS:ffff88007c200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000030 CR3: 00000000372a3000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 1993, threadinfo ffff8800780bc000, task ffff880078044040)
Stack:
ffff880037b7ba80 ffff880037b7bb18 ffff880056d52158 ffff880077f25e48
ffff880077f25e60 ffff880077f25e88 ffff880077f25e80 ffff880056d520c0
ffff880037b7ba80 ffff880041afe040 0000000000000000 00007f96b2bd6b50
Call Trace:
[<ffffffff8103ab5f>] dup_mm+0xbf/0x150
[<ffffffff8103bb72>] copy_process+0xf82/0xfa0
[<ffffffff8103bf78>] do_fork+0xb8/0x300
[<ffffffff8104f94c>] ? do_sigaction+0x13c/0x1e0
[<ffffffff81164040>] ? fd_install+0x30/0x60
[<ffffffff812eb3c9>] ? lockdep_sys_exit_thunk+0x35/0x67
[<ffffffff8100af83>] sys_clone+0x23/0x30
[<ffffffff8157b553>] stub_clone+0x13/0x20
[<ffffffff8157b1f9>] ? system_call_fastpath+0x16/0x1b
Code: 00 00 00 48 81 63 30 ff df ff ff 48 c7 43 20 00 00 00 00 48 c7 43 18 00 00 00 00 48 85 d2 74 4f 48 8b 42 18 4c 8b a2 30 01 00 00 <48> 8b 40 30 f0 48 ff 42 68 f6 43 31 08 74 07 f0 ff 88 cc 01 00
RIP [<ffffffff8103a521>] dup_mmap+0x1c1/0x3b0
RSP <ffff8800780bdd50>
CR2: 0000000000000030
--
Jiri Kosina
SUSE Labs
On Fri, Feb 24, 2012 at 2:39 AM, Jiri Kosina <[email protected]> wrote:
>
> I just got the BUG below (with g45196ce being the topmost commit).
>
> It happened when trying to start 'gwenview', but I am not able to
> reproduce it again. Adding a few people to CC just in case someone
> immediately sees what might be the problem.
Hmm. That is the code that increments the file counter, afaik:
0: 48 81 63 30 ff df ff ff andq $0xffffffffffffdfff,0x30(%rbx)
8: 48 c7 43 20 00 00 00 00 movq $0x0,0x20(%rbx)
10: 48 c7 43 18 00 00 00 00 movq $0x0,0x18(%rbx)
18: 48 85 d2 test %rdx,%rdx
1b: 74 4f je 0x6c
1d: 48 8b 42 18 mov 0x18(%rdx),%rax
21: 4c 8b a2 30 01 00 00 mov 0x130(%rdx),%r12
28:* 48 8b 40 30 mov 0x30(%rax),%rax <--
trapping instruction
2c: f0 48 ff 42 68 lock incq 0x68(%rdx)
31: f6 43 31 08 testb $0x8,0x31(%rbx)
35: 74 07 je 0x3e
and that preceding test is testing for a NULL "file", and then the
mov 0x18(%rdx),%rax
is "dentry = file->f_path.dentry", while the trapping "mov
0x30(%rax),%rax" is the continuation of that: "dentry->d_inode" (and
the "lock incq" is the get_file() - it's incrementing the file
counter). That "mov 0x130(%rdx),%r12" in between is doing "mapping =
file->f_mapping"
So dentry seems to be NULL for you.
> The machine has gone through several suspend-resume cycles before this
> happened, so it might well also be some memory corruption caused by a
> random driver.
I almost think it is, because "file->dentry" should never be NULL in a
mapping afaik. Especially as your "mapping" certainly isn't NULL (it's
in %r12, so you can see it in your register dump).
This isn't some unusual code sequence either, so I don't see it as
some random latent bug that is just very unlikely and hard to trigger
in that code itself.
I'll think about it, but my first reaction is "memory corruption". Do
you think you could try to run with a kernel that has SLAB debugging
and poisoning on? If it's a stale pointer dereference that has cleared
that dentry, that _might_ show it closer to the actual bug (rather
than a long time later when the NULL dereference happens).
Linus
On Fri, 24 Feb 2012, Linus Torvalds wrote:
> > The machine has gone through several suspend-resume cycles before this
> > happened, so it might well also be some memory corruption caused by a
> > random driver.
>
> I almost think it is, because "file->dentry" should never be NULL in a
> mapping afaik. Especially as your "mapping" certainly isn't NULL (it's
> in %r12, so you can see it in your register dump).
>
> This isn't some unusual code sequence either, so I don't see it as
> some random latent bug that is just very unlikely and hard to trigger
> in that code itself.
>
> I'll think about it, but my first reaction is "memory corruption". Do
> you think you could try to run with a kernel that has SLAB debugging
> and poisoning on? If it's a stale pointer dereference that has cleared
> that dentry, that _might_ show it closer to the actual bug (rather
> than a long time later when the NULL dereference happens).
Running DEBUG_SLAB kernel since I have first hit the bug, but nothing
popped up yet. Seems undebuggable so far.
On the other hand I wouldn't blame HW for a bit-flip, as it was a clear
NULL pointer (plus 0x30 offset), not a random garbage.
--
Jiri Kosina
SUSE Labs
On Fri, Feb 24, 2012 at 10:01 AM, Jiri Kosina <[email protected]> wrote:
>
> On the other hand I wouldn't blame HW for a bit-flip, as it was a clear
> NULL pointer (plus 0x30 offset), not a random garbage.
Agreed. Actual NULL pointers tend to be us screwing things up.
Linus
On Fri, 24 Feb 2012, Jiri Kosina wrote:
> Running DEBUG_SLAB kernel since I have first hit the bug, but nothing
> popped up yet. Seems undebuggable so far.
>
> On the other hand I wouldn't blame HW for a bit-flip, as it was a clear
> NULL pointer (plus 0x30 offset), not a random garbage.
Hmm, just got this as a result of lsmod (topmost commit g586c6e7)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff810a6380>] m_show+0x70/0x190
PGD 77c60067 PUD 3756c067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: af_packet rfcomm bnep tun iptable_mangle xt_DSCP nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt
nf_conntrack iptable_filter ip_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microc
btusb bluetooth i2c_i801 iTCO_wdt snd_hda_codec_conexant e1000e pcspkr iTCO_vendor_support cfg80211 snd_hda_intel snd_hda_codec s
d_acpi rfkill battery snd ac soundcore tpm_tis wmi tpm tpm_bios autofs4 uhci_hcd i915 drm_kms_helper ehci_hcd drm i2c_algo_bit usb
a_generic thermal thermal_sys
Pid: 3960, comm: lsmod Not tainted 3.3.0-rc5-00088-g586c6e7 #10 LENOVO 7470BN2/7470BN2
RIP: 0010:[<ffffffff810a6380>] [<ffffffff810a6380>] m_show+0x70/0x190
RSP: 0018:ffff880037b8fe08 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8800379b332c
RBP: ffff880037b8fe48 R08: 0000000000000020 R09: 000000000000ffff
R10: 0000000000000001 R11: 0000ffffffff6c0a R12: ffffffffa0175490
R13: ffff880036c44d80 R14: ffffffffa0175280 R15: ffffffffa0175288
FS: 00007f78f52a0700(0000) GS:ffff88007c200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000007563a000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process lsmod (pid: 3960, threadinfo ffff880037b8e000, task ffff880073624040)
Stack:
ffff880037b8fe28 ffffffff814fa56e ffff880037b8fe88 ffffffffa0175288
ffff880036c44d80 0000000000000320 ffff880037b8fe80 000000000000002b
ffff880037b8feb8 ffffffff8118dda7 ffff880037b8ff48 00000000000003d5
Call Trace:
[<ffffffff814fa56e>] ? mutex_lock_nested+0x3e/0x50
[<ffffffff8118dda7>] seq_read+0x287/0x400
[<ffffffff8118db20>] ? seq_lseek+0x110/0x110
[<ffffffff811cea6d>] proc_reg_read+0x7d/0xc0
[<ffffffff8116b258>] vfs_read+0xc8/0x130
[<ffffffff8116b3b0>] sys_read+0x50/0x90
[<ffffffff81505779>] system_call_fastpath+0x16/0x1b
Code: e8 06 ff ff ff 48 c7 c6 c6 50 79 81 48 89 c2 4c 89 ef 31 c0 e8 12 76 0e 00 49 8b 9e 10 02 00 00 31 c0 4c 39 e3 74 2a 0f 1f 4
89 ef 48 83 c2 18 e8
RIP [<ffffffff810a6380>] m_show+0x70/0x190
RSP <ffff880037b8fe08>
CR2: 0000000000000020
So again NULL+offset, again a vfs-related structure, but a completely
different codepath.
This particular kernel didn't have DEBUG_SLAB turned on, unfortunately.
--
Jiri Kosina
SUSE Labs