Greetings,
K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about
buffer underrun upon every attempt to burn a CD. I can't fully bisect
the problem because intervening kernels hang soft during boot. Using
git bisect visualize, and converting to postable text:
bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5)
bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa)
libata: update ATAPI overflow draining
bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4
Serial console log of hung kernel 40b01b9bbdf51ae543a04744283bf2d56c4a6afa below
[ 0.000000] Linux version 2.6.25-rc2-smp (root@homer) (gcc version 4.2.1 (SUSE Linux)) #14 SMP PREEMPT Thu Feb 21 08:49:51 CET 2008
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
[ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] 0MB HIGHMEM available.
[ 0.000000] 1023MB LOWMEM available.
[ 0.000000] Scan SMP from b0000000 for 1024 bytes.
[ 0.000000] Scan SMP from b009fc00 for 1024 bytes.
[ 0.000000] Scan SMP from b00f0000 for 65536 bytes.
[ 0.000000] found SMP MP-table at [b00f5320] 000f5320
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] Normal 4096 -> 262128
[ 0.000000] HighMem 262128 -> 262128
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0 -> 262128
[ 0.000000] DMI 2.3 present.
[ 0.000000] ACPI: RSDP 000F6CC0, 0014 (r0 IntelR)
[ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 IntelR AWRDACPI 42302E31 AWRD 0)
[ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 IntelR AWRDACPI 42302E31 AWRD 0)
[ 0.000000] ACPI: DSDT 3FFF30C0, 4139 (r1 INTELR AWRDACPI 1000 MSFT 100000E)
[ 0.000000] ACPI: FACS 3FFF0000, 0040
[ 0.000000] ACPI: APIC 3FFF7200, 0068 (r1 IntelR AWRDACPI 42302E31 AWRD 0)
[ 0.000000] ACPI: PM-Timer IO Port: 0x408
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 15:2 APIC version 20
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] Processor #1 15:2 APIC version 20
[ 0.000000] WARNING: maxcpus limit of 1 reached. Processor ignored.
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
[ 0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
[ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
[ 0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081
[ 0.000000] Kernel command line: root=/dev/sdb3 rootflags=data=writeback vga=0x314 resume=/dev/sdb2 console=ttyS0,115200n8 console=tty splash=silent PROFILE=default 1 maxcpus=1
[ 0.000000] Enabling fast FPU save and restore... done.
[ 0.000000] Enabling unmasked SIMD FPU exception support... done.
[ 0.000000] Initializing CPU#0
[ 0.000000] Preemptible RCU implementation.
[ 0.000000] CPU 0 irqstacks, hard=b0427000 soft=b0425000
[ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
[ 0.000000] Detected 2992.603 MHz processor.
[ 0.000999] Console: colour dummy device 80x25
[ 0.000999] console [tty0] enabled
[ 0.000999] console [ttyS0] enabled
[ 0.000999] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[ 0.000999] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000999] Memory: 1028968k/1048512k available (1998k kernel code, 18904k reserved, 955k data, 236k init, 0k highmem)
[ 0.000999] virtual kernel memory layout:
[ 0.000999] fixmap : 0xfff9b000 - 0xfffff000 ( 400 kB)
[ 0.000999] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
[ 0.000999] vmalloc : 0xf0800000 - 0xff7fe000 ( 239 MB)
[ 0.000999] lowmem : 0xb0000000 - 0xefff0000 (1023 MB)
[ 0.000999] .init : 0xb03e7000 - 0xb0422000 ( 236 kB)
[ 0.000999] .data : 0xb02f3b26 - 0xb03e29a8 ( 955 kB)
[ 0.000999] .text : 0xb0100000 - 0xb02f3b26 (1998 kB)
[ 0.000999] Checking if this processor honours the WP bit even in supervisor mode...Ok.
[ 0.060993] Calibrating delay using timer specific routine.. 5987.55 BogoMIPS (lpj=2993775)
[ 0.063022] Security Framework initialized
[ 0.064010] Mount-cache hash table entries: 512
[ 0.065129] CPU: Trace cache: 12K uops, L1 D cache: 8K
[ 0.066992] CPU: L2 cache: 512K
[ 0.067992] CPU: Physical Processor ID: 0
[ 0.068993] Intel machine check architecture supported.
[ 0.069994] Intel machine check reporting enabled on CPU#0.
[ 0.070991] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
[ 0.071992] CPU0: Thermal monitoring enabled
[ 0.072993] Compat vDSO mapped to ffffe000.
[ 0.073996] Checking 'hlt' instruction... OK.
[ 0.079743] SMP alternatives: switching to UP code
[ 0.079998] Freeing SMP alternatives: 9k freed
[ 0.080991] ACPI: Core revision 20070126
[ 0.091025] CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
[ 0.094019] Total of 1 processors activated (5987.55 BogoMIPS).
[ 0.095110] ENABLING IO-APIC IRQs
[ 0.096152] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.107983] Brought up 1 CPUs
[ 0.108208] net_namespace: 552 bytes
[ 0.110058] NET: Registered protocol family 16
[ 0.111195] ACPI: bus type pci registered
[ 0.114122] PCI: PCI BIOS revision 2.10 entry at 0xfb980, last bus=2
[ 0.114984] PCI: Using configuration type 1
[ 0.115983] Setting up standard PCI resources
[ 0.139635] ACPI: Interpreter enabled
[ 0.139985] ACPI: (supports S0 S3 S4 S5)
[ 0.142261] ACPI: Using IOAPIC for interrupt routing
[ 0.148393] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 0.149406] pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO
[ 0.149981] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH4 GPIO
[ 0.151395] PCI: Transparent bridge - 0000:00:1e.0
[ 0.160727] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15)
[ 0.164091] ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 7 9 10 11 12 14 15)
[ 0.168383] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
[ 0.171666] ACPI: PCI Interrupt Link [LNKD] (IRQs *3 4 5 7 9 10 11 12 14 15)
[ 0.175277] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 0.179656] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 *9 10 11 12 14 15)
[ 0.183050] ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 0.188275] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 0.192014] Linux Plug and Play Support v0.97 (c) Adam Belay
[ 0.193005] pnp: PnP ACPI init
[ 0.193979] ACPI: bus type pnp registered
[ 0.199104] pnpacpi: exceeded the max number of mem resources: 12
[ 0.200035] pnp: PnP ACPI: found 13 devices
[ 0.200971] ACPI: ACPI bus type pnp unregistered
[ 0.202306] PCI: Using ACPI for IRQ routing
[ 0.202977] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 0.226967] NetLabel: Initializing
[ 0.226969] NetLabel: domain hash size = 128
[ 0.227968] NetLabel: protocols = UNLABELED CIPSOv4
[ 0.228981] NetLabel: unlabeled traffic allowed by default
[ 0.230055] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 11
[ 0.232158] hpet0: 3 64-bit timers, 14318180 Hz
[ 0.235002] ACPI: RTC can wake from S4
[ 0.235966] Time: tsc clocksource has been installed.
[ 0.243966] system 00:01: ioport range 0xb78-0xb7b has been reserved
[ 0.243969] system 00:01: ioport range 0xf78-0xf7b has been reserved
[ 0.244971] system 00:01: ioport range 0xa78-0xa7b has been reserved
[ 0.245968] system 00:01: ioport range 0xe78-0xe7b has been reserved
[ 0.246967] system 00:01: ioport range 0xbbc-0xbbf has been reserved
[ 0.247967] system 00:01: ioport range 0xfbc-0xfbf has been reserved
[ 0.248968] system 00:01: ioport range 0x4d0-0x4d1 has been reserved
[ 0.249967] system 00:01: ioport range 0x200-0x200 has been reserved
[ 0.250967] system 00:01: ioport range 0x202-0x208 has been reserved
[ 0.251967] system 00:01: ioport range 0x320-0x32f has been reserved
[ 0.252966] system 00:01: ioport range 0x295-0x296 has been reserved
[ 0.254970] system 00:0b: ioport range 0x400-0x4bf could not be reserved
[ 0.255973] system 00:0c: iomem range 0xf0000-0xf3fff could not be reserved
[ 0.256966] system 00:0c: iomem range 0xf4000-0xf7fff could not be reserved
[ 0.257966] system 00:0c: iomem range 0xf8000-0xfbfff could not be reserved
[ 0.258966] system 00:0c: iomem range 0xfc000-0xfffff could not be reserved
[ 0.259966] system 00:0c: iomem range 0x3fff0000-0x3fffffff could not be reserved
[ 0.260965] system 00:0c: iomem range 0x0-0x9ffff could not be reserved
[ 0.261965] system 00:0c: iomem range 0x100000-0x3ffeffff could not be reserved
[ 0.262965] system 00:0c: iomem range 0xfec00000-0xfec00fff could not be reserved
[ 0.263965] system 00:0c: iomem range 0xfec01000-0xfed8ffff could not be reserved
[ 0.264965] system 00:0c: iomem range 0xfee00000-0xfee00fff could not be reserved
[ 0.265965] system 00:0c: iomem range 0xffb00000-0xffbfffff could not be reserved
[ 0.266964] system 00:0c: iomem range 0xfff00000-0xffffffff could not be reserved
[ 0.298911] PCI: Bridge: 0000:00:01.0
[ 0.298960] IO window: a000-afff
[ 0.299963] MEM window: 0xf8000000-0xf9ffffff
[ 0.300961] PREFETCH window: 0x00000000e8000000-0x00000000f7ffffff
[ 0.301963] PCI: Bridge: 0000:00:1e.0
[ 0.302959] IO window: b000-bfff
[ 0.303961] MEM window: 0xfa000000-0xfa0fffff
[ 0.304960] PREFETCH window: disabled.
[ 0.305994] NET: Registered protocol family 2
[ 0.325957] IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.326233] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.328385] TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.329394] TCP: Hash tables configured (established 131072 bind 65536)
[ 0.329963] TCP reno registered
[ 0.337961] Unpacking initramfs... done
[ 0.574326] Freeing initrd memory: 6128k freed
[ 0.576000] Machine check exception polling timer started.
[ 0.577268] audit: initializing netlink socket (disabled)
[ 0.577937] type=2000 audit(1203584121.788:1): initialized
[ 0.579060] Total HugeTLB memory allocated, 0
[ 0.579988] VFS: Disk quotas dquot_6.5.1
[ 0.580947] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[ 0.582937] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
[ 0.583930] io scheduler noop registered
[ 0.584925] io scheduler anticipatory registered
[ 0.585924] io scheduler deadline registered
[ 0.586932] io scheduler cfq registered (default)
[ 0.589129] vesafb: framebuffer at 0xe8000000, mapped to 0xf0880000, using 1875k, total 16384k
[ 0.589925] vesafb: mode is 800x600x16, linelength=1600, pages=16
[ 0.590924] vesafb: protected mode interface info at c000:b544
[ 0.591926] vesafb: pmi: set display start = b00cb5d2, set palette = b00cb612
[ 0.592923] vesafb: scrolling: redraw
[ 0.593924] vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
[ 0.612485] Console: switching to colour frame buffer device 100x37
[ 0.627919] fb0: VESA VGA frame buffer device
[ 0.638247] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
[ 0.639029] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 0.641591] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 0.642200] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[ 0.642917] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[ 0.644280] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 0.645234] mice: PS/2 mouse device common for all mice
[ 0.675685] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[ 0.687684] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
[ 0.687709] rtc0: alarms up to one month
[ 0.688747] cpuidle: using governor ladder
[ 0.689689] cpuidle: using governor menu
[ 0.690751] oprofile: using NMI interrupt.
[ 0.693003] NET: Registered protocol family 1
[ 0.693759] p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
[ 0.694686] Using IPI No-Shortcut mode
[ 0.695814] registered taskstats version 1
[ 0.696813] rtc_cmos 00:03: setting system clock to 2008-02-21 08:55:23 UTC (1203584123)
[ 0.698738] Freeing unused kernel memory: 236k freed
[ 0.699701] Write protecting the kernel text: 2000k
[ 0.700690] Write protecting the kernel read-only data: 792k
[ 0.762206] ACPI: ACPI0007:00 is registered as cooling_device0
[ 0.768027] ACPI: LNXTHERM:01 is registered as thermal_zone0
[ 0.768870] ACPI: Thermal Zone [THRM] (40 C)
[ 0.785027] SCSI subsystem initialized
[ 0.806767] ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
[ 0.808835] scsi0 : ata_piix
[ 0.809721] scsi1 : ata_piix
[ 0.812130] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
[ 0.812702] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
[ 1.290553] ata1.00: ATA-6: ST3160021A, 3.04, max UDMA/100
[ 1.290558] ata1.00: 312581808 sectors, multi 16: LBA48
[ 1.291580] ata1.01: ATAPI: BENQ DVD DD DW1625, BBIA, max UDMA/33
[ 1.314801] ata1.00: configured for UDMA/100
[ 1.473512] ata1.01: configured for UDMA/33
[ 3.788245] ata2.00: ATA-6: ST3120022A, 3.06, max UDMA/100
[ 3.788248] ata2.00: 234441648 sectors, multi 16: LBA48
[ 3.811575] ata2.00: configured for UDMA/100
[ 3.823087] scsi 0:0:0:0: Direct-Access ATA ST3160021A 3.04 PQ: 0 ANSI: 5
[ 3.825453] scsi 0:0:1:0: CD-ROM BENQ DVD DD DW1625 BBIA PQ: 0 ANSI: 5
[ 3.825587] scsi 1:0:0:0: Direct-Access ATA ST3120022A 3.06 PQ: 0 ANSI: 5
[ 3.831138] ACPI: PNP0C0B:00 is registered as cooling_device1
[ 3.831481] ACPI: Fan [FAN] (on)
[ 3.856766] BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
[ 3.999333] Driver 'sd' needs updating - please use bus_type methods
[ 3.999551] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
[ 4.000464] sd 0:0:0:0: [sda] Write Protect is off
[ 4.001459] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.002528] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
[ 4.003475] sd 0:0:0:0: [sda] Write Protect is off
[ 4.004458] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.005586] sda: sda1 sda2 < sda5 sda6 >
[ 4.048569] sd 0:0:0:0: [sda] Attached SCSI disk
[ 4.049519] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
[ 4.050439] sd 1:0:0:0: [sdb] Write Protect is off
[ 4.051473] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.052505] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
[ 4.053436] sd 1:0:0:0: [sdb] Write Protect is off
[ 4.054450] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.055498] sdb: sdb1 sdb2 sdb3
[ 4.063528] sd 1:0:0:0: [sdb] Attached SCSI disk
[ 37.473476] SysRq : Show State
[ 37.474338] task PC stack pid father
[ 37.474338] init S ef836eac 0 1 0
[ 37.474338] ef836f00 00000082 ffffffff ef836eac b01224d3 00000000 00000000 00000001
[ 37.474338] e80cb97b 00000000 b0421180 b0421180 b0421180 ef835020 ef83527c b180b180
[ 37.474338] 00000000 ef836000 ee6e0c80 b1031ae0 00000000 00000000 ee5b9148 ef836f14
[ 37.474338] Call Trace:
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
[ 37.474338] [<b01d218e>] ? security_task_wait+0xf/0x11
[ 37.474338] [<b0129e1f>] do_wait+0x470/0xa1a
[ 37.474338] [<b01249e9>] ? wake_up_new_task+0x77/0x91
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b012a431>] sys_wait4+0x68/0x9f
[ 37.474338] [<b012a48f>] sys_waitpid+0x27/0x29
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1
[ 37.474338] =======================
[ 37.474338] kthreadd S 00000000 0 2 0
[ 37.474338] ef83bfc8 00000046 ef8b50a0 00000000 00000092 ef83bf80 00000000 00000000
[ 37.474338] ee5f776c 00000000 b0421180 b0421180 b0421180 ef83a0a0 ef83a2fc b180b180
[ 37.474338] 00000000 ef83b000 ee564d00 00000000 00000ae7 00000000 ee5f6e1c ee5f6e20
[ 37.474338] Call Trace:
[ 37.474338] [<b011d295>] ? complete+0x43/0x4b
[ 37.474338] [<b0139450>] kthreadd+0x13a/0x13f
[ 37.474338] [<b0139316>] ? kthreadd+0x0/0x13f
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] migration/0 S 0102c262 0 3 2
[ 37.474338] ef83df98 00000046 ef83c120 0102c262 00000246 ef83df4c 00000000 ef83df4c
[ 37.474338] 066fb2fc 00000000 b0421180 b0421180 b0421180 ef83c120 ef83c37c b180b180
[ 37.474338] 00000000 ef83d000 b03ba200 00000000 00000000 00000000 00000000 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210
[ 37.474338] [<b012304e>] migration_thread+0x124/0x210
[ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] ksoftirqd/0 R running 0 4 2
[ 37.474338] f4744940 00000008 ef841f14 b01424d9 f4650f99 00000008 00000000 01808638
[ 37.474338] b18086c0 b1808638 b1808600 ef841f50 b013ca83 b1808600 b1808604 f4744940
[ 37.474338] 00000008 f465092e 00000008 f465092e 00000046 b1807120 00000000 00000246
[ 37.474338] Call Trace:
[ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
[ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164
[ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
[ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89
[ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
[ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30
[ 37.474338] [<b01079d9>] ? do_softirq+0x35/0x8c
[ 37.474338] [<b012c515>] ? ksoftirqd+0xad/0x17f
[ 37.474338] [<b012c468>] ? ksoftirqd+0x0/0x17f
[ 37.474338] [<b01392f4>] ? kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] ? kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] events/0 R running 0 5 2
[ 37.474338] ef843fa0 00000046 ef8112c0 ef843f44 b0136d0e 00000a75 00000000 00000a75
[ 37.474338] 60e9afad 00000008 b0421180 b0421180 b0421180 ef8420a0 ef8422fc b180b180
[ 37.474338] 00000000 ef843000 ee6e0880 b18089e0 3399ec88 00000002 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b0136d0e>] ? queue_delayed_work+0x40/0x48
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] khelper S b02f12e5 0 6 2
[ 37.474338] ef845fa0 00000046 ef845f3c b02f12e5 ef836d40 ef836d44 00000000 ef845f44
[ 37.474338] 2a9ba799 00000000 b0421180 b0421180 b0421180 ef844120 ef84437c b180b180
[ 37.474338] 00000000 ef845000 ee6e0880 ef822480 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kblockd/0 S ef887f4c 0 35 2
[ 37.474338] ef887fa0 00000046 00000246 ef887f4c b01224ed ef887f4c 00000000 00000000
[ 37.474338] 0800f107 00000000 b0421180 b0421180 b0421180 ef886120 ef88637c b180b180
[ 37.474338] 00000000 ef887000 b03ba200 08006f94 00000c39 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kacpid S ef88cf4c 0 37 2
[ 37.474338] ef88cfa0 00000046 00000246 ef88cf4c b01224ed ef88cf4c 00000000 00000000
[ 37.474338] 0801a506 00000000 b0421180 b0421180 b0421180 ef88b0a0 ef88b2fc b180b180
[ 37.474338] 00000000 ef88c000 b03ba200 08018266 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kacpi_notify S ef88ff4c 0 38 2
[ 37.474338] ef88ffa0 00000046 00000246 ef88ff4c b01224ed ef88ff4c 00000000 00000000
[ 37.474338] 0886f6cd 00000000 b0421180 b0421180 b0421180 ef88e120 ef88e37c b180b180
[ 37.474338] 00000000 ef88f000 b03ba200 0801c44e 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] cqueue/0 S ef8e0f4c 0 111 2
[ 37.474338] ef8e0fa0 00000046 00000246 ef8e0f4c b01224ed ef8e0f4c 00000000 00000000
[ 37.474338] 0c0b0eaf 00000000 b0421180 b0421180 b0421180 ef8bd0a0 ef8bd2fc b180b180
[ 37.474338] 00000000 ef8e0000 b03ba200 0c0abebc 00000cf2 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kseriod S b011d321 0 115 2
[ 37.474338] ef8e4f8c 00000046 ef8e4f34 b011d321 00000000 00000000 00000000 ef88d440
[ 37.474338] 28fd3c39 00000000 b0421180 b0421180 b0421180 ef874120 ef87437c b180b180
[ 37.474338] 00000000 ef8e4000 b03ba200 00000000 00000000 00000000 b03cc440 ef88d43c
[ 37.474338] Call Trace:
[ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0255c33>] serio_thread+0xc2/0x32f
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0255b71>] ? serio_thread+0x0/0x32f
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] pdflush S 00000000 0 148 2
[ 37.474338] ee52cfa4 00000046 b180b180 00000000 ee52c000 ee52cf68 00000000 000001fa
[ 37.474338] 22833768 00000000 b0421180 b0421180 b0421180 ef864120 ef86437c b180b180
[ 37.474338] 00000000 ee52c000 b03ba200 b012229c 00000000 00000000 ef864120 b180b180
[ 37.474338] Call Trace:
[ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8
[ 37.474338] [<b02f2c57>] ? _spin_unlock_irqrestore+0x1f/0x21
[ 37.474338] [<b011eb19>] ? set_user_nice+0xcf/0xdf
[ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4
[ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] pdflush S 00000286 0 149 2
[ 37.474338] ee52dfa4 00000046 b03c16e0 00000286 ee52df54 b0130184 00000000 00000000
[ 37.474338] 60e9f95c 00000008 b0421180 b0421180 b0421180 ef8620a0 ef8622fc b180b180
[ 37.474338] 00000000 ee52d000 ee6e0680 00000000 00000000 00000000 00000000 00000000
[ 37.474338] Call Trace:
[ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf
[ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4
[ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kswapd0 S 00000000 0 150 2
[ 37.474338] ee52ef2c 00000046 00000000 00000000 00000000 00000000 00000000 000004fa
[ 37.474338] 22918b1f 00000000 b0421180 b0421180 b0421180 ef85e020 ef85e27c b180b180
[ 37.474338] 00000000 ee52e000 b03ba200 b012229c 00000000 00000000 b1807b00 b180b180
[ 37.474338] Call Trace:
[ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8
[ 37.474338] [<b011ab2c>] ? __dequeue_entity+0x31/0x35
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2
[ 37.474338] [<b016b38e>] kswapd+0x490/0x4a2
[ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b011d295>] ? complete+0x43/0x4b
[ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] aio/0 S ee530f4c 0 151 2
[ 37.474338] ee530fa0 00000046 00000246 ee530f4c b01224ed ee530f4c 00000000 00000000
[ 37.474338] 22bec65c 00000000 b0421180 b0421180 b0421180 ef85c120 ef85c37c b180b180
[ 37.474338] 00000000 ee530000 b03ba200 2291af16 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kpsmoused S ee653f4c 0 378 2
[ 37.474338] ee653fa0 00000046 00000246 ee653f4c b01224ed ee653f4c 00000000 00000000
[ 37.474338] 26811a10 00000000 b0421180 b0421180 b0421180 ee6ba020 ee6ba27c b180b180
[ 37.474338] 00000000 ee653000 b03ba200 2680eca0 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] kondemand/0 S ee5b0f4c 0 384 2
[ 37.474338] ee5b0fa0 00000046 00000246 ee5b0f4c b01224ed ee5b0f4c 00000000 00000000
[ 37.474338] 292b36fc 00000000 b0421180 b0421180 b0421180 ee5cc120 ee5cc37c b180b180
[ 37.474338] 00000000 ee5b0000 b03ba200 290d4c43 00000b94 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] blogd S 00000000 0 423 1
[ 37.474338] ee6d0b10 00000082 000004b3 00000000 b1807120 15b2b2c0 00000000 ef8420a0
[ 37.474338] 60e9cf9d 00000008 b0421180 b0421180 b0421180 ee5e60a0 ee5e62fc b180b180
[ 37.474338] 00000000 ee6d0000 ee6e0680 b0130059 339a08bf 00000002 ee6d0b20 00000286
[ 37.474338] Call Trace:
[ 37.474338] [<b0130059>] ? lock_timer_base+0x1f/0x40
[ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf
[ 37.474338] [<b02f14ba>] schedule_timeout+0x44/0xa4
[ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
[ 37.474338] [<b012fcda>] ? process_timeout+0x0/0xa
[ 37.474338] [<b02f14b5>] ? schedule_timeout+0x3f/0xa4
[ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
[ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
[ 37.474338] [<b020e60c>] ? cfb_fillrect+0x138/0x2bd
[ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
[ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
[ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
[ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49
[ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30
[ 37.474338] [<b011b82b>] ? hrtick_start_fair+0x10d/0x144
[ 37.474338] [<b011b900>] ? enqueue_task_fair+0x52/0x56
[ 37.474338] [<b011a3f1>] ? enqueue_task+0x4c/0x58
[ 37.474338] [<b011eb76>] ? try_to_wake_up+0x4d/0x1be
[ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
[ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b01042b4>] ? do_notify_resume+0x55/0x79e
[ 37.474338] [<b0127ad4>] ? release_console_sem+0x1c4/0x1d4
[ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
[ 37.474338] [<b0232d6a>] ? tty_ldisc_deref+0x55/0x6e
[ 37.474338] [<b018e3fc>] sys_select+0xd7/0x1a2
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] blogd R running 0 424 1
[ 37.474338] ee6cbdf8 00000082 b18086c0 ee6cbda0 b01e99b7 b1808640 00000000 b180b5f4
[ 37.474338] 60e8dd8e 00000008 b0421180 b0421180 b0421180 ee5d70a0 ee5d72fc b180b180
[ 37.474338] 00000000 ee6cb000 ee6e0680 ee6cbdf8 375f44e1 00000003 ee6cbdec b041e600
[ 37.474338] Call Trace:
[ 37.474338] [<b01e99b7>] ? rb_insert_color+0x77/0xd8
[ 37.474338] [<b0143bfe>] futex_wait+0x285/0x2d3
[ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c
[ 37.474338] [<b0143b3d>] ? futex_wait+0x1c4/0x2d3
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b0144c2d>] do_futex+0x20d/0xa5f
[ 37.474338] [<b0168667>] ? pagevec_lookup_tag+0x25/0x2e
[ 37.474338] [<b016158a>] ? wait_on_page_writeback_range+0x5b/0xf0
[ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
[ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
[ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49
[ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30
[ 37.474338] [<b0145502>] sys_futex+0x83/0xe8
[ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] ata/0 S ee7b3f8c 0 438 2
[ 37.474338] ee7b3fa0 00000046 00000002 ee7b3f8c ee7b3f80 ee7b3f44 00000000 002dae6f
[ 37.474338] e3f3f21f 00000000 b0421180 b0421180 b0421180 ee6be120 ee6be37c b180b180
[ 37.474338] 00000000 ee7b3000 ee6e0a80 ef906480 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] ata_aux S ee7b2f4c 0 439 2
[ 37.474338] ee7b2fa0 00000046 ffffffff ee7b2f4c b01224d3 00000000 00000000 00000000
[ 37.474338] 2fc09ed2 00000000 b0421180 b0421180 b0421180 ee5dd020 ee5dd27c b180b180
[ 37.474338] 00000000 ee7b2000 ee6e0280 2f81ef36 00000000 00000000 b0421180 b0421180
[ 37.474338] Call Trace:
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
[ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
[ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
[ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] scsi_eh_0 S ef906520 0 445 2
[ 37.474338] ee585f64 00000046 ef906524 ef906520 00000000 00000092 00000000 b011d321
[ 37.474338] e4331293 00000000 b0421180 b0421180 b0421180 ee5e4020 ee5e427c b180b180
[ 37.474338] 00000000 ee585000 ee6e0480 ee702008 00108a29 00000000 f084c295 ee702000
[ 37.474338] Call Trace:
[ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
[ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod]
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod]
[ 37.474338] [<b011d295>] ? complete+0x43/0x4b
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] scsi_eh_1 S ee622520 0 446 2
[ 37.474338] ee5acf64 00000046 ee622524 ee622520 00000000 00000092 00000000 b011d321
[ 37.474338] e4333e8f 00000000 b0421180 b0421180 b0421180 ee5d10a0 ee5d12fc b180b180
[ 37.474338] 00000000 ee5ac000 ee6e0480 ee702808 00000000 00000000 f084c295 ee702800
[ 37.474338] Call Trace:
[ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
[ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod]
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod]
[ 37.474338] [<b011d295>] ? complete+0x43/0x4b
[ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
[ 37.474338] [<b01392f4>] kthread+0x37/0x59
[ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
[ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
[ 37.474338] =======================
[ 37.474338] udevd S 00000000 0 461 1
[ 37.474338] ee5d8b10 00000082 00000000 00000000 00000000 00000000 00000000 ee5d8ab4
[ 37.474338] 93a637c5 00000001 b0421180 b0421180 b0421180 ee6c20a0 ee6c22fc b180b180
[ 37.474338] 00000000 ee5d8000 ee6e0480 b1807980 00000000 00000000 000f41a9 00000000
[ 37.474338] Call Trace:
[ 37.474338] [<b011b920>] ? __update_rq_clock+0x1c/0x157
[ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
[ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
[ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
[ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
[ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
[ 37.474338] [<b013fdbc>] ? clocksource_get_next+0x3d/0x44
[ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108
[ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
[ 37.474338] [<b01eab55>] ? number+0x2a3/0x2b5
[ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30
[ 37.474338] [<b019007b>] ? fcntl_getlk64+0x4e/0x159
[ 37.474338] [<b01eb3a7>] ? vsnprintf+0x2e8/0x5ea
[ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
[ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
[ 37.474338] [<b016625f>] ? __alloc_pages+0x57/0x32d
[ 37.474338] [<b016c409>] ? __inc_zone_page_state+0x18/0x1a
[ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
[ 37.474338] [<b0165c04>] ? free_hot_page+0xa/0xc
[ 37.474338] [<b0168bf3>] ? put_page+0x2d/0xac
[ 37.474338] [<b0176340>] ? free_page_and_swap_cache+0x1e/0x3e
[ 37.474338] [<b016e44a>] ? unmap_vmas+0x317/0x54b
[ 37.474338] [<b0117410>] ? pgd_dtor+0x0/0x4a
[ 37.474338] [<b011740e>] ? check_pgt_cache+0x1e/0x20
[ 37.474338] [<b01711bf>] ? unmap_region+0xdc/0x12f
[ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] udevsettle R running 0 465 1
[ 37.474338] ee524f1c 00000082 ee524ed4 00000000 00010542 b1808640 00000000 ee6cbe14
[ 37.474338] 60e8f831 00000008 b0421180 b0421180 b0421180 ee5ce020 ee5ce27c b180b180
[ 37.474338] 00000000 ee524000 ee6e0880 ee524f1c 33bd800d 00000003 00000008 b041e600
[ 37.474338] Call Trace:
[ 37.474338] [<b02f1a50>] do_nanosleep+0x70/0x9a
[ 37.474338] [<b013c7ae>] hrtimer_nanosleep+0x4c/0xaf
[ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c
[ 37.474338] [<b02f1a3d>] ? do_nanosleep+0x5d/0x9a
[ 37.474338] [<b013c868>] sys_nanosleep+0x57/0x5b
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1
[ 37.474338] =======================
[ 37.474338] udevd S 00000000 0 873 461
[ 37.474338] e7c7eb10 00000082 00000000 00000000 00000000 00000000 00000000 00000000
[ 37.474338] 9385214a 00000001 b0421180 b0421180 b0421180 ef8d40a0 ef8d42fc b180b180
[ 37.474338] 00000000 e7c7e000 ee6e0280 00000000 00000000 00000000 00000000 00000000
[ 37.474338] Call Trace:
[ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
[ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
[ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
[ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
[ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
[ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
[ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
[ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
[ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
[ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
[ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
[ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4
[ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
[ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
[ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69
[ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] udevd S 00000000 0 876 461
[ 37.474338] ee780b10 00000086 00000000 00000000 00000000 00000000 00000000 00000000
[ 37.474338] 93a2c2db 00000001 b0421180 b0421180 b0421180 ef8a1120 ef8a137c b180b180
[ 37.474338] 00000000 ee780000 ee564100 00000000 00000000 00000000 00000000 00000000
[ 37.474338] Call Trace:
[ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
[ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
[ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
[ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
[ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
[ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b011a8a0>] ? update_curr+0x12f/0x136
[ 37.474338] [<b012381e>] ? task_tick_fair+0x59/0x86
[ 37.474338] [<b01227b3>] ? scheduler_tick+0x268/0x3cc
[ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
[ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
[ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108
[ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
[ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164
[ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
[ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
[ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
[ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
[ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
[ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
[ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
[ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
[ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
[ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4
[ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
[ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
[ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69
[ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] scsi_id D 00000000 0 878 873
[ 37.474338] ee60bb7c 00000086 00000046 00000000 00000000 ef904000 00000000 ee66d910
[ 37.474338] 93796336 00000001 b0421180 b0421180 b0421180 ee5d3120 ee5d337c b180b180
[ 37.474338] 00000000 ee60b000 ee564700 ee5c2800 0000065a 00000000 ee60bb5c b024e1b8
[ 37.474338] Call Trace:
[ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11
[ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod]
[ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
[ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b
[ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d
[ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a
[ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14
[ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c
[ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29
[ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a
[ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2
[ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49
[ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4
[ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1
[ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
[ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162
[ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod]
[ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e
[ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814
[ 37.474338] [<b01e7574>] ? kobject_get+0x12/0x17
[ 37.474338] [<b0160fc3>] ? find_lock_page+0x72/0x8d
[ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449
[ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b
[ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28
[ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377
[ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58
[ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4
[ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21
[ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21
[ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71
[ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290
[ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb
[ 37.474338] [<b0189bfc>] ? putname+0x25/0x30
[ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7
[ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] scsi_id D ee621bd4 0 879 876
[ 37.474338] ee636b7c 00000086 ee621bac ee621bd4 00000000 f084c4f9 00000000 ee66d0c8
[ 37.474338] 9394eaa1 00000001 b0421180 b0421180 b0421180 ef888020 ef88827c b180b180
[ 37.474338] 00000000 ee636000 ee564300 ee6e8c00 00000000 00000000 ee636b5c b024e1b8
[ 37.474338] Call Trace:
[ 37.474338] [<f084c4f9>] ? scsi_done+0x0/0x19 [scsi_mod]
[ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11
[ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod]
[ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
[ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b
[ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d
[ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a
[ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e
[ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
[ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14
[ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c
[ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29
[ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a
[ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2
[ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49
[ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4
[ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1
[ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
[ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5
[ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
[ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162
[ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod]
[ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e
[ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814
[ 37.474338] [<b0168bbe>] ? activate_page+0xb1/0xb9
[ 37.474338] [<b0168cce>] ? mark_page_accessed+0x27/0x2e
[ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449
[ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b
[ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28
[ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377
[ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58
[ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4
[ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21
[ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21
[ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71
[ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290
[ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb
[ 37.474338] [<b0189bfc>] ? putname+0x25/0x30
[ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7
[ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63
[ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
[ 37.474338] =======================
[ 37.474338] Sched Debug Version: v0.07, 2.6.25-rc2-smp #14
[ 37.474338] now at 44900.194289 msecs
[ 37.474338] .sysctl_sched_latency : 20.000000
[ 37.474338] .sysctl_sched_min_granularity : 4.000000
[ 37.474338] .sysctl_sched_wakeup_granularity : 10.000000
[ 37.474338] .sysctl_sched_batch_wakeup_granularity : 10.000000
[ 37.474338] .sysctl_sched_child_runs_first : 0.000001
[ 37.474338] .sysctl_sched_features : 39
[ 37.474338]
[ 37.474338] cpu#0, 2992.603 MHz
[ 37.474338] .nr_running : 4
[ 37.474338] .load : 2048
[ 37.474338] .nr_switches : 3363
[ 37.474338] .nr_load_updates : 34557
[ 37.474338] .nr_uninterruptible : 2
[ 37.474338] .jiffies : 4294705811
[ 37.474338] .next_balance : 4294.705968
[ 37.474338] .curr->pid : 4
[ 37.474338] .clock : 37474.338672
[ 37.474338] .idle_clock : 3065.714769
[ 37.474338] .prev_clock_raw : 78772.978492
[ 37.474338] .clock_warps : 0
[ 37.474338] .clock_overflows : 3996
[ 37.474338] .clock_underflows : 31781
[ 37.474338] .clock_deep_idle_events : 1
[ 37.474338] .clock_max_delta : 0.999848
[ 37.474338] .cpu_load[0] : 2048
[ 37.474338] .cpu_load[1] : 2048
[ 37.474338] .cpu_load[2] : 2048
[ 37.474338] .cpu_load[3] : 2048
[ 37.474338] .cpu_load[4] : 2048
[ 37.474338]
[ 37.474338] cfs_rq
[ 37.474338] .exec_clock : 34293.783916
[ 37.474338] .MIN_vruntime : 0.000001
[ 37.474338] .min_vruntime : 17146.893105
[ 37.474338] .max_vruntime : 0.000001
[ 37.474338] .spread : 0.000000
[ 37.474338] .spread0 : 0.000000
[ 37.474338] .nr_running : 1
[ 37.474338] .load : 2048
[ 37.474338] .bkl_count : 405
[ 37.474338] .nr_spread_over : 0
[ 37.474338]
[ 37.474338] cfs_rq
[ 37.474338] .exec_clock : 34293.783916
[ 37.474338] .MIN_vruntime : 13830.833996
[ 37.474338] .min_vruntime : 17146.893105
[ 37.474338] .max_vruntime : 13830.833996
[ 37.474338] .spread : 0.000000
[ 37.474338] .spread0 : 0.000000
[ 37.474338] .nr_running : 4
[ 37.474338] .load : 8290
[ 37.474338] .bkl_count : 405
[ 37.474338] .nr_spread_over : 6
[ 37.474338]
[ 37.474338] runnable tasks:
[ 37.474338] task PID tree-key switches prio exec-runtime sum-exec sum-sleep
[ 37.474338] ----------------------------------------------------------------------------------------------------------
[ 37.474338] R ksoftirqd/0 4 14329.115724 37 115 14329.115724 33248.417353 4113.418770
[ 37.474338] events/0 5 13830.833996 35 115 13830.833996 0.284125 4265.748410
[ 37.474338] blogd 424 13830.833996 135 120 13830.833996 0.460905 1680.558420
[ 37.474338] udevsettle 465 13830.833996 17 120 13830.833996 0.838894 643.594622
[ 37.474338]
On Thu, Feb 21 2008, Mike Galbraith wrote:
> Greetings,
>
> K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about
> buffer underrun upon every attempt to burn a CD. I can't fully bisect
> the problem because intervening kernels hang soft during boot. Using
> git bisect visualize, and converting to postable text:
>
> bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5)
> bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa)
> libata: update ATAPI overflow draining
> bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4
Tejun?
>
> Serial console log of hung kernel 40b01b9bbdf51ae543a04744283bf2d56c4a6afa below
>
> [ 0.000000] Linux version 2.6.25-rc2-smp (root@homer) (gcc version 4.2.1 (SUSE Linux)) #14 SMP PREEMPT Thu Feb 21 08:49:51 CET 2008
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
> [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
> [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
> [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> [ 0.000000] 0MB HIGHMEM available.
> [ 0.000000] 1023MB LOWMEM available.
> [ 0.000000] Scan SMP from b0000000 for 1024 bytes.
> [ 0.000000] Scan SMP from b009fc00 for 1024 bytes.
> [ 0.000000] Scan SMP from b00f0000 for 65536 bytes.
> [ 0.000000] found SMP MP-table at [b00f5320] 000f5320
> [ 0.000000] Zone PFN ranges:
> [ 0.000000] DMA 0 -> 4096
> [ 0.000000] Normal 4096 -> 262128
> [ 0.000000] HighMem 262128 -> 262128
> [ 0.000000] Movable zone start PFN for each node
> [ 0.000000] early_node_map[1] active PFN ranges
> [ 0.000000] 0: 0 -> 262128
> [ 0.000000] DMI 2.3 present.
> [ 0.000000] ACPI: RSDP 000F6CC0, 0014 (r0 IntelR)
> [ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 IntelR AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 IntelR AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] ACPI: DSDT 3FFF30C0, 4139 (r1 INTELR AWRDACPI 1000 MSFT 100000E)
> [ 0.000000] ACPI: FACS 3FFF0000, 0040
> [ 0.000000] ACPI: APIC 3FFF7200, 0068 (r1 IntelR AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] ACPI: PM-Timer IO Port: 0x408
> [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [ 0.000000] Processor #0 15:2 APIC version 20
> [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> [ 0.000000] Processor #1 15:2 APIC version 20
> [ 0.000000] WARNING: maxcpus limit of 1 reached. Processor ignored.
> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
> [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> [ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
> [ 0.000000] Using ACPI (MADT) for SMP configuration information
> [ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
> [ 0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
> [ 0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081
> [ 0.000000] Kernel command line: root=/dev/sdb3 rootflags=data=writeback vga=0x314 resume=/dev/sdb2 console=ttyS0,115200n8 console=tty splash=silent PROFILE=default 1 maxcpus=1
> [ 0.000000] Enabling fast FPU save and restore... done.
> [ 0.000000] Enabling unmasked SIMD FPU exception support... done.
> [ 0.000000] Initializing CPU#0
> [ 0.000000] Preemptible RCU implementation.
> [ 0.000000] CPU 0 irqstacks, hard=b0427000 soft=b0425000
> [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
> [ 0.000000] Detected 2992.603 MHz processor.
> [ 0.000999] Console: colour dummy device 80x25
> [ 0.000999] console [tty0] enabled
> [ 0.000999] console [ttyS0] enabled
> [ 0.000999] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> [ 0.000999] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> [ 0.000999] Memory: 1028968k/1048512k available (1998k kernel code, 18904k reserved, 955k data, 236k init, 0k highmem)
> [ 0.000999] virtual kernel memory layout:
> [ 0.000999] fixmap : 0xfff9b000 - 0xfffff000 ( 400 kB)
> [ 0.000999] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
> [ 0.000999] vmalloc : 0xf0800000 - 0xff7fe000 ( 239 MB)
> [ 0.000999] lowmem : 0xb0000000 - 0xefff0000 (1023 MB)
> [ 0.000999] .init : 0xb03e7000 - 0xb0422000 ( 236 kB)
> [ 0.000999] .data : 0xb02f3b26 - 0xb03e29a8 ( 955 kB)
> [ 0.000999] .text : 0xb0100000 - 0xb02f3b26 (1998 kB)
> [ 0.000999] Checking if this processor honours the WP bit even in supervisor mode...Ok.
> [ 0.060993] Calibrating delay using timer specific routine.. 5987.55 BogoMIPS (lpj=2993775)
> [ 0.063022] Security Framework initialized
> [ 0.064010] Mount-cache hash table entries: 512
> [ 0.065129] CPU: Trace cache: 12K uops, L1 D cache: 8K
> [ 0.066992] CPU: L2 cache: 512K
> [ 0.067992] CPU: Physical Processor ID: 0
> [ 0.068993] Intel machine check architecture supported.
> [ 0.069994] Intel machine check reporting enabled on CPU#0.
> [ 0.070991] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
> [ 0.071992] CPU0: Thermal monitoring enabled
> [ 0.072993] Compat vDSO mapped to ffffe000.
> [ 0.073996] Checking 'hlt' instruction... OK.
> [ 0.079743] SMP alternatives: switching to UP code
> [ 0.079998] Freeing SMP alternatives: 9k freed
> [ 0.080991] ACPI: Core revision 20070126
> [ 0.091025] CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
> [ 0.094019] Total of 1 processors activated (5987.55 BogoMIPS).
> [ 0.095110] ENABLING IO-APIC IRQs
> [ 0.096152] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> [ 0.107983] Brought up 1 CPUs
> [ 0.108208] net_namespace: 552 bytes
> [ 0.110058] NET: Registered protocol family 16
> [ 0.111195] ACPI: bus type pci registered
> [ 0.114122] PCI: PCI BIOS revision 2.10 entry at 0xfb980, last bus=2
> [ 0.114984] PCI: Using configuration type 1
> [ 0.115983] Setting up standard PCI resources
> [ 0.139635] ACPI: Interpreter enabled
> [ 0.139985] ACPI: (supports S0 S3 S4 S5)
> [ 0.142261] ACPI: Using IOAPIC for interrupt routing
> [ 0.148393] ACPI: PCI Root Bridge [PCI0] (0000:00)
> [ 0.149406] pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO
> [ 0.149981] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH4 GPIO
> [ 0.151395] PCI: Transparent bridge - 0000:00:1e.0
> [ 0.160727] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15)
> [ 0.164091] ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 7 9 10 11 12 14 15)
> [ 0.168383] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
> [ 0.171666] ACPI: PCI Interrupt Link [LNKD] (IRQs *3 4 5 7 9 10 11 12 14 15)
> [ 0.175277] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12 14 15)
> [ 0.179656] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 *9 10 11 12 14 15)
> [ 0.183050] ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
> [ 0.188275] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 *11 12 14 15)
> [ 0.192014] Linux Plug and Play Support v0.97 (c) Adam Belay
> [ 0.193005] pnp: PnP ACPI init
> [ 0.193979] ACPI: bus type pnp registered
> [ 0.199104] pnpacpi: exceeded the max number of mem resources: 12
> [ 0.200035] pnp: PnP ACPI: found 13 devices
> [ 0.200971] ACPI: ACPI bus type pnp unregistered
> [ 0.202306] PCI: Using ACPI for IRQ routing
> [ 0.202977] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
> [ 0.226967] NetLabel: Initializing
> [ 0.226969] NetLabel: domain hash size = 128
> [ 0.227968] NetLabel: protocols = UNLABELED CIPSOv4
> [ 0.228981] NetLabel: unlabeled traffic allowed by default
> [ 0.230055] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 11
> [ 0.232158] hpet0: 3 64-bit timers, 14318180 Hz
> [ 0.235002] ACPI: RTC can wake from S4
> [ 0.235966] Time: tsc clocksource has been installed.
> [ 0.243966] system 00:01: ioport range 0xb78-0xb7b has been reserved
> [ 0.243969] system 00:01: ioport range 0xf78-0xf7b has been reserved
> [ 0.244971] system 00:01: ioport range 0xa78-0xa7b has been reserved
> [ 0.245968] system 00:01: ioport range 0xe78-0xe7b has been reserved
> [ 0.246967] system 00:01: ioport range 0xbbc-0xbbf has been reserved
> [ 0.247967] system 00:01: ioport range 0xfbc-0xfbf has been reserved
> [ 0.248968] system 00:01: ioport range 0x4d0-0x4d1 has been reserved
> [ 0.249967] system 00:01: ioport range 0x200-0x200 has been reserved
> [ 0.250967] system 00:01: ioport range 0x202-0x208 has been reserved
> [ 0.251967] system 00:01: ioport range 0x320-0x32f has been reserved
> [ 0.252966] system 00:01: ioport range 0x295-0x296 has been reserved
> [ 0.254970] system 00:0b: ioport range 0x400-0x4bf could not be reserved
> [ 0.255973] system 00:0c: iomem range 0xf0000-0xf3fff could not be reserved
> [ 0.256966] system 00:0c: iomem range 0xf4000-0xf7fff could not be reserved
> [ 0.257966] system 00:0c: iomem range 0xf8000-0xfbfff could not be reserved
> [ 0.258966] system 00:0c: iomem range 0xfc000-0xfffff could not be reserved
> [ 0.259966] system 00:0c: iomem range 0x3fff0000-0x3fffffff could not be reserved
> [ 0.260965] system 00:0c: iomem range 0x0-0x9ffff could not be reserved
> [ 0.261965] system 00:0c: iomem range 0x100000-0x3ffeffff could not be reserved
> [ 0.262965] system 00:0c: iomem range 0xfec00000-0xfec00fff could not be reserved
> [ 0.263965] system 00:0c: iomem range 0xfec01000-0xfed8ffff could not be reserved
> [ 0.264965] system 00:0c: iomem range 0xfee00000-0xfee00fff could not be reserved
> [ 0.265965] system 00:0c: iomem range 0xffb00000-0xffbfffff could not be reserved
> [ 0.266964] system 00:0c: iomem range 0xfff00000-0xffffffff could not be reserved
> [ 0.298911] PCI: Bridge: 0000:00:01.0
> [ 0.298960] IO window: a000-afff
> [ 0.299963] MEM window: 0xf8000000-0xf9ffffff
> [ 0.300961] PREFETCH window: 0x00000000e8000000-0x00000000f7ffffff
> [ 0.301963] PCI: Bridge: 0000:00:1e.0
> [ 0.302959] IO window: b000-bfff
> [ 0.303961] MEM window: 0xfa000000-0xfa0fffff
> [ 0.304960] PREFETCH window: disabled.
> [ 0.305994] NET: Registered protocol family 2
> [ 0.325957] IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
> [ 0.326233] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
> [ 0.328385] TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
> [ 0.329394] TCP: Hash tables configured (established 131072 bind 65536)
> [ 0.329963] TCP reno registered
> [ 0.337961] Unpacking initramfs... done
> [ 0.574326] Freeing initrd memory: 6128k freed
> [ 0.576000] Machine check exception polling timer started.
> [ 0.577268] audit: initializing netlink socket (disabled)
> [ 0.577937] type=2000 audit(1203584121.788:1): initialized
> [ 0.579060] Total HugeTLB memory allocated, 0
> [ 0.579988] VFS: Disk quotas dquot_6.5.1
> [ 0.580947] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
> [ 0.582937] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
> [ 0.583930] io scheduler noop registered
> [ 0.584925] io scheduler anticipatory registered
> [ 0.585924] io scheduler deadline registered
> [ 0.586932] io scheduler cfq registered (default)
> [ 0.589129] vesafb: framebuffer at 0xe8000000, mapped to 0xf0880000, using 1875k, total 16384k
> [ 0.589925] vesafb: mode is 800x600x16, linelength=1600, pages=16
> [ 0.590924] vesafb: protected mode interface info at c000:b544
> [ 0.591926] vesafb: pmi: set display start = b00cb5d2, set palette = b00cb612
> [ 0.592923] vesafb: scrolling: redraw
> [ 0.593924] vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
> [ 0.612485] Console: switching to colour frame buffer device 100x37
> [ 0.627919] fb0: VESA VGA frame buffer device
> [ 0.638247] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
> [ 0.639029] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 0.641591] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 0.642200] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
> [ 0.642917] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
> [ 0.644280] serio: i8042 KBD port at 0x60,0x64 irq 1
> [ 0.645234] mice: PS/2 mouse device common for all mice
> [ 0.675685] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
> [ 0.687684] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0
> [ 0.687709] rtc0: alarms up to one month
> [ 0.688747] cpuidle: using governor ladder
> [ 0.689689] cpuidle: using governor menu
> [ 0.690751] oprofile: using NMI interrupt.
> [ 0.693003] NET: Registered protocol family 1
> [ 0.693759] p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
> [ 0.694686] Using IPI No-Shortcut mode
> [ 0.695814] registered taskstats version 1
> [ 0.696813] rtc_cmos 00:03: setting system clock to 2008-02-21 08:55:23 UTC (1203584123)
> [ 0.698738] Freeing unused kernel memory: 236k freed
> [ 0.699701] Write protecting the kernel text: 2000k
> [ 0.700690] Write protecting the kernel read-only data: 792k
> [ 0.762206] ACPI: ACPI0007:00 is registered as cooling_device0
> [ 0.768027] ACPI: LNXTHERM:01 is registered as thermal_zone0
> [ 0.768870] ACPI: Thermal Zone [THRM] (40 C)
> [ 0.785027] SCSI subsystem initialized
> [ 0.806767] ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
> [ 0.808835] scsi0 : ata_piix
> [ 0.809721] scsi1 : ata_piix
> [ 0.812130] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
> [ 0.812702] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
> [ 1.290553] ata1.00: ATA-6: ST3160021A, 3.04, max UDMA/100
> [ 1.290558] ata1.00: 312581808 sectors, multi 16: LBA48
> [ 1.291580] ata1.01: ATAPI: BENQ DVD DD DW1625, BBIA, max UDMA/33
> [ 1.314801] ata1.00: configured for UDMA/100
> [ 1.473512] ata1.01: configured for UDMA/33
> [ 3.788245] ata2.00: ATA-6: ST3120022A, 3.06, max UDMA/100
> [ 3.788248] ata2.00: 234441648 sectors, multi 16: LBA48
> [ 3.811575] ata2.00: configured for UDMA/100
> [ 3.823087] scsi 0:0:0:0: Direct-Access ATA ST3160021A 3.04 PQ: 0 ANSI: 5
> [ 3.825453] scsi 0:0:1:0: CD-ROM BENQ DVD DD DW1625 BBIA PQ: 0 ANSI: 5
> [ 3.825587] scsi 1:0:0:0: Direct-Access ATA ST3120022A 3.06 PQ: 0 ANSI: 5
> [ 3.831138] ACPI: PNP0C0B:00 is registered as cooling_device1
> [ 3.831481] ACPI: Fan [FAN] (on)
> [ 3.856766] BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
> [ 3.999333] Driver 'sd' needs updating - please use bus_type methods
> [ 3.999551] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
> [ 4.000464] sd 0:0:0:0: [sda] Write Protect is off
> [ 4.001459] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 4.002528] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
> [ 4.003475] sd 0:0:0:0: [sda] Write Protect is off
> [ 4.004458] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 4.005586] sda: sda1 sda2 < sda5 sda6 >
> [ 4.048569] sd 0:0:0:0: [sda] Attached SCSI disk
> [ 4.049519] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
> [ 4.050439] sd 1:0:0:0: [sdb] Write Protect is off
> [ 4.051473] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 4.052505] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
> [ 4.053436] sd 1:0:0:0: [sdb] Write Protect is off
> [ 4.054450] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 4.055498] sdb: sdb1 sdb2 sdb3
> [ 4.063528] sd 1:0:0:0: [sdb] Attached SCSI disk
> [ 37.473476] SysRq : Show State
> [ 37.474338] task PC stack pid father
> [ 37.474338] init S ef836eac 0 1 0
> [ 37.474338] ef836f00 00000082 ffffffff ef836eac b01224d3 00000000 00000000 00000001
> [ 37.474338] e80cb97b 00000000 b0421180 b0421180 b0421180 ef835020 ef83527c b180b180
> [ 37.474338] 00000000 ef836000 ee6e0c80 b1031ae0 00000000 00000000 ee5b9148 ef836f14
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
> [ 37.474338] [<b01d218e>] ? security_task_wait+0xf/0x11
> [ 37.474338] [<b0129e1f>] do_wait+0x470/0xa1a
> [ 37.474338] [<b01249e9>] ? wake_up_new_task+0x77/0x91
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b012a431>] sys_wait4+0x68/0x9f
> [ 37.474338] [<b012a48f>] sys_waitpid+0x27/0x29
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1
> [ 37.474338] =======================
> [ 37.474338] kthreadd S 00000000 0 2 0
> [ 37.474338] ef83bfc8 00000046 ef8b50a0 00000000 00000092 ef83bf80 00000000 00000000
> [ 37.474338] ee5f776c 00000000 b0421180 b0421180 b0421180 ef83a0a0 ef83a2fc b180b180
> [ 37.474338] 00000000 ef83b000 ee564d00 00000000 00000ae7 00000000 ee5f6e1c ee5f6e20
> [ 37.474338] Call Trace:
> [ 37.474338] [<b011d295>] ? complete+0x43/0x4b
> [ 37.474338] [<b0139450>] kthreadd+0x13a/0x13f
> [ 37.474338] [<b0139316>] ? kthreadd+0x0/0x13f
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] migration/0 S 0102c262 0 3 2
> [ 37.474338] ef83df98 00000046 ef83c120 0102c262 00000246 ef83df4c 00000000 ef83df4c
> [ 37.474338] 066fb2fc 00000000 b0421180 b0421180 b0421180 ef83c120 ef83c37c b180b180
> [ 37.474338] 00000000 ef83d000 b03ba200 00000000 00000000 00000000 00000000 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210
> [ 37.474338] [<b012304e>] migration_thread+0x124/0x210
> [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] ksoftirqd/0 R running 0 4 2
> [ 37.474338] f4744940 00000008 ef841f14 b01424d9 f4650f99 00000008 00000000 01808638
> [ 37.474338] b18086c0 b1808638 b1808600 ef841f50 b013ca83 b1808600 b1808604 f4744940
> [ 37.474338] 00000008 f465092e 00000008 f465092e 00000046 b1807120 00000000 00000246
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
> [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164
> [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
> [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89
> [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
> [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30
> [ 37.474338] [<b01079d9>] ? do_softirq+0x35/0x8c
> [ 37.474338] [<b012c515>] ? ksoftirqd+0xad/0x17f
> [ 37.474338] [<b012c468>] ? ksoftirqd+0x0/0x17f
> [ 37.474338] [<b01392f4>] ? kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] ? kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] events/0 R running 0 5 2
> [ 37.474338] ef843fa0 00000046 ef8112c0 ef843f44 b0136d0e 00000a75 00000000 00000a75
> [ 37.474338] 60e9afad 00000008 b0421180 b0421180 b0421180 ef8420a0 ef8422fc b180b180
> [ 37.474338] 00000000 ef843000 ee6e0880 b18089e0 3399ec88 00000002 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b0136d0e>] ? queue_delayed_work+0x40/0x48
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] khelper S b02f12e5 0 6 2
> [ 37.474338] ef845fa0 00000046 ef845f3c b02f12e5 ef836d40 ef836d44 00000000 ef845f44
> [ 37.474338] 2a9ba799 00000000 b0421180 b0421180 b0421180 ef844120 ef84437c b180b180
> [ 37.474338] 00000000 ef845000 ee6e0880 ef822480 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kblockd/0 S ef887f4c 0 35 2
> [ 37.474338] ef887fa0 00000046 00000246 ef887f4c b01224ed ef887f4c 00000000 00000000
> [ 37.474338] 0800f107 00000000 b0421180 b0421180 b0421180 ef886120 ef88637c b180b180
> [ 37.474338] 00000000 ef887000 b03ba200 08006f94 00000c39 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kacpid S ef88cf4c 0 37 2
> [ 37.474338] ef88cfa0 00000046 00000246 ef88cf4c b01224ed ef88cf4c 00000000 00000000
> [ 37.474338] 0801a506 00000000 b0421180 b0421180 b0421180 ef88b0a0 ef88b2fc b180b180
> [ 37.474338] 00000000 ef88c000 b03ba200 08018266 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kacpi_notify S ef88ff4c 0 38 2
> [ 37.474338] ef88ffa0 00000046 00000246 ef88ff4c b01224ed ef88ff4c 00000000 00000000
> [ 37.474338] 0886f6cd 00000000 b0421180 b0421180 b0421180 ef88e120 ef88e37c b180b180
> [ 37.474338] 00000000 ef88f000 b03ba200 0801c44e 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] cqueue/0 S ef8e0f4c 0 111 2
> [ 37.474338] ef8e0fa0 00000046 00000246 ef8e0f4c b01224ed ef8e0f4c 00000000 00000000
> [ 37.474338] 0c0b0eaf 00000000 b0421180 b0421180 b0421180 ef8bd0a0 ef8bd2fc b180b180
> [ 37.474338] 00000000 ef8e0000 b03ba200 0c0abebc 00000cf2 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kseriod S b011d321 0 115 2
> [ 37.474338] ef8e4f8c 00000046 ef8e4f34 b011d321 00000000 00000000 00000000 ef88d440
> [ 37.474338] 28fd3c39 00000000 b0421180 b0421180 b0421180 ef874120 ef87437c b180b180
> [ 37.474338] 00000000 ef8e4000 b03ba200 00000000 00000000 00000000 b03cc440 ef88d43c
> [ 37.474338] Call Trace:
> [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0255c33>] serio_thread+0xc2/0x32f
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0255b71>] ? serio_thread+0x0/0x32f
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] pdflush S 00000000 0 148 2
> [ 37.474338] ee52cfa4 00000046 b180b180 00000000 ee52c000 ee52cf68 00000000 000001fa
> [ 37.474338] 22833768 00000000 b0421180 b0421180 b0421180 ef864120 ef86437c b180b180
> [ 37.474338] 00000000 ee52c000 b03ba200 b012229c 00000000 00000000 ef864120 b180b180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8
> [ 37.474338] [<b02f2c57>] ? _spin_unlock_irqrestore+0x1f/0x21
> [ 37.474338] [<b011eb19>] ? set_user_nice+0xcf/0xdf
> [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4
> [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] pdflush S 00000286 0 149 2
> [ 37.474338] ee52dfa4 00000046 b03c16e0 00000286 ee52df54 b0130184 00000000 00000000
> [ 37.474338] 60e9f95c 00000008 b0421180 b0421180 b0421180 ef8620a0 ef8622fc b180b180
> [ 37.474338] 00000000 ee52d000 ee6e0680 00000000 00000000 00000000 00000000 00000000
> [ 37.474338] Call Trace:
> [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf
> [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4
> [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kswapd0 S 00000000 0 150 2
> [ 37.474338] ee52ef2c 00000046 00000000 00000000 00000000 00000000 00000000 000004fa
> [ 37.474338] 22918b1f 00000000 b0421180 b0421180 b0421180 ef85e020 ef85e27c b180b180
> [ 37.474338] 00000000 ee52e000 b03ba200 b012229c 00000000 00000000 b1807b00 b180b180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8
> [ 37.474338] [<b011ab2c>] ? __dequeue_entity+0x31/0x35
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2
> [ 37.474338] [<b016b38e>] kswapd+0x490/0x4a2
> [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b011d295>] ? complete+0x43/0x4b
> [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] aio/0 S ee530f4c 0 151 2
> [ 37.474338] ee530fa0 00000046 00000246 ee530f4c b01224ed ee530f4c 00000000 00000000
> [ 37.474338] 22bec65c 00000000 b0421180 b0421180 b0421180 ef85c120 ef85c37c b180b180
> [ 37.474338] 00000000 ee530000 b03ba200 2291af16 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kpsmoused S ee653f4c 0 378 2
> [ 37.474338] ee653fa0 00000046 00000246 ee653f4c b01224ed ee653f4c 00000000 00000000
> [ 37.474338] 26811a10 00000000 b0421180 b0421180 b0421180 ee6ba020 ee6ba27c b180b180
> [ 37.474338] 00000000 ee653000 b03ba200 2680eca0 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] kondemand/0 S ee5b0f4c 0 384 2
> [ 37.474338] ee5b0fa0 00000046 00000246 ee5b0f4c b01224ed ee5b0f4c 00000000 00000000
> [ 37.474338] 292b36fc 00000000 b0421180 b0421180 b0421180 ee5cc120 ee5cc37c b180b180
> [ 37.474338] 00000000 ee5b0000 b03ba200 290d4c43 00000b94 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] blogd S 00000000 0 423 1
> [ 37.474338] ee6d0b10 00000082 000004b3 00000000 b1807120 15b2b2c0 00000000 ef8420a0
> [ 37.474338] 60e9cf9d 00000008 b0421180 b0421180 b0421180 ee5e60a0 ee5e62fc b180b180
> [ 37.474338] 00000000 ee6d0000 ee6e0680 b0130059 339a08bf 00000002 ee6d0b20 00000286
> [ 37.474338] Call Trace:
> [ 37.474338] [<b0130059>] ? lock_timer_base+0x1f/0x40
> [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf
> [ 37.474338] [<b02f14ba>] schedule_timeout+0x44/0xa4
> [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
> [ 37.474338] [<b012fcda>] ? process_timeout+0x0/0xa
> [ 37.474338] [<b02f14b5>] ? schedule_timeout+0x3f/0xa4
> [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
> [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
> [ 37.474338] [<b020e60c>] ? cfb_fillrect+0x138/0x2bd
> [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
> [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
> [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
> [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49
> [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30
> [ 37.474338] [<b011b82b>] ? hrtick_start_fair+0x10d/0x144
> [ 37.474338] [<b011b900>] ? enqueue_task_fair+0x52/0x56
> [ 37.474338] [<b011a3f1>] ? enqueue_task+0x4c/0x58
> [ 37.474338] [<b011eb76>] ? try_to_wake_up+0x4d/0x1be
> [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
> [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b01042b4>] ? do_notify_resume+0x55/0x79e
> [ 37.474338] [<b0127ad4>] ? release_console_sem+0x1c4/0x1d4
> [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
> [ 37.474338] [<b0232d6a>] ? tty_ldisc_deref+0x55/0x6e
> [ 37.474338] [<b018e3fc>] sys_select+0xd7/0x1a2
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] blogd R running 0 424 1
> [ 37.474338] ee6cbdf8 00000082 b18086c0 ee6cbda0 b01e99b7 b1808640 00000000 b180b5f4
> [ 37.474338] 60e8dd8e 00000008 b0421180 b0421180 b0421180 ee5d70a0 ee5d72fc b180b180
> [ 37.474338] 00000000 ee6cb000 ee6e0680 ee6cbdf8 375f44e1 00000003 ee6cbdec b041e600
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01e99b7>] ? rb_insert_color+0x77/0xd8
> [ 37.474338] [<b0143bfe>] futex_wait+0x285/0x2d3
> [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c
> [ 37.474338] [<b0143b3d>] ? futex_wait+0x1c4/0x2d3
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b0144c2d>] do_futex+0x20d/0xa5f
> [ 37.474338] [<b0168667>] ? pagevec_lookup_tag+0x25/0x2e
> [ 37.474338] [<b016158a>] ? wait_on_page_writeback_range+0x5b/0xf0
> [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
> [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
> [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49
> [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30
> [ 37.474338] [<b0145502>] sys_futex+0x83/0xe8
> [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] ata/0 S ee7b3f8c 0 438 2
> [ 37.474338] ee7b3fa0 00000046 00000002 ee7b3f8c ee7b3f80 ee7b3f44 00000000 002dae6f
> [ 37.474338] e3f3f21f 00000000 b0421180 b0421180 b0421180 ee6be120 ee6be37c b180b180
> [ 37.474338] 00000000 ee7b3000 ee6e0a80 ef906480 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] ata_aux S ee7b2f4c 0 439 2
> [ 37.474338] ee7b2fa0 00000046 ffffffff ee7b2f4c b01224d3 00000000 00000000 00000000
> [ 37.474338] 2fc09ed2 00000000 b0421180 b0421180 b0421180 ee5dd020 ee5dd27c b180b180
> [ 37.474338] 00000000 ee7b2000 ee6e0280 2f81ef36 00000000 00000000 b0421180 b0421180
> [ 37.474338] Call Trace:
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d
> [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7
> [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38
> [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] scsi_eh_0 S ef906520 0 445 2
> [ 37.474338] ee585f64 00000046 ef906524 ef906520 00000000 00000092 00000000 b011d321
> [ 37.474338] e4331293 00000000 b0421180 b0421180 b0421180 ee5e4020 ee5e427c b180b180
> [ 37.474338] 00000000 ee585000 ee6e0480 ee702008 00108a29 00000000 f084c295 ee702000
> [ 37.474338] Call Trace:
> [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
> [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod]
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod]
> [ 37.474338] [<b011d295>] ? complete+0x43/0x4b
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] scsi_eh_1 S ee622520 0 446 2
> [ 37.474338] ee5acf64 00000046 ee622524 ee622520 00000000 00000092 00000000 b011d321
> [ 37.474338] e4333e8f 00000000 b0421180 b0421180 b0421180 ee5d10a0 ee5d12fc b180b180
> [ 37.474338] 00000000 ee5ac000 ee6e0480 ee702808 00000000 00000000 f084c295 ee702800
> [ 37.474338] Call Trace:
> [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42
> [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod]
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod]
> [ 37.474338] [<b011d295>] ? complete+0x43/0x4b
> [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod]
> [ 37.474338] [<b01392f4>] kthread+0x37/0x59
> [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59
> [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18
> [ 37.474338] =======================
> [ 37.474338] udevd S 00000000 0 461 1
> [ 37.474338] ee5d8b10 00000082 00000000 00000000 00000000 00000000 00000000 ee5d8ab4
> [ 37.474338] 93a637c5 00000001 b0421180 b0421180 b0421180 ee6c20a0 ee6c22fc b180b180
> [ 37.474338] 00000000 ee5d8000 ee6e0480 b1807980 00000000 00000000 000f41a9 00000000
> [ 37.474338] Call Trace:
> [ 37.474338] [<b011b920>] ? __update_rq_clock+0x1c/0x157
> [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
> [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
> [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
> [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
> [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
> [ 37.474338] [<b013fdbc>] ? clocksource_get_next+0x3d/0x44
> [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108
> [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
> [ 37.474338] [<b01eab55>] ? number+0x2a3/0x2b5
> [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30
> [ 37.474338] [<b019007b>] ? fcntl_getlk64+0x4e/0x159
> [ 37.474338] [<b01eb3a7>] ? vsnprintf+0x2e8/0x5ea
> [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
> [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
> [ 37.474338] [<b016625f>] ? __alloc_pages+0x57/0x32d
> [ 37.474338] [<b016c409>] ? __inc_zone_page_state+0x18/0x1a
> [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
> [ 37.474338] [<b0165c04>] ? free_hot_page+0xa/0xc
> [ 37.474338] [<b0168bf3>] ? put_page+0x2d/0xac
> [ 37.474338] [<b0176340>] ? free_page_and_swap_cache+0x1e/0x3e
> [ 37.474338] [<b016e44a>] ? unmap_vmas+0x317/0x54b
> [ 37.474338] [<b0117410>] ? pgd_dtor+0x0/0x4a
> [ 37.474338] [<b011740e>] ? check_pgt_cache+0x1e/0x20
> [ 37.474338] [<b01711bf>] ? unmap_region+0xdc/0x12f
> [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] udevsettle R running 0 465 1
> [ 37.474338] ee524f1c 00000082 ee524ed4 00000000 00010542 b1808640 00000000 ee6cbe14
> [ 37.474338] 60e8f831 00000008 b0421180 b0421180 b0421180 ee5ce020 ee5ce27c b180b180
> [ 37.474338] 00000000 ee524000 ee6e0880 ee524f1c 33bd800d 00000003 00000008 b041e600
> [ 37.474338] Call Trace:
> [ 37.474338] [<b02f1a50>] do_nanosleep+0x70/0x9a
> [ 37.474338] [<b013c7ae>] hrtimer_nanosleep+0x4c/0xaf
> [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c
> [ 37.474338] [<b02f1a3d>] ? do_nanosleep+0x5d/0x9a
> [ 37.474338] [<b013c868>] sys_nanosleep+0x57/0x5b
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1
> [ 37.474338] =======================
> [ 37.474338] udevd S 00000000 0 873 461
> [ 37.474338] e7c7eb10 00000082 00000000 00000000 00000000 00000000 00000000 00000000
> [ 37.474338] 9385214a 00000001 b0421180 b0421180 b0421180 ef8d40a0 ef8d42fc b180b180
> [ 37.474338] 00000000 e7c7e000 ee6e0280 00000000 00000000 00000000 00000000 00000000
> [ 37.474338] Call Trace:
> [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
> [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
> [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
> [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
> [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
> [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
> [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
> [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
> [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
> [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
> [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
> [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4
> [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
> [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
> [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69
> [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] udevd S 00000000 0 876 461
> [ 37.474338] ee780b10 00000086 00000000 00000000 00000000 00000000 00000000 00000000
> [ 37.474338] 93a2c2db 00000001 b0421180 b0421180 b0421180 ef8a1120 ef8a137c b180b180
> [ 37.474338] 00000000 ee780000 ee564100 00000000 00000000 00000000 00000000 00000000
> [ 37.474338] Call Trace:
> [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
> [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36
> [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb
> [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f
> [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c
> [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b011a8a0>] ? update_curr+0x12f/0x136
> [ 37.474338] [<b012381e>] ? task_tick_fair+0x59/0x86
> [ 37.474338] [<b01227b3>] ? scheduler_tick+0x268/0x3cc
> [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa
> [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf
> [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108
> [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61
> [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164
> [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79
> [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8
> [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
> [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb
> [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3
> [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c
> [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7
> [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e
> [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5
> [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4
> [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55
> [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2
> [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69
> [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] scsi_id D 00000000 0 878 873
> [ 37.474338] ee60bb7c 00000086 00000046 00000000 00000000 ef904000 00000000 ee66d910
> [ 37.474338] 93796336 00000001 b0421180 b0421180 b0421180 ee5d3120 ee5d337c b180b180
> [ 37.474338] 00000000 ee60b000 ee564700 ee5c2800 0000065a 00000000 ee60bb5c b024e1b8
> [ 37.474338] Call Trace:
> [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11
> [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod]
> [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
> [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b
> [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d
> [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a
> [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14
> [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c
> [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29
> [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a
> [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2
> [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49
> [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4
> [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1
> [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
> [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162
> [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod]
> [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e
> [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814
> [ 37.474338] [<b01e7574>] ? kobject_get+0x12/0x17
> [ 37.474338] [<b0160fc3>] ? find_lock_page+0x72/0x8d
> [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449
> [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b
> [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28
> [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377
> [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58
> [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4
> [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21
> [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21
> [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71
> [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290
> [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb
> [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30
> [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7
> [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] scsi_id D ee621bd4 0 879 876
> [ 37.474338] ee636b7c 00000086 ee621bac ee621bd4 00000000 f084c4f9 00000000 ee66d0c8
> [ 37.474338] 9394eaa1 00000001 b0421180 b0421180 b0421180 ef888020 ef88827c b180b180
> [ 37.474338] 00000000 ee636000 ee564300 ee6e8c00 00000000 00000000 ee636b5c b024e1b8
> [ 37.474338] Call Trace:
> [ 37.474338] [<f084c4f9>] ? scsi_done+0x0/0x19 [scsi_mod]
> [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11
> [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod]
> [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4
> [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b
> [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d
> [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a
> [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e
> [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd
> [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14
> [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c
> [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29
> [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a
> [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2
> [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49
> [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4
> [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1
> [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443
> [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5
> [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124
> [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162
> [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod]
> [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e
> [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814
> [ 37.474338] [<b0168bbe>] ? activate_page+0xb1/0xb9
> [ 37.474338] [<b0168cce>] ? mark_page_accessed+0x27/0x2e
> [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449
> [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b
> [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28
> [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377
> [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58
> [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4
> [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21
> [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21
> [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71
> [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290
> [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb
> [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30
> [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7
> [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63
> [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb
> [ 37.474338] =======================
> [ 37.474338] Sched Debug Version: v0.07, 2.6.25-rc2-smp #14
> [ 37.474338] now at 44900.194289 msecs
> [ 37.474338] .sysctl_sched_latency : 20.000000
> [ 37.474338] .sysctl_sched_min_granularity : 4.000000
> [ 37.474338] .sysctl_sched_wakeup_granularity : 10.000000
> [ 37.474338] .sysctl_sched_batch_wakeup_granularity : 10.000000
> [ 37.474338] .sysctl_sched_child_runs_first : 0.000001
> [ 37.474338] .sysctl_sched_features : 39
> [ 37.474338]
> [ 37.474338] cpu#0, 2992.603 MHz
> [ 37.474338] .nr_running : 4
> [ 37.474338] .load : 2048
> [ 37.474338] .nr_switches : 3363
> [ 37.474338] .nr_load_updates : 34557
> [ 37.474338] .nr_uninterruptible : 2
> [ 37.474338] .jiffies : 4294705811
> [ 37.474338] .next_balance : 4294.705968
> [ 37.474338] .curr->pid : 4
> [ 37.474338] .clock : 37474.338672
> [ 37.474338] .idle_clock : 3065.714769
> [ 37.474338] .prev_clock_raw : 78772.978492
> [ 37.474338] .clock_warps : 0
> [ 37.474338] .clock_overflows : 3996
> [ 37.474338] .clock_underflows : 31781
> [ 37.474338] .clock_deep_idle_events : 1
> [ 37.474338] .clock_max_delta : 0.999848
> [ 37.474338] .cpu_load[0] : 2048
> [ 37.474338] .cpu_load[1] : 2048
> [ 37.474338] .cpu_load[2] : 2048
> [ 37.474338] .cpu_load[3] : 2048
> [ 37.474338] .cpu_load[4] : 2048
> [ 37.474338]
> [ 37.474338] cfs_rq
> [ 37.474338] .exec_clock : 34293.783916
> [ 37.474338] .MIN_vruntime : 0.000001
> [ 37.474338] .min_vruntime : 17146.893105
> [ 37.474338] .max_vruntime : 0.000001
> [ 37.474338] .spread : 0.000000
> [ 37.474338] .spread0 : 0.000000
> [ 37.474338] .nr_running : 1
> [ 37.474338] .load : 2048
> [ 37.474338] .bkl_count : 405
> [ 37.474338] .nr_spread_over : 0
> [ 37.474338]
> [ 37.474338] cfs_rq
> [ 37.474338] .exec_clock : 34293.783916
> [ 37.474338] .MIN_vruntime : 13830.833996
> [ 37.474338] .min_vruntime : 17146.893105
> [ 37.474338] .max_vruntime : 13830.833996
> [ 37.474338] .spread : 0.000000
> [ 37.474338] .spread0 : 0.000000
> [ 37.474338] .nr_running : 4
> [ 37.474338] .load : 8290
> [ 37.474338] .bkl_count : 405
> [ 37.474338] .nr_spread_over : 6
> [ 37.474338]
> [ 37.474338] runnable tasks:
> [ 37.474338] task PID tree-key switches prio exec-runtime sum-exec sum-sleep
> [ 37.474338] ----------------------------------------------------------------------------------------------------------
> [ 37.474338] R ksoftirqd/0 4 14329.115724 37 115 14329.115724 33248.417353 4113.418770
> [ 37.474338] events/0 5 13830.833996 35 115 13830.833996 0.284125 4265.748410
> [ 37.474338] blogd 424 13830.833996 135 120 13830.833996 0.460905 1680.558420
> [ 37.474338] udevsettle 465 13830.833996 17 120 13830.833996 0.838894 643.594622
> [ 37.474338]
>
>
>
>
--
Jens Axboe
On Fri, 2008-02-22 at 08:32 +0100, Jens Axboe wrote:
> On Thu, Feb 21 2008, Mike Galbraith wrote:
> > Greetings,
> >
> > K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about
> > buffer underrun upon every attempt to burn a CD. I can't fully bisect
> > the problem because intervening kernels hang soft during boot. Using
> > git bisect visualize, and converting to postable text:
> >
> > bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5)
> > bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa)
> > libata: update ATAPI overflow draining
> > bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4
>
> Tejun?
<crickets chirping> He must be off having a life or something ;-)
Meanwhile back at the ranch, reverting
6b00769fe1502b4ad97bb327ef7ac971b208bfb5
40b01b9bbdf51ae543a04744283bf2d56c4a6afa and the one entangled line from
dde2020754aeb14e17052d61784dcb37f252aac2 did restore my burner.
-Mike
On Sat, 2008-02-23 at 08:42 +0100, Mike Galbraith wrote:
> Meanwhile back at the ranch, reverting
> 6b00769fe1502b4ad97bb327ef7ac971b208bfb5
> 40b01b9bbdf51ae543a04744283bf2d56c4a6afa and the one entangled line from
> dde2020754aeb14e17052d61784dcb37f252aac2 did restore my burner.
It looks like the reason for boot failure with
40b01b9bbdf51ae543a04744283bf2d56c4a6afa may be that one hunk of
6b00769fe1502b4ad97bb327ef7ac971b208bfb5 was supposed to land in
40b01b9bbdf51ae543a04744283bf2d56c4a6afa (per comment);
diff --git a/block/blk-map.c b/block/blk-map.c
index a7cf63c..09f7fd0 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -154,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
bio->bi_size += pad_len;
+ rq->data_len += pad_len;
}
rq->buffer = rq->data = NULL;
Something else looks funny with
6b00769fe1502b4ad97bb327ef7ac971b208bfb5, did something go missing?
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 135c1d0..ba21d97 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1014,10 +1014,6 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb,
}
req->buffer = NULL;
- if (blk_pc_request(req))
- sdb->length = req->data_len;
- else
- sdb->length = req->nr_sectors << 9;
/*
* Next, walk the list, and fill in the addresses and sizes of <== here
@@ -1026,6 +1022,10 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb,
count = blk_rq_map_sg(req->q, req, sdb->table.sgl);
BUG_ON(count > sdb->table.nents);
sdb->table.nents = count;
+ if (blk_pc_request(req))
+ sdb->length = req->data_len;
+ else
+ sdb->length = req->nr_sectors << 9;
return BLKPREP_OK;
}
Greetings,
I straced both a good and a bad kernel (good being .git with attached
revert patch applied) and filtered/diffed/merged the output. Scroll
down to "HERE" to see the problem (resid).
I'm poking around, but not having much luck.
--- good 2008-02-26 09:11:08.000000000 +0100
+++ bad 2008-02-26 09:03:44.000000000 +0100
@@ -1,48 +1,44 @@
open("/dev/sr0", O_RDWR|O_NONBLOCK) = 3
fcntl64(3, F_GETFL) = 0x8802 (flags O_RDWR|O_NONBLOCK|O_LARGEFILE)
fcntl64(3, F_SETFL, O_RDWR|O_LARGEFILE) = 0
-ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xaf8d9194) = 0
-ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xaf8d9190) = 0
-ioctl(3, SG_GET_VERSION_NUM, 0xaf8d9198) = 0
+ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xafa1a2d4) = 0
+ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xafa1a2d0) = 0
+ioctl(3, SG_GET_VERSION_NUM, 0xafa1a2d8) = 0
write(2, "Linux sg driver version: 3.5.27\n", 32Linux sg driver version: 3.5.27
) = 32
-ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xaf8d9134) = 0
-ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xaf8d9130) = 0
-ioctl(3, SG_SET_TIMEOUT, 0xaf8d9030) = 0
-fstat64(3, {st_dev=makedev(0, 13), st_ino=4758, st_mode=S_IFBLK|0640, st_nlink=1, st_uid=0, st_gid=6, st_blksize=4096, st_blocks=0, st_rdev=makedev(11, 0), st_atime=2008/02/26-08:45:17, st_mtime=2008/02/26-08:45:17, st_ctime=2008/02/26-08:45:17}) = 0
+ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xafa1a274) = 0
+ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xafa1a270) = 0
+ioctl(3, SG_SET_TIMEOUT, 0xafa1a170) = 0
+fstat64(3, {st_dev=makedev(0, 13), st_ino=4572, st_mode=S_IFBLK|0640, st_nlink=1, st_uid=0, st_gid=6, st_blksize=4096, st_blocks=0, st_rdev=makedev(11, 0), st_atime=2008/02/26-09:36:43, st_mtime=2008/02/26-09:36:43, st_ctime=2008/02/26-09:36:43}) = 0
geteuid32() = 0
getuid32() = 0
write(1, "Using libscg version \'schily-0.9"..., 35) = 35
write(1, "Driveropts: \'burnfree\'\n", 23) = 23
-ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93d4) = 0
-ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93d8) = 0
-ioctl(3, SG_GET_PACK_ID, 0xaf8d93d0) = -1 ENOTTY (Inappropriate ioctl for device)
+ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a514) = 0
+ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a518) = 0
+ioctl(3, SG_GET_PACK_ID, 0xafa1a510) = -1 ENOTTY (Inappropriate ioctl for device)
write(2, "SCSI buffer size: 64512\n", 24SCSI buffer size: 64512
) = 24
-ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93b4) = 0
-ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93b8) = 0
-ioctl(3, SG_GET_PACK_ID, 0xaf8d93b0) = -1 ENOTTY (Inappropriate ioctl for device)
-brk(0x9520000) = 0x9520000
-ioctl(3, SG_EMULATED_HOST, 0xaf8d93ec) = 0
+ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a4f4) = 0
+ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a4f8) = 0
+ioctl(3, SG_GET_PACK_ID, 0xafa1a4f0) = -1 ENOTTY (Inappropriate ioctl for device)
+brk(0x9fa8000) = 0x9fa8000
+ioctl(3, SG_EMULATED_HOST, 0xafa1a52c) = 0
HERE
write(1, "atapi: 1\n", 9) = 9
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=16, iovec_count=0, dxfer_len=36, timeout=200000, flags=0x1, data[36]=["\5\200\0052[\0\0\0BENQ DVD DD DW1625 "...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=16, iovec_count=0, dxfer_len=36, timeout=200000, flags=0x1, data[36]=["\5\200\0052[\0\0\0BENQ DVD DD DW1625 "...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=36, duration=2, info=0}) = 0
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 3f, 00, 00, 00, 00, 00, 08, 00], mx_sb_len=16, iovec_count=0, dxfer_len=8, timeout=200000, flags=0x1, data[8]=["\1\36\21\0\0\0\0\0"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=6, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 3f, 00, 00, 00, 00, 00, 08, 00], mx_sb_len=16, iovec_count=0, dxfer_len=8, timeout=200000, flags=0x1, data[8]=["\1\36\21\0\0\0\0\0"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=8, duration=4, info=0}) = 0
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 02, 00], mx_sb_len=16, iovec_count=0, dxfer_len=2, timeout=200000, flags=0x1, data[2]=["\0>"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=2, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 02, 00], mx_sb_len=16, iovec_count=0, dxfer_len=2, timeout=200000, flags=0x1, data[2]=["\0>"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=2, duration=3, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=2, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=64, duration=3, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 0a, 00], mx_sb_len=16, iovec_count=0, dxfer_len=10, timeout=200000, flags=0x1, data[10]=["\0>\21\0\0\0\0\0*6"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=10, duration=3, info=0}) = 0
-ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
+ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
-
-write(1, "Device type : Removable CD-RO"..., 34) = 34
-write(1, "Version : 5\n", 19) = 19
-write(1, "Response Format: 2\n", 19) = 19
-write(1, "Capabilities : \n", 18) = 18
-write(1, "Vendor_info : \'BENQ \'\n", 28) = 28
-write(1, "Identifikation : \'DVD DD DW1625 "..., 36) = 36
-write(1, "Revision : \'BBIA\'\n", 24) = 24
-write(1, "Device seems to be: Generic mmc2"..., 55) = 55
+ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=64, duration=3, info=0}) = 0
+write(2, "/usr/bin/cdrecord: Warning: cont"..., 80/usr/bin/cdrecord: Warning: controller returns zero sized CD capabilities page.
+) = 80
+write(2, "/usr/bin/cdrecord: Warning: cont"..., 91/usr/bin/cdrecord: Warning: controller returns wrong page 0 for CD capabilities page (2A).
+) = 91
On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote:
> Greetings,
>
> I straced both a good and a bad kernel (good being .git with attached
> revert patch applied) and filtered/diffed/merged the output. Scroll
> down to "HERE" to see the problem (resid).
>
> I'm poking around, but not having much luck.
Seems the problem is data_len changes, but raw_data_len doesn't. I've
not the foggiest IO-land clue, but k3b works again, so the below may
have some diagnostic value.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba21d97..7a6f784 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
scsi_end_bidi_request(cmd);
return;
}
- req->data_len = scsi_get_resid(cmd);
+ req->data_len = req->raw_data_len = scsi_get_resid(cmd);
}
BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */
-Mike
On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <[email protected]> wrote:
>
> On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote:
> > Greetings,
> >
> > I straced both a good and a bad kernel (good being .git with attached
> > revert patch applied) and filtered/diffed/merged the output. Scroll
> > down to "HERE" to see the problem (resid).
> >
> > I'm poking around, but not having much luck.
cc's added.
I'm told this is part of "Tejun's DMA drain handling".
> Seems the problem is data_len changes, but raw_data_len doesn't. I've
> not the foggiest IO-land clue, but k3b works again, so the below may
> have some diagnostic value.
So this change fixes a bug? Can we have a recap of how it does this?
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index ba21d97..7a6f784 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> scsi_end_bidi_request(cmd);
> return;
> }
> - req->data_len = scsi_get_resid(cmd);
> + req->data_len = req->raw_data_len = scsi_get_resid(cmd);
> }
>
> BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */
>
Thanks.
Andrew Morton wrote:
> On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <[email protected]> wrote:
>
>> On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote:
>>> Greetings,
>>>
>>> I straced both a good and a bad kernel (good being .git with attached
>>> revert patch applied) and filtered/diffed/merged the output. Scroll
>>> down to "HERE" to see the problem (resid).
>>>
>>> I'm poking around, but not having much luck.
>
> cc's added.
>
> I'm told this is part of "Tejun's DMA drain handling".
Correct.
>> Seems the problem is data_len changes, but raw_data_len doesn't. I've
>> not the foggiest IO-land clue, but k3b works again, so the below may
>> have some diagnostic value.
>
> So this change fixes a bug? Can we have a recap of how it does this?
>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index ba21d97..7a6f784 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>> scsi_end_bidi_request(cmd);
>> return;
>> }
>> - req->data_len = scsi_get_resid(cmd);
>> + req->data_len = req->raw_data_len = scsi_get_resid(cmd);
>> }
I would love to get an answer as to what data_len (and of course
raw_data_len) should be set to AFTER the command completes, which is
what is going on here.
I can see the above being correct -- scsi_get_resid() returns the length
of the left-over data after the command is processed -- but I am mainly
curious why setting [raw_]data_len matters after I/O completion.
Jeff
On Tue, 2008-02-26 at 15:08 -0800, Andrew Morton wrote:
> On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <[email protected]> wrote:
> > Seems the problem is data_len changes, but raw_data_len doesn't. I've
> > not the foggiest IO-land clue, but k3b works again, so the below may
> > have some diagnostic value.
>
> So this change fixes a bug? Can we have a recap of how it does this?
Yeah, it fixes the problem. (wrt recap, if I could write it, it would
be a changelog;)
-Mike
On Tue, 2008-02-26 at 19:46 -0500, Jeff Garzik wrote:
> I would love to get an answer as to what data_len (and of course
> raw_data_len) should be set to AFTER the command completes, which is
> what is going on here.
Yeah, blk_complete_sghdr_rq() used to do hdr->resid = irq->data_len,
which is modified down lower. How/where that hdr->resid percolates back
up, and turns into a retry/nogo, I don't know.
-Mike
On Wed, 2008-02-27 at 03:24 +0100, Mike Galbraith wrote:
> On Tue, 2008-02-26 at 15:08 -0800, Andrew Morton wrote:
> > So this change fixes a bug? Can we have a recap of how it does this?
>
> Yeah, it fixes the problem. (wrt recap, if I could write it, it would
> be a changelog;)
Hm. After rummaging around some more in both kernel and userland, I
think this patchlet is not only functional, but (random accident)
technically correct. What the heck, let's see if it flies...
snippet from userland:
/*
* Return the residual DMA count for last command.
* If this count is < 0, then a DMA overrun occured.
*/
EXPORT int
scg_getresid(scgp)
SCSI *scgp;
{
return (scgp->scmd->resid);
}
This function is used all over the place in cdrecord to determine
transfer size.
(patchlet takes wing, and... goes splat?)
Fix CD burning regression introduced by
6b00769fe1502b4ad97bb327ef7ac971b208bfb5. raw_data_len must be updated
to reflect residual data upon IO completion because it is used by
blk_complete_sghdr_rq() to set hdr->resid which eventually becomes
visible to userland.
Signed-off-by: Mike Galbraith <[email protected]>
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba21d97..7a6f784 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
unsigned int good_bytes)
scsi_end_bidi_request(cmd);
return;
}
- req->data_len = scsi_get_resid(cmd);
+ req->data_len = req->raw_data_len = scsi_get_resid(cmd);
}
BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet
*/
-Mike
On Wed, 2008-02-27 at 07:00 +0100, Mike Galbraith wrote:
> (patchlet takes wing, and... goes splat?)
Bugger, went splat... forgot preformat for patchlet insert. <quiltuple
checks>
Fix CD burning regression introduced by
6b00769fe1502b4ad97bb327ef7ac971b208bfb5. raw_data_len must be updated
to reflect residual data upon IO completion because it is used by
blk_complete_sghdr_rq() to set hdr->resid which eventually becomes
visible to userland.
Signed-off-by: Mike Galbraith <[email protected]>
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba21d97..7a6f784 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
scsi_end_bidi_request(cmd);
return;
}
- req->data_len = scsi_get_resid(cmd);
+ req->data_len = req->raw_data_len = scsi_get_resid(cmd);
}
BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */
diff --git a/block/blk-core.c b/block/blk-core.c
index 775c851..929ab61 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
+ rq->extra_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
diff --git a/block/blk-map.c b/block/blk-map.c
index 09f7fd0..c67a75f 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
bio->bi_size += pad_len;
rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7506c4f..efb5b4d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -232,6 +232,7 @@ new_segment:
(PAGE_SIZE - 1));
nsegs++;
rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
diff --git a/block/bsg.c b/block/bsg.c
index 7f3c095..81b2133 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
+ hdr->din_resid = blk_rq_raw_data_len(rq->next_rq);
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = blk_rq_raw_data_len(rq);
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
/*
* If the request generated a negative error number, return it
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e993cac..32424b3 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = blk_rq_raw_data_len(rq);
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0562b0a..5cab84c 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(blk_rq_raw_data_len(scmd->request),
+ (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6fe67d1..57e2a9e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len;
unsigned int sense_len;
void *data;
void *sense;
@@ -477,6 +477,11 @@ enum {
#define rq_data_dir(rq) ((rq)->cmd_flags & 1)
+static inline unsigned int blk_rq_raw_data_len(struct request *rq)
+{
+ return rq->data_len - min(rq->extra_len, rq->data_len);
+}
+
/*
* We regard a request as sync, if it's a READ or a SYNC write.
*/
On Thu, 2008-02-28 at 16:43 +0900, Tejun Heo wrote:
> Hello, all.
>
> Sorry about the delay. Was buried under other stuff. Mike, thanks a
> lot for reporting and analyzing the problem; however, the patch is
> slightly incorrect. rq->data_len is rq->data_len + extra stuff for
> alignment and padding, so the correct thing to do is...
>
> req->raw_data_len -= req->data_len - scsi_get_resid(cmd);
> req->data_len = scsi_get_resid(cmd);
Ah, close but no banana. (feeds poor wingless patchlet to bit-wolf)
> which is ugly and error-prone. In addition, this isn't the only place
> where resid is set. Other block drivers do this too. This definitely
> should be done in block layer.
>
> With rq->data_len and rq->raw_data_len, it's impossible to translate
> resid of rq->data_len to resid of rq->raw_data_len as block layer
> doesn't know how much was extra data after rq->data_len is modified.
> The attached patch substitutes rq->raw_data_len w/ rq->extra_len and
> adds blk_rq_raw_data_len(). Things look cleaner this way and the resid
> problem should be solved with this.
>
> Can you please verify the attached patch fixes the problem?
>
> Thanks.
Thank you, works fine.
> plain text document attachment (patch)
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 775c851..929ab61 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq)
> rq->nr_hw_segments = 0;
> rq->ioprio = 0;
> rq->special = NULL;
> - rq->raw_data_len = 0;
> + rq->extra_len = 0;
> rq->buffer = NULL;
> rq->tag = -1;
> rq->errors = 0;
> @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
> rq->hard_cur_sectors = rq->current_nr_sectors;
> rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
> rq->buffer = bio_data(bio);
> - rq->raw_data_len = bio->bi_size;
> rq->data_len = bio->bi_size;
>
> rq->bio = rq->biotail = bio;
> diff --git a/block/blk-map.c b/block/blk-map.c
> index 09f7fd0..c67a75f 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
> rq->biotail->bi_next = bio;
> rq->biotail = bio;
>
> - rq->raw_data_len += bio->bi_size;
> rq->data_len += bio->bi_size;
> }
> return 0;
> @@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
> bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
> bio->bi_size += pad_len;
> rq->data_len += pad_len;
> + rq->extra_len += pad_len;
> }
>
> rq->buffer = rq->data = NULL;
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 7506c4f..efb5b4d 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -232,6 +232,7 @@ new_segment:
> (PAGE_SIZE - 1));
> nsegs++;
> rq->data_len += q->dma_drain_size;
> + rq->extra_len += q->dma_drain_size;
> }
>
> if (sg)
> diff --git a/block/bsg.c b/block/bsg.c
> index 7f3c095..81b2133 100644
> --- a/block/bsg.c
> +++ b/block/bsg.c
> @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
> }
>
> if (rq->next_rq) {
> - hdr->dout_resid = rq->raw_data_len;
> - hdr->din_resid = rq->next_rq->raw_data_len;
> + hdr->dout_resid = blk_rq_raw_data_len(rq);
> + hdr->din_resid = blk_rq_raw_data_len(rq->next_rq);
> blk_rq_unmap_user(bidi_bio);
> blk_put_request(rq->next_rq);
> } else if (rq_data_dir(rq) == READ)
> - hdr->din_resid = rq->raw_data_len;
> + hdr->din_resid = blk_rq_raw_data_len(rq);
> else
> - hdr->dout_resid = rq->raw_data_len;
> + hdr->dout_resid = blk_rq_raw_data_len(rq);
>
> /*
> * If the request generated a negative error number, return it
> diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
> index e993cac..32424b3 100644
> --- a/block/scsi_ioctl.c
> +++ b/block/scsi_ioctl.c
> @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
> hdr->info = 0;
> if (hdr->masked_status || hdr->host_status || hdr->driver_status)
> hdr->info |= SG_INFO_CHECK;
> - hdr->resid = rq->raw_data_len;
> + hdr->resid = blk_rq_raw_data_len(rq);
> hdr->sb_len_wr = 0;
>
> if (rq->sense_len && hdr->sbp) {
> @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
> rq = blk_get_request(q, WRITE, __GFP_WAIT);
> rq->cmd_type = REQ_TYPE_BLOCK_PC;
> rq->data = NULL;
> - rq->raw_data_len = 0;
> rq->data_len = 0;
> + rq->extra_len = 0;
> rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
> memset(rq->cmd, 0, sizeof(rq->cmd));
> rq->cmd[0] = cmd;
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 0562b0a..5cab84c 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
> * want to set it properly, and for DMA where it is
> * effectively meaningless.
> */
> - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
> + nbytes = min(blk_rq_raw_data_len(scmd->request),
> + (unsigned int)63 * 1024);
>
> /* Most ATAPI devices which honor transfer chunk size don't
> * behave according to the spec when odd chunk size which
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 6fe67d1..57e2a9e 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -216,8 +216,8 @@ struct request {
> unsigned int cmd_len;
> unsigned char cmd[BLK_MAX_CDB];
>
> - unsigned int raw_data_len;
> unsigned int data_len;
> + unsigned int extra_len;
> unsigned int sense_len;
> void *data;
> void *sense;
> @@ -477,6 +477,11 @@ enum {
>
> #define rq_data_dir(rq) ((rq)->cmd_flags & 1)
>
> +static inline unsigned int blk_rq_raw_data_len(struct request *rq)
> +{
> + return rq->data_len - min(rq->extra_len, rq->data_len);
> +}
> +
> /*
> * We regard a request as sync, if it's a READ or a SYNC write.
> */
rq->raw_data_len introduced for block layer padding and draining
(commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte
count handling. Block drivers modify rq->data_len to notify residual
byte count to the block layer which blindly reported unmodified
rq->raw_data_len to userland.
To keep block drivers dealing only with rq->data_len, this should be
handled inside block layer. However, how much extra buffer was
appened is lost after rq->data_len is modified.
This patch replaces rq->raw_data_len with rq->extra_len and add
blk_rq_raw_data_len() helper to calculate raw data size from
rq->data_len and rq->extra_len. The helper returns correct raw
residual byte count when called on a rq whose data_len is modified to
carry residual byte count.
This problem was reported and diagnosed by Mike Galbraith.
Signed-off-by: Tejun Heo <[email protected]>
Cc: Mike Galbraith <[email protected]>
---
block/blk-core.c | 3 +--
block/blk-map.c | 2 +-
block/blk-merge.c | 1 +
block/bsg.c | 8 ++++----
block/scsi_ioctl.c | 4 ++--
drivers/ata/libata-scsi.c | 3 ++-
include/linux/blkdev.h | 8 +++++++-
7 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 775c851..929ab61 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
+ rq->extra_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
diff --git a/block/blk-map.c b/block/blk-map.c
index 09f7fd0..c67a75f 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
bio->bi_size += pad_len;
rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7506c4f..efb5b4d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -232,6 +232,7 @@ new_segment:
(PAGE_SIZE - 1));
nsegs++;
rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
diff --git a/block/bsg.c b/block/bsg.c
index 7f3c095..81b2133 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
+ hdr->din_resid = blk_rq_raw_data_len(rq->next_rq);
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = blk_rq_raw_data_len(rq);
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
/*
* If the request generated a negative error number, return it
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e993cac..32424b3 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = blk_rq_raw_data_len(rq);
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0562b0a..5cab84c 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(blk_rq_raw_data_len(scmd->request),
+ (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6fe67d1..917b97f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len; /* length of alignment and padding */
unsigned int sense_len;
void *data;
void *sense;
@@ -477,6 +477,12 @@ enum {
#define rq_data_dir(rq) ((rq)->cmd_flags & 1)
+/* data_len of the request sans extra stuff for alignment and padding */
+static inline unsigned int blk_rq_raw_data_len(struct request *rq)
+{
+ return rq->data_len - min(rq->extra_len, rq->data_len);
+}
+
/*
* We regard a request as sync, if it's a READ or a SYNC write.
*/
On Thu, Feb 28 2008, Tejun Heo wrote:
> rq->raw_data_len introduced for block layer padding and draining
> (commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte
> count handling. Block drivers modify rq->data_len to notify residual
> byte count to the block layer which blindly reported unmodified
> rq->raw_data_len to userland.
>
> To keep block drivers dealing only with rq->data_len, this should be
> handled inside block layer. However, how much extra buffer was
> appened is lost after rq->data_len is modified.
>
> This patch replaces rq->raw_data_len with rq->extra_len and add
> blk_rq_raw_data_len() helper to calculate raw data size from
> rq->data_len and rq->extra_len. The helper returns correct raw
> residual byte count when called on a rq whose data_len is modified to
> carry residual byte count.
>
> This problem was reported and diagnosed by Mike Galbraith.
Tejun, this patch isn't much cleaner at all. It really shows the pain of
these two seperate, yet related, variables.
--
Jens Axboe
Jens Axboe wrote:
>> This problem was reported and diagnosed by Mike Galbraith.
>
> Tejun, this patch isn't much cleaner at all. It really shows the pain of
> these two seperate, yet related, variables.
Not much cleaner compared to what? I think padding stuff is bound to be
somewhat complex. It's a nasty thing in nature. I think ->extra_len is
better than ->raw_data_len because ->extra_len only needs to be updated
where the dirty jobs are done and extra buffer areas are added. Any
better suggestions?
Thanks.
--
tejun
On Fri, 2008-02-29 at 00:46 +0900, Tejun Heo wrote:
> Jens Axboe wrote:
> >> This problem was reported and diagnosed by Mike Galbraith.
> >
> > Tejun, this patch isn't much cleaner at all. It really shows the pain of
> > these two seperate, yet related, variables.
>
> Not much cleaner compared to what? I think padding stuff is bound to be
> somewhat complex. It's a nasty thing in nature. I think ->extra_len is
> better than ->raw_data_len because ->extra_len only needs to be updated
> where the dirty jobs are done and extra buffer areas are added. Any
> better suggestions?
Well, I just investigated a bug report in the SCSI transport class. Our
SMP handler is broken in exactly the same way. We rely on the incoming
reported request lengths to size our request data, and they've blown up
from the true length to 512 bytes (the size of our alignment).
With the original patch, I have to run through the whole of libsas and
scsi_transport_sas doing
s/data_len/raw_data_len/
With your update it looks like I have to run through them all doing
s/data_len/data_len - extra_len/
which is even worse. Can't we put things back to a point where data_len
means exactly that and extra_len means how much we have spare on the
end, so you know you can DMA up to data_len + extra_len if need be?
That way we don't have to sweep through every block driver altering the
way it uses data_len.
James
On Fri, Feb 29 2008, James Bottomley wrote:
>
> On Fri, 2008-02-29 at 00:46 +0900, Tejun Heo wrote:
> > Jens Axboe wrote:
> > >> This problem was reported and diagnosed by Mike Galbraith.
> > >
> > > Tejun, this patch isn't much cleaner at all. It really shows the pain of
> > > these two seperate, yet related, variables.
> >
> > Not much cleaner compared to what? I think padding stuff is bound to be
> > somewhat complex. It's a nasty thing in nature. I think ->extra_len is
> > better than ->raw_data_len because ->extra_len only needs to be updated
> > where the dirty jobs are done and extra buffer areas are added. Any
> > better suggestions?
>
> Well, I just investigated a bug report in the SCSI transport class. Our
> SMP handler is broken in exactly the same way. We rely on the incoming
> reported request lengths to size our request data, and they've blown up
> from the true length to 512 bytes (the size of our alignment).
>
> With the original patch, I have to run through the whole of libsas and
> scsi_transport_sas doing
>
> s/data_len/raw_data_len/
>
> With your update it looks like I have to run through them all doing
>
> s/data_len/data_len - extra_len/
>
> which is even worse. Can't we put things back to a point where data_len
> means exactly that and extra_len means how much we have spare on the
> end, so you know you can DMA up to data_len + extra_len if need be?
>
> That way we don't have to sweep through every block driver altering the
> way it uses data_len.
Fully agree. The reason why I think it's so ugly is that you have to
keep these two seperate variables in sync. The burning was just one bug,
there will be others...
--
Jens Axboe
Hello, Jens, James.
Jens Axboe wrote:
>> With the original patch, I have to run through the whole of libsas and
>> scsi_transport_sas doing
>>
>> s/data_len/raw_data_len/
>>
>> With your update it looks like I have to run through them all doing
>>
>> s/data_len/data_len - extra_len/
blk_rq_raw_data_len() should do.
>> which is even worse. Can't we put things back to a point where data_len
>> means exactly that and extra_len means how much we have spare on the
>> end, so you know you can DMA up to data_len + extra_len if need be?
>>
>> That way we don't have to sweep through every block driver altering the
>> way it uses data_len.
If SMP is broken because it needs start address alignment but not
padding to align the size, what should be done is to make that exact
requirement visible to the block layer. Say,
blk_queue_dma_start_alignment() or maybe change
blk_queue_dma_alignment() such that it only indicates start address
alignment and add blk_queue_dma_size_alignment() for drivers which
require size to be aligned too. I think those are few.
I think the decision which value rq->data_len represents comes down to
which size is used more in low level drivers because no matter which way
we choose we'll have to update some of the drivers which expects the
other thing from rq->data_len.
blk_rq_raw_data_len() is needed iff a driver needs dummy buffers
attached at the end and still needs to know the original request size
which isn't the common case.
> Fully agree. The reason why I think it's so ugly is that you have to
> keep these two seperate variables in sync. The burning was just one bug,
> there will be others...
The posted modification isn't too bad as the maintenance of the two
variables is at places where the nasty things happen. I think what
rq->data_len should represent when seen from LLDs is more important and
please note that if SMP is broken because it simply doesn't require
512byte size alignment, it's a different issue.
As long as both raw_data_len and data_len are accessible, I'm okay
either way. My biggest reluctance is against breaking sum(sg) ==
rq->data_len. I think this can lead to much more subtle problems such
as programming the controller w/ wrong bytes count and wrapped-around
resid calculation.
Thanks.
--
tejun
On Sat, 2008-03-01 at 15:17 +0900, Tejun Heo wrote:
> Hello, Jens, James.
>
> Jens Axboe wrote:
> >> With the original patch, I have to run through the whole of libsas and
> >> scsi_transport_sas doing
> >>
> >> s/data_len/raw_data_len/
> >>
> >> With your update it looks like I have to run through them all doing
> >>
> >> s/data_len/data_len - extra_len/
>
> blk_rq_raw_data_len() should do.
I know we *could* sweep through all the block drivers altering them; my
point is that I don't think we *should*. Fundamentally, every driver
that cares is assuming req->data_len is the length of the request that
came down. The fact that it got padded is irrelevant (and actually
detrimental) to most of them as the SMP driver illustrates.
We use a high dma_alignment not because we care about padding, but
because we want to avoid scatter gather. So we care about alignment of
the start of the buffer (to avoid sg), but fundamentally, we need to
know what its true length (not its padded length) is. The true length
feeds into the smp frame size and is checked by the interfaces, which is
why the changes caused an SMP failure.
Just for the principle of least surprise, can we not keep req->data_len
what it has always been, namely the true data length of the request and
express the fact that we've padded it by req->extra_len or something, so
we don't have to do all of these driver changes.
> >> which is even worse. Can't we put things back to a point where data_len
> >> means exactly that and extra_len means how much we have spare on the
> >> end, so you know you can DMA up to data_len + extra_len if need be?
> >>
> >> That way we don't have to sweep through every block driver altering the
> >> way it uses data_len.
>
> If SMP is broken because it needs start address alignment but not
> padding to align the size, what should be done is to make that exact
> requirement visible to the block layer. Say,
> blk_queue_dma_start_alignment() or maybe change
> blk_queue_dma_alignment() such that it only indicates start address
> alignment and add blk_queue_dma_size_alignment() for drivers which
> require size to be aligned too. I think those are few.
But this is true of *every* current user of the block layer apart from
IDE ... we all care about alignment not padding. Any current user that
actually cares about padding will be doing their own adjustments, so
they need changing anyway.
We can frame that with a different API, but blk_queue_dma_alignment()
better be the common case (start but not pad alignment).
> I think the decision which value rq->data_len represents comes down to
> which size is used more in low level drivers because no matter which way
> we choose we'll have to update some of the drivers which expects the
> other thing from rq->data_len.
Right, and currently, apart from IDE, they all want it to mean the true
data length.
> blk_rq_raw_data_len() is needed iff a driver needs dummy buffers
> attached at the end and still needs to know the original request size
> which isn't the common case.
I think it *is* the common case.
> > Fully agree. The reason why I think it's so ugly is that you have to
> > keep these two seperate variables in sync. The burning was just one bug,
> > there will be others...
>
> The posted modification isn't too bad as the maintenance of the two
> variables is at places where the nasty things happen. I think what
> rq->data_len should represent when seen from LLDs is more important and
> please note that if SMP is broken because it simply doesn't require
> 512byte size alignment, it's a different issue.
But we still have to find all the bugs this causes in all the block
drivers ... that's my biggest concern right now.
> As long as both raw_data_len and data_len are accessible, I'm okay
> either way. My biggest reluctance is against breaking sum(sg) ==
> rq->data_len. I think this can lead to much more subtle problems such
> as programming the controller w/ wrong bytes count and wrapped-around
> resid calculation.
OK, so can we go back to data_len being the true value and add an
extra_len for drivers who want to know where the padding lies?
James
On Sat, 01 Mar 2008 15:17:32 +0900
Tejun Heo <[email protected]> wrote:
> Hello, Jens, James.
>
> Jens Axboe wrote:
> >> With the original patch, I have to run through the whole of libsas and
> >> scsi_transport_sas doing
> >>
> >> s/data_len/raw_data_len/
> >>
> >> With your update it looks like I have to run through them all doing
> >>
> >> s/data_len/data_len - extra_len/
>
> blk_rq_raw_data_len() should do.
>
> >> which is even worse. Can't we put things back to a point where data_len
> >> means exactly that and extra_len means how much we have spare on the
> >> end, so you know you can DMA up to data_len + extra_len if need be?
> >>
> >> That way we don't have to sweep through every block driver altering the
> >> way it uses data_len.
>
> If SMP is broken because it needs start address alignment but not
> padding to align the size, what should be done is to make that exact
> requirement visible to the block layer. Say,
> blk_queue_dma_start_alignment() or maybe change
> blk_queue_dma_alignment() such that it only indicates start address
> alignment and add blk_queue_dma_size_alignment() for drivers which
> require size to be aligned too. I think those are few.
>
> I think the decision which value rq->data_len represents comes down to
> which size is used more in low level drivers because no matter which way
> we choose we'll have to update some of the drivers which expects the
> other thing from rq->data_len.
>
> blk_rq_raw_data_len() is needed iff a driver needs dummy buffers
> attached at the end and still needs to know the original request size
> which isn't the common case.
>
> > Fully agree. The reason why I think it's so ugly is that you have to
> > keep these two seperate variables in sync. The burning was just one bug,
> > there will be others...
>
> The posted modification isn't too bad as the maintenance of the two
> variables is at places where the nasty things happen. I think what
> rq->data_len should represent when seen from LLDs is more important and
> please note that if SMP is broken because it simply doesn't require
> 512byte size alignment, it's a different issue.
>
> As long as both raw_data_len and data_len are accessible, I'm okay
> either way. My biggest reluctance is against breaking sum(sg) ==
> rq->data_len. I think this can lead to much more subtle problems such
> as programming the controller w/ wrong bytes count and wrapped-around
> resid calculation.
sum(sg) == rq->data_len is already broken; sg sends such requests
(though it would be nice if it doesn't).
I've not followed the earlier discussion (because I thought the drain
buffer stuff affected only libata but seems it doesn't ...). Why did
we need to change the meaning of rq->data_len?
rq->data_len meant the true data length and the patch to change it
doesn't look to make anything simple. Can we revert the meaning of
rq->data_len? I'm not sure that we need to add rq->extra_len but it's
fine as long as it's only for drivers that want to use it.
This is only compile tested.
=
diff --git a/block/blk-core.c b/block/blk-core.c
index 775c851..bfec406 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->cmd_len = 0;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->data_len = 0;
+ rq->extra_len = 0;
rq->sense_len = 0;
rq->data = NULL;
rq->sense = NULL;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
diff --git a/block/blk-map.c b/block/blk-map.c
index 09f7fd0..3287637 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -155,7 +154,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
bio->bi_size += pad_len;
- rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7506c4f..0f58616 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -231,7 +231,7 @@ new_segment:
((unsigned long)q->dma_drain_buffer) &
(PAGE_SIZE - 1));
nsegs++;
- rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
diff --git a/block/bsg.c b/block/bsg.c
index 7f3c095..8917c51 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
+ hdr->din_resid = rq->next_rq->data_len;
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = rq->data_len;
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
/*
* If the request generated a negative error number, return it
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e993cac..a2c3a93 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = rq->data_len;
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7b1f1ee..fe47922 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
}
qc->tf.command = ATA_CMD_PACKET;
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* check whether ATAPI DMA is safe */
if (!using_pio && ata_check_atapi_dma(qc))
@@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
@@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
* TODO: find out if we need to do more here to
* cover scatter/gather case.
*/
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* request result TF and be quiet about device error */
qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6fe67d1..b72526c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len; /* length of alignment and padding */
unsigned int sense_len;
void *data;
void *sense;
FUJITA Tomonori wrote:
> sum(sg) == rq->data_len is already broken; sg sends such requests
> (though it would be nice if it doesn't).
>
Actually, I think I was half wrong on that when you asked about
scsi_debug. The scatterlist that sg.c uses is never seen by the block
layer or scsi layer. It is just used as a container to hold segments.
sg.c and st.c use their scatterlist to manage their preallocated
pages/segments. When they pass it to scsi_execute_async, that function
will create a request struct and add bios for the pages.
In 2.6.24 and below, sg.c will send a scatterlist length that does not
match the IO length, and scsi_execute_async will goof up and send a
rq->data_len that does not match the sum of the bios. That is what I was
trying to fix in 2.6.24, but the patch got messed up. In 2.6.25-rc2 and
above that is fixed and scsi_execute_async will catch sg.c doing this
and set rq->data_len and the bio lengths correctly.
So hopefully that helps any fixes you might have planned.
FUJITA Tomonori wrote:
> sum(sg) == rq->data_len is already broken; sg sends such requests
> (though it would be nice if it doesn't).
>
> I've not followed the earlier discussion (because I thought the drain
> buffer stuff affected only libata but seems it doesn't ...). Why did
> we need to change the meaning of rq->data_len?
At this point, it's not clear what the original meaning of rq->data_len
is because before moving alignment and padding to block layer,
rq->data_len equaled both the requested data length and the length of sg
list. AFAIK, it's SCSI midlayer which makes sg list and data length
mismatch not block layer.
>From the POV of the block layer, as now it might extend the sg list, it
has to decide what rq->data_len means in this case - the requested
transfer length from userland or the length of mapped sg list.
I think that currently the biggest problem is that drivers which don't
require request size adjustment are getting it because alignment setting
doesn't distinguish between start address alignment and size alignment.
For drivers which don't require data size adjustment from block layer,
data_len or raw_data_len doesn't matter. They're equal anyway. I'm
prepping a patch for this now.
For drivers which do require request size adjustments, I think it's
better to keep rq->data_len in line with the size of mapped sg list.
The rationales are...
- Those are dumb controllers which want to see requests which meet
certain size requirements and they're likely to care more about actual
data buffer size than user requested buffer size. IOW, they wanna see
sizes which meet certain requirements, so give them those values.
- I think bugs caused by using raw_data_len instead of data_len are more
subtle than the other way around. Using data_len instead of
raw_data_len usually affects the application layer while using
raw_data_len instead of data_len affects the DMA engine and transport layer.
> rq->data_len meant the true data length and the patch to change it
> doesn't look to make anything simple. Can we revert the meaning of
> rq->data_len? I'm not sure that we need to add rq->extra_len but it's
> fine as long as it's only for drivers that want to use it.
>
> This is only compile tested.
If we're gonna go this way, we'll need blk_rq_total_data_len() and use
it for drivers which requires request size adjustments.
Thanks.
--
tejun
Is it possible to teach Thunderturd to NOT munge the cc line? It
stripped names from cc addresses, and here when that happens the message
lands (intentionally) in my spam grinder. I just happened to see this
one before flushing, but now, thanks to Thunderturd, every follow-up
will also land there. (well, not any more since I restored it)
-Mike
On Mon, 03 Mar 2008 11:40:08 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> > sum(sg) == rq->data_len is already broken; sg sends such requests
> > (though it would be nice if it doesn't).
> >
> > I've not followed the earlier discussion (because I thought the drain
> > buffer stuff affected only libata but seems it doesn't ...). Why did
> > we need to change the meaning of rq->data_len?
>
> At this point, it's not clear what the original meaning of rq->data_len
> is because before moving alignment and padding to block layer,
> rq->data_len equaled both the requested data length and the length of sg
> list. AFAIK, it's SCSI midlayer which makes sg list and data length
> mismatch not block layer.
>
> From the POV of the block layer, as now it might extend the sg list, it
> has to decide what rq->data_len means in this case - the requested
> transfer length from userland or the length of mapped sg list.
Yeah. It meant the requested transfer length from userland in the past
and I think that chaning to the length of mapped sg list doesn't make
anything simpler.
> I think that currently the biggest problem is that drivers which don't
> require request size adjustment are getting it because alignment setting
> doesn't distinguish between start address alignment and size alignment.
> For drivers which don't require data size adjustment from block layer,
> data_len or raw_data_len doesn't matter. They're equal anyway. I'm
> prepping a patch for this now.
>
> For drivers which do require request size adjustments, I think it's
> better to keep rq->data_len in line with the size of mapped sg list.
> The rationales are...
>
> - Those are dumb controllers which want to see requests which meet
> certain size requirements and they're likely to care more about actual
> data buffer size than user requested buffer size. IOW, they wanna see
> sizes which meet certain requirements, so give them those values.
The drivers care about the actual data buffer size.
The dumb controllers drivers can get what they want to use
rq->extra_len or walk through the sg list.
> - I think bugs caused by using raw_data_len instead of data_len are more
> subtle than the other way around. Using data_len instead of
> raw_data_len usually affects the application layer while using
> raw_data_len instead of data_len affects the DMA engine and transport layer.
If we add extra_len, we can get what raw_data_len and data_len
provide.
I can't see what changing the meaning of rq->data_len (and
investigating all the block drivers) gives us.
> > rq->data_len meant the true data length and the patch to change it
> > doesn't look to make anything simple. Can we revert the meaning of
> > rq->data_len? I'm not sure that we need to add rq->extra_len but it's
> > fine as long as it's only for drivers that want to use it.
> >
> > This is only compile tested.
>
> If we're gonna go this way, we'll need blk_rq_total_data_len() and use
> it for drivers which requires request size adjustments.
No problem. It would be much better to add blk_rq_total_data_len
rather than chainging the meaning of rq->data_len and all the block
drivers.
Here's an updated patch (I forgot to remove the bi_size adjustment in
blk_rq_map_user in the previous patch). Can we agree on it if we add
blk_rq_total_data_len()?
diff --git a/block/blk-core.c b/block/blk-core.c
index 775c851..bfec406 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->cmd_len = 0;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->data_len = 0;
+ rq->extra_len = 0;
rq->sense_len = 0;
rq->data = NULL;
rq->sense = NULL;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
diff --git a/block/blk-map.c b/block/blk-map.c
index 09f7fd0..f559832 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -151,11 +150,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
*/
if (len & queue_dma_alignment(q)) {
unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
- struct bio *bio = rq->biotail;
- bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
- bio->bi_size += pad_len;
- rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7506c4f..0f58616 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -231,7 +231,7 @@ new_segment:
((unsigned long)q->dma_drain_buffer) &
(PAGE_SIZE - 1));
nsegs++;
- rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
diff --git a/block/bsg.c b/block/bsg.c
index 7f3c095..8917c51 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
+ hdr->din_resid = rq->next_rq->data_len;
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = rq->data_len;
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
/*
* If the request generated a negative error number, return it
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e993cac..a2c3a93 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = rq->data_len;
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7b1f1ee..fe47922 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
}
qc->tf.command = ATA_CMD_PACKET;
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* check whether ATAPI DMA is safe */
if (!using_pio && ata_check_atapi_dma(qc))
@@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
@@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
* TODO: find out if we need to do more here to
* cover scatter/gather case.
*/
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* request result TF and be quiet about device error */
qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6fe67d1..b72526c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len; /* length of alignment and padding */
unsigned int sense_len;
void *data;
void *sense;
FUJITA Tomonori wrote:
>> - I think bugs caused by using raw_data_len instead of data_len are more
>> subtle than the other way around. Using data_len instead of
>> raw_data_len usually affects the application layer while using
>> raw_data_len instead of data_len affects the DMA engine and transport layer.
>
> If we add extra_len, we can get what raw_data_len and data_len
> provide.
>
> I can't see what changing the meaning of rq->data_len (and
> investigating all the block drivers) gives us.
No matter which way you go, you change the meaning of rq->data_len and
you MUST inspect rq->data_len usage whichever way you go. Apply your
patch and try to do sg IO on IDE cdrom w/ various transfer lengths.
--
tejun
rq->raw_data_len introduced for block layer padding and draining
(commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte
count handling. Block drivers modify rq->data_len to notify residual
byte count to the block layer which blindly reported unmodified
rq->raw_data_len to userland.
To keep block drivers dealing only with rq->data_len, this should be
handled inside block layer. However, how much extra buffer was
appened is lost after rq->data_len is modified.
This patch replaces rq->raw_data_len with rq->extra_len and add
blk_rq_raw_data_len() helper to calculate raw data size from
rq->data_len and rq->extra_len. The helper returns correct raw
residual byte count when called on a rq whose data_len is modified to
carry residual byte count.
This problem was reported and diagnosed by Mike Galbraith.
Signed-off-by: Tejun Heo <[email protected]>
Cc: Mike Galbraith <[email protected]>
---
Comments updated compared to the previous version.
block/blk-core.c | 3 +--
block/blk-map.c | 2 +-
block/blk-merge.c | 1 +
block/blk-settings.c | 4 ++++
block/bsg.c | 8 ++++----
block/scsi_ioctl.c | 4 ++--
drivers/ata/libata-scsi.c | 3 ++-
include/linux/blkdev.h | 8 +++++++-
8 files changed, 22 insertions(+), 11 deletions(-)
Index: work/block/blk-core.c
===================================================================
--- work.orig/block/blk-core.c
+++ work/block/blk-core.c
@@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, st
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
+ rq->extra_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queu
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
Index: work/block/blk-map.c
===================================================================
--- work.orig/block/blk-map.c
+++ work/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_que
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
bio->bi_size += pad_len;
rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
Index: work/block/blk-merge.c
===================================================================
--- work.orig/block/blk-merge.c
+++ work/block/blk-merge.c
@@ -232,6 +232,7 @@ new_segment:
(PAGE_SIZE - 1));
nsegs++;
rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
Index: work/block/bsg.c
===================================================================
--- work.orig/block/bsg.c
+++ work/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(stru
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
+ hdr->din_resid = blk_rq_raw_data_len(rq->next_rq);
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = blk_rq_raw_data_len(rq);
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = blk_rq_raw_data_len(rq);
/*
* If the request generated a negative error number, return it
Index: work/block/scsi_ioctl.c
===================================================================
--- work.orig/block/scsi_ioctl.c
+++ work/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = blk_rq_raw_data_len(rq);
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct req
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
Index: work/drivers/ata/libata-scsi.c
===================================================================
--- work.orig/drivers/ata/libata-scsi.c
+++ work/drivers/ata/libata-scsi.c
@@ -2549,7 +2549,8 @@ static unsigned int atapi_xlat(struct at
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(blk_rq_raw_data_len(scmd->request),
+ (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
Index: work/include/linux/blkdev.h
===================================================================
--- work.orig/include/linux/blkdev.h
+++ work/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len; /* length of padding and draining buffers */
unsigned int sense_len;
void *data;
void *sense;
@@ -477,6 +477,12 @@ enum {
#define rq_data_dir(rq) ((rq)->cmd_flags & 1)
+/* data_len of the request sans extra stuff for padding and draining */
+static inline unsigned int blk_rq_raw_data_len(struct request *rq)
+{
+ return rq->data_len - min(rq->extra_len, rq->data_len);
+}
+
/*
* We regard a request as sync, if it's a READ or a SYNC write.
*/
Index: work/block/blk-settings.c
===================================================================
--- work.orig/block/blk-settings.c
+++ work/block/blk-settings.c
@@ -309,6 +309,10 @@ EXPORT_SYMBOL(blk_queue_stack_limits);
* does is adjust the queue so that the buf is always appended
* silently to the scatterlist.
*
+ * Appending draining buffer to a request modifies ->data_len such
+ * that it includes the drain buffer. The original requested data
+ * length can be obtained using blk_rq_raw_data_len().
+ *
* Note: This routine adjusts max_hw_segments to make room for
* appending the drain buffer. If you call
* blk_queue_max_hw_segments() or blk_queue_max_phys_segments() after
Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Signed-off-by: Tejun Heo <[email protected]>
---
As wrote before, the major problem was that drivers which don't want
size adjustment got it acciedentally by mixing up aligning and padding
which are two conceptually separate things. Let padding occur iff the
driver explicitly requested it. This makes both parties happy.
block/blk-map.c | 16 +++++++++-------
block/blk-settings.c | 17 +++++++++++++++++
drivers/ata/libata-scsi.c | 3 ++-
include/linux/blkdev.h | 2 ++
4 files changed, 30 insertions(+), 8 deletions(-)
Index: work/block/blk-settings.c
===================================================================
--- work.orig/block/blk-settings.c
+++ work/block/blk-settings.c
@@ -293,6 +293,23 @@ void blk_queue_stack_limits(struct reque
EXPORT_SYMBOL(blk_queue_stack_limits);
/**
+ * blk_queue_dma_pad - set pad mask
+ * @q: the request queue for the device
+ * @mask: pad mask
+ *
+ * Set pad mask. Direct IO requests are padded to the mask specified.
+ *
+ * Appending pad buffer to a request modifies ->data_len such that it
+ * includes the pad buffer. The original requested data length can be
+ * obtained using blk_rq_raw_data_len().
+ **/
+void blk_queue_dma_pad(struct request_queue *q, unsigned int mask)
+{
+ q->dma_pad_mask = mask;
+}
+EXPORT_SYMBOL(blk_queue_dma_pad);
+
+/**
* blk_queue_dma_drain - Set up a drain buffer for excess dma.
*
* @q: the request queue for the device
Index: work/block/blk-map.c
===================================================================
--- work.orig/block/blk-map.c
+++ work/block/blk-map.c
@@ -43,6 +43,7 @@ static int __blk_rq_map_user(struct requ
void __user *ubuf, unsigned int len)
{
unsigned long uaddr;
+ unsigned int alignment;
struct bio *bio, *orig_bio;
int reading, ret;
@@ -53,8 +54,8 @@ static int __blk_rq_map_user(struct requ
* direct dma. else, set up kernel bounce buffers
*/
uaddr = (unsigned long) ubuf;
- if (!(uaddr & queue_dma_alignment(q)) &&
- !(len & queue_dma_alignment(q)))
+ alignment = queue_dma_alignment(q) | q->dma_pad_mask;
+ if (!(uaddr & alignment) && !(len & alignment))
bio = bio_map_user(q, NULL, uaddr, len, reading);
else
bio = bio_copy_user(q, uaddr, len, reading);
@@ -141,15 +142,16 @@ int blk_rq_map_user(struct request_queue
/*
* __blk_rq_map_user() copies the buffers if starting address
- * or length isn't aligned. As the copied buffer is always
- * page aligned, we know that there's enough room for padding.
- * Extend the last bio and update rq->data_len accordingly.
+ * or length isn't aligned to dma_pad_mask. As the copied
+ * buffer is always page aligned, we know that there's enough
+ * room for padding. Extend the last bio and update
+ * rq->data_len accordingly.
*
* On unmap, bio_uncopy_user() will use unmodified
* bio_map_data pointed to by bio->bi_private.
*/
- if (len & queue_dma_alignment(q)) {
- unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
+ if (len & q->dma_pad_mask) {
+ unsigned int pad_len = (q->dma_pad_mask & ~len) + 1;
struct bio *bio = rq->biotail;
bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
Index: work/include/linux/blkdev.h
===================================================================
--- work.orig/include/linux/blkdev.h
+++ work/include/linux/blkdev.h
@@ -362,6 +362,7 @@ struct request_queue
unsigned long seg_boundary_mask;
void *dma_drain_buffer;
unsigned int dma_drain_size;
+ unsigned int dma_pad_mask;
unsigned int dma_alignment;
struct blk_queue_tag *queue_tags;
@@ -707,6 +708,7 @@ extern void blk_queue_max_hw_segments(st
extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
extern void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b);
+extern void blk_queue_dma_pad(struct request_queue *, unsigned int);
extern int blk_queue_dma_drain(struct request_queue *q,
dma_drain_needed_fn *dma_drain_needed,
void *buf, unsigned int size);
Index: work/drivers/ata/libata-scsi.c
===================================================================
--- work.orig/drivers/ata/libata-scsi.c
+++ work/drivers/ata/libata-scsi.c
@@ -862,9 +862,10 @@ static int ata_scsi_dev_config(struct sc
struct request_queue *q = sdev->request_queue;
void *buf;
- /* set the min alignment */
+ /* set the min alignment and padding */
blk_queue_update_dma_alignment(sdev->request_queue,
ATA_DMA_PAD_SZ - 1);
+ blk_queue_dma_pad(sdev->request_queue, ATA_DMA_PAD_SZ - 1);
/* configure draining */
buf = kmalloc(ATAPI_MAX_DRAIN, q->bounce_gfp | GFP_KERNEL);
On Mon, 03 Mar 2008 13:09:13 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> >> - I think bugs caused by using raw_data_len instead of data_len are more
> >> subtle than the other way around. Using data_len instead of
> >> raw_data_len usually affects the application layer while using
> >> raw_data_len instead of data_len affects the DMA engine and transport layer.
> >
> > If we add extra_len, we can get what raw_data_len and data_len
> > provide.
> >
> > I can't see what changing the meaning of rq->data_len (and
> > investigating all the block drivers) gives us.
>
> No matter which way you go, you change the meaning of rq->data_len and
> you MUST inspect rq->data_len usage whichever way you go.
The patch doens't change that rq->data_len means the true data
length. But yeah, it breaks rq->data_len == sum(sg). So it might break
some drivers.
> Apply your patch and try to do sg IO on IDE cdrom w/ various
> transfer lengths.
I've just tried the patch with both ata and libata and it seems to
work.
For anyone hitting this problem, please try the following patch:
http://lkml.org/lkml/2008/3/2/218
Thanks,
FUJITA Tomonori wrote:
>>> I can't see what changing the meaning of rq->data_len (and
>>> investigating all the block drivers) gives us.
>> No matter which way you go, you change the meaning of rq->data_len and
>> you MUST inspect rq->data_len usage whichever way you go.
>
> The patch doens't change that rq->data_len means the true data
> length. But yeah, it breaks rq->data_len == sum(sg). So it might break
> some drivers.
Yeah, that's what I was saying. You end up breaking one of the two
assumptions. As sglist is getting modified for any driver if it has DMA
alignment set, whether rq->data_len is adjusted together or not, sglist
and data_len usages have to be audited.
>> Apply your patch and try to do sg IO on IDE cdrom w/ various
>> transfer lengths.
>
> I've just tried the patch with both ata and libata and it seems to
> work.
Right, I missed you added extra_len in libata and IDE isn't using block
layer stuff yet.
> For anyone hitting this problem, please try the following patch:
>
> http://lkml.org/lkml/2008/3/2/218
Whether rq->data_len stays with requested data buffer size or sum(sg), I
think we need to separate out padding from address alignment; otherwise,
we'll have to audit every block driver to make sure they can deal with
extended sglist no matter which value rq->data_len ends up indicating.
If padding is applied iff explicitly requested, rq->data_len indicates
matters only to the drivers which want to see the data length adjusted,
so most of the problems go away.
--
tejun
On Mon, 03 Mar 2008 18:21:13 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> >>> I can't see what changing the meaning of rq->data_len (and
> >>> investigating all the block drivers) gives us.
> >> No matter which way you go, you change the meaning of rq->data_len and
> >> you MUST inspect rq->data_len usage whichever way you go.
> >
> > The patch doens't change that rq->data_len means the true data
> > length. But yeah, it breaks rq->data_len == sum(sg). So it might break
> > some drivers.
>
> Yeah, that's what I was saying. You end up breaking one of the two
> assumptions. As sglist is getting modified for any driver if it has DMA
> alignment set, whether rq->data_len is adjusted together or not, sglist
> and data_len usages have to be audited.
My patch (well, James' original approach) doesn't affect drivers that
don't use drain buffer. rq->data_len still means the true data length
and rq->data_len is equal to sum(sg) for them. So right now we need to
audit only libata.
But your patch changes the meaning of rq->data_len. It affects all the
drivers. So it breaks non libata stuff, like the SMP handler. We need
to audit all the drivers.
FUJITA Tomonori wrote:
> On Mon, 03 Mar 2008 18:21:13 +0900
> Tejun Heo <[email protected]> wrote:
>
>> FUJITA Tomonori wrote:
>>>>> I can't see what changing the meaning of rq->data_len (and
>>>>> investigating all the block drivers) gives us.
>>>> No matter which way you go, you change the meaning of rq->data_len and
>>>> you MUST inspect rq->data_len usage whichever way you go.
>>> The patch doens't change that rq->data_len means the true data
>>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break
>>> some drivers.
>> Yeah, that's what I was saying. You end up breaking one of the two
>> assumptions. As sglist is getting modified for any driver if it has DMA
>> alignment set, whether rq->data_len is adjusted together or not, sglist
>> and data_len usages have to be audited.
>
> My patch (well, James' original approach) doesn't affect drivers that
> don't use drain buffer. rq->data_len still means the true data length
> and rq->data_len is equal to sum(sg) for them. So right now we need to
> audit only libata.
Your patch does change sglist for any driver which sets DMA alignment.
You'll definitely need to audit more than libata.
> But your patch changes the meaning of rq->data_len. It affects all the
> drivers. So it breaks non libata stuff, like the SMP handler. We need
> to audit all the drivers.
With both patches applied, sglist and data_len are adjusted only for
libata, so only drivers which explicitly requested buffer size
manipulation (currently only libata) need to be audited / updated.
--
tejun
On Mon, 03 Mar 2008 22:38:55 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> > On Mon, 03 Mar 2008 18:21:13 +0900
> > Tejun Heo <[email protected]> wrote:
> >
> >> FUJITA Tomonori wrote:
> >>>>> I can't see what changing the meaning of rq->data_len (and
> >>>>> investigating all the block drivers) gives us.
> >>>> No matter which way you go, you change the meaning of rq->data_len and
> >>>> you MUST inspect rq->data_len usage whichever way you go.
> >>> The patch doens't change that rq->data_len means the true data
> >>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break
> >>> some drivers.
> >> Yeah, that's what I was saying. You end up breaking one of the two
> >> assumptions. As sglist is getting modified for any driver if it has DMA
> >> alignment set, whether rq->data_len is adjusted together or not, sglist
> >> and data_len usages have to be audited.
> >
> > My patch (well, James' original approach) doesn't affect drivers that
> > don't use drain buffer. rq->data_len still means the true data length
> > and rq->data_len is equal to sum(sg) for them. So right now we need to
> > audit only libata.
>
> Your patch does change sglist for any driver which sets DMA alignment.
I overlook it. Where does it changes sglist?
FUJITA Tomonori wrote:
>>>> FUJITA Tomonori wrote:
>>>>>>> I can't see what changing the meaning of rq->data_len (and
>>>>>>> investigating all the block drivers) gives us.
>>>>>> No matter which way you go, you change the meaning of rq->data_len and
>>>>>> you MUST inspect rq->data_len usage whichever way you go.
>>>>> The patch doens't change that rq->data_len means the true data
>>>>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break
>>>>> some drivers.
>>>> Yeah, that's what I was saying. You end up breaking one of the two
>>>> assumptions. As sglist is getting modified for any driver if it has DMA
>>>> alignment set, whether rq->data_len is adjusted together or not, sglist
>>>> and data_len usages have to be audited.
>>> My patch (well, James' original approach) doesn't affect drivers that
>>> don't use drain buffer. rq->data_len still means the true data length
>>> and rq->data_len is equal to sum(sg) for them. So right now we need to
>>> audit only libata.
>> Your patch does change sglist for any driver which sets DMA alignment.
>
> I overlook it. Where does it changes sglist?
At the end of blk_rq_map_user() together with data_len / extra_len
mangling or were you talking about James' original patch?
--
tejun
On Mon, 03 Mar 2008 22:55:56 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> >>>> FUJITA Tomonori wrote:
> >>>>>>> I can't see what changing the meaning of rq->data_len (and
> >>>>>>> investigating all the block drivers) gives us.
> >>>>>> No matter which way you go, you change the meaning of rq->data_len and
> >>>>>> you MUST inspect rq->data_len usage whichever way you go.
> >>>>> The patch doens't change that rq->data_len means the true data
> >>>>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break
> >>>>> some drivers.
> >>>> Yeah, that's what I was saying. You end up breaking one of the two
> >>>> assumptions. As sglist is getting modified for any driver if it has DMA
> >>>> alignment set, whether rq->data_len is adjusted together or not, sglist
> >>>> and data_len usages have to be audited.
> >>> My patch (well, James' original approach) doesn't affect drivers that
> >>> don't use drain buffer. rq->data_len still means the true data length
> >>> and rq->data_len is equal to sum(sg) for them. So right now we need to
> >>> audit only libata.
> >> Your patch does change sglist for any driver which sets DMA alignment.
> >
> > I overlook it. Where does it changes sglist?
>
> At the end of blk_rq_map_user() together with data_len / extra_len
> mangling or were you talking about James' original patch?
With my patch, at the end of blk_rq_map_user, we have:
if (len & queue_dma_alignment(q)) {
unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
rq->extra_len += pad_len;
}
So no change as compared with 2.6.24?
FUJITA Tomonori wrote:
>> At the end of blk_rq_map_user() together with data_len / extra_len
>> mangling or were you talking about James' original patch?
>
> With my patch, at the end of blk_rq_map_user, we have:
>
> if (len & queue_dma_alignment(q)) {
> unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
>
> rq->extra_len += pad_len;
> }
>
>
> So no change as compared with 2.6.24?
Oh.. you killed sg list manipulation. Many controllers do allow odd
bytes as the last sg entry but not all. Also, if you append drain
buffer after it, it ends up with unaligned sg entry in the middle and
rq->data_len + rq->extra_len will overrun the sg entry after the drain
page which is really dangerous.
--
tejun
On Mon, 03 Mar 2008 23:22:46 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> >> At the end of blk_rq_map_user() together with data_len / extra_len
> >> mangling or were you talking about James' original patch?
> >
> > With my patch, at the end of blk_rq_map_user, we have:
> >
> > if (len & queue_dma_alignment(q)) {
> > unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
> >
> > rq->extra_len += pad_len;
> > }
> >
> >
> > So no change as compared with 2.6.24?
>
> Oh.. you killed sg list manipulation. Many controllers do allow odd
> bytes as the last sg entry but not all. Also, if you append drain
Until 2.6.24, these drivers have taken care about the issue by
themselves. There is no change as compared with 2.6.24.
> buffer after it, it ends up with unaligned sg entry in the middle and
> rq->data_len + rq->extra_len will overrun the sg entry after the drain
> page which is really dangerous.
The drivers know that they use drain buffer. They can take care about
themselves on this too. If we want to do explicitly, we could have
rq->pad_len and rq->drain_len instead of rq->extra_len, though I think
that we are fine without these values because these drivers already
tell the block layer what they want and know that the block layer
gives it.
Jens, want's your verdict on this?
On Mon, 2008-03-03 at 15:10 +0900, Tejun Heo wrote:
> Block layer alignment was used for two different purposes - memory
> alignment and padding. This causes problems in lower layers because
> drivers which only require memory alignment ends up with adjusted
> rq->data_len. Separate out padding such that padding occurs iff
> driver explicitly requests it.
This puts the libsas SMP handler back into a working state again.
Thanks,
James
FUJITA Tomonori wrote:
> On Mon, 03 Mar 2008 23:22:46 +0900
> Tejun Heo <[email protected]> wrote:
>
>> FUJITA Tomonori wrote:
>>>> At the end of blk_rq_map_user() together with data_len / extra_len
>>>> mangling or were you talking about James' original patch?
>>> With my patch, at the end of blk_rq_map_user, we have:
>>>
>>> if (len & queue_dma_alignment(q)) {
>>> unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
>>>
>>> rq->extra_len += pad_len;
>>> }
>>>
>>>
>>> So no change as compared with 2.6.24?
>> Oh.. you killed sg list manipulation. Many controllers do allow odd
>> bytes as the last sg entry but not all. Also, if you append drain
>
> Until 2.6.24, these drivers have taken care about the issue by
> themselves. There is no change as compared with 2.6.24.
Yeah, libata did its own padding and needed to add draining. Private
implementation was complex as hell and James suggested moving them to
block layer. Are you suggesting moving them back to drivers?
>> buffer after it, it ends up with unaligned sg entry in the middle and
>> rq->data_len + rq->extra_len will overrun the sg entry after the drain
>> page which is really dangerous.
>
> The drivers know that they use drain buffer. They can take care about
> themselves on this too. If we want to do explicitly, we could have
> rq->pad_len and rq->drain_len instead of rq->extra_len, though I think
> that we are fine without these values because these drivers already
> tell the block layer what they want and know that the block layer
> gives it.
So, if a driver has requested aligning and draining, the driver should
extend the sg entry before the last one by the alignment if draining was
used for the request and extent the last sg if the draining wasn't used.
I'd rather just implement them in the drivers.
--
tejun
On Tue, 04 Mar 2008 07:44:13 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> > On Mon, 03 Mar 2008 23:22:46 +0900
> > Tejun Heo <[email protected]> wrote:
> >
> >> FUJITA Tomonori wrote:
> >>>> At the end of blk_rq_map_user() together with data_len / extra_len
> >>>> mangling or were you talking about James' original patch?
> >>> With my patch, at the end of blk_rq_map_user, we have:
> >>>
> >>> if (len & queue_dma_alignment(q)) {
> >>> unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
> >>>
> >>> rq->extra_len += pad_len;
> >>> }
> >>>
> >>>
> >>> So no change as compared with 2.6.24?
> >> Oh.. you killed sg list manipulation. Many controllers do allow odd
> >> bytes as the last sg entry but not all. Also, if you append drain
> >
> > Until 2.6.24, these drivers have taken care about the issue by
> > themselves. There is no change as compared with 2.6.24.
>
> Yeah, libata did its own padding and needed to add draining. Private
> implementation was complex as hell and James suggested moving them to
> block layer. Are you suggesting moving them back to drivers?
No, I'm not. I've been working on the IOMMUs to remove such
workarounds in LLDs.
What drivers need to do on this is just adding a padding length, that
is, drivers don't need to change the structure of the sg list (like
splitting a sg entry), right? And it doesn't break the SAS drivers
that support SATAPI, does it?
But I agree that drivers want to get a complete sglist so I'm fine
with adjusting sglist entries in the block layer with your secode
patch (separate out padding from alignment). As we discussed, I'm fine
with breaking sum(sg) == rq->data_len as long as rq->data_len means
the true data length.
> >> buffer after it, it ends up with unaligned sg entry in the middle and
> >> rq->data_len + rq->extra_len will overrun the sg entry after the drain
> >> page which is really dangerous.
> >
> > The drivers know that they use drain buffer. They can take care about
> > themselves on this too. If we want to do explicitly, we could have
> > rq->pad_len and rq->drain_len instead of rq->extra_len, though I think
> > that we are fine without these values because these drivers already
> > tell the block layer what they want and know that the block layer
> > gives it.
>
> So, if a driver has requested aligning and draining, the driver should
> extend the sg entry before the last one by the alignment if draining was
> used for the request and extent the last sg if the draining wasn't used.
> I'd rather just implement them in the drivers.
The block layer extends the sg entry? The drivers just adjust
sg->length?
FUJITA Tomonori wrote:
>> Yeah, libata did its own padding and needed to add draining. Private
>> implementation was complex as hell and James suggested moving them to
>> block layer. Are you suggesting moving them back to drivers?
>
> No, I'm not. I've been working on the IOMMUs to remove such
> workarounds in LLDs.
>
> What drivers need to do on this is just adding a padding length, that
> is, drivers don't need to change the structure of the sg list (like
> splitting a sg entry), right? And it doesn't break the SAS drivers
> that support SATAPI, does it?
>
> But I agree that drivers want to get a complete sglist so I'm fine
> with adjusting sglist entries in the block layer with your secode
> patch (separate out padding from alignment). As we discussed, I'm fine
> with breaking sum(sg) == rq->data_len as long as rq->data_len means
> the true data length.
As long as the second patch is in, what value rq->data_len indicates
doesn't matter to drivers which don't use explicit padding or draining,
so the situation is much more controlled. I don't care which value
rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
is what IDE and libata which will be the major users of padding and/or
draining expect in rq->data_len but fixing up that shouldn't be too
difficult. I guess this can be determined by Jens. If Jens likes
rq->data_len to contain requested transfer size, I'll post updated patches.
>>>> buffer after it, it ends up with unaligned sg entry in the middle and
>>>> rq->data_len + rq->extra_len will overrun the sg entry after the drain
>>>> page which is really dangerous.
>>> The drivers know that they use drain buffer. They can take care about
>>> themselves on this too. If we want to do explicitly, we could have
>>> rq->pad_len and rq->drain_len instead of rq->extra_len, though I think
>>> that we are fine without these values because these drivers already
>>> tell the block layer what they want and know that the block layer
>>> gives it.
>> So, if a driver has requested aligning and draining, the driver should
>> extend the sg entry before the last one by the alignment if draining was
>> used for the request and extent the last sg if the draining wasn't used.
>> I'd rather just implement them in the drivers.
>
> The block layer extends the sg entry? The drivers just adjust
> sg->length?
Still, do you really wanna force such things into low level drivers?
That will be one extremely fragile API and will be really difficult to
tell when things go wrong.
--
tejun
On Tue, Mar 04 2008, FUJITA Tomonori wrote:
> On Tue, 04 Mar 2008 11:32:56 +0900
> Tejun Heo <[email protected]> wrote:
>
> > FUJITA Tomonori wrote:
> > >> Yeah, libata did its own padding and needed to add draining. Private
> > >> implementation was complex as hell and James suggested moving them to
> > >> block layer. Are you suggesting moving them back to drivers?
> > >
> > > No, I'm not. I've been working on the IOMMUs to remove such
> > > workarounds in LLDs.
> > >
> > > What drivers need to do on this is just adding a padding length, that
> > > is, drivers don't need to change the structure of the sg list (like
> > > splitting a sg entry), right? And it doesn't break the SAS drivers
> > > that support SATAPI, does it?
> > >
> > > But I agree that drivers want to get a complete sglist so I'm fine
> > > with adjusting sglist entries in the block layer with your secode
> > > patch (separate out padding from alignment). As we discussed, I'm fine
> > > with breaking sum(sg) == rq->data_len as long as rq->data_len means
> > > the true data length.
> >
> > As long as the second patch is in, what value rq->data_len indicates
> > doesn't matter to drivers which don't use explicit padding or draining,
> > so the situation is much more controlled. I don't care which value
> > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
> > is what IDE and libata which will be the major users of padding and/or
> > draining expect in rq->data_len but fixing up that shouldn't be too
> > difficult. I guess this can be determined by Jens. If Jens likes
> > rq->data_len to contain requested transfer size, I'll post updated patches.
>
> OK, I prefer rq->data_len means the true data length though you prefer
> rq->data_len means the allocated buffer length (the true data length
> plus padding and drain). We agree on other things. We can live with
> either way.
>
> Jens, what's your preference?
I completely agree with you, ->data_len meaning true data length is way
cleaner imho. Only the driver should care for the padded length, all
other parts of the kernel only need to know what they actually got.
--
Jens Axboe
Hello, Jens.
Jens Axboe wrote:
> I completely agree with you, ->data_len meaning true data length is way
> cleaner imho. Only the driver should care for the padded length, all
> other parts of the kernel only need to know what they actually got.
Oh well, I guess I'm the one with strange taste he
re. My logic is that the only thing below the block layer is the driver
which requested size adjustment. This means residual bytes calculation
is pushed to low level drivers which isn't anything major but still.
Anyways, I'll review FUJITA's modified patch.
Thanks.
--
tejun
FUJITA Tomonori wrote:
> OK, I've updated his patch. Tejun, can you audit this?
Looks good to me.
Thanks.
--
tejun
On Tue, Mar 04 2008, FUJITA Tomonori wrote:
> On Tue, 04 Mar 2008 18:06:48 +0900
> FUJITA Tomonori <[email protected]> wrote:
>
> > On Tue, 4 Mar 2008 09:59:46 +0100
> > Jens Axboe <[email protected]> wrote:
> >
> > > On Tue, Mar 04 2008, FUJITA Tomonori wrote:
> > > > On Tue, 04 Mar 2008 11:32:56 +0900
> > > > Tejun Heo <[email protected]> wrote:
> > > >
> > > > > FUJITA Tomonori wrote:
> > > > > >> Yeah, libata did its own padding and needed to add draining. Private
> > > > > >> implementation was complex as hell and James suggested moving them to
> > > > > >> block layer. Are you suggesting moving them back to drivers?
> > > > > >
> > > > > > No, I'm not. I've been working on the IOMMUs to remove such
> > > > > > workarounds in LLDs.
> > > > > >
> > > > > > What drivers need to do on this is just adding a padding length, that
> > > > > > is, drivers don't need to change the structure of the sg list (like
> > > > > > splitting a sg entry), right? And it doesn't break the SAS drivers
> > > > > > that support SATAPI, does it?
> > > > > >
> > > > > > But I agree that drivers want to get a complete sglist so I'm fine
> > > > > > with adjusting sglist entries in the block layer with your secode
> > > > > > patch (separate out padding from alignment). As we discussed, I'm fine
> > > > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means
> > > > > > the true data length.
> > > > >
> > > > > As long as the second patch is in, what value rq->data_len indicates
> > > > > doesn't matter to drivers which don't use explicit padding or draining,
> > > > > so the situation is much more controlled. I don't care which value
> > > > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
> > > > > is what IDE and libata which will be the major users of padding and/or
> > > > > draining expect in rq->data_len but fixing up that shouldn't be too
> > > > > difficult. I guess this can be determined by Jens. If Jens likes
> > > > > rq->data_len to contain requested transfer size, I'll post updated patches.
> > > >
> > > > OK, I prefer rq->data_len means the true data length though you prefer
> > > > rq->data_len means the allocated buffer length (the true data length
> > > > plus padding and drain). We agree on other things. We can live with
> > > > either way.
> > > >
> > > > Jens, what's your preference?
> > >
> > > I completely agree with you, ->data_len meaning true data length is way
> > > cleaner imho. Only the driver should care for the padded length, all
> > > other parts of the kernel only need to know what they actually got.
> >
> > OK, now we can fix the whole SG_IO (and bsg handler) mess.
> >
> > Here's my patch with a proper description. which several people have
> > already tested (thanks!). Then we need an updated version of Tejun's
> > separate out padding from alignment patch.
>
> OK, I've updated his patch. Tejun, can you audit this?
Looks excellent to me, has a variant of this been tested as OK by the
users reporting the regression?
--
Jens Axboe
Jens Axboe wrote:
> Looks excellent to me, has a variant of this been tested as OK by the
> users reporting the regression?
Yeah, the other version which added extra_len to data_len has been
verified to work. The only difference is now libata is adding
extra_len, so this one should be safe.
--
tejun
On Tue, Mar 04 2008, Tejun Heo wrote:
> Jens Axboe wrote:
> > Looks excellent to me, has a variant of this been tested as OK by the
> > users reporting the regression?
>
> Yeah, the other version which added extra_len to data_len has been
> verified to work. The only difference is now libata is adding
> extra_len, so this one should be safe.
Great, since we all agree, I'll merge it up and pass it on.
--
Jens Axboe
On Tue, 04 Mar 2008 11:32:56 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> >> Yeah, libata did its own padding and needed to add draining. Private
> >> implementation was complex as hell and James suggested moving them to
> >> block layer. Are you suggesting moving them back to drivers?
> >
> > No, I'm not. I've been working on the IOMMUs to remove such
> > workarounds in LLDs.
> >
> > What drivers need to do on this is just adding a padding length, that
> > is, drivers don't need to change the structure of the sg list (like
> > splitting a sg entry), right? And it doesn't break the SAS drivers
> > that support SATAPI, does it?
> >
> > But I agree that drivers want to get a complete sglist so I'm fine
> > with adjusting sglist entries in the block layer with your secode
> > patch (separate out padding from alignment). As we discussed, I'm fine
> > with breaking sum(sg) == rq->data_len as long as rq->data_len means
> > the true data length.
>
> As long as the second patch is in, what value rq->data_len indicates
> doesn't matter to drivers which don't use explicit padding or draining,
> so the situation is much more controlled. I don't care which value
> rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
> is what IDE and libata which will be the major users of padding and/or
> draining expect in rq->data_len but fixing up that shouldn't be too
> difficult. I guess this can be determined by Jens. If Jens likes
> rq->data_len to contain requested transfer size, I'll post updated patches.
OK, I prefer rq->data_len means the true data length though you prefer
rq->data_len means the allocated buffer length (the true data length
plus padding and drain). We agree on other things. We can live with
either way.
Jens, what's your preference?
> >>>> buffer after it, it ends up with unaligned sg entry in the middle and
> >>>> rq->data_len + rq->extra_len will overrun the sg entry after the drain
> >>>> page which is really dangerous.
> >>> The drivers know that they use drain buffer. They can take care about
> >>> themselves on this too. If we want to do explicitly, we could have
> >>> rq->pad_len and rq->drain_len instead of rq->extra_len, though I think
> >>> that we are fine without these values because these drivers already
> >>> tell the block layer what they want and know that the block layer
> >>> gives it.
> >> So, if a driver has requested aligning and draining, the driver should
> >> extend the sg entry before the last one by the alignment if draining was
> >> used for the request and extent the last sg if the draining wasn't used.
> >> I'd rather just implement them in the drivers.
> >
> > The block layer extends the sg entry? The drivers just adjust
> > sg->length?
>
> Still, do you really wanna force such things into low level drivers?
> That will be one extremely fragile API and will be really difficult to
> tell when things go wrong.
No, I don't, as I explained above. As long as rq->data_len means the
true data length, I'm fine. I knew that James' drain buffer patch
breaks rq->data_len == sum(sg). I don't care about it. I can
understand that drivers wants to a perfect sglist.
On Tue, 04 Mar 2008 18:06:48 +0900
FUJITA Tomonori <[email protected]> wrote:
> On Tue, 4 Mar 2008 09:59:46 +0100
> Jens Axboe <[email protected]> wrote:
>
> > On Tue, Mar 04 2008, FUJITA Tomonori wrote:
> > > On Tue, 04 Mar 2008 11:32:56 +0900
> > > Tejun Heo <[email protected]> wrote:
> > >
> > > > FUJITA Tomonori wrote:
> > > > >> Yeah, libata did its own padding and needed to add draining. Private
> > > > >> implementation was complex as hell and James suggested moving them to
> > > > >> block layer. Are you suggesting moving them back to drivers?
> > > > >
> > > > > No, I'm not. I've been working on the IOMMUs to remove such
> > > > > workarounds in LLDs.
> > > > >
> > > > > What drivers need to do on this is just adding a padding length, that
> > > > > is, drivers don't need to change the structure of the sg list (like
> > > > > splitting a sg entry), right? And it doesn't break the SAS drivers
> > > > > that support SATAPI, does it?
> > > > >
> > > > > But I agree that drivers want to get a complete sglist so I'm fine
> > > > > with adjusting sglist entries in the block layer with your secode
> > > > > patch (separate out padding from alignment). As we discussed, I'm fine
> > > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means
> > > > > the true data length.
> > > >
> > > > As long as the second patch is in, what value rq->data_len indicates
> > > > doesn't matter to drivers which don't use explicit padding or draining,
> > > > so the situation is much more controlled. I don't care which value
> > > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
> > > > is what IDE and libata which will be the major users of padding and/or
> > > > draining expect in rq->data_len but fixing up that shouldn't be too
> > > > difficult. I guess this can be determined by Jens. If Jens likes
> > > > rq->data_len to contain requested transfer size, I'll post updated patches.
> > >
> > > OK, I prefer rq->data_len means the true data length though you prefer
> > > rq->data_len means the allocated buffer length (the true data length
> > > plus padding and drain). We agree on other things. We can live with
> > > either way.
> > >
> > > Jens, what's your preference?
> >
> > I completely agree with you, ->data_len meaning true data length is way
> > cleaner imho. Only the driver should care for the padded length, all
> > other parts of the kernel only need to know what they actually got.
>
> OK, now we can fix the whole SG_IO (and bsg handler) mess.
>
> Here's my patch with a proper description. which several people have
> already tested (thanks!). Then we need an updated version of Tejun's
> separate out padding from alignment patch.
OK, I've updated his patch. Tejun, can you audit this?
Thanks,
=
From: Tejun Heo <[email protected]>
Subject: [PATCH] block: separate out padding from alignment
Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Tomo: restorethe code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa
according to padding alignment.
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: FUJITA Tomonori <[email protected]>
---
block/blk-map.c | 20 +++++++++++++-------
block/blk-settings.c | 17 +++++++++++++++++
drivers/ata/libata-scsi.c | 3 ++-
include/linux/blkdev.h | 2 ++
4 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index f559832..4e17dfd 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -43,6 +43,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq,
void __user *ubuf, unsigned int len)
{
unsigned long uaddr;
+ unsigned int alignment;
struct bio *bio, *orig_bio;
int reading, ret;
@@ -53,8 +54,8 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq,
* direct dma. else, set up kernel bounce buffers
*/
uaddr = (unsigned long) ubuf;
- if (!(uaddr & queue_dma_alignment(q)) &&
- !(len & queue_dma_alignment(q)))
+ alignment = queue_dma_alignment(q) | q->dma_pad_mask;
+ if (!(uaddr & alignment) && !(len & alignment))
bio = bio_map_user(q, NULL, uaddr, len, reading);
else
bio = bio_copy_user(q, uaddr, len, reading);
@@ -141,15 +142,20 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
/*
* __blk_rq_map_user() copies the buffers if starting address
- * or length isn't aligned. As the copied buffer is always
- * page aligned, we know that there's enough room for padding.
- * Extend the last bio and update rq->data_len accordingly.
+ * or length isn't aligned to dma_pad_mask. As the copied
+ * buffer is always page aligned, we know that there's enough
+ * room for padding. Extend the last bio and update
+ * rq->data_len accordingly.
*
* On unmap, bio_uncopy_user() will use unmodified
* bio_map_data pointed to by bio->bi_private.
*/
- if (len & queue_dma_alignment(q)) {
- unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
+ if (len & q->dma_pad_mask) {
+ unsigned int pad_len = (q->dma_pad_mask & ~len) + 1;
+ struct bio *bio = rq->biotail;
+
+ bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
+ bio->bi_size += pad_len;
rq->extra_len += pad_len;
}
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 9a8ffdd..5fcb625 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -293,6 +293,23 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b)
EXPORT_SYMBOL(blk_queue_stack_limits);
/**
+ * blk_queue_dma_pad - set pad mask
+ * @q: the request queue for the device
+ * @mask: pad mask
+ *
+ * Set pad mask. Direct IO requests are padded to the mask specified.
+ *
+ * Appending pad buffer to a request modifies ->data_len such that it
+ * includes the pad buffer. The original requested data length can be
+ * obtained using blk_rq_raw_data_len().
+ **/
+void blk_queue_dma_pad(struct request_queue *q, unsigned int mask)
+{
+ q->dma_pad_mask = mask;
+}
+EXPORT_SYMBOL(blk_queue_dma_pad);
+
+/**
* blk_queue_dma_drain - Set up a drain buffer for excess dma.
*
* @q: the request queue for the device
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index fe47922..8f0e8f2 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -862,9 +862,10 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
struct request_queue *q = sdev->request_queue;
void *buf;
- /* set the min alignment */
+ /* set the min alignment and padding */
blk_queue_update_dma_alignment(sdev->request_queue,
ATA_DMA_PAD_SZ - 1);
+ blk_queue_dma_pad(sdev->request_queue, ATA_DMA_PAD_SZ - 1);
/* configure draining */
buf = kmalloc(ATAPI_MAX_DRAIN, q->bounce_gfp | GFP_KERNEL);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b72526c..6f79d40 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -362,6 +362,7 @@ struct request_queue
unsigned long seg_boundary_mask;
void *dma_drain_buffer;
unsigned int dma_drain_size;
+ unsigned int dma_pad_mask;
unsigned int dma_alignment;
struct blk_queue_tag *queue_tags;
@@ -701,6 +702,7 @@ extern void blk_queue_max_hw_segments(struct request_queue *, unsigned short);
extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
extern void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b);
+extern void blk_queue_dma_pad(struct request_queue *, unsigned int);
extern int blk_queue_dma_drain(struct request_queue *q,
dma_drain_needed_fn *dma_drain_needed,
void *buf, unsigned int size);
--
1.5.3.6
On Tue, 4 Mar 2008 09:59:46 +0100
Jens Axboe <[email protected]> wrote:
> On Tue, Mar 04 2008, FUJITA Tomonori wrote:
> > On Tue, 04 Mar 2008 11:32:56 +0900
> > Tejun Heo <[email protected]> wrote:
> >
> > > FUJITA Tomonori wrote:
> > > >> Yeah, libata did its own padding and needed to add draining. Private
> > > >> implementation was complex as hell and James suggested moving them to
> > > >> block layer. Are you suggesting moving them back to drivers?
> > > >
> > > > No, I'm not. I've been working on the IOMMUs to remove such
> > > > workarounds in LLDs.
> > > >
> > > > What drivers need to do on this is just adding a padding length, that
> > > > is, drivers don't need to change the structure of the sg list (like
> > > > splitting a sg entry), right? And it doesn't break the SAS drivers
> > > > that support SATAPI, does it?
> > > >
> > > > But I agree that drivers want to get a complete sglist so I'm fine
> > > > with adjusting sglist entries in the block layer with your secode
> > > > patch (separate out padding from alignment). As we discussed, I'm fine
> > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means
> > > > the true data length.
> > >
> > > As long as the second patch is in, what value rq->data_len indicates
> > > doesn't matter to drivers which don't use explicit padding or draining,
> > > so the situation is much more controlled. I don't care which value
> > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value
> > > is what IDE and libata which will be the major users of padding and/or
> > > draining expect in rq->data_len but fixing up that shouldn't be too
> > > difficult. I guess this can be determined by Jens. If Jens likes
> > > rq->data_len to contain requested transfer size, I'll post updated patches.
> >
> > OK, I prefer rq->data_len means the true data length though you prefer
> > rq->data_len means the allocated buffer length (the true data length
> > plus padding and drain). We agree on other things. We can live with
> > either way.
> >
> > Jens, what's your preference?
>
> I completely agree with you, ->data_len meaning true data length is way
> cleaner imho. Only the driver should care for the padded length, all
> other parts of the kernel only need to know what they actually got.
OK, now we can fix the whole SG_IO (and bsg handler) mess.
Here's my patch with a proper description. which several people have
already tested (thanks!). Then we need an updated version of Tejun's
separate out padding from alignment patch.
=
From: FUJITA Tomonori <[email protected]>
Subject: [PATCH] block: restore the meaning of rq->data_len to the true data length
The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).
This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: FUJITA Tomonori <[email protected]>
---
block/blk-core.c | 3 +--
block/blk-map.c | 6 +-----
block/blk-merge.c | 2 +-
block/bsg.c | 8 ++++----
block/scsi_ioctl.c | 4 ++--
drivers/ata/libata-scsi.c | 6 +++---
include/linux/blkdev.h | 2 +-
7 files changed, 13 insertions(+), 18 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 775c851..bfec406 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->nr_hw_segments = 0;
rq->ioprio = 0;
rq->special = NULL;
- rq->raw_data_len = 0;
rq->buffer = NULL;
rq->tag = -1;
rq->errors = 0;
@@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq->cmd_len = 0;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->data_len = 0;
+ rq->extra_len = 0;
rq->sense_len = 0;
rq->data = NULL;
rq->sense = NULL;
@@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
rq->hard_cur_sectors = rq->current_nr_sectors;
rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio);
rq->buffer = bio_data(bio);
- rq->raw_data_len = bio->bi_size;
rq->data_len = bio->bi_size;
rq->bio = rq->biotail = bio;
diff --git a/block/blk-map.c b/block/blk-map.c
index 09f7fd0..f559832 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
rq->biotail->bi_next = bio;
rq->biotail = bio;
- rq->raw_data_len += bio->bi_size;
rq->data_len += bio->bi_size;
}
return 0;
@@ -151,11 +150,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
*/
if (len & queue_dma_alignment(q)) {
unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1;
- struct bio *bio = rq->biotail;
- bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len;
- bio->bi_size += pad_len;
- rq->data_len += pad_len;
+ rq->extra_len += pad_len;
}
rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7506c4f..0f58616 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -231,7 +231,7 @@ new_segment:
((unsigned long)q->dma_drain_buffer) &
(PAGE_SIZE - 1));
nsegs++;
- rq->data_len += q->dma_drain_size;
+ rq->extra_len += q->dma_drain_size;
}
if (sg)
diff --git a/block/bsg.c b/block/bsg.c
index 7f3c095..8917c51 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
}
if (rq->next_rq) {
- hdr->dout_resid = rq->raw_data_len;
- hdr->din_resid = rq->next_rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
+ hdr->din_resid = rq->next_rq->data_len;
blk_rq_unmap_user(bidi_bio);
blk_put_request(rq->next_rq);
} else if (rq_data_dir(rq) == READ)
- hdr->din_resid = rq->raw_data_len;
+ hdr->din_resid = rq->data_len;
else
- hdr->dout_resid = rq->raw_data_len;
+ hdr->dout_resid = rq->data_len;
/*
* If the request generated a negative error number, return it
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e993cac..a2c3a93 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr,
hdr->info = 0;
if (hdr->masked_status || hdr->host_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
- hdr->resid = rq->raw_data_len;
+ hdr->resid = rq->data_len;
hdr->sb_len_wr = 0;
if (rq->sense_len && hdr->sbp) {
@@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk,
rq = blk_get_request(q, WRITE, __GFP_WAIT);
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->data = NULL;
- rq->raw_data_len = 0;
rq->data_len = 0;
+ rq->extra_len = 0;
rq->timeout = BLK_DEFAULT_SG_TIMEOUT;
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd[0] = cmd;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7b1f1ee..fe47922 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
}
qc->tf.command = ATA_CMD_PACKET;
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* check whether ATAPI DMA is safe */
if (!using_pio && ata_check_atapi_dma(qc))
@@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
* want to set it properly, and for DMA where it is
* effectively meaningless.
*/
- nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024);
+ nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024);
/* Most ATAPI devices which honor transfer chunk size don't
* behave according to the spec when odd chunk size which
@@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
* TODO: find out if we need to do more here to
* cover scatter/gather case.
*/
- qc->nbytes = scsi_bufflen(scmd);
+ qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len;
/* request result TF and be quiet about device error */
qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6fe67d1..b72526c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,8 +216,8 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
- unsigned int raw_data_len;
unsigned int data_len;
+ unsigned int extra_len; /* length of alignment and padding */
unsigned int sense_len;
void *data;
void *sense;
--
1.5.3.6
On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> Looks excellent to me, has a variant of this been tested as OK by the
> users reporting the regression?
K3b burning seems to be a nogo here. This is git pulled this morning
though, so it's a somewhat different tree than previously tested fwtw.
[ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
[ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
[ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation)
[ 136.440053] ata1.01: status: { DRDY DRQ }
[ 136.440086] ata1: soft resetting link
[ 165.327627] ata1.01: qc timeout (cmd 0xa1)
[ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 165.327627] ata1.01: revalidation failed (errno=-5)
[ 165.327627] ata1: failed to recover some devices, retrying in 5 secs
[ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80)
[ 180.388879] ata1: device not ready (errno=-16), forcing hardreset
[ 180.388879] ata1: soft resetting link
[ 210.832471] ata1.01: qc timeout (cmd 0xa1)
[ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 210.832471] ata1.01: revalidation failed (errno=-5)
[ 210.832471] ata1: failed to recover some devices, retrying in 5 secs
[ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80)
[ 225.920376] ata1: device not ready (errno=-16), forcing hardreset
[ 225.920376] ata1: soft resetting link
[ 256.542565] ata1.01: qc timeout (cmd 0xa1)
[ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 256.542565] ata1.01: revalidation failed (errno=-5)
[ 256.542565] ata1.01: disabled
[ 259.995199] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
[ 259.995214] ata1.00: revalidation failed (errno=-5)
[ 259.995219] ata1: failed to recover some devices, retrying in 5 secs
[ 265.047502] ata1: soft resetting link
[ 262.397570] ata1.00: limited to UDMA/33 due to 40-wire cable
[ 262.420039] ata1.00: configured for UDMA/33
[ 262.420039] sr 0:0:1:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[ 262.420039] sr 0:0:1:0: [sr0] Sense Key : Aborted Command [current] [descriptor]
[ 262.420039] Descriptor sense data with sense descriptors (in hex):
[ 262.420039] 72 0b 47 00 00 00 00 0e 09 0c 00 00 00 02 00 00
[ 262.420039] 00 02 00 00 b0 58
[ 262.420039] sr 0:0:1:0: [sr0] Add. Sense: Scsi parity error
[ 262.420039] ata1: EH complete
[ 262.420257] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
[ 262.420320] sd 0:0:0:0: [sda] Write Protect is off
[ 262.420326] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 262.420390] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
>
> > Looks excellent to me, has a variant of this been tested as OK by the
> > users reporting the regression?
>
> K3b burning seems to be a nogo here. This is git pulled this morning
> though, so it's a somewhat different tree than previously tested fwtw.
can you please try git as of this morning without any patches applied,
and then pull
git://git.kernel.dk/linux-2.6-block.git for-linus
into that and see if that works?
--
Jens Axboe
Mike Galbraith wrote:
> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
>
>> Looks excellent to me, has a variant of this been tested as OK by the
>> users reporting the regression?
>
> K3b burning seems to be a nogo here. This is git pulled this morning
> though, so it's a somewhat different tree than previously tested fwtw.
>
> [ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> [ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
> [ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
> [ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation)
> [ 136.440053] ata1.01: status: { DRDY DRQ }
> [ 136.440086] ata1: soft resetting link
> [ 165.327627] ata1.01: qc timeout (cmd 0xa1)
> [ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 165.327627] ata1.01: revalidation failed (errno=-5)
> [ 165.327627] ata1: failed to recover some devices, retrying in 5 secs
> [ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80)
> [ 180.388879] ata1: device not ready (errno=-16), forcing hardreset
> [ 180.388879] ata1: soft resetting link
> [ 210.832471] ata1.01: qc timeout (cmd 0xa1)
> [ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 210.832471] ata1.01: revalidation failed (errno=-5)
> [ 210.832471] ata1: failed to recover some devices, retrying in 5 secs
> [ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80)
> [ 225.920376] ata1: device not ready (errno=-16), forcing hardreset
> [ 225.920376] ata1: soft resetting link
> [ 256.542565] ata1.01: qc timeout (cmd 0xa1)
> [ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 256.542565] ata1.01: revalidation failed (errno=-5)
> [ 256.542565] ata1.01: disabled
Aiee... device going down after timing out on READ_DISC_INFO. That's
gruesome. Can you please try the other patches?
Thanks.
--
tejun
On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> >
> > > Looks excellent to me, has a variant of this been tested as OK by the
> > > users reporting the regression?
> >
> > K3b burning seems to be a nogo here. This is git pulled this morning
> > though, so it's a somewhat different tree than previously tested fwtw.
>
> can you please try git as of this morning without any patches applied,
> and then pull
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> into that and see if that works?
I'll give it a shot in a bit.
-Mike
On Tue, 2008-03-04 at 21:40 +0900, Tejun Heo wrote:
> Aiee... device going down after timing out on READ_DISC_INFO. That's
> gruesome. Can you please try the other patches?
I tried your last yesterday, and k3b worked fine.
-Mike
On Tue, 2008-03-04 at 13:43 +0100, Mike Galbraith wrote:
> > can you please try git as of this morning without any patches applied,
> > and then pull
> >
> > git://git.kernel.dk/linux-2.6-block.git for-linus
> >
> > into that and see if that works?
>
> I'll give it a shot in a bit.
Aw poo, so many choices.
I did:
git add remote block-for-linus git://git.kernel.dk/linux-2.6-block.git
git remote update
Now, which one do I check out? block-for-linus/master maybe, or
block-for-linus/for-linus?
homer:..git/linux-2.6 # git checkout block-for-linus
error: pathspec 'block-for-linus' did not match any file(s) known to git.
Did you forget to 'git add'?
homer:..git/linux-2.6 # git branch -a
* master
x86/master
x86/mm
block-for-linus/blktrace
block-for-linus/cmdfilter
block-for-linus/dynpipe
block-for-linus/fcache
block-for-linus/for-akpm
block-for-linus/for-linus
block-for-linus/io-cpu-affinity
block-for-linus/io-cpu-affinity-kthread
block-for-linus/loop-extent_map
block-for-linus/loop-fastfs
block-for-linus/master
block-for-linus/plug
block-for-linus/splice
block-for-linus/syslet
block-for-linus/syslet-share
block-for-linus/timeout
linux-next/master
linux-next/stable
origin/HEAD
origin/master
x86/base
x86/for-akpm
x86/for-linus
x86/latest
x86/master
x86/mm
x86/origin
x86/testing
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 13:43 +0100, Mike Galbraith wrote:
>
> > > can you please try git as of this morning without any patches applied,
> > > and then pull
> > >
> > > git://git.kernel.dk/linux-2.6-block.git for-linus
> > >
> > > into that and see if that works?
> >
> > I'll give it a shot in a bit.
>
> Aw poo, so many choices.
> I did:
> git add remote block-for-linus git://git.kernel.dk/linux-2.6-block.git
> git remote update
> Now, which one do I check out? block-for-linus/master maybe, or
> block-for-linus/for-linus?
Re-read my original mail! It states that you should just pull:
git://git.kernel.dk/linux-2.6-block.git for-linus
into your linus branch, or just create a test branch off linus' master
and pull into that. IOW, it's the for-linus branch that you should pull,
nothing else.
--
Jens Axboe
On Tue, 04 Mar 2008 21:40:53 +0900
Tejun Heo <[email protected]> wrote:
> Mike Galbraith wrote:
> > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> >
> >> Looks excellent to me, has a variant of this been tested as OK by the
> >> users reporting the regression?
> >
> > K3b burning seems to be a nogo here. This is git pulled this morning
> > though, so it's a somewhat different tree than previously tested fwtw.
> >
> > [ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> > [ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
> > [ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
> > [ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation)
> > [ 136.440053] ata1.01: status: { DRDY DRQ }
> > [ 136.440086] ata1: soft resetting link
> > [ 165.327627] ata1.01: qc timeout (cmd 0xa1)
> > [ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 165.327627] ata1.01: revalidation failed (errno=-5)
> > [ 165.327627] ata1: failed to recover some devices, retrying in 5 secs
> > [ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80)
> > [ 180.388879] ata1: device not ready (errno=-16), forcing hardreset
> > [ 180.388879] ata1: soft resetting link
> > [ 210.832471] ata1.01: qc timeout (cmd 0xa1)
> > [ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 210.832471] ata1.01: revalidation failed (errno=-5)
> > [ 210.832471] ata1: failed to recover some devices, retrying in 5 secs
> > [ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80)
> > [ 225.920376] ata1: device not ready (errno=-16), forcing hardreset
> > [ 225.920376] ata1: soft resetting link
> > [ 256.542565] ata1.01: qc timeout (cmd 0xa1)
> > [ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 256.542565] ata1.01: revalidation failed (errno=-5)
> > [ 256.542565] ata1.01: disabled
>
> Aiee... device going down after timing out on READ_DISC_INFO. That's
> gruesome. Can you please try the other patches?
Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
Now Jens' git tree should work with all the non libata stuff, ide,
firewire, bsg, etc. But I'm not sure about libata.
FUJITA Tomonori wrote:
>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>> gruesome. Can you please try the other patches?
>
> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
The extra_len you added to qc->nbytes should be it. The only other
place to pay attention is the ATAPI transfer chunk size and your patch
seems to get it right.
> Now Jens' git tree should work with all the non libata stuff, ide,
> firewire, bsg, etc. But I'm not sure about libata.
With the second patch, all others should be fine no matter what. I'll
go check libata part again.
Thanks.
--
tejun
On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote:
> Re-read my original mail! It states that you should just pull:
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> into your linus branch, or just create a test branch off linus' master
> and pull into that. IOW, it's the for-linus branch that you should pull,
> nothing else.
Well, I had a good reason. You know how to un-pull, I know how to
un-remote to get back to pristine after I'm done testing... guaranteed
without whimpering pathetically on the git list ;-)
Anyway, I checked out the one with the big-fat-hint in it's name
(block-for-linus/for-linus).
Same error. Git this morning with patches...
restore_meaning_of_data_len.diff
seperate_out_padding_from_alignment.diff
...reverted restored me to the originally reported k3b error, nothing
new noted.
If I tested the wrong branch, whack me upside the head, and I'll follow
your pull destructions, and figure out how to un-pull later.
-Mike
On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> >
> > > Looks excellent to me, has a variant of this been tested as OK by the
> > > users reporting the regression?
> >
> > K3b burning seems to be a nogo here. This is git pulled this morning
> > though, so it's a somewhat different tree than previously tested fwtw.
>
> can you please try git as of this morning without any patches applied,
> and then pull
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> into that and see if that works?
Works for me with the SAS SMP handler. Both input request and output
response frame sizes are picked up and returned with the correct
residues.
James
Tejun Heo wrote:
> FUJITA Tomonori wrote:
>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>>> gruesome. Can you please try the other patches?
>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
>
> The extra_len you added to qc->nbytes should be it. The only other
> place to pay attention is the ATAPI transfer chunk size and your patch
> seems to get it right.
>
>> Now Jens' git tree should work with all the non libata stuff, ide,
>> firewire, bsg, etc. But I'm not sure about libata.
>
> With the second patch, all others should be fine no matter what. I'll
> go check libata part again.
I can reproduce the problem here and it's very weird. I'll report back
when I know more.
--
tejun
Tejun Heo wrote:
> Tejun Heo wrote:
>> FUJITA Tomonori wrote:
>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>>>> gruesome. Can you please try the other patches?
>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
>> The extra_len you added to qc->nbytes should be it. The only other
>> place to pay attention is the ATAPI transfer chunk size and your patch
>> seems to get it right.
>>
>>> Now Jens' git tree should work with all the non libata stuff, ide,
>>> firewire, bsg, etc. But I'm not sure about libata.
>> With the second patch, all others should be fine no matter what. I'll
>> go check libata part again.
>
> I can reproduce the problem here and it's very weird. I'll report back
> when I know more.
Okay, I got it. Heh, it turns out SCSI and/or block layer is not
ready for rq->data_len != sum(sg). When adjusted command completes,
SCSI midlayer completes the command with rq->data_len for PC commands
which eventually ends up in __end_that_request_first(). As there are
extra sg area left after completing rq->data_len, blk layer says so to
SCSI layer and SCSI layer retries the command only with the appended
area.
The following patch gets the writing going. I really think it's a
serious mistake to break rq->data_len == sum(sg). If we break
rq->data_len == requested size, the worst bugs are giving wrong size
when issuing commands to application layer of devices which is
relatively easy to spot and not all that command anyway. Breaking
rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
engine programming and transport layer. Oh well...
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fecba05..32439ac 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd);
+ good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
--
tejun
walt wrote:
> Jens Axboe wrote:
>> On Tue, Mar 04 2008, Mike Galbraith wrote:
>>> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
>>>
>>>> Looks excellent to me, has a variant of this been tested as OK by the
>>>> users reporting the regression?
>>> K3b burning seems to be a nogo here. This is git pulled this morning
>>> though, so it's a somewhat different tree than previously tested fwtw.
>>
>> can you please try git as of this morning without any patches applied,
>> and then pull
>>
>> git://git.kernel.dk/linux-2.6-block.git for-linus
>>
>> into that and see if that works?
>
> Unfortunately this doesn't fix a problem I've discussed off-list with
> Kiyoshi Ueda, who suggested that I should follow this thread and try
> any patches posted here.
>
> Here is what happens when I try to mount a CD (before and after I
> pull 'for-linus'):
>
> hdc: ide_cd_check_ireason: wrong transfer direction!
> cdrom: failed setting lba address space
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> hdc: status timeout: status=0xd0 { Busy }
> ide: failed opcode was: unknown
> hdc: DMA disabled
> hdc: drive not ready for command
> hdc: ATAPI reset complete
> ISO 9660 Extensions: Microsoft Joliet Level 3
> ISOFS: changing to secondary root
> VFS: busy inodes on changed media.
>
> The mount can take from 5 seconds on up to a minute or so before the
> CD can be accessed.
Which version did you try? There was a recent IDE bug fix which
affected CD recording. Commit bcd88ac3b2ff2eae3d0fa57a6b02d4fce5392f32
which is included in 2.6.25-rc3. Also, did 2.6.24 work?
--
tejun
Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
>> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
>>
>>> Looks excellent to me, has a variant of this been tested as OK by the
>>> users reporting the regression?
>> K3b burning seems to be a nogo here. This is git pulled this morning
>> though, so it's a somewhat different tree than previously tested fwtw.
>
> can you please try git as of this morning without any patches applied,
> and then pull
>
> git://git.kernel.dk/linux-2.6-block.git for-linus
>
> into that and see if that works?
Unfortunately this doesn't fix a problem I've discussed off-list with
Kiyoshi Ueda, who suggested that I should follow this thread and try
any patches posted here.
Here is what happens when I try to mount a CD (before and after I
pull 'for-linus'):
hdc: ide_cd_check_ireason: wrong transfer direction!
cdrom: failed setting lba address space
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
hdc: status timeout: status=0xd0 { Busy }
ide: failed opcode was: unknown
hdc: DMA disabled
hdc: drive not ready for command
hdc: ATAPI reset complete
ISO 9660 Extensions: Microsoft Joliet Level 3
ISOFS: changing to secondary root
VFS: busy inodes on changed media.
The mount can take from 5 seconds on up to a minute or so before the
CD can be accessed.
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote:
>
> > Re-read my original mail! It states that you should just pull:
> >
> > git://git.kernel.dk/linux-2.6-block.git for-linus
> >
> > into your linus branch, or just create a test branch off linus' master
> > and pull into that. IOW, it's the for-linus branch that you should pull,
> > nothing else.
>
> Well, I had a good reason. You know how to un-pull, I know how to
> un-remote to get back to pristine after I'm done testing... guaranteed
> without whimpering pathetically on the git list ;-)
OK, if you're on master, it's pretty easy:
$ git branch test-branch
$ git checkout test-branch
$ git pull git://git.kernel.dk/linux-2.6-block.git for-linus
[build, boot, test]
$ git checkout master
$ git branch -D test-branch
> Anyway, I checked out the one with the big-fat-hint in it's name
> (block-for-linus/for-linus).
> Same error. Git this morning with patches...
> restore_meaning_of_data_len.diff
> seperate_out_padding_from_alignment.diff
> ...reverted restored me to the originally reported k3b error, nothing
> new noted.
>
> If I tested the wrong branch, whack me upside the head, and I'll follow
> your pull destructions, and figure out how to un-pull later.
for-linus is the right branch, but I'm just a little worried that you
didn't test what you think you tested. What does cat .git/HEAD say? If
that is a ref to a file (eg refs/heads/master), what does that file
contain?
--
Jens Axboe
On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
> Tejun Heo wrote:
> > Tejun Heo wrote:
> >> FUJITA Tomonori wrote:
> >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
> >>>> gruesome. Can you please try the other patches?
> >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
> >> The extra_len you added to qc->nbytes should be it. The only other
> >> place to pay attention is the ATAPI transfer chunk size and your patch
> >> seems to get it right.
> >>
> >>> Now Jens' git tree should work with all the non libata stuff, ide,
> >>> firewire, bsg, etc. But I'm not sure about libata.
> >> With the second patch, all others should be fine no matter what. I'll
> >> go check libata part again.
> >
> > I can reproduce the problem here and it's very weird. I'll report back
> > when I know more.
>
> Okay, I got it. Heh, it turns out SCSI and/or block layer is not
> ready for rq->data_len != sum(sg). When adjusted command completes,
> SCSI midlayer completes the command with rq->data_len for PC commands
> which eventually ends up in __end_that_request_first(). As there are
> extra sg area left after completing rq->data_len, blk layer says so to
> SCSI layer and SCSI layer retries the command only with the appended
> area.
>
> The following patch gets the writing going. I really think it's a
> serious mistake to break rq->data_len == sum(sg). If we break
> rq->data_len == requested size, the worst bugs are giving wrong size
> when issuing commands to application layer of devices which is
> relatively easy to spot and not all that command anyway. Breaking
> rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
> engine programming and transport layer. Oh well...
>
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index fecba05..32439ac 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
> "Notifying upper driver of completion "
> "(result %x)\n", cmd->result));
>
> - good_bytes = scsi_bufflen(cmd);
> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
This doesn't look right. scsi_bufflen(cmd) is req->data_len for PC
commands ... did you mean to add extra_len here?
James
On Tue, Mar 04 2008, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote:
> >
> > > Re-read my original mail! It states that you should just pull:
> > >
> > > git://git.kernel.dk/linux-2.6-block.git for-linus
> > >
> > > into your linus branch, or just create a test branch off linus' master
> > > and pull into that. IOW, it's the for-linus branch that you should pull,
> > > nothing else.
> >
> > Well, I had a good reason. You know how to un-pull, I know how to
> > un-remote to get back to pristine after I'm done testing... guaranteed
> > without whimpering pathetically on the git list ;-)
>
> OK, if you're on master, it's pretty easy:
>
> $ git branch test-branch
> $ git checkout test-branch
> $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus
>
> [build, boot, test]
> $ git checkout master
> $ git branch -D test-branch
>
> > Anyway, I checked out the one with the big-fat-hint in it's name
> > (block-for-linus/for-linus).
> > Same error. Git this morning with patches...
> > restore_meaning_of_data_len.diff
> > seperate_out_padding_from_alignment.diff
> > ...reverted restored me to the originally reported k3b error, nothing
> > new noted.
> >
> > If I tested the wrong branch, whack me upside the head, and I'll follow
> > your pull destructions, and figure out how to un-pull later.
>
> for-linus is the right branch, but I'm just a little worried that you
> didn't test what you think you tested. What does cat .git/HEAD say? If
> that is a ref to a file (eg refs/heads/master), what does that file
> contain?
Or just re-pull Linus' tree, the stuff is in now.
--
Jens Axboe
James Bottomley wrote:
> On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
>> Tejun Heo wrote:
>>> Tejun Heo wrote:
>>>> FUJITA Tomonori wrote:
>>>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>>>>>> gruesome. Can you please try the other patches?
>>>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
>>>> The extra_len you added to qc->nbytes should be it. The only other
>>>> place to pay attention is the ATAPI transfer chunk size and your patch
>>>> seems to get it right.
>>>>
>>>>> Now Jens' git tree should work with all the non libata stuff, ide,
>>>>> firewire, bsg, etc. But I'm not sure about libata.
>>>> With the second patch, all others should be fine no matter what. I'll
>>>> go check libata part again.
>>> I can reproduce the problem here and it's very weird. I'll report back
>>> when I know more.
>> Okay, I got it. Heh, it turns out SCSI and/or block layer is not
>> ready for rq->data_len != sum(sg). When adjusted command completes,
>> SCSI midlayer completes the command with rq->data_len for PC commands
>> which eventually ends up in __end_that_request_first(). As there are
>> extra sg area left after completing rq->data_len, blk layer says so to
>> SCSI layer and SCSI layer retries the command only with the appended
>> area.
>>
>> The following patch gets the writing going. I really think it's a
>> serious mistake to break rq->data_len == sum(sg). If we break
>> rq->data_len == requested size, the worst bugs are giving wrong size
>> when issuing commands to application layer of devices which is
>> relatively easy to spot and not all that command anyway. Breaking
>> rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
>> engine programming and transport layer. Oh well...
>>
>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
>> index fecba05..32439ac 100644
>> --- a/drivers/scsi/scsi.c
>> +++ b/drivers/scsi/scsi.c
>> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
>> "Notifying upper driver of completion "
>> "(result %x)\n", cmd->result));
>>
>> - good_bytes = scsi_bufflen(cmd);
>> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
>
> This doesn't look right. scsi_bufflen(cmd) is req->data_len for PC
> commands ... did you mean to add extra_len here?
Yeap, sorry about the confusion. Adding two times data_len accidentally
worked tho. :-)
--
tejun
On Tue, Mar 04 2008 at 18:42 +0200, Tejun Heo <[email protected]> wrote:
> Tejun Heo wrote:
>> Tejun Heo wrote:
>>> FUJITA Tomonori wrote:
>>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>>>>> gruesome. Can you please try the other patches?
>>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
>>> The extra_len you added to qc->nbytes should be it. The only other
>>> place to pay attention is the ATAPI transfer chunk size and your patch
>>> seems to get it right.
>>>
>>>> Now Jens' git tree should work with all the non libata stuff, ide,
>>>> firewire, bsg, etc. But I'm not sure about libata.
>>> With the second patch, all others should be fine no matter what. I'll
>>> go check libata part again.
>> I can reproduce the problem here and it's very weird. I'll report back
>> when I know more.
>
> Okay, I got it. Heh, it turns out SCSI and/or block layer is not
> ready for rq->data_len != sum(sg). When adjusted command completes,
> SCSI midlayer completes the command with rq->data_len for PC commands
> which eventually ends up in __end_that_request_first(). As there are
> extra sg area left after completing rq->data_len, blk layer says so to
> SCSI layer and SCSI layer retries the command only with the appended
> area.
>
> The following patch gets the writing going. I really think it's a
> serious mistake to break rq->data_len == sum(sg). If we break
> rq->data_len == requested size, the worst bugs are giving wrong size
> when issuing commands to application layer of devices which is
> relatively easy to spot and not all that command anyway. Breaking
> rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
> engine programming and transport layer. Oh well...
>
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index fecba05..32439ac 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
> "Notifying upper driver of completion "
> "(result %x)\n", cmd->result));
>
> - good_bytes = scsi_bufflen(cmd);
> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
Are you sure? is it not:
+ good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len
> if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
> drv = scsi_cmd_to_driver(cmd);
> if (drv->done)
>
>
I hate this patch. I wish you could maybe take the extra_len into
account inside blk_end_request. The padding should be transparent
to all concerned but the requesting LLD and the internals of the
block layer. If block layer added padding it should take that into
account on completion. My $0.2.
Boaz
On Tue, 2008-03-04 at 19:17 +0100, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote:
> >
> > > Re-read my original mail! It states that you should just pull:
> > >
> > > git://git.kernel.dk/linux-2.6-block.git for-linus
> > >
> > > into your linus branch, or just create a test branch off linus' master
> > > and pull into that. IOW, it's the for-linus branch that you should pull,
> > > nothing else.
> >
> > Well, I had a good reason. You know how to un-pull, I know how to
> > un-remote to get back to pristine after I'm done testing... guaranteed
> > without whimpering pathetically on the git list ;-)
>
> OK, if you're on master, it's pretty easy:
>
> $ git branch test-branch
> $ git checkout test-branch
> $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus
>
> [build, boot, test]
> $ git checkout master
> $ git branch -D test-branch
Hm, that's simple enough. I'll do this for the edification. Thanks.
Maybe some day, I'll cease to be so paranoid that my test setup may
become compromised. (at which time...)
> > Anyway, I checked out the one with the big-fat-hint in it's name
> > (block-for-linus/for-linus).
> > Same error. Git this morning with patches...
> > restore_meaning_of_data_len.diff
> > seperate_out_padding_from_alignment.diff
> > ...reverted restored me to the originally reported k3b error, nothing
> > new noted.
> >
> > If I tested the wrong branch, whack me upside the head, and I'll follow
> > your pull destructions, and figure out how to un-pull later.
>
> for-linus is the right branch, but I'm just a little worried that you
> didn't test what you think you tested. What does cat .git/HEAD say? If
> that is a ref to a file (eg refs/heads/master), what does that file
> contain?
That wouldn't surprise me one bit. (ergo...)
It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
-Mike
Boaz Harrosh wrote:
>> - good_bytes = scsi_bufflen(cmd);
>> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
>
> Are you sure? is it not:
> + good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len
You're right. Sorry about the confusion.
>> if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
>> drv = scsi_cmd_to_driver(cmd);
>> if (drv->done)
>>
>>
>
> I hate this patch. I wish you could maybe take the extra_len into
> account inside blk_end_request. The padding should be transparent
> to all concerned but the requesting LLD and the internals of the
> block layer. If block layer added padding it should take that into
> account on completion. My $0.2.
Yeah, I hate it too. As I've been saying all along, I think it just
should be rq->data_len.
--
tejun
On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
> The following patch gets the writing going.
Bingo.
-Mike
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 19:17 +0100, Jens Axboe wrote:
> > On Tue, Mar 04 2008, Mike Galbraith wrote:
> > >
> > > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote:
> > >
> > > > Re-read my original mail! It states that you should just pull:
> > > >
> > > > git://git.kernel.dk/linux-2.6-block.git for-linus
> > > >
> > > > into your linus branch, or just create a test branch off linus' master
> > > > and pull into that. IOW, it's the for-linus branch that you should pull,
> > > > nothing else.
> > >
> > > Well, I had a good reason. You know how to un-pull, I know how to
> > > un-remote to get back to pristine after I'm done testing... guaranteed
> > > without whimpering pathetically on the git list ;-)
> >
> > OK, if you're on master, it's pretty easy:
> >
> > $ git branch test-branch
> > $ git checkout test-branch
> > $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus
> >
> > [build, boot, test]
> > $ git checkout master
> > $ git branch -D test-branch
>
> Hm, that's simple enough. I'll do this for the edification. Thanks.
> Maybe some day, I'll cease to be so paranoid that my test setup may
> become compromised. (at which time...)
>
> > > Anyway, I checked out the one with the big-fat-hint in it's name
> > > (block-for-linus/for-linus).
> > > Same error. Git this morning with patches...
> > > restore_meaning_of_data_len.diff
> > > seperate_out_padding_from_alignment.diff
> > > ...reverted restored me to the originally reported k3b error, nothing
> > > new noted.
> > >
> > > If I tested the wrong branch, whack me upside the head, and I'll follow
> > > your pull destructions, and figure out how to un-pull later.
> >
> > for-linus is the right branch, but I'm just a little worried that you
> > didn't test what you think you tested. What does cat .git/HEAD say? If
> > that is a ref to a file (eg refs/heads/master), what does that file
> > contain?
>
> That wouldn't surprise me one bit. (ergo...)
>
> It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
That looks right, then perhaps there's still an issue there :/
Logs?
--
Jens Axboe
On Tue, Mar 04 2008, James Bottomley wrote:
> On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote:
> > On Tue, Mar 04 2008, Mike Galbraith wrote:
> > >
> > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> > >
> > > > Looks excellent to me, has a variant of this been tested as OK by the
> > > > users reporting the regression?
> > >
> > > K3b burning seems to be a nogo here. This is git pulled this morning
> > > though, so it's a somewhat different tree than previously tested fwtw.
> >
> > can you please try git as of this morning without any patches applied,
> > and then pull
> >
> > git://git.kernel.dk/linux-2.6-block.git for-linus
> >
> > into that and see if that works?
>
> Works for me with the SAS SMP handler. Both input request and output
> response frame sizes are picked up and returned with the correct
> residues.
Goodie, now we just need to figure out why it doesn't work for Mike
yet...
--
Jens Axboe
On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote:
> > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
>
> That looks right, then perhaps there's still an issue there :/
> Logs?
Tejuns patchlet (below) fixed it here.
Date: Wed, 05 Mar 2008 01:42:45 +0900
From: Tejun Heo <[email protected]>
To: FUJITA Tomonori <[email protected]>
CC: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: [PATCH] block: fix residual byte count handling
Tejun Heo wrote:
> Tejun Heo wrote:
>> FUJITA Tomonori wrote:
>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
>>>> gruesome. Can you please try the other patches?
>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
>> The extra_len you added to qc->nbytes should be it. The only other
>> place to pay attention is the ATAPI transfer chunk size and your patch
>> seems to get it right.
>>
>>> Now Jens' git tree should work with all the non libata stuff, ide,
>>> firewire, bsg, etc. But I'm not sure about libata.
>> With the second patch, all others should be fine no matter what. I'll
>> go check libata part again.
>
> I can reproduce the problem here and it's very weird. I'll report back
> when I know more.
Okay, I got it. Heh, it turns out SCSI and/or block layer is not
ready for rq->data_len != sum(sg). When adjusted command completes,
SCSI midlayer completes the command with rq->data_len for PC commands
which eventually ends up in __end_that_request_first(). As there are
extra sg area left after completing rq->data_len, blk layer says so to
SCSI layer and SCSI layer retries the command only with the appended
area.
The following patch gets the writing going. I really think it's a
serious mistake to break rq->data_len == sum(sg). If we break
rq->data_len == requested size, the worst bugs are giving wrong size
when issuing commands to application layer of devices which is
relatively easy to spot and not all that command anyway. Breaking
rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
engine programming and transport layer. Oh well...
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fecba05..32439ac 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd);
+ good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote:
>
> > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
> >
> > That looks right, then perhaps there's still an issue there :/
> > Logs?
>
> Tejuns patchlet (below) fixed it here.
OK, can you try changing that to
good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
and retest?
--
Jens Axboe
On Wed, 05 Mar 2008 01:42:45 +0900
Tejun Heo <[email protected]> wrote:
> Tejun Heo wrote:
> > Tejun Heo wrote:
> >> FUJITA Tomonori wrote:
> >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's
> >>>> gruesome. Can you please try the other patches?
> >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No?
> >> The extra_len you added to qc->nbytes should be it. The only other
> >> place to pay attention is the ATAPI transfer chunk size and your patch
> >> seems to get it right.
> >>
> >>> Now Jens' git tree should work with all the non libata stuff, ide,
> >>> firewire, bsg, etc. But I'm not sure about libata.
> >> With the second patch, all others should be fine no matter what. I'll
> >> go check libata part again.
> >
> > I can reproduce the problem here and it's very weird. I'll report back
> > when I know more.
>
> Okay, I got it. Heh, it turns out SCSI and/or block layer is not
> ready for rq->data_len != sum(sg). When adjusted command completes,
> SCSI midlayer completes the command with rq->data_len for PC commands
> which eventually ends up in __end_that_request_first(). As there are
> extra sg area left after completing rq->data_len, blk layer says so to
> SCSI layer and SCSI layer retries the command only with the appended
> area.
Ah, thanks!
> The following patch gets the writing going. I really think it's a
> serious mistake to break rq->data_len == sum(sg). If we break
> rq->data_len == requested size, the worst bugs are giving wrong size
> when issuing commands to application layer of devices which is
> relatively easy to spot and not all that command anyway. Breaking
> rq->data_len == sum(sg), bugs will be in internal mechanics, DMA
> engine programming and transport layer. Oh well...
>
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index fecba05..32439ac 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
> "Notifying upper driver of completion "
> "(result %x)\n", cmd->result));
>
> - good_bytes = scsi_bufflen(cmd);
> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len;
> if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
> drv = scsi_cmd_to_driver(cmd);
> if (drv->done)
>
>
Hmm, does SCSI mid-layer need to care about how many bytes the block
layer allocates? I don't think that extra_len is NOT good_bytes.
I think that the block layer had better take care about it (fix
__end_that_request_first?).
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
>
> > The following patch gets the writing going.
>
> Bingo.
Pretty please test this on top of current -git?
I'll merge this up, it should do the trick. Would just be nice if you
could verify! :-)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fecba05..e5c6f6a 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd);
+ good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
--
Jens Axboe
On Tue, 2008-03-04 at 19:54 +0100, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote:
> >
> > > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
> > >
> > > That looks right, then perhaps there's still an issue there :/
> > > Logs?
> >
> > Tejuns patchlet (below) fixed it here.
>
> OK, can you try changing that to
>
> good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
>
> and retest?
Yup, disk #42 is happily burning away.
-Mike
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 19:54 +0100, Jens Axboe wrote:
> > On Tue, Mar 04 2008, Mike Galbraith wrote:
> > >
> > > On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote:
> > >
> > > > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll?
> > > >
> > > > That looks right, then perhaps there's still an issue there :/
> > > > Logs?
> > >
> > > Tejuns patchlet (below) fixed it here.
> >
> > OK, can you try changing that to
> >
> > good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
> >
> > and retest?
>
> Yup, disk #42 is happily burning away.
Super, patch heading to Linus now. Thanks for all your testing, Mike!
--
Jens Axboe
On Tue, 2008-03-04 at 20:25 +0100, Jens Axboe wrote:
> On Tue, Mar 04 2008, Mike Galbraith wrote:
> >
> > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
> >
> > > The following patch gets the writing going.
> >
> > Bingo.
>
> Pretty please test this on top of current -git?
? That's the patch I just tested, and the tree. Oh.. 976dde0..87baa2b
just a sec....
-Mike
On Tue, Mar 04 2008, Mike Galbraith wrote:
>
> On Tue, 2008-03-04 at 20:25 +0100, Jens Axboe wrote:
> > On Tue, Mar 04 2008, Mike Galbraith wrote:
> > >
> > > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote:
> > >
> > > > The following patch gets the writing going.
> > >
> > > Bingo.
> >
> > Pretty please test this on top of current -git?
>
> ? That's the patch I just tested, and the tree. Oh.. 976dde0..87baa2b
> just a sec....
Yeah it is, mid-air collision of emails. So just disregard this one!
--
Jens Axboe
Hi,
On Tue, 04 Mar 2008 09:34:56 -0800, walt wrote:
> Jens Axboe wrote:
> > On Tue, Mar 04 2008, Mike Galbraith wrote:
> >> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote:
> >>
> >>> Looks excellent to me, has a variant of this been tested as OK by the
> >>> users reporting the regression?
> >> K3b burning seems to be a nogo here. This is git pulled this morning
> >> though, so it's a somewhat different tree than previously tested fwtw.
> >
> > can you please try git as of this morning without any patches applied,
> > and then pull
> >
> > git://git.kernel.dk/linux-2.6-block.git for-linus
> >
> > into that and see if that works?
>
> Unfortunately this doesn't fix a problem I've discussed off-list with
> Kiyoshi Ueda, who suggested that I should follow this thread and try
> any patches posted here.
I think there was misunderstanding between us.
On off-list, I meant:
o Your original problem was CD burning, and it looked same problem
being discussed on this thread, according to this message:
> cdrecord: Warning: controller returns zero sized CD capabilities page.
> cdrecord: Warning: controller returns wrong page 0 for CD capabilities page (2A).
So I suggested you to watch this thread and try patches of this
thread for CD burning problem.
o The problem of ide_cd_check_ireason looked different from
CD burning one.
So I suggested you to report it as a different problem.
Thanks,
Kiyoshi Ueda
FUJITA Tomonori wrote:
> Hmm, does SCSI mid-layer need to care about how many bytes the block
> layer allocates? I don't think that extra_len is NOT good_bytes.
>
> I think that the block layer had better take care about it (fix
> __end_that_request_first?).
Yeah, probably calling completion functions w/o bytes count is the right
thing to do but what I was talking about was what could break when the
semantics of rq->data_len changed. If we keep rq->data_len() ==
sum(sg), we keep it business as usual for all the rest except for the
device application layer if we don't we do the reverse and SCSI midlayer
completion was a good example, I think.
Things going the other way is fine with me but I at least want to hear a
valid rationale. Till now all I got is "because that's the true size"
which doesn't really make much sense to me.
Thanks.
--
tejun
Tejun Heo wrote:
> FUJITA Tomonori wrote:
>> Hmm, does SCSI mid-layer need to care about how many bytes the block
>> layer allocates? I don't think that extra_len is NOT good_bytes.
>>
>> I think that the block layer had better take care about it (fix
>> __end_that_request_first?).
>
> Yeah, probably calling completion functions w/o bytes count is the right
> thing to do but what I was talking about was what could break when the
> semantics of rq->data_len changed. If we keep rq->data_len() ==
> sum(sg), we keep it business as usual for all the rest except for the
> device application layer if we don't we do the reverse and SCSI midlayer
> completion was a good example, I think.
>
> Things going the other way is fine with me but I at least want to hear a
> valid rationale. Till now all I got is "because that's the true size"
> which doesn't really make much sense to me.
I'm giving it another shot. When the padding / draining thing was in
libata (or IDE) in that matter. The whole thing looked like this.
user - blk - SCSI - libata - LLD - controller - device
<---------------------><----------------------><----->
a b c
a: Uses the 'true' request size and matching sg
b: Requires adjusted request size and matching sg
c: Don't really care about sg, but sometimes needs the true size. For
anything which gets attached behind ATA and which may require
padding, transfer size is also sent in the CDB as well, which not
all devices honor and that's one of the reasons why size adjustment
is necessary.
If we move the adjustment to block layer and keep data_len == sum(sg),
it looks like.
user - blk - SCSI - libata - LLD - controller - device
<------><-------------------------------------><----->
a b c
And a, b and c stay the same. If we keep the requested size in
data_len. Whole b gets inconsistent values in the middle while c gets
the value it wants in data_len, so we're risking much more to keep the
true size in rq->data_len when we could simply make it mean sum(sg).
Before the only thing which need updating was to correctly determine
the transfer size to feed to device. Now we need to audit whole b.
In addition, such adjustments are made only when the driver explicitly
requested it, so for all others it doesn't really matter.
Thanks.
--
tejun
On Wed, 05 Mar 2008 08:33:05 +0900
Tejun Heo <[email protected]> wrote:
> FUJITA Tomonori wrote:
> > Hmm, does SCSI mid-layer need to care about how many bytes the block
> > layer allocates? I don't think that extra_len is NOT good_bytes.
> >
> > I think that the block layer had better take care about it (fix
> > __end_that_request_first?).
>
> Yeah, probably calling completion functions w/o bytes count is the right
> thing to do but what I was talking about was what could break when the
> semantics of rq->data_len changed. If we keep rq->data_len() ==
> sum(sg), we keep it business as usual for all the rest except for the
> device application layer if we don't we do the reverse and SCSI midlayer
> completion was a good example, I think.
sglist is a low-level I/O representation for device drivers. SCSI
midlayer should not care about sglist. We should not fix SCSI midlayer
for rq->data_len != sum(sg) change (so I can't agree with your
diagrams in another mail).
When if we change a rule, we need to fix something.
If we keep rq->data_len == sum(sg), we need to fix the device
application layer. If we keep rq->data_len == the true data length, we
need to fix the low-level drivers.
Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
since we are in -rc stages. But I plan to send a patch to revert it
and fix this issue in the block layer. I'd like to test it in -mm for
a while.
Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
know, we really want to remove it.
> Things going the other way is fine with me but I at least want to hear a
> valid rationale. Till now all I got is "because that's the true size"
> which doesn't really make much sense to me.
Most of users of request structure care about only the real data
length, don't care about padding and drain length. Why do they bother
to use a helper function to get the real data length?
Hello,
FUJITA Tomonori wrote:
> sglist is a low-level I/O representation for device drivers. SCSI
> midlayer should not care about sglist. We should not fix SCSI midlayer
> for rq->data_len != sum(sg) change (so I can't agree with your
> diagrams in another mail).
But that's not the way things currently are.
> When if we change a rule, we need to fix something.
>
> If we keep rq->data_len == sum(sg), we need to fix the device
> application layer. If we keep rq->data_len == the true data length, we
> need to fix the low-level drivers.
Basically everything under block layer.
> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
> since we are in -rc stages. But I plan to send a patch to revert it
> and fix this issue in the block layer. I'd like to test it in -mm for
> a while.
>
> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
> know, we really want to remove it.
If the way forward is to make anything but the low level drivers not
care about sglist, in the long term, the current scheme is fine but I
still don't think this way of doing things is safe one. We're affecting
large portion of code based on what things should be in future not what
they currently are.
>> Things going the other way is fine with me but I at least want to hear a
>> valid rationale. Till now all I got is "because that's the true size"
>> which doesn't really make much sense to me.
>
> Most of users of request structure care about only the real data
> length, don't care about padding and drain length. Why do they bother
> to use a helper function to get the real data length?
I think this is where the difference comes from. To me it seems
internal usage seems more wide-spread and more delicate and not too many
care about the true size and when they do only in well defined places.
Maybe it comes from the difference between your most and my most.
Thanks.
--
tejun
On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <[email protected]> wrote:
> On Wed, 05 Mar 2008 08:33:05 +0900
> Tejun Heo <[email protected]> wrote:
>
>> FUJITA Tomonori wrote:
>>> Hmm, does SCSI mid-layer need to care about how many bytes the block
>>> layer allocates? I don't think that extra_len is NOT good_bytes.
>>>
>>> I think that the block layer had better take care about it (fix
>>> __end_that_request_first?).
>> Yeah, probably calling completion functions w/o bytes count is the right
>> thing to do but what I was talking about was what could break when the
>> semantics of rq->data_len changed. If we keep rq->data_len() ==
>> sum(sg), we keep it business as usual for all the rest except for the
>> device application layer if we don't we do the reverse and SCSI midlayer
>> completion was a good example, I think.
>
> sglist is a low-level I/O representation for device drivers. SCSI
> midlayer should not care about sglist. We should not fix SCSI midlayer
> for rq->data_len != sum(sg) change (so I can't agree with your
> diagrams in another mail).
>
> When if we change a rule, we need to fix something.
>
> If we keep rq->data_len == sum(sg), we need to fix the device
> application layer. If we keep rq->data_len == the true data length, we
> need to fix the low-level drivers.
>
> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
> since we are in -rc stages. But I plan to send a patch to revert it
> and fix this issue in the block layer. I'd like to test it in -mm for
> a while.
No this commit is a serious bug, and the only fix is like you suggested
in __end_that_request_first. This is because it breaks that scsi-ml loop
where scsi_bufflen() can be less then blk_rq_bytes(). In that case this
commit is a data corruption.
> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
> know, we really want to remove it.
>
>
>> Things going the other way is fine with me but I at least want to hear a
>> valid rationale. Till now all I got is "because that's the true size"
>> which doesn't really make much sense to me.
>
> Most of users of request structure care about only the real data
> length, don't care about padding and drain length. Why do they bother
> to use a helper function to get the real data length?
> --
Submitted is the right fix to this problem, as pointed out by TOMO.
Please test it solves the CD burning problem.
(The patch includes the revert of commit e97a294e)
---
From: Boaz Harrosh <[email protected]>
Date: Wed, 5 Mar 2008 12:07:12 +0200
Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
because scsi-ml supports the ability to split a request into smaller chunks,
in which case scsi_bufflen() is smaller then request length. Then at completion
time the remainder can be issued as a new scsi command. In that case the above
commit is a data corruption.
Also in this fix all users of block layer are taken care of, and not only
scsi devices.
Signed-off-by: Boaz Harrosh <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
block/blk-core.c | 4 ++++
drivers/scsi/scsi.c | 2 +-
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 2a438a9..37fcccc 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
nr_bytes >> 9, req->sector);
}
+ if (nr_bytes >= blk_rq_bytes(req))
+ nr_bytes += req->extra_len;
+
total_bytes = bio_nbytes = 0;
while ((bio = req->bio) != NULL) {
int nbytes;
@@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
if (!req->bio)
return 0;
+ BUG_ON(total_bytes >= blk_rq_bytes(req));
/*
* if the request wasn't completed, update state
*/
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index e5c6f6a..fecba05 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
+ good_bytes = scsi_bufflen(cmd);
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
--
1.5.3.3
On Wed, 2008-03-05 at 12:16 +0200, Boaz Harrosh wrote:
> Please test it solves the CD burning problem.
Works for me.
-Mike
> (The patch includes the revert of commit e97a294e)
> ---
> From: Boaz Harrosh <[email protected]>
> Date: Wed, 5 Mar 2008 12:07:12 +0200
> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
>
> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
> because scsi-ml supports the ability to split a request into smaller chunks,
> in which case scsi_bufflen() is smaller then request length. Then at completion
> time the remainder can be issued as a new scsi command. In that case the above
> commit is a data corruption.
>
> Also in this fix all users of block layer are taken care of, and not only
> scsi devices.
>
> Signed-off-by: Boaz Harrosh <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> block/blk-core.c | 4 ++++
> drivers/scsi/scsi.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 2a438a9..37fcccc 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
> nr_bytes >> 9, req->sector);
> }
>
> + if (nr_bytes >= blk_rq_bytes(req))
> + nr_bytes += req->extra_len;
> +
> total_bytes = bio_nbytes = 0;
> while ((bio = req->bio) != NULL) {
> int nbytes;
> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
> if (!req->bio)
> return 0;
>
> + BUG_ON(total_bytes >= blk_rq_bytes(req));
> /*
> * if the request wasn't completed, update state
> */
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index e5c6f6a..fecba05 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
> "Notifying upper driver of completion "
> "(result %x)\n", cmd->result));
>
> - good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
> + good_bytes = scsi_bufflen(cmd);
> if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
> drv = scsi_cmd_to_driver(cmd);
> if (drv->done)
On Wed, Mar 05 2008, Boaz Harrosh wrote:
> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <[email protected]> wrote:
> > On Wed, 05 Mar 2008 08:33:05 +0900
> > Tejun Heo <[email protected]> wrote:
> >
> >> FUJITA Tomonori wrote:
> >>> Hmm, does SCSI mid-layer need to care about how many bytes the block
> >>> layer allocates? I don't think that extra_len is NOT good_bytes.
> >>>
> >>> I think that the block layer had better take care about it (fix
> >>> __end_that_request_first?).
> >> Yeah, probably calling completion functions w/o bytes count is the right
> >> thing to do but what I was talking about was what could break when the
> >> semantics of rq->data_len changed. If we keep rq->data_len() ==
> >> sum(sg), we keep it business as usual for all the rest except for the
> >> device application layer if we don't we do the reverse and SCSI midlayer
> >> completion was a good example, I think.
> >
> > sglist is a low-level I/O representation for device drivers. SCSI
> > midlayer should not care about sglist. We should not fix SCSI midlayer
> > for rq->data_len != sum(sg) change (so I can't agree with your
> > diagrams in another mail).
> >
> > When if we change a rule, we need to fix something.
> >
> > If we keep rq->data_len == sum(sg), we need to fix the device
> > application layer. If we keep rq->data_len == the true data length, we
> > need to fix the low-level drivers.
> >
> > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
> > since we are in -rc stages. But I plan to send a patch to revert it
> > and fix this issue in the block layer. I'd like to test it in -mm for
> > a while.
>
> No this commit is a serious bug, and the only fix is like you suggested
> in __end_that_request_first. This is because it breaks that scsi-ml loop
> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this
> commit is a data corruption.
>
> > Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
> > know, we really want to remove it.
> >
> >
> >> Things going the other way is fine with me but I at least want to hear a
> >> valid rationale. Till now all I got is "because that's the true size"
> >> which doesn't really make much sense to me.
> >
> > Most of users of request structure care about only the real data
> > length, don't care about padding and drain length. Why do they bother
> > to use a helper function to get the real data length?
> > --
>
> Submitted is the right fix to this problem, as pointed out by TOMO.
> Please test it solves the CD burning problem.
> (The patch includes the revert of commit e97a294e)
> ---
> From: Boaz Harrosh <[email protected]>
> Date: Wed, 5 Mar 2008 12:07:12 +0200
> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
>
> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
> because scsi-ml supports the ability to split a request into smaller chunks,
> in which case scsi_bufflen() is smaller then request length. Then at completion
> time the remainder can be issued as a new scsi command. In that case the above
> commit is a data corruption.
We needed something for -rc4, so it had to be rushed a bit...
> Also in this fix all users of block layer are taken care of, and not only
> scsi devices.
>
> Signed-off-by: Boaz Harrosh <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> block/blk-core.c | 4 ++++
> drivers/scsi/scsi.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 2a438a9..37fcccc 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
> nr_bytes >> 9, req->sector);
> }
>
> + if (nr_bytes >= blk_rq_bytes(req))
> + nr_bytes += req->extra_len;
> +
> total_bytes = bio_nbytes = 0;
> while ((bio = req->bio) != NULL) {
> int nbytes;
> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
> if (!req->bio)
> return 0;
>
> + BUG_ON(total_bytes >= blk_rq_bytes(req));
Make that a WARN_ON() first please. It's indeed a bug, but it wont be
critical and it's not fair killing everything since this padding stuff
is so fresh and may still need a tweak or two.
I'd be fine with making it a BUG_ON() post 2.6.25.
--
Jens Axboe
On Wed, Mar 05 2008, Boaz Harrosh wrote:
> On Wed, Mar 05 2008 at 14:33 +0200, Jens Axboe <[email protected]> wrote:
> > On Wed, Mar 05 2008, Boaz Harrosh wrote:
> >> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <[email protected]> wrote:
> >>> On Wed, 05 Mar 2008 08:33:05 +0900
> >>> Tejun Heo <[email protected]> wrote:
> >>>
> >>>> FUJITA Tomonori wrote:
> >>>>> Hmm, does SCSI mid-layer need to care about how many bytes the block
> >>>>> layer allocates? I don't think that extra_len is NOT good_bytes.
> >>>>>
> >>>>> I think that the block layer had better take care about it (fix
> >>>>> __end_that_request_first?).
> >>>> Yeah, probably calling completion functions w/o bytes count is the right
> >>>> thing to do but what I was talking about was what could break when the
> >>>> semantics of rq->data_len changed. If we keep rq->data_len() ==
> >>>> sum(sg), we keep it business as usual for all the rest except for the
> >>>> device application layer if we don't we do the reverse and SCSI midlayer
> >>>> completion was a good example, I think.
> >>> sglist is a low-level I/O representation for device drivers. SCSI
> >>> midlayer should not care about sglist. We should not fix SCSI midlayer
> >>> for rq->data_len != sum(sg) change (so I can't agree with your
> >>> diagrams in another mail).
> >>>
> >>> When if we change a rule, we need to fix something.
> >>>
> >>> If we keep rq->data_len == sum(sg), we need to fix the device
> >>> application layer. If we keep rq->data_len == the true data length, we
> >>> need to fix the low-level drivers.
> >>>
> >>> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
> >>> since we are in -rc stages. But I plan to send a patch to revert it
> >>> and fix this issue in the block layer. I'd like to test it in -mm for
> >>> a while.
> >> No this commit is a serious bug, and the only fix is like you suggested
> >> in __end_that_request_first. This is because it breaks that scsi-ml loop
> >> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this
> >> commit is a data corruption.
> >>
> >>> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
> >>> know, we really want to remove it.
> >>>
> >>>
> >>>> Things going the other way is fine with me but I at least want to hear a
> >>>> valid rationale. Till now all I got is "because that's the true size"
> >>>> which doesn't really make much sense to me.
> >>> Most of users of request structure care about only the real data
> >>> length, don't care about padding and drain length. Why do they bother
> >>> to use a helper function to get the real data length?
> >>> --
> >> Submitted is the right fix to this problem, as pointed out by TOMO.
> >> Please test it solves the CD burning problem.
> >> (The patch includes the revert of commit e97a294e)
> >> ---
> >> From: Boaz Harrosh <[email protected]>
> >> Date: Wed, 5 Mar 2008 12:07:12 +0200
> >> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
> >>
> >> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
> >> because scsi-ml supports the ability to split a request into smaller chunks,
> >> in which case scsi_bufflen() is smaller then request length. Then at completion
> >> time the remainder can be issued as a new scsi command. In that case the above
> >> commit is a data corruption.
> >
> > We needed something for -rc4, so it had to be rushed a bit...
> >
> >> Also in this fix all users of block layer are taken care of, and not only
> >> scsi devices.
> >>
> >> Signed-off-by: Boaz Harrosh <[email protected]>
> >> Signed-off-by: Benny Halevy <[email protected]>
> >> ---
> >> block/blk-core.c | 4 ++++
> >> drivers/scsi/scsi.c | 2 +-
> >> 2 files changed, 5 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/block/blk-core.c b/block/blk-core.c
> >> index 2a438a9..37fcccc 100644
> >> --- a/block/blk-core.c
> >> +++ b/block/blk-core.c
> >> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
> >> nr_bytes >> 9, req->sector);
> >> }
> >>
> >> + if (nr_bytes >= blk_rq_bytes(req))
> >> + nr_bytes += req->extra_len;
> >> +
> >> total_bytes = bio_nbytes = 0;
> >> while ((bio = req->bio) != NULL) {
> >> int nbytes;
> >> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
> >> if (!req->bio)
> >> return 0;
> >>
> >> + BUG_ON(total_bytes >= blk_rq_bytes(req));
> >
> > Make that a WARN_ON() first please. It's indeed a bug, but it wont be
> > critical and it's not fair killing everything since this padding stuff
> > is so fresh and may still need a tweak or two.
> >
> > I'd be fine with making it a BUG_ON() post 2.6.25.
> >
> Updated, you are absolutely right, thanks.
>
> Will you commit below patch for 2.6.25? I know that, at the time, I have
> seen this scsi-ml-loop in action on a sata drive here in the lab, on an
> x86_64 machine. The current solution will silently corrupt data, which
> is very hard to find.
Yes, was just hoping you'd resend with the above corrected, so thanks!
I'll add it to the pending queue for 2.6.25.
--
Jens Axboe
On Wed, Mar 05 2008 at 14:33 +0200, Jens Axboe <[email protected]> wrote:
> On Wed, Mar 05 2008, Boaz Harrosh wrote:
>> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <[email protected]> wrote:
>>> On Wed, 05 Mar 2008 08:33:05 +0900
>>> Tejun Heo <[email protected]> wrote:
>>>
>>>> FUJITA Tomonori wrote:
>>>>> Hmm, does SCSI mid-layer need to care about how many bytes the block
>>>>> layer allocates? I don't think that extra_len is NOT good_bytes.
>>>>>
>>>>> I think that the block layer had better take care about it (fix
>>>>> __end_that_request_first?).
>>>> Yeah, probably calling completion functions w/o bytes count is the right
>>>> thing to do but what I was talking about was what could break when the
>>>> semantics of rq->data_len changed. If we keep rq->data_len() ==
>>>> sum(sg), we keep it business as usual for all the rest except for the
>>>> device application layer if we don't we do the reverse and SCSI midlayer
>>>> completion was a good example, I think.
>>> sglist is a low-level I/O representation for device drivers. SCSI
>>> midlayer should not care about sglist. We should not fix SCSI midlayer
>>> for rq->data_len != sum(sg) change (so I can't agree with your
>>> diagrams in another mail).
>>>
>>> When if we change a rule, we need to fix something.
>>>
>>> If we keep rq->data_len == sum(sg), we need to fix the device
>>> application layer. If we keep rq->data_len == the true data length, we
>>> need to fix the low-level drivers.
>>>
>>> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
>>> since we are in -rc stages. But I plan to send a patch to revert it
>>> and fix this issue in the block layer. I'd like to test it in -mm for
>>> a while.
>> No this commit is a serious bug, and the only fix is like you suggested
>> in __end_that_request_first. This is because it breaks that scsi-ml loop
>> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this
>> commit is a data corruption.
>>
>>> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you
>>> know, we really want to remove it.
>>>
>>>
>>>> Things going the other way is fine with me but I at least want to hear a
>>>> valid rationale. Till now all I got is "because that's the true size"
>>>> which doesn't really make much sense to me.
>>> Most of users of request structure care about only the real data
>>> length, don't care about padding and drain length. Why do they bother
>>> to use a helper function to get the real data length?
>>> --
>> Submitted is the right fix to this problem, as pointed out by TOMO.
>> Please test it solves the CD burning problem.
>> (The patch includes the revert of commit e97a294e)
>> ---
>> From: Boaz Harrosh <[email protected]>
>> Date: Wed, 5 Mar 2008 12:07:12 +0200
>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
>>
>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
>> because scsi-ml supports the ability to split a request into smaller chunks,
>> in which case scsi_bufflen() is smaller then request length. Then at completion
>> time the remainder can be issued as a new scsi command. In that case the above
>> commit is a data corruption.
>
> We needed something for -rc4, so it had to be rushed a bit...
>
>> Also in this fix all users of block layer are taken care of, and not only
>> scsi devices.
>>
>> Signed-off-by: Boaz Harrosh <[email protected]>
>> Signed-off-by: Benny Halevy <[email protected]>
>> ---
>> block/blk-core.c | 4 ++++
>> drivers/scsi/scsi.c | 2 +-
>> 2 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 2a438a9..37fcccc 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
>> nr_bytes >> 9, req->sector);
>> }
>>
>> + if (nr_bytes >= blk_rq_bytes(req))
>> + nr_bytes += req->extra_len;
>> +
>> total_bytes = bio_nbytes = 0;
>> while ((bio = req->bio) != NULL) {
>> int nbytes;
>> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
>> if (!req->bio)
>> return 0;
>>
>> + BUG_ON(total_bytes >= blk_rq_bytes(req));
>
> Make that a WARN_ON() first please. It's indeed a bug, but it wont be
> critical and it's not fair killing everything since this padding stuff
> is so fresh and may still need a tweak or two.
>
> I'd be fine with making it a BUG_ON() post 2.6.25.
>
Updated, you are absolutely right, thanks.
Will you commit below patch for 2.6.25? I know that, at the time, I have
seen this scsi-ml-loop in action on a sata drive here in the lab, on an
x86_64 machine. The current solution will silently corrupt data, which
is very hard to find.
Boaz
---
From: Boaz Harrosh <[email protected]>
Date: Wed, 5 Mar 2008 12:07:12 +0200
Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
because scsi-ml supports the ability to split a request into smaller chunks,
in which case scsi_bufflen() is smaller then request length. Then at completion
time the remainder can be issued as a new scsi command. In that case the above
commit is a data corruption.
Also in this fix all users of block layer are taken care of, and not only
scsi devices.
Signed-off-by: Boaz Harrosh <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
block/blk-core.c | 4 ++++
drivers/scsi/scsi.c | 2 +-
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 2a438a9..c82e68a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
nr_bytes >> 9, req->sector);
}
+ if (nr_bytes >= blk_rq_bytes(req))
+ nr_bytes += req->extra_len;
+
total_bytes = bio_nbytes = 0;
while ((bio = req->bio) != NULL) {
int nbytes;
@@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error,
if (!req->bio)
return 0;
+ WARN_ON(total_bytes >= blk_rq_bytes(req));
/*
* if the request wasn't completed, update state
*/
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index e5c6f6a..fecba05 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
+ good_bytes = scsi_bufflen(cmd);
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
--
1.5.3.3
Hello, Jens, Boaz.
Jens Axboe wrote:
>>>> From: Boaz Harrosh <[email protected]>
>>>> Date: Wed, 5 Mar 2008 12:07:12 +0200
>>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
>>>>
>>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
>>>> because scsi-ml supports the ability to split a request into smaller chunks,
>>>> in which case scsi_bufflen() is smaller then request length. Then at completion
>>>> time the remainder can be issued as a new scsi command. In that case the above
>>>> commit is a data corruption.
Thanks for catching the stupidity. Did it actually happen? PC commands
are not completed in pieces and padding / draining should only happen
for those. qc->extra_len should be zero where commands can be splitted
for all current cases.
>>> We needed something for -rc4, so it had to be rushed a bit...
>>>
>>>> Also in this fix all users of block layer are taken care of, and not only
>>>> scsi devices.
>>>>
>>>> Signed-off-by: Boaz Harrosh <[email protected]>
>>>> Signed-off-by: Benny Halevy <[email protected]>
>>>> ---
>>>> block/blk-core.c | 4 ++++
>>>> drivers/scsi/scsi.c | 2 +-
>>>> 2 files changed, 5 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 2a438a9..37fcccc 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
>>>> nr_bytes >> 9, req->sector);
>>>> }
>>>>
>>>> + if (nr_bytes >= blk_rq_bytes(req))
>>>> + nr_bytes += req->extra_len;
>>>> +
This is getting insanely subtle. Let's say there's PIO driver which
transfer certain sized chunks at a time and completes request partially
after completing each chunk and the driver uses draining to eat up
whatever excess data, which seems like a legit use case to me. But it
won't work because __end_that_request_first() will terminate when it
reaches reaches the 'true' transfer size. That's just broken API. FWIW,
Nacked-by: Tejun Heo <[email protected]>
--
tejun
On Wed, Mar 05 2008, Tejun Heo wrote:
> Hello, Jens, Boaz.
>
> Jens Axboe wrote:
> >>>> From: Boaz Harrosh <[email protected]>
> >>>> Date: Wed, 5 Mar 2008 12:07:12 +0200
> >>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
> >>>>
> >>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
> >>>> because scsi-ml supports the ability to split a request into smaller chunks,
> >>>> in which case scsi_bufflen() is smaller then request length. Then at completion
> >>>> time the remainder can be issued as a new scsi command. In that case the above
> >>>> commit is a data corruption.
>
> Thanks for catching the stupidity. Did it actually happen? PC commands
> are not completed in pieces and padding / draining should only happen
> for those. qc->extra_len should be zero where commands can be splitted
> for all current cases.
>
> >>> We needed something for -rc4, so it had to be rushed a bit...
> >>>
> >>>> Also in this fix all users of block layer are taken care of, and not only
> >>>> scsi devices.
> >>>>
> >>>> Signed-off-by: Boaz Harrosh <[email protected]>
> >>>> Signed-off-by: Benny Halevy <[email protected]>
> >>>> ---
> >>>> block/blk-core.c | 4 ++++
> >>>> drivers/scsi/scsi.c | 2 +-
> >>>> 2 files changed, 5 insertions(+), 1 deletions(-)
> >>>>
> >>>> diff --git a/block/blk-core.c b/block/blk-core.c
> >>>> index 2a438a9..37fcccc 100644
> >>>> --- a/block/blk-core.c
> >>>> +++ b/block/blk-core.c
> >>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
> >>>> nr_bytes >> 9, req->sector);
> >>>> }
> >>>>
> >>>> + if (nr_bytes >= blk_rq_bytes(req))
> >>>> + nr_bytes += req->extra_len;
> >>>> +
>
> This is getting insanely subtle. Let's say there's PIO driver which
> transfer certain sized chunks at a time and completes request partially
> after completing each chunk and the driver uses draining to eat up
> whatever excess data, which seems like a legit use case to me. But it
> won't work because __end_that_request_first() will terminate when it
> reaches reaches the 'true' transfer size. That's just broken API. FWIW,
>
> Nacked-by: Tejun Heo <[email protected]>
Yeah, I think I may have gone a bit overboard in applying this so
quickly. It's just not a good interface, silently adding the extra
length if asked to complete more. It may even happen right now, for a
driver that does no padding (it probably wont do any harm here either,
but still).
I'll try and see if I can come up with something cleaner.
My basic design paradigm for this is that the _driver_ (or mid layer, if
SCSI wants to handle it) should care about the padding. So make it easy
for them to pad, but have it 'unrolled' by completion time. We should
NOT need any extra_len checks or additions in the block/ directory,
period.
--
Jens Axboe
Hello, Jens.
Jens Axboe wrote:
>> This is getting insanely subtle. Let's say there's PIO driver which
>> transfer certain sized chunks at a time and completes request partially
>> after completing each chunk and the driver uses draining to eat up
>> whatever excess data, which seems like a legit use case to me. But it
>> won't work because __end_that_request_first() will terminate when it
>> reaches reaches the 'true' transfer size. That's just broken API. FWIW,
>>
>> Nacked-by: Tejun Heo <[email protected]>
>
> Yeah, I think I may have gone a bit overboard in applying this so
> quickly. It's just not a good interface, silently adding the extra
> length if asked to complete more. It may even happen right now, for a
> driver that does no padding (it probably wont do any harm here either,
> but still).
Unless it explicitly requests padding, it shouldn't be a problem
extra_len will always be zero and currently the only driver which uses
padding and draining is libata.
> I'll try and see if I can come up with something cleaner.
>
> My basic design paradigm for this is that the _driver_ (or mid layer, if
> SCSI wants to handle it) should care about the padding. So make it easy
> for them to pad, but have it 'unrolled' by completion time. We should
> NOT need any extra_len checks or additions in the block/ directory,
> period.
Maybe I'm from Mars but I don't really understand all this fuss. The
two patches I posted way back work perfectly fine and don't have any of
these problems and as I have said again and again that's because it
doesn't break the assumption which our internal mechanics depend on.
Can you please put the "true" size aside for a while and consider those
patches? There's nothing fundamentally wrong with letting the
rq->data_len be sum(sg) which can differ from user requested data length
if and only if low level driver requests so.
If you can come up with something nicer, that will be great too but I
really don't think the current scheme will work.
Thanks.
--
tejun
On Wed, Mar 05 2008 at 15:45 +0200, Tejun Heo <[email protected]> wrote:
> Hello, Jens, Boaz.
>
> Jens Axboe wrote:
>>>>> From: Boaz Harrosh <[email protected]>
>>>>> Date: Wed, 5 Mar 2008 12:07:12 +0200
>>>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count
>>>>>
>>>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is
>>>>> because scsi-ml supports the ability to split a request into smaller chunks,
>>>>> in which case scsi_bufflen() is smaller then request length. Then at completion
>>>>> time the remainder can be issued as a new scsi command. In that case the above
>>>>> commit is a data corruption.
>
> Thanks for catching the stupidity. Did it actually happen? PC commands
> are not completed in pieces and padding / draining should only happen
> for those. qc->extra_len should be zero where commands can be splitted
> for all current cases.
So qc->extra_len == 0 and nothing is done in that case.
>
>>>> We needed something for -rc4, so it had to be rushed a bit...
>>>>
>>>>> Also in this fix all users of block layer are taken care of, and not only
>>>>> scsi devices.
>>>>>
>>>>> Signed-off-by: Boaz Harrosh <[email protected]>
>>>>> Signed-off-by: Benny Halevy <[email protected]>
>>>>> ---
>>>>> block/blk-core.c | 4 ++++
>>>>> drivers/scsi/scsi.c | 2 +-
>>>>> 2 files changed, 5 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>> index 2a438a9..37fcccc 100644
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error,
>>>>> nr_bytes >> 9, req->sector);
>>>>> }
>>>>>
>>>>> + if (nr_bytes >= blk_rq_bytes(req))
>>>>> + nr_bytes += req->extra_len;
>>>>> +
>
> This is getting insanely subtle. Let's say there's PIO driver which
> transfer certain sized chunks at a time and completes request partially
> after completing each chunk and the driver uses draining to eat up
> whatever excess data, which seems like a legit use case to me. But it
> won't work because __end_that_request_first() will terminate when it
> reaches reaches the 'true' transfer size. That's just broken API. FWIW,
>
> Nacked-by: Tejun Heo <[email protected]>
>
I don't understand? Drivers can still do that. Do you mean That it wants
to also complete the draining portion in smaller chunks? I thought the draining
is always done at once, at most. Is that theoretical or is it so in any of the
drivers.
Any way Nack from my side on the scsi_finish_command(), it makes too many
assumptions that are unchecked anywhere. And it's a terrible layering violation.
scsi is a pass-threw block device, the fix should be in block or in using device
drivers (eg libata), that know what is going on.
Any way you are always saying req->data_len == sum(sg) but that was certainly
never true for scsi_bufflen() == sum(sg) so leave that alone please.
Any other block layer fixes are welcome. But for now this is the best fix we have
that only breaks theoretical, yet to be submitted drivers.
Boaz
Hello, Boaz.
Boaz Harrosh wrote:
>> This is getting insanely subtle. Let's say there's PIO driver which
>> transfer certain sized chunks at a time and completes request partially
>> after completing each chunk and the driver uses draining to eat up
>> whatever excess data, which seems like a legit use case to me. But it
>> won't work because __end_that_request_first() will terminate when it
>> reaches reaches the 'true' transfer size. That's just broken API. FWIW,
>>
>> Nacked-by: Tejun Heo <[email protected]>
>>
>
> I don't understand? Drivers can still do that. Do you mean That it wants
> to also complete the draining portion in smaller chunks? I thought the draining
> is always done at once, at most. Is that theoretical or is it so in any of the
> drivers.
Ah... I wasn't really Nacking your patch specifically. I was trying to
say "this scheme isn't gonna work". Your patch does make good sense
given the situation (and I think I did acknowledge that above). Sorry
about the miscommunication.
> Any way Nack from my side on the scsi_finish_command(), it makes too many
> assumptions that are unchecked anywhere. And it's a terrible layering violation.
> scsi is a pass-threw block device, the fix should be in block or in using device
> drivers (eg libata), that know what is going on.
Yeap, completely agreed. That one gets my big Nack-You-Idiot.
> Any way you are always saying req->data_len == sum(sg) but that was certainly
> never true for scsi_bufflen() == sum(sg) so leave that alone please.
I don't really care about scsi_bufflen() and I'm not willing to change
any of that. If SCSI LLDs are happy with scsi_bufflen() != sum(sg), no
problem at all. What I'm against is pushing that into block layer,
which until now had "true" size == rq->data_len == sum(sg).
We're about to break one of the two equals if we're gonna do sg
manipulation in block layer (Jens seems to be planning something
different) and all I'm saying is we're far better off breaking the
former one.
First, I don't really think SCSI LLDs will make much use of explicit
padding or draining. Secondly, even when such need arises, keeping
scsi_bufflen() at the "true" size is easy no matter which way we go with
rq->data_len.
Anyways, let's wait and see what Jens comes up with.
> Any other block layer fixes are welcome. But for now this is the best fix we have
> that only breaks theoretical, yet to be submitted drivers.
Yeap, given the current code, I agree.
Thanks.
--
tejun
On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote:
> On Wed, Mar 05 2008, Tejun Heo wrote:
> > This is getting insanely subtle. Let's say there's PIO driver which
> > transfer certain sized chunks at a time and completes request partially
> > after completing each chunk and the driver uses draining to eat up
> > whatever excess data, which seems like a legit use case to me. But it
> > won't work because __end_that_request_first() will terminate when it
> > reaches reaches the 'true' transfer size. That's just broken API. FWIW,
> >
> > Nacked-by: Tejun Heo <[email protected]>
>
> Yeah, I think I may have gone a bit overboard in applying this so
> quickly. It's just not a good interface, silently adding the extra
> length if asked to complete more. It may even happen right now, for a
> driver that does no padding (it probably wont do any harm here either,
> but still).
>
> I'll try and see if I can come up with something cleaner.
>
> My basic design paradigm for this is that the _driver_ (or mid layer, if
> SCSI wants to handle it) should care about the padding. So make it easy
> for them to pad, but have it 'unrolled' by completion time. We should
> NOT need any extra_len checks or additions in the block/ directory,
> period.
Right, that's why my original proposal was to do nothing for padding
(other than ensure the driver could adjust the length if it wanted to)
and to add an extra element always for draining, which the driver could
ignore. It basically pushed the use paradigm onto the driver.
If we want the use paradigm shared between block and driver, then I
think the best approach is to keep all the bios the same (so not adjust
for padding), but do adjust in the blk_rq_map_sg(). That way we have
the padding and draining unwind information by comparing with the bio.
For passing on to the driver: req->data_len still needs to be the input
(bio) lenght. req->extra_len can record how much padding and draining
was added. The completion length also needs to be in terms of the true
(bio) length. Now, here's the subtlety. Because of the way transfers
work, we expect the padded length not to contribute to overrun (because
it represents transfers that were successfully completed at the correct
length), but we *do* expect drain usage to be recorded as overrun.
However, if we keep the bios intact, we have all the information to make
this determination in the block layer at completion time, with the
expectation that the lower layers report the exact amount they
transferred.
James
On Wed, 05 Mar 2008 09:21:24 -0600
James Bottomley <[email protected]> wrote:
> On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote:
> > On Wed, Mar 05 2008, Tejun Heo wrote:
> > > This is getting insanely subtle. Let's say there's PIO driver which
> > > transfer certain sized chunks at a time and completes request partially
> > > after completing each chunk and the driver uses draining to eat up
> > > whatever excess data, which seems like a legit use case to me. But it
> > > won't work because __end_that_request_first() will terminate when it
> > > reaches reaches the 'true' transfer size. That's just broken API. FWIW,
> > >
> > > Nacked-by: Tejun Heo <[email protected]>
> >
> > Yeah, I think I may have gone a bit overboard in applying this so
> > quickly. It's just not a good interface, silently adding the extra
> > length if asked to complete more. It may even happen right now, for a
> > driver that does no padding (it probably wont do any harm here either,
> > but still).
> >
> > I'll try and see if I can come up with something cleaner.
> >
> > My basic design paradigm for this is that the _driver_ (or mid layer, if
> > SCSI wants to handle it) should care about the padding. So make it easy
> > for them to pad, but have it 'unrolled' by completion time. We should
> > NOT need any extra_len checks or additions in the block/ directory,
> > period.
>
> Right, that's why my original proposal was to do nothing for padding
> (other than ensure the driver could adjust the length if it wanted to)
> and to add an extra element always for draining, which the driver could
> ignore. It basically pushed the use paradigm onto the driver.
>
> If we want the use paradigm shared between block and driver, then I
> think the best approach is to keep all the bios the same (so not adjust
> for padding), but do adjust in the blk_rq_map_sg(). That way we have
> the padding and draining unwind information by comparing with the bio.
Adjusting only sg in blk_rq_map_sg (like drain) looks much
better. This works with libata for me.
diff --git a/block/blk-map.c b/block/blk-map.c
index c07d9c8..e949969 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -140,26 +140,6 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
ubuf += ret;
}
- /*
- * __blk_rq_map_user() copies the buffers if starting address
- * or length isn't aligned to dma_pad_mask. As the copied
- * buffer is always page aligned, we know that there's enough
- * room for padding. Extend the last bio and update
- * rq->data_len accordingly.
- *
- * On unmap, bio_uncopy_user() will use unmodified
- * bio_map_data pointed to by bio->bi_private.
- */
- if (len & q->dma_pad_mask) {
- unsigned int pad_len = (q->dma_pad_mask & ~len) + 1;
- struct bio *tail = rq->biotail;
-
- tail->bi_io_vec[tail->bi_vcnt - 1].bv_len += pad_len;
- tail->bi_size += pad_len;
-
- rq->extra_len += pad_len;
- }
-
rq->buffer = rq->data = NULL;
return 0;
unmap_rq:
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 0f58616..2a81c87 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -220,6 +220,13 @@ new_segment:
bvprv = bvec;
} /* segments in rq */
+ if (sg && (q->dma_pad_mask & rq->data_len)) {
+ unsigned int pad_len = (q->dma_pad_mask & ~rq->data_len) + 1;
+
+ sg->length += pad_len;
+ rq->extra_len += pad_len;
+ }
+
if (q->dma_drain_size && q->dma_drain_needed(rq)) {
if (rq->cmd_flags & REQ_RW)
memset(q->dma_drain_buffer, 0, q->dma_drain_size);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index e5c6f6a..fecba05 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
"Notifying upper driver of completion "
"(result %x)\n", cmd->result));
- good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
+ good_bytes = scsi_bufflen(cmd);
if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
drv = scsi_cmd_to_driver(cmd);
if (drv->done)
On Wed, 05 Mar 2008 09:44:01 +0900
Tejun Heo <[email protected]> wrote:
> >> Things going the other way is fine with me but I at least want to hear a
> >> valid rationale. Till now all I got is "because that's the true size"
> >> which doesn't really make much sense to me.
> >
> > Most of users of request structure care about only the real data
> > length, don't care about padding and drain length. Why do they bother
> > to use a helper function to get the real data length?
>
> I think this is where the difference comes from. To me it seems
> internal usage seems more wide-spread and more delicate and not too many
> care about the true size and when they do only in well defined places.
> Maybe it comes from the difference between your most and my most.
I don't think that they only in well defined places.
If you see scsi mid-layer (and LLDs), you can find several places that
use rq->data_len as the true data length.
Breaking rq->data_len == the true data length theoretically
wrong. Even if it affects only libata now, it will hurt us, I think.
Hello, FUJITA.
FUJITA Tomonori wrote:
>>>> Things going the other way is fine with me but I at least want to hear a
>>>> valid rationale. Till now all I got is "because that's the true size"
>>>> which doesn't really make much sense to me.
>>> Most of users of request structure care about only the real data
>>> length, don't care about padding and drain length. Why do they bother
>>> to use a helper function to get the real data length?
>> I think this is where the difference comes from. To me it seems
>> internal usage seems more wide-spread and more delicate and not too many
>> care about the true size and when they do only in well defined places.
>> Maybe it comes from the difference between your most and my most.
>
> I don't think that they only in well defined places.
>
> If you see scsi mid-layer (and LLDs), you can find several places that
> use rq->data_len as the true data length.
>
> Breaking rq->data_len == the true data length theoretically
> wrong. Even if it affects only libata now, it will hurt us, I think.
Yeap, I fully agree it's much better not to break any of the two
assumptions except when it's actually needed. Both padding and draining
are requirements from low level driver which usually stems from hardware
kinkiness, so adjusting sg and length there and let the rest of system
not care about it sounds like a good idea to me. Maybe something good
can come out of this long thread. :-)
Thanks.
--
tejun
On Wed, 05 Mar 2008 12:16:15 +0200
Boaz Harrosh <[email protected]> wrote:
> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <[email protected]> wrote:
> > On Wed, 05 Mar 2008 08:33:05 +0900
> > Tejun Heo <[email protected]> wrote:
> >
> >> FUJITA Tomonori wrote:
> >>> Hmm, does SCSI mid-layer need to care about how many bytes the block
> >>> layer allocates? I don't think that extra_len is NOT good_bytes.
> >>>
> >>> I think that the block layer had better take care about it (fix
> >>> __end_that_request_first?).
> >> Yeah, probably calling completion functions w/o bytes count is the right
> >> thing to do but what I was talking about was what could break when the
> >> semantics of rq->data_len changed. If we keep rq->data_len() ==
> >> sum(sg), we keep it business as usual for all the rest except for the
> >> device application layer if we don't we do the reverse and SCSI midlayer
> >> completion was a good example, I think.
> >
> > sglist is a low-level I/O representation for device drivers. SCSI
> > midlayer should not care about sglist. We should not fix SCSI midlayer
> > for rq->data_len != sum(sg) change (so I can't agree with your
> > diagrams in another mail).
> >
> > When if we change a rule, we need to fix something.
> >
> > If we keep rq->data_len == sum(sg), we need to fix the device
> > application layer. If we keep rq->data_len == the true data length, we
> > need to fix the low-level drivers.
> >
> > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709
> > since we are in -rc stages. But I plan to send a patch to revert it
> > and fix this issue in the block layer. I'd like to test it in -mm for
> > a while.
>
> No this commit is a serious bug, and the only fix is like you suggested
> in __end_that_request_first. This is because it breaks that scsi-ml loop
> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this
> commit is a data corruption.
Ah, I knew that the patch doesn't work with partial completion but I
thought that it doesn't happen with PC commands... And touching
__end_that_request_first looked really hacky so I didn't send such
patch.
Moving the padding adjustment to blk_rq_map_sg (James' proposal) looks
fine. Maybe Jens will come up with something better.
On Thu, Mar 06 2008, FUJITA Tomonori wrote:
> On Wed, 05 Mar 2008 09:21:24 -0600
> James Bottomley <[email protected]> wrote:
>
> > On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote:
> > > On Wed, Mar 05 2008, Tejun Heo wrote:
> > > > This is getting insanely subtle. Let's say there's PIO driver which
> > > > transfer certain sized chunks at a time and completes request partially
> > > > after completing each chunk and the driver uses draining to eat up
> > > > whatever excess data, which seems like a legit use case to me. But it
> > > > won't work because __end_that_request_first() will terminate when it
> > > > reaches reaches the 'true' transfer size. That's just broken API. FWIW,
> > > >
> > > > Nacked-by: Tejun Heo <[email protected]>
> > >
> > > Yeah, I think I may have gone a bit overboard in applying this so
> > > quickly. It's just not a good interface, silently adding the extra
> > > length if asked to complete more. It may even happen right now, for a
> > > driver that does no padding (it probably wont do any harm here either,
> > > but still).
> > >
> > > I'll try and see if I can come up with something cleaner.
> > >
> > > My basic design paradigm for this is that the _driver_ (or mid layer, if
> > > SCSI wants to handle it) should care about the padding. So make it easy
> > > for them to pad, but have it 'unrolled' by completion time. We should
> > > NOT need any extra_len checks or additions in the block/ directory,
> > > period.
> >
> > Right, that's why my original proposal was to do nothing for padding
> > (other than ensure the driver could adjust the length if it wanted to)
> > and to add an extra element always for draining, which the driver could
> > ignore. It basically pushed the use paradigm onto the driver.
> >
> > If we want the use paradigm shared between block and driver, then I
> > think the best approach is to keep all the bios the same (so not adjust
> > for padding), but do adjust in the blk_rq_map_sg(). That way we have
> > the padding and draining unwind information by comparing with the bio.
>
> Adjusting only sg in blk_rq_map_sg (like drain) looks much
> better. This works with libata for me.
Looks like a much better solution to me. Anyone have any valid
objections against moving the padding to the sg map time?
--
Jens Axboe
Jens Axboe wrote:
>>> If we want the use paradigm shared between block and driver, then I
>>> think the best approach is to keep all the bios the same (so not adjust
>>> for padding), but do adjust in the blk_rq_map_sg(). That way we have
>>> the padding and draining unwind information by comparing with the bio.
>> Adjusting only sg in blk_rq_map_sg (like drain) looks much
>> better. This works with libata for me.
>
> Looks like a much better solution to me. Anyone have any valid
> objections against moving the padding to the sg map time?
Not necessarily objections but some concerns.
* As completion is done in bio terms, it makes completion from LLDs a
bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg).
* I've been wondering why we are not using sg chain / table or whatever
directly in bios and maybe rq_map_sg can go away in future.
How about separating out the padding / draining adjustment into a
separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra()
and make it the responsibility of the LLD which requested
padding/draining to apply and undo the adjustments? It can undo the
adjustments when it returns the the request to its upper layer. If rq
completion is handled by upper layer, it will do the right thing. If rq
completion is handled by LLD, it can see the bio it wants to see.
Thanks.
--
tejun
On Fri, 07 Mar 2008 09:07:23 +0900
Tejun Heo <[email protected]> wrote:
> Jens Axboe wrote:
> >>> If we want the use paradigm shared between block and driver, then I
> >>> think the best approach is to keep all the bios the same (so not adjust
> >>> for padding), but do adjust in the blk_rq_map_sg(). That way we have
> >>> the padding and draining unwind information by comparing with the bio.
> >> Adjusting only sg in blk_rq_map_sg (like drain) looks much
> >> better. This works with libata for me.
> >
> > Looks like a much better solution to me. Anyone have any valid
> > objections against moving the padding to the sg map time?
>
> Not necessarily objections but some concerns.
>
> * As completion is done in bio terms, it makes completion from LLDs a
> bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg).
What do you mean? How does sub(bio) affect LLDs?
> * I've been wondering why we are not using sg chain / table or whatever
> directly in bios and maybe rq_map_sg can go away in future.
You mean that LLDs use bios directly? For me, sg and bio have very
different objectives and it's a clean layer separation.
> How about separating out the padding / draining adjustment into a
> separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra()
> and make it the responsibility of the LLD which requested
> padding/draining to apply and undo the adjustments? It can undo the
> adjustments when it returns the the request to its upper layer. If rq
> completion is handled by upper layer, it will do the right thing. If rq
> completion is handled by LLD, it can see the bio it wants to see.
If possible, I'd like to avoid creating APIs for them. I think that
the current approach is much better than such APIs.
FUJITA Tomonori wrote:
> On Fri, 07 Mar 2008 09:07:23 +0900
> Tejun Heo <[email protected]> wrote:
>
>> Jens Axboe wrote:
>>>>> If we want the use paradigm shared between block and driver, then I
>>>>> think the best approach is to keep all the bios the same (so not adjust
>>>>> for padding), but do adjust in the blk_rq_map_sg(). That way we have
>>>>> the padding and draining unwind information by comparing with the bio.
>>>> Adjusting only sg in blk_rq_map_sg (like drain) looks much
>>>> better. This works with libata for me.
>>> Looks like a much better solution to me. Anyone have any valid
>>> objections against moving the padding to the sg map time?
>> Not necessarily objections but some concerns.
>>
>> * As completion is done in bio terms, it makes completion from LLDs a
>> bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg).
>
> What do you mean? How does sub(bio) affect LLDs?
LLDs which loop over sg's trying to complete rq incrementally will see
rq going away sooner than it expected.
>> * I've been wondering why we are not using sg chain / table or whatever
>> directly in bios and maybe rq_map_sg can go away in future.
>
> You mean that LLDs use bios directly? For me, sg and bio have very
> different objectives and it's a clean layer separation.
Actually the other way, block layer use sg instead of bio_vec in bio.
Layer separation doesn't necessarily require copying about the same
information to differently formatted data structure. I'm not sure it
will be a clean win tho. Requests hang longer in scheduler queue and
and bio_vec is smaller and scatterlist.
The thing is that, to me, blk_rq_map_sg() doesn't really look necessary,
it can be done just as well when the request is fetched from the queue
by block driver. (continued below...)
>> How about separating out the padding / draining adjustment into a
>> separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra()
>> and make it the responsibility of the LLD which requested
>> padding/draining to apply and undo the adjustments? It can undo the
>> adjustments when it returns the the request to its upper layer. If rq
>> completion is handled by upper layer, it will do the right thing. If rq
>> completion is handled by LLD, it can see the bio it wants to see.
>
> If possible, I'd like to avoid creating APIs for them. I think that
> the current approach is much better than such APIs.
And, so, I'm not too sure whether putting more mechanisms into it is a
good idea.
Thanks.
--
tejun
From: Jens Axboe <[email protected]>
Subject: Re: [PATCH] blk: missing add of padded bytes to io completion byte count
Date: Thu, 6 Mar 2008 14:41:39 +0100
> On Thu, Mar 06 2008, FUJITA Tomonori wrote:
> > On Wed, 05 Mar 2008 09:21:24 -0600
> > James Bottomley <[email protected]> wrote:
> >
> > > On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote:
> > > > On Wed, Mar 05 2008, Tejun Heo wrote:
> > > > > This is getting insanely subtle. Let's say there's PIO driver which
> > > > > transfer certain sized chunks at a time and completes request partially
> > > > > after completing each chunk and the driver uses draining to eat up
> > > > > whatever excess data, which seems like a legit use case to me. But it
> > > > > won't work because __end_that_request_first() will terminate when it
> > > > > reaches reaches the 'true' transfer size. That's just broken API. FWIW,
> > > > >
> > > > > Nacked-by: Tejun Heo <[email protected]>
> > > >
> > > > Yeah, I think I may have gone a bit overboard in applying this so
> > > > quickly. It's just not a good interface, silently adding the extra
> > > > length if asked to complete more. It may even happen right now, for a
> > > > driver that does no padding (it probably wont do any harm here either,
> > > > but still).
> > > >
> > > > I'll try and see if I can come up with something cleaner.
> > > >
> > > > My basic design paradigm for this is that the _driver_ (or mid layer, if
> > > > SCSI wants to handle it) should care about the padding. So make it easy
> > > > for them to pad, but have it 'unrolled' by completion time. We should
> > > > NOT need any extra_len checks or additions in the block/ directory,
> > > > period.
> > >
> > > Right, that's why my original proposal was to do nothing for padding
> > > (other than ensure the driver could adjust the length if it wanted to)
> > > and to add an extra element always for draining, which the driver could
> > > ignore. It basically pushed the use paradigm onto the driver.
> > >
> > > If we want the use paradigm shared between block and driver, then I
> > > think the best approach is to keep all the bios the same (so not adjust
> > > for padding), but do adjust in the blk_rq_map_sg(). That way we have
> > > the padding and draining unwind information by comparing with the bio.
> >
> > Adjusting only sg in blk_rq_map_sg (like drain) looks much
> > better. This works with libata for me.
>
> Looks like a much better solution to me. Anyone have any valid
> objections against moving the padding to the sg map time?
What's the situation with this fix?