This is the start of the stable review cycle for the 4.4.60 release.
There are 26 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat Apr 8 08:35:54 UTC 2017.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.60-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <[email protected]>
Linux 4.4.60-rc1
Jason A. Donenfeld <[email protected]>
padata: avoid race in reordering
NeilBrown <[email protected]>
blk: Ensure users for current->bio_list can see the full list.
NeilBrown <[email protected]>
blk: improve order of bio handling in generic_make_request()
Alexandre Belloni <[email protected]>
power: reset: at91-poweroff: timely shutdown LPDDR memories
David Hildenbrand <[email protected]>
KVM: kvm_io_bus_unregister_dev() should never fail
Uwe Kleine-König <[email protected]>
rtc: s35390a: improve irq handling
Uwe Kleine-König <[email protected]>
rtc: s35390a: implement reset routine as suggested by the reference
Uwe Kleine-König <[email protected]>
rtc: s35390a: make sure all members in the output are set
Uwe Kleine-König <[email protected]>
rtc: s35390a: fix reading out alarm
Felix Fietkau <[email protected]>
MIPS: Lantiq: Fix cascaded IRQ setup
Naoya Horiguchi <[email protected]>
mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd()
Michel Dänzer <[email protected]>
drm/radeon: Override fpfn for all VRAM placements in radeon_evict_flags
Peter Xu <[email protected]>
KVM: x86: clear bus pointer when destroyed
Alan Stern <[email protected]>
USB: fix linked-list corruption in rh_call_control()
Nicolas Ferre <[email protected]>
tty/serial: atmel: fix TX path in atmel_console_write()
Richard Genoud <[email protected]>
tty/serial: atmel: fix race condition (TX+DMA)
Joerg Roedel <[email protected]>
ACPI: Do not create a platform_device for IOAPIC/IOxAPIC
Josh Poimboeuf <[email protected]>
ACPI: Fix incompatibility with mcount-based function graph tracing
Songjun Wu <[email protected]>
ASoC: atmel-classd: fix audio clock rate
Hui Wang <[email protected]>
ALSA: hda - fix a problem for lineout on a Dell AIO machine
Takashi Iwai <[email protected]>
ALSA: seq: Fix race during FIFO resize
John Garry <[email protected]>
scsi: libsas: fix ata xfer length
peter chang <[email protected]>
scsi: sg: check length passed to SG_NEXT_CMD_LEN
James Bottomley <[email protected]>
scsi: mpt3sas: fix hang on ata passthrough commands
Ross Lagerwall <[email protected]>
xen/setup: Don't relocate p2m over existing one
Ilya Dryomov <[email protected]>
libceph: force GFP_NOIO for socket allocations
-------------
Diffstat:
Makefile | 4 +-
arch/mips/lantiq/irq.c | 38 ++++----
arch/x86/xen/setup.c | 6 +-
block/bio.c | 12 ++-
block/blk-core.c | 40 +++++++--
drivers/acpi/Makefile | 1 -
drivers/acpi/acpi_platform.c | 8 +-
drivers/gpu/drm/radeon/radeon_ttm.c | 4 +-
drivers/md/raid1.c | 3 +-
drivers/md/raid10.c | 3 +-
drivers/power/reset/at91-poweroff.c | 54 ++++++++++-
drivers/rtc/rtc-s35390a.c | 167 +++++++++++++++++++++++++++--------
drivers/scsi/libsas/sas_ata.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_base.h | 12 +++
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 ++++++---
drivers/scsi/sg.c | 2 +
drivers/tty/serial/atmel_serial.c | 8 ++
drivers/usb/core/hcd.c | 7 +-
include/linux/kvm_host.h | 4 +-
kernel/padata.c | 5 +-
mm/hugetlb.c | 6 +-
net/ceph/messenger.c | 6 ++
sound/core/seq/seq_fifo.c | 4 +
sound/pci/hda/patch_realtek.c | 12 ++-
sound/soc/atmel/atmel-classd.c | 2 +-
virt/kvm/eventfd.c | 3 +-
virt/kvm/kvm_main.c | 40 ++++++---
27 files changed, 372 insertions(+), 121 deletions(-)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov <[email protected]>
commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 upstream.
sock_alloc_inode() allocates socket+inode and socket_wq with
GFP_KERNEL, which is not allowed on the writeback path:
Workqueue: ceph-msgr con_work [libceph]
ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
Call Trace:
[<ffffffff816dd629>] schedule+0x29/0x70
[<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
[<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
[<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
[<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
[<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
[<ffffffff81086335>] flush_work+0x165/0x250
[<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
[<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
[<ffffffff816d6b42>] ? __slab_free+0xee/0x234
[<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
[<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
[<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
[<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
[<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
[<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
[<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[<ffffffff811c0c18>] super_cache_scan+0x178/0x180
[<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
[<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
[<ffffffff8115af70>] shrink_slab+0x100/0x140
[<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
[<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
[<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
[<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
[<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
[<ffffffff811a0ac5>] new_slab+0x2c5/0x390
[<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
[<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
[<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
[<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
[<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
[<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
[<ffffffff811d8566>] alloc_inode+0x26/0xa0
[<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
[<ffffffff815b933e>] sock_alloc+0x1e/0x80
[<ffffffff815ba855>] __sock_create+0x95/0x220
[<ffffffff815baa04>] sock_create_kern+0x24/0x30
[<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
[<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
[<ffffffff81084c19>] process_one_work+0x159/0x4f0
[<ffffffff8108561b>] worker_thread+0x11b/0x530
[<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[<ffffffff8108b6f9>] kthread+0xc9/0xe0
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.
Link: http://tracker.ceph.com/issues/19309
Reported-by: Sergey Jerusalimov <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ceph/messenger.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -7,6 +7,7 @@
#include <linux/kthread.h>
#include <linux/net.h>
#include <linux/nsproxy.h>
+#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/socket.h>
#include <linux/string.h>
@@ -478,11 +479,16 @@ static int ceph_tcp_connect(struct ceph_
{
struct sockaddr_storage *paddr = &con->peer_addr.in_addr;
struct socket *sock;
+ unsigned int noio_flag;
int ret;
BUG_ON(con->sock);
+
+ /* sock_create_kern() allocates with GFP_KERNEL */
+ noio_flag = memalloc_noio_save();
ret = sock_create_kern(read_pnet(&con->msgr->net), paddr->ss_family,
SOCK_STREAM, IPPROTO_TCP, &sock);
+ memalloc_noio_restore(noio_flag);
if (ret)
return ret;
sock->sk->sk_allocation = GFP_NOFS;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Genoud <[email protected]>
commit 31ca2c63fdc0aee725cbd4f207c1256f5deaabde upstream.
If uart_flush_buffer() is called between atmel_tx_dma() and
atmel_complete_tx_dma(), the circular buffer has been cleared, but not
atmel_port->tx_len.
That leads to a circular buffer overflow (dumping (UART_XMIT_SIZE -
atmel_port->tx_len) bytes).
Tested-by: Nicolas Ferre <[email protected]>
Signed-off-by: Richard Genoud <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/serial/atmel_serial.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -1987,6 +1987,11 @@ static void atmel_flush_buffer(struct ua
atmel_uart_writel(port, ATMEL_PDC_TCR, 0);
atmel_port->pdc_tx.ofs = 0;
}
+ /*
+ * in uart_flush_buffer(), the xmit circular buffer has just
+ * been cleared, so we have to reset tx_len accordingly.
+ */
+ atmel_port->tx_len = 0;
}
/*
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre <[email protected]>
commit 497e1e16f45c70574dc9922c7f75c642c2162119 upstream.
A side effect of 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA
from transmitting in stop_tx") is that the console can be called with
TX path disabled. Then the system would hang trying to push charecters
out in atmel_console_putchar().
Signed-off-by: Nicolas Ferre <[email protected]>
Fixes: 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA from transmitting in stop_tx")
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/serial/atmel_serial.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -2504,6 +2504,9 @@ static void atmel_console_write(struct c
pdc_tx = atmel_uart_readl(port, ATMEL_PDC_PTSR) & ATMEL_PDC_TXTEN;
atmel_uart_writel(port, ATMEL_PDC_PTCR, ATMEL_PDC_TXTDIS);
+ /* Make sure that tx path is actually able to send characters */
+ atmel_uart_writel(port, ATMEL_US_CR, ATMEL_US_TXEN);
+
uart_console_write(port, s, count, atmel_console_putchar);
/*
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau <[email protected]>
commit 6c356eda225e3ee134ed4176b9ae3a76f793f4dd upstream.
With the IRQ stack changes integrated, the XRX200 devices started
emitting a constant stream of kernel messages like this:
[ 565.415310] Spurious IRQ: CAUSE=0x1100c300
This is caused by IP0 getting handled by plat_irq_dispatch() rather than
its vectored interrupt handler, which is fixed by commit de856416e714
("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch").
Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly
by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ
for all MIPS CPU interrupts.
Signed-off-by: Felix Fietkau <[email protected]>
Acked-by: John Crispin <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/15077/
[[email protected]: tweaked commit message]
Signed-off-by: James Hogan <[email protected]>
Signed-off-by: Amit Pundir <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/lantiq/irq.c | 36 ++++++++++++++++--------------------
1 file changed, 16 insertions(+), 20 deletions(-)
--- a/arch/mips/lantiq/irq.c
+++ b/arch/mips/lantiq/irq.c
@@ -269,6 +269,11 @@ static void ltq_hw5_irqdispatch(void)
DEFINE_HWx_IRQDISPATCH(5)
#endif
+static void ltq_hw_irq_handler(struct irq_desc *desc)
+{
+ ltq_hw_irqdispatch(irq_desc_get_irq(desc) - 2);
+}
+
#ifdef CONFIG_MIPS_MT_SMP
void __init arch_init_ipiirq(int irq, struct irqaction *action)
{
@@ -313,23 +318,19 @@ static struct irqaction irq_call = {
asmlinkage void plat_irq_dispatch(void)
{
unsigned int pending = read_c0_status() & read_c0_cause() & ST0_IM;
- unsigned int i;
+ int irq;
- if ((MIPS_CPU_TIMER_IRQ == 7) && (pending & CAUSEF_IP7)) {
- do_IRQ(MIPS_CPU_TIMER_IRQ);
- goto out;
- } else {
- for (i = 0; i < MAX_IM; i++) {
- if (pending & (CAUSEF_IP2 << i)) {
- ltq_hw_irqdispatch(i);
- goto out;
- }
- }
+ if (!pending) {
+ spurious_interrupt();
+ return;
}
- pr_alert("Spurious IRQ: CAUSE=0x%08x\n", read_c0_status());
-out:
- return;
+ pending >>= CAUSEB_IP;
+ while (pending) {
+ irq = fls(pending) - 1;
+ do_IRQ(MIPS_CPU_IRQ_BASE + irq);
+ pending &= ~BIT(irq);
+ }
}
static int icu_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw)
@@ -354,11 +355,6 @@ static const struct irq_domain_ops irq_d
.map = icu_map,
};
-static struct irqaction cascade = {
- .handler = no_action,
- .name = "cascade",
-};
-
int __init icu_of_init(struct device_node *node, struct device_node *parent)
{
struct device_node *eiu_node;
@@ -390,7 +386,7 @@ int __init icu_of_init(struct device_nod
mips_cpu_irq_init();
for (i = 0; i < MAX_IM; i++)
- setup_irq(i + 2, &cascade);
+ irq_set_chained_handler(i + 2, ltq_hw_irq_handler);
if (cpu_has_vint) {
pr_info("Setting up vectored interrupts\n");
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naoya Horiguchi <[email protected]>
commit c9d398fa237882ea07167e23bcfc5e6847066518 upstream.
I found the race condition which triggers the following bug when
move_pages() and soft offline are called on a single hugetlb page
concurrently.
Soft offlining page 0x119400 at 0x700000000000
BUG: unable to handle kernel paging request at ffffea0011943820
IP: follow_huge_pmd+0x143/0x190
PGD 7ffd2067
PUD 7ffd1067
PMD 0
[61163.582052] Oops: 0000 [#1] SMP
Modules linked in: binfmt_misc ppdev virtio_balloon parport_pc pcspkr i2c_piix4 parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk 8139too crc32c_intel ata_piix serio_raw libata virtio_pci 8139cp virtio_ring virtio mii floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: cap_check]
CPU: 0 PID: 22573 Comm: iterate_numa_mo Tainted: P OE 4.11.0-rc2-mm1+ #2
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:follow_huge_pmd+0x143/0x190
RSP: 0018:ffffc90004bdbcd0 EFLAGS: 00010202
RAX: 0000000465003e80 RBX: ffffea0004e34d30 RCX: 00003ffffffff000
RDX: 0000000011943800 RSI: 0000000000080001 RDI: 0000000465003e80
RBP: ffffc90004bdbd18 R08: 0000000000000000 R09: ffff880138d34000
R10: ffffea0004650000 R11: 0000000000c363b0 R12: ffffea0011943800
R13: ffff8801b8d34000 R14: ffffea0000000000 R15: 000077ff80000000
FS: 00007fc977710740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffea0011943820 CR3: 000000007a746000 CR4: 00000000001406f0
Call Trace:
follow_page_mask+0x270/0x550
SYSC_move_pages+0x4ea/0x8f0
SyS_move_pages+0xe/0x10
do_syscall_64+0x67/0x180
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fc976e03949
RSP: 002b:00007ffe72221d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000117
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc976e03949
RDX: 0000000000c22390 RSI: 0000000000001400 RDI: 0000000000005827
RBP: 00007ffe72221e00 R08: 0000000000c2c3a0 R09: 0000000000000004
R10: 0000000000c363b0 R11: 0000000000000246 R12: 0000000000400650
R13: 00007ffe72221ee0 R14: 0000000000000000 R15: 0000000000000000
Code: 81 e4 ff ff 1f 00 48 21 c2 49 c1 ec 0c 48 c1 ea 0c 4c 01 e2 49 bc 00 00 00 00 00 ea ff ff 48 c1 e2 06 49 01 d4 f6 45 bc 04 74 90 <49> 8b 7c 24 20 40 f6 c7 01 75 2b 4c 89 e7 8b 47 1c 85 c0 7e 2a
RIP: follow_huge_pmd+0x143/0x190 RSP: ffffc90004bdbcd0
CR2: ffffea0011943820
---[ end trace e4f81353a2d23232 ]---
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
This bug is triggered when pmd_present() returns true for non-present
hugetlb, so fixing the present check in follow_huge_pmd() prevents it.
Using pmd_present() to determine present/non-present for hugetlb is not
correct, because pmd_present() checks multiple bits (not only
_PAGE_PRESENT) for historical reason and it can misjudge hugetlb state.
Fixes: e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: Hillf Danton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/hugetlb.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4362,6 +4362,7 @@ follow_huge_pmd(struct mm_struct *mm, un
{
struct page *page = NULL;
spinlock_t *ptl;
+ pte_t pte;
retry:
ptl = pmd_lockptr(mm, pmd);
spin_lock(ptl);
@@ -4371,12 +4372,13 @@ retry:
*/
if (!pmd_huge(*pmd))
goto out;
- if (pmd_present(*pmd)) {
+ pte = huge_ptep_get((pte_t *)pmd);
+ if (pte_present(pte)) {
page = pmd_page(*pmd) + ((address & ~PMD_MASK) >> PAGE_SHIFT);
if (flags & FOLL_GET)
get_page(page);
} else {
- if (is_hugetlb_entry_migration(huge_ptep_get((pte_t *)pmd))) {
+ if (is_hugetlb_entry_migration(pte)) {
spin_unlock(ptl);
__migration_entry_wait(mm, (pte_t *)pmd, ptl);
goto retry;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Garry <[email protected]>
commit 9702c67c6066f583b629cf037d2056245bb7a8e6 upstream.
The total ata xfer length may not be calculated properly, in that we do
not use the proper method to get an sg element dma length.
According to the code comment, sg_dma_len() should be used after
dma_map_sg() is called.
This issue was found by turning on the SMMUv3 in front of the hisi_sas
controller in hip07. Multiple sg elements were being combined into a
single element, but the original first element length was being use as
the total xfer length.
Fixes: ff2aeb1eb64c8a4770a6 ("libata: convert to chained sg")
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/libsas/sas_ata.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -218,7 +218,7 @@ static unsigned int sas_ata_qc_issue(str
task->num_scatter = qc->n_elem;
} else {
for_each_sg(qc->sg, sg, qc->n_elem, si)
- xfer += sg->length;
+ xfer += sg_dma_len(sg);
task->total_xfer_len = xfer;
task->num_scatter = si;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Bottomley <[email protected]>
commit ffb58456589443ca572221fabbdef3db8483a779 upstream.
mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time. If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout). The original fix was to block
the device when an ATA command came in, but this caused a regression
with
commit 669f044170d8933c3d66d231b69ea97cb8447338
Author: Bart Van Assche <[email protected]>
Date: Tue Nov 22 16:17:13 2016 -0800
scsi: srp_transport: Move queuecommand() wait code to SCSI core
So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends. The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).
[mkp: addressed feedback wrt. test_bit and fixed whitespace]
Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley <[email protected]>
Acked-by: Sreekanth Reddy <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reported-by: Ingo Molnar <[email protected]>
Tested-by: Ingo Molnar <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Cc: Joe Korty <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/mpt3sas/mpt3sas_base.h | 12 ++++++++++
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 ++++++++++++++++++++++-------------
2 files changed, 38 insertions(+), 14 deletions(-)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -390,6 +390,7 @@ struct MPT3SAS_TARGET {
* @eedp_enable: eedp support enable bit
* @eedp_type: 0(type_1), 1(type_2), 2(type_3)
* @eedp_block_length: block size
+ * @ata_command_pending: SATL passthrough outstanding for device
*/
struct MPT3SAS_DEVICE {
struct MPT3SAS_TARGET *sas_target;
@@ -398,6 +399,17 @@ struct MPT3SAS_DEVICE {
u8 configured_lun;
u8 block;
u8 tlr_snoop_check;
+ /*
+ * Bug workaround for SATL handling: the mpt2/3sas firmware
+ * doesn't return BUSY or TASK_SET_FULL for subsequent
+ * commands while a SATL pass through is in operation as the
+ * spec requires, it simply does nothing with them until the
+ * pass through completes, causing them possibly to timeout if
+ * the passthrough is a long executing command (like format or
+ * secure erase). This variable allows us to do the right
+ * thing while a SATL command is pending.
+ */
+ unsigned long ata_command_pending;
};
#define MPT3_CMD_NOT_USED 0x8000 /* free */
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3707,9 +3707,18 @@ _scsih_temp_threshold_events(struct MPT3
}
}
-static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending)
{
- return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+ struct MPT3SAS_DEVICE *priv = scmd->device->hostdata;
+
+ if (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16)
+ return 0;
+
+ if (pending)
+ return test_and_set_bit(0, &priv->ata_command_pending);
+
+ clear_bit(0, &priv->ata_command_pending);
+ return 0;
}
/**
@@ -3733,9 +3742,7 @@ _scsih_flush_running_cmds(struct MPT3SAS
if (!scmd)
continue;
count++;
- if (ata_12_16_cmd(scmd))
- scsi_internal_device_unblock(scmd->device,
- SDEV_RUNNING);
+ _scsih_set_satl_pending(scmd, false);
mpt3sas_base_free_smid(ioc, smid);
scsi_dma_unmap(scmd);
if (ioc->pci_error_recovery)
@@ -3866,13 +3873,6 @@ scsih_qcmd(struct Scsi_Host *shost, stru
if (ioc->logging_level & MPT_DEBUG_SCSI)
scsi_print_command(scmd);
- /*
- * Lock the device for any subsequent command until command is
- * done.
- */
- if (ata_12_16_cmd(scmd))
- scsi_internal_device_block(scmd->device);
-
sas_device_priv_data = scmd->device->hostdata;
if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
scmd->result = DID_NO_CONNECT << 16;
@@ -3886,6 +3886,19 @@ scsih_qcmd(struct Scsi_Host *shost, stru
return 0;
}
+ /*
+ * Bug work around for firmware SATL handling. The loop
+ * is based on atomic operations and ensures consistency
+ * since we're lockless at this point
+ */
+ do {
+ if (test_bit(0, &sas_device_priv_data->ata_command_pending)) {
+ scmd->result = SAM_STAT_BUSY;
+ scmd->scsi_done(scmd);
+ return 0;
+ }
+ } while (_scsih_set_satl_pending(scmd, true));
+
sas_target_priv_data = sas_device_priv_data->sas_target;
/* invalid device handle */
@@ -4445,8 +4458,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *i
if (scmd == NULL)
return 1;
- if (ata_12_16_cmd(scmd))
- scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+ _scsih_set_satl_pending(scmd, false);
mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf <[email protected]>
commit 61b79e16c68d703dde58c25d3935d67210b7d71b upstream.
Paul Menzel reported a warning:
WARNING: CPU: 0 PID: 774 at /build/linux-ROBWaj/linux-4.9.13/kernel/trace/trace_functions_graph.c:233 ftrace_return_to_handler+0x1aa/0x1e0
Bad frame pointer: expected f6919d98, received f6919db0
from func acpi_pm_device_sleep_wake return to c43b6f9d
The warning means that function graph tracing is broken for the
acpi_pm_device_sleep_wake() function. That's because the ACPI Makefile
unconditionally sets the '-Os' gcc flag to optimize for size. That's an
issue because mcount-based function graph tracing is incompatible with
'-Os' on x86, thanks to the following gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109
I have another patch pending which will ensure that mcount-based
function graph tracing is never used with CONFIG_CC_OPTIMIZE_FOR_SIZE on
x86.
But this patch is needed in addition to that one because the ACPI
Makefile overrides that config option for no apparent reason. It has
had this flag since the beginning of git history, and there's no related
comment, so I don't know why it's there. As far as I can tell, there's
no reason for it to be there. The appropriate behavior is for it to
honor CONFIG_CC_OPTIMIZE_FOR_{SIZE,PERFORMANCE} like the rest of the
kernel.
Reported-by: Paul Menzel <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/acpi/Makefile | 1 -
1 file changed, 1 deletion(-)
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -2,7 +2,6 @@
# Makefile for the Linux ACPI interpreter
#
-ccflags-y := -Os
ccflags-$(CONFIG_ACPI_DEBUG) += -DACPI_DEBUG_OUTPUT
#
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Songjun Wu <[email protected]>
commit cd3ac9affc43b44f49d7af70d275f0bd426ba643 upstream.
Fix the audio clock rate according to the datasheet.
Reported-by: Dushara Jayasinghe <[email protected]>
Signed-off-by: Songjun Wu <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/soc/atmel/atmel-classd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/atmel/atmel-classd.c
+++ b/sound/soc/atmel/atmel-classd.c
@@ -343,7 +343,7 @@ static int atmel_classd_codec_dai_digita
}
#define CLASSD_ACLK_RATE_11M2896_MPY_8 (112896 * 100 * 8)
-#define CLASSD_ACLK_RATE_12M288_MPY_8 (12228 * 1000 * 8)
+#define CLASSD_ACLK_RATE_12M288_MPY_8 (12288 * 1000 * 8)
static struct {
int rate;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai <[email protected]>
commit 2d7d54002e396c180db0c800c1046f0a3c471597 upstream.
When a new event is queued while processing to resize the FIFO in
snd_seq_fifo_clear(), it may lead to a use-after-free, as the old pool
that is being queued gets removed. For avoiding this race, we need to
close the pool to be deleted and sync its usage before actually
deleting it.
The issue was spotted by syzkaller.
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/core/seq/seq_fifo.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/sound/core/seq/seq_fifo.c
+++ b/sound/core/seq/seq_fifo.c
@@ -265,6 +265,10 @@ int snd_seq_fifo_resize(struct snd_seq_f
/* NOTE: overflow flag is not cleared */
spin_unlock_irqrestore(&f->lock, flags);
+ /* close the old pool and wait until all users are gone */
+ snd_seq_pool_mark_closing(oldpool);
+ snd_use_lock_sync(&f->use_lock);
+
/* release cells in old pool */
for (cell = oldhead; cell; cell = next) {
next = cell->next;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hui Wang <[email protected]>
commit 2f726aec19a9d2c63bec9a8a53a3910ffdcd09f8 upstream.
On this Dell AIO machine, the lineout jack does not work.
We found the pin 0x1a is assigned to lineout on this machine, and in
the past, we applied ALC298_FIXUP_DELL1_MIC_NO_PRESENCE to fix the
heaset-set mic problem for this machine, this fixup will redefine
the pin 0x1a to headphone-mic, as a result the lineout doesn't
work anymore.
After consulting with Dell, they told us this machine doesn't support
microphone via headset jack, so we add a new fixup which only defines
the pin 0x18 as the headset-mic.
[rearranged the fixup insertion position by tiwai in order to make the
merge with other branches easier -- tiwai]
Fixes: 59ec4b57bcae ("ALSA: hda - Fix headset mic detection problem for two dell machines")
Signed-off-by: Hui Wang <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/patch_realtek.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -4831,6 +4831,7 @@ enum {
ALC292_FIXUP_DISABLE_AAMIX,
ALC293_FIXUP_DISABLE_AAMIX_MULTIJACK,
ALC298_FIXUP_DELL1_MIC_NO_PRESENCE,
+ ALC298_FIXUP_DELL_AIO_MIC_NO_PRESENCE,
ALC275_FIXUP_DELL_XPS,
ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE,
ALC293_FIXUP_LENOVO_SPK_NOISE,
@@ -5429,6 +5430,15 @@ static const struct hda_fixup alc269_fix
.chained = true,
.chain_id = ALC269_FIXUP_HEADSET_MODE
},
+ [ALC298_FIXUP_DELL_AIO_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x18, 0x01a1913c }, /* use as headset mic, without its own jack detect */
+ { }
+ },
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MODE
+ },
[ALC275_FIXUP_DELL_XPS] = {
.type = HDA_FIXUP_VERBS,
.v.verbs = (const struct hda_verb[]) {
@@ -5501,7 +5511,7 @@ static const struct hda_fixup alc269_fix
.type = HDA_FIXUP_FUNC,
.v.func = alc298_fixup_speaker_volume,
.chained = true,
- .chain_id = ALC298_FIXUP_DELL1_MIC_NO_PRESENCE,
+ .chain_id = ALC298_FIXUP_DELL_AIO_MIC_NO_PRESENCE,
},
[ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER] = {
.type = HDA_FIXUP_PINS,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: peter chang <[email protected]>
commit bf33f87dd04c371ea33feb821b60d63d754e3124 upstream.
The user can control the size of the next command passed along, but the
value passed to the ioctl isn't checked against the usable max command
size.
Signed-off-by: Peter Chang <[email protected]>
Acked-by: Douglas Gilbert <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/sg.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1008,6 +1008,8 @@ sg_ioctl(struct file *filp, unsigned int
result = get_user(val, ip);
if (result)
return result;
+ if (val > SG_MAX_CDB_SIZE)
+ return -ENOMEM;
sfp->next_cmd_len = (val > 0) ? val : 0;
return 0;
case SG_GET_VERSION_NUM:
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason A. Donenfeld <[email protected]>
commit de5540d088fe97ad583cc7d396586437b32149a5 upstream.
Under extremely heavy uses of padata, crashes occur, and with list
debugging turned on, this happens instead:
[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011] [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198] [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364] [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513] [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659] [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772] [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915] [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084] [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:
spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);
This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:
next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
padata = list_entry(reorder->list.next,
struct padata_priv, list);
spin_lock(&reorder->lock);
list_del_init(&padata->list);
atomic_dec(&pd->reorder_objects);
spin_unlock(&reorder->lock);
pd->processed++;
goto out;
}
out:
return padata;
I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix is thus be hoist that lock outside of
that block.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/padata.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -189,19 +189,20 @@ static struct padata_priv *padata_get_ne
reorder = &next_queue->reorder;
+ spin_lock(&reorder->lock);
if (!list_empty(&reorder->list)) {
padata = list_entry(reorder->list.next,
struct padata_priv, list);
- spin_lock(&reorder->lock);
list_del_init(&padata->list);
atomic_dec(&pd->reorder_objects);
- spin_unlock(&reorder->lock);
pd->processed++;
+ spin_unlock(&reorder->lock);
goto out;
}
+ spin_unlock(&reorder->lock);
if (__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index) {
padata = ERR_PTR(-ENODATA);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <[email protected]>
commit f5fe1b51905df7cfe4fdfd85c5fb7bc5b71a094f upstream.
Commit 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
changed current->bio_list so that it did not contain *all* of the
queued bios, but only those submitted by the currently running
make_request_fn.
There are two places which walk the list and requeue selected bios,
and others that check if the list is empty. These are no longer
correct.
So redefine current->bio_list to point to an array of two lists, which
contain all queued bios, and adjust various code to test or walk both
lists.
Signed-off-by: NeilBrown <[email protected]>
Fixes: 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
Signed-off-by: Jens Axboe <[email protected]>
[jwang: backport to 4.4]
Signed-off-by: Jack Wang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
block/bio.c | 12 +++++++++---
block/blk-core.c | 31 +++++++++++++++++++------------
drivers/md/raid1.c | 3 ++-
drivers/md/raid10.c | 3 ++-
4 files changed, 32 insertions(+), 17 deletions(-)
--- a/block/bio.c
+++ b/block/bio.c
@@ -373,10 +373,14 @@ static void punt_bios_to_rescuer(struct
bio_list_init(&punt);
bio_list_init(&nopunt);
- while ((bio = bio_list_pop(current->bio_list)))
+ while ((bio = bio_list_pop(¤t->bio_list[0])))
bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
+ current->bio_list[0] = nopunt;
- *current->bio_list = nopunt;
+ bio_list_init(&nopunt);
+ while ((bio = bio_list_pop(¤t->bio_list[1])))
+ bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
+ current->bio_list[1] = nopunt;
spin_lock(&bs->rescue_lock);
bio_list_merge(&bs->rescue_list, &punt);
@@ -464,7 +468,9 @@ struct bio *bio_alloc_bioset(gfp_t gfp_m
* we retry with the original gfp_flags.
*/
- if (current->bio_list && !bio_list_empty(current->bio_list))
+ if (current->bio_list &&
+ (!bio_list_empty(¤t->bio_list[0]) ||
+ !bio_list_empty(¤t->bio_list[1])))
gfp_mask &= ~__GFP_DIRECT_RECLAIM;
p = mempool_alloc(bs->bio_pool, gfp_mask);
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2021,7 +2021,14 @@ end_io:
*/
blk_qc_t generic_make_request(struct bio *bio)
{
- struct bio_list bio_list_on_stack;
+ /*
+ * bio_list_on_stack[0] contains bios submitted by the current
+ * make_request_fn.
+ * bio_list_on_stack[1] contains bios that were submitted before
+ * the current make_request_fn, but that haven't been processed
+ * yet.
+ */
+ struct bio_list bio_list_on_stack[2];
blk_qc_t ret = BLK_QC_T_NONE;
if (!generic_make_request_checks(bio))
@@ -2038,7 +2045,7 @@ blk_qc_t generic_make_request(struct bio
* should be added at the tail
*/
if (current->bio_list) {
- bio_list_add(current->bio_list, bio);
+ bio_list_add(¤t->bio_list[0], bio);
goto out;
}
@@ -2057,17 +2064,17 @@ blk_qc_t generic_make_request(struct bio
* bio_list, and call into ->make_request() again.
*/
BUG_ON(bio->bi_next);
- bio_list_init(&bio_list_on_stack);
- current->bio_list = &bio_list_on_stack;
+ bio_list_init(&bio_list_on_stack[0]);
+ current->bio_list = bio_list_on_stack;
do {
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {
- struct bio_list lower, same, hold;
+ struct bio_list lower, same;
/* Create a fresh bio_list for all subordinate requests */
- hold = bio_list_on_stack;
- bio_list_init(&bio_list_on_stack);
+ bio_list_on_stack[1] = bio_list_on_stack[0];
+ bio_list_init(&bio_list_on_stack[0]);
ret = q->make_request_fn(q, bio);
@@ -2077,19 +2084,19 @@ blk_qc_t generic_make_request(struct bio
*/
bio_list_init(&lower);
bio_list_init(&same);
- while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
+ while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
if (q == bdev_get_queue(bio->bi_bdev))
bio_list_add(&same, bio);
else
bio_list_add(&lower, bio);
/* now assemble so we handle the lowest level first */
- bio_list_merge(&bio_list_on_stack, &lower);
- bio_list_merge(&bio_list_on_stack, &same);
- bio_list_merge(&bio_list_on_stack, &hold);
+ bio_list_merge(&bio_list_on_stack[0], &lower);
+ bio_list_merge(&bio_list_on_stack[0], &same);
+ bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
} else {
bio_io_error(bio);
}
- bio = bio_list_pop(current->bio_list);
+ bio = bio_list_pop(&bio_list_on_stack[0]);
} while (bio);
current->bio_list = NULL; /* deactivate */
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -877,7 +877,8 @@ static sector_t wait_barrier(struct r1co
((conf->start_next_window <
conf->next_resync + RESYNC_SECTORS) &&
current->bio_list &&
- !bio_list_empty(current->bio_list))),
+ (!bio_list_empty(¤t->bio_list[0]) ||
+ !bio_list_empty(¤t->bio_list[1])))),
conf->resync_lock);
conf->nr_waiting--;
}
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -946,7 +946,8 @@ static void wait_barrier(struct r10conf
!conf->barrier ||
(conf->nr_pending &&
current->bio_list &&
- !bio_list_empty(current->bio_list)),
+ (!bio_list_empty(¤t->bio_list[0]) ||
+ !bio_list_empty(¤t->bio_list[1]))),
conf->resync_lock);
conf->nr_waiting--;
}
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <[email protected]>
commit 79bd99596b7305ab08109a8bf44a6a4511dbf1cd upstream.
To avoid recursion on the kernel stack when stacked block devices
are in use, generic_make_request() will, when called recursively,
queue new requests for later handling. They will be handled when the
make_request_fn for the current bio completes.
If any bios are submitted by a make_request_fn, these will ultimately
be handled seqeuntially. If the handling of one of those generates
further requests, they will be added to the end of the queue.
This strict first-in-first-out behaviour can lead to deadlocks in
various ways, normally because a request might need to wait for a
previous request to the same device to complete. This can happen when
they share a mempool, and can happen due to interdependencies
particular to the device. Both md and dm have examples where this happens.
These deadlocks can be erradicated by more selective ordering of bios.
Specifically by handling them in depth-first order. That is: when the
handling of one bio generates one or more further bios, they are
handled immediately after the parent, before any siblings of the
parent. That way, when generic_make_request() calls make_request_fn
for some particular device, we can be certain that all previously
submited requests for that device have been completely handled and are
not waiting for anything in the queue of requests maintained in
generic_make_request().
An easy way to achieve this would be to use a last-in-first-out stack
instead of a queue. However this will change the order of consecutive
bios submitted by a make_request_fn, which could have unexpected consequences.
Instead we take a slightly more complex approach.
A fresh queue is created for each call to a make_request_fn. After it completes,
any bios for a different device are placed on the front of the main queue, followed
by any bios for the same device, followed by all bios that were already on
the queue before the make_request_fn was called.
This provides the depth-first approach without reordering bios on the same level.
This, by itself, it not enough to remove all deadlocks. It just makes
it possible for drivers to take the extra step required themselves.
To avoid deadlocks, drivers must never risk waiting for a request
after submitting one to generic_make_request. This includes never
allocing from a mempool twice in the one call to a make_request_fn.
A common pattern in drivers is to call bio_split() in a loop, handling
the first part and then looping around to possibly split the next part.
Instead, a driver that finds it needs to split a bio should queue
(with generic_make_request) the second part, handle the first part,
and then return. The new code in generic_make_request will ensure the
requests to underlying bios are processed first, then the second bio
that was split off. If it splits again, the same process happens. In
each case one bio will be completely handled before the next one is attempted.
With this is place, it should be possible to disable the
punt_bios_to_recover() recovery thread for many block devices, and
eventually it may be possible to remove it completely.
Ref: http://www.spinics.net/lists/raid/msg54680.html
Tested-by: Jinpu Wang <[email protected]>
Inspired-by: Lars Ellenberg <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
[jwang: backport to 4.4]
Signed-off-by: Jack Wang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
block/blk-core.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2063,18 +2063,33 @@ blk_qc_t generic_make_request(struct bio
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {
+ struct bio_list lower, same, hold;
+
+ /* Create a fresh bio_list for all subordinate requests */
+ hold = bio_list_on_stack;
+ bio_list_init(&bio_list_on_stack);
ret = q->make_request_fn(q, bio);
blk_queue_exit(q);
-
- bio = bio_list_pop(current->bio_list);
+ /* sort new bios into those for a lower level
+ * and those for the same level
+ */
+ bio_list_init(&lower);
+ bio_list_init(&same);
+ while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
+ if (q == bdev_get_queue(bio->bi_bdev))
+ bio_list_add(&same, bio);
+ else
+ bio_list_add(&lower, bio);
+ /* now assemble so we handle the lowest level first */
+ bio_list_merge(&bio_list_on_stack, &lower);
+ bio_list_merge(&bio_list_on_stack, &same);
+ bio_list_merge(&bio_list_on_stack, &hold);
} else {
- struct bio *bio_next = bio_list_pop(current->bio_list);
-
bio_io_error(bio);
- bio = bio_next;
}
+ bio = bio_list_pop(current->bio_list);
} while (bio);
current->bio_list = NULL; /* deactivate */
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandre Belloni <[email protected]>
commit 0b0408745e7ff24757cbfd571d69026c0ddb803c upstream.
LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
proper power off sequence is used before shutting down the platform.
Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/power/reset/at91-poweroff.c | 54 +++++++++++++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)
--- a/drivers/power/reset/at91-poweroff.c
+++ b/drivers/power/reset/at91-poweroff.c
@@ -14,9 +14,12 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/platform_device.h>
#include <linux/printk.h>
+#include <soc/at91/at91sam9_ddrsdr.h>
+
#define AT91_SHDW_CR 0x00 /* Shut Down Control Register */
#define AT91_SHDW_SHDW BIT(0) /* Shut Down command */
#define AT91_SHDW_KEY (0xa5 << 24) /* KEY Password */
@@ -50,6 +53,7 @@ static const char *shdwc_wakeup_modes[]
static void __iomem *at91_shdwc_base;
static struct clk *sclk;
+static void __iomem *mpddrc_base;
static void __init at91_wakeup_status(void)
{
@@ -73,6 +77,29 @@ static void at91_poweroff(void)
writel(AT91_SHDW_KEY | AT91_SHDW_SHDW, at91_shdwc_base + AT91_SHDW_CR);
}
+static void at91_lpddr_poweroff(void)
+{
+ asm volatile(
+ /* Align to cache lines */
+ ".balign 32\n\t"
+
+ /* Ensure AT91_SHDW_CR is in the TLB by reading it */
+ " ldr r6, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
+
+ /* Power down SDRAM0 */
+ " str %1, [%0, #" __stringify(AT91_DDRSDRC_LPR) "]\n\t"
+ /* Shutdown CPU */
+ " str %3, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
+
+ " b .\n\t"
+ :
+ : "r" (mpddrc_base),
+ "r" cpu_to_le32(AT91_DDRSDRC_LPDDR2_PWOFF),
+ "r" (at91_shdwc_base),
+ "r" cpu_to_le32(AT91_SHDW_KEY | AT91_SHDW_SHDW)
+ : "r0");
+}
+
static int at91_poweroff_get_wakeup_mode(struct device_node *np)
{
const char *pm;
@@ -124,6 +151,8 @@ static void at91_poweroff_dt_set_wakeup_
static int __init at91_poweroff_probe(struct platform_device *pdev)
{
struct resource *res;
+ struct device_node *np;
+ u32 ddr_type;
int ret;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -150,12 +179,30 @@ static int __init at91_poweroff_probe(st
pm_power_off = at91_poweroff;
+ np = of_find_compatible_node(NULL, NULL, "atmel,sama5d3-ddramc");
+ if (!np)
+ return 0;
+
+ mpddrc_base = of_iomap(np, 0);
+ of_node_put(np);
+
+ if (!mpddrc_base)
+ return 0;
+
+ ddr_type = readl(mpddrc_base + AT91_DDRSDRC_MDR) & AT91_DDRSDRC_MD;
+ if ((ddr_type == AT91_DDRSDRC_MD_LPDDR2) ||
+ (ddr_type == AT91_DDRSDRC_MD_LPDDR3))
+ pm_power_off = at91_lpddr_poweroff;
+ else
+ iounmap(mpddrc_base);
+
return 0;
}
static int __exit at91_poweroff_remove(struct platform_device *pdev)
{
- if (pm_power_off == at91_poweroff)
+ if (pm_power_off == at91_poweroff ||
+ pm_power_off == at91_lpddr_poweroff)
pm_power_off = NULL;
clk_disable_unprepare(sclk);
@@ -163,6 +210,11 @@ static int __exit at91_poweroff_remove(s
return 0;
}
+static const struct of_device_id at91_ramc_of_match[] = {
+ { .compatible = "atmel,sama5d3-ddramc", },
+ { /* sentinel */ }
+};
+
static const struct of_device_id at91_poweroff_of_match[] = {
{ .compatible = "atmel,at91sam9260-shdwc", },
{ .compatible = "atmel,at91sam9rl-shdwc", },
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand <[email protected]>
commit 90db10434b163e46da413d34db8d0e77404cc645 upstream.
No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.
There is nothing the callers could do, except retrying over and over
again.
So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).
Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU")
Reported-by: Dmitry Vyukov <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/kvm_host.h | 4 ++--
virt/kvm/eventfd.c | 3 ++-
virt/kvm/kvm_main.c | 40 +++++++++++++++++++++++-----------------
3 files changed, 27 insertions(+), 20 deletions(-)
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -182,8 +182,8 @@ int kvm_io_bus_read(struct kvm_vcpu *vcp
int len, void *val);
int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
int len, struct kvm_io_device *dev);
-int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
- struct kvm_io_device *dev);
+void kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
+ struct kvm_io_device *dev);
#ifdef CONFIG_KVM_ASYNC_PF
struct kvm_async_pf {
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -868,7 +868,8 @@ kvm_deassign_ioeventfd_idx(struct kvm *k
continue;
kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
- kvm->buses[bus_idx]->ioeventfd_count--;
+ if (kvm->buses[bus_idx])
+ kvm->buses[bus_idx]->ioeventfd_count--;
ioeventfd_release(p);
ret = 0;
break;
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -655,7 +655,8 @@ static void kvm_destroy_vm(struct kvm *k
spin_unlock(&kvm_lock);
kvm_free_irq_routing(kvm);
for (i = 0; i < KVM_NR_BUSES; i++) {
- kvm_io_bus_destroy(kvm->buses[i]);
+ if (kvm->buses[i])
+ kvm_io_bus_destroy(kvm->buses[i]);
kvm->buses[i] = NULL;
}
kvm_coalesced_mmio_free(kvm);
@@ -3273,6 +3274,8 @@ int kvm_io_bus_write(struct kvm_vcpu *vc
};
bus = srcu_dereference(vcpu->kvm->buses[bus_idx], &vcpu->kvm->srcu);
+ if (!bus)
+ return -ENOMEM;
r = __kvm_io_bus_write(vcpu, bus, &range, val);
return r < 0 ? r : 0;
}
@@ -3290,6 +3293,8 @@ int kvm_io_bus_write_cookie(struct kvm_v
};
bus = srcu_dereference(vcpu->kvm->buses[bus_idx], &vcpu->kvm->srcu);
+ if (!bus)
+ return -ENOMEM;
/* First try the device referenced by cookie. */
if ((cookie >= 0) && (cookie < bus->dev_count) &&
@@ -3340,6 +3345,8 @@ int kvm_io_bus_read(struct kvm_vcpu *vcp
};
bus = srcu_dereference(vcpu->kvm->buses[bus_idx], &vcpu->kvm->srcu);
+ if (!bus)
+ return -ENOMEM;
r = __kvm_io_bus_read(vcpu, bus, &range, val);
return r < 0 ? r : 0;
}
@@ -3352,6 +3359,9 @@ int kvm_io_bus_register_dev(struct kvm *
struct kvm_io_bus *new_bus, *bus;
bus = kvm->buses[bus_idx];
+ if (!bus)
+ return -ENOMEM;
+
/* exclude ioeventfd which is limited by maximum fd */
if (bus->dev_count - bus->ioeventfd_count > NR_IOBUS_DEVS - 1)
return -ENOSPC;
@@ -3371,45 +3381,41 @@ int kvm_io_bus_register_dev(struct kvm *
}
/* Caller must hold slots_lock. */
-int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
- struct kvm_io_device *dev)
+void kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
+ struct kvm_io_device *dev)
{
- int i, r;
+ int i;
struct kvm_io_bus *new_bus, *bus;
bus = kvm->buses[bus_idx];
-
- /*
- * It's possible the bus being released before hand. If so,
- * we're done here.
- */
if (!bus)
- return 0;
+ return;
- r = -ENOENT;
for (i = 0; i < bus->dev_count; i++)
if (bus->range[i].dev == dev) {
- r = 0;
break;
}
- if (r)
- return r;
+ if (i == bus->dev_count)
+ return;
new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) *
sizeof(struct kvm_io_range)), GFP_KERNEL);
- if (!new_bus)
- return -ENOMEM;
+ if (!new_bus) {
+ pr_err("kvm: failed to shrink bus, removing it completely\n");
+ goto broken;
+ }
memcpy(new_bus, bus, sizeof(*bus) + i * sizeof(struct kvm_io_range));
new_bus->dev_count--;
memcpy(new_bus->range + i, bus->range + i + 1,
(new_bus->dev_count - i) * sizeof(struct kvm_io_range));
+broken:
rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
synchronize_srcu_expedited(&kvm->srcu);
kfree(bus);
- return r;
+ return;
}
static struct notifier_block kvm_cpu_notifier = {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ross Lagerwall <[email protected]>
commit 7ecec8503af37de6be4f96b53828d640a968705f upstream.
When relocating the p2m, take special care not to relocate it so
that is overlaps with the current location of the p2m/initrd. This is
needed since the full extent of the current location is not marked as a
reserved region in the e820.
This was seen to happen to a dom0 with a large initial p2m and a small
reserved region in the middle of the initial p2m.
Signed-off-by: Ross Lagerwall <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/xen/setup.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -713,10 +713,9 @@ static void __init xen_reserve_xen_mfnli
size = PFN_PHYS(xen_start_info->nr_p2m_frames);
}
- if (!xen_is_e820_reserved(start, size)) {
- memblock_reserve(start, size);
+ memblock_reserve(start, size);
+ if (!xen_is_e820_reserved(start, size))
return;
- }
#ifdef CONFIG_X86_32
/*
@@ -727,6 +726,7 @@ static void __init xen_reserve_xen_mfnli
BUG();
#else
xen_relocate_p2m();
+ memblock_free(start, size);
#endif
}
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern <[email protected]>
commit 1633682053a7ee8058e10c76722b9b28e97fb73f upstream.
Using KASAN, Dmitry found a bug in the rh_call_control() routine: If
buffer allocation fails, the routine returns immediately without
unlinking its URB from the control endpoint, eventually leading to
linked-list corruption.
This patch fixes the problem by jumping to the end of the routine
(where the URB is unlinked) when an allocation failure occurs.
Signed-off-by: Alan Stern <[email protected]>
Reported-and-tested-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/usb/core/hcd.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -499,8 +499,10 @@ static int rh_call_control (struct usb_h
*/
tbuf_size = max_t(u16, sizeof(struct usb_hub_descriptor), wLength);
tbuf = kzalloc(tbuf_size, GFP_KERNEL);
- if (!tbuf)
- return -ENOMEM;
+ if (!tbuf) {
+ status = -ENOMEM;
+ goto err_alloc;
+ }
bufp = tbuf;
@@ -705,6 +707,7 @@ error:
}
kfree(tbuf);
+ err_alloc:
/* any errors get returned through the urb completion */
spin_lock_irq(&hcd_root_hub_lock);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joerg Roedel <[email protected]>
commit 08f63d97749185fab942a3a47ed80f5bd89b8b7d upstream.
No platform-device is required for IO(x)APICs, so don't even
create them.
[ rjw: This fixes a problem with leaking platform device objects
after IOAPIC/IOxAPIC hot-removal events.]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/acpi/acpi_platform.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/acpi/acpi_platform.c
+++ b/drivers/acpi/acpi_platform.c
@@ -24,9 +24,11 @@
ACPI_MODULE_NAME("platform");
static const struct acpi_device_id forbidden_id_list[] = {
- {"PNP0000", 0}, /* PIC */
- {"PNP0100", 0}, /* Timer */
- {"PNP0200", 0}, /* AT DMA Controller */
+ {"PNP0000", 0}, /* PIC */
+ {"PNP0100", 0}, /* Timer */
+ {"PNP0200", 0}, /* AT DMA Controller */
+ {"ACPI0009", 0}, /* IOxAPIC */
+ {"ACPI000A", 0}, /* IOAPIC */
{"", 0},
};
On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Alexandre Belloni <[email protected]>
>
> commit 0b0408745e7ff24757cbfd571d69026c0ddb803c upstream.
>
> LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
> proper power off sequence is used before shutting down the platform.
>
> Signed-off-by: Alexandre Belloni <[email protected]>
> Signed-off-by: Sebastian Reichel <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[...]
> +static void at91_lpddr_poweroff(void)
> +{
> + asm volatile(
> + /* Align to cache lines */
> + ".balign 32\n\t"
> +
> + /* Ensure AT91_SHDW_CR is in the TLB by reading it */
> + " ldr r6, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
This clobbers r6...
> + /* Power down SDRAM0 */
> + " str %1, [%0, #" __stringify(AT91_DDRSDRC_LPR) "]\n\t"
> + /* Shutdown CPU */
> + " str %3, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
> +
> + " b .\n\t"
> + :
> + : "r" (mpddrc_base),
> + "r" cpu_to_le32(AT91_DDRSDRC_LPDDR2_PWOFF),
> + "r" (at91_shdwc_base),
> + "r" cpu_to_le32(AT91_SHDW_KEY | AT91_SHDW_SHDW)
> + : "r0");
[...]
...but the clobber list has r0.
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
On 04/06/2017 02:38 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.60 release.
> There are 26 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat Apr 8 08:35:54 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.60-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my test system. No dmesg regressions.
thanks,
-- Shuah
On 06/04/2017 at 18:45:39 +0100, Ben Hutchings wrote:
> On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Alexandre Belloni <[email protected]>
> >
> > commit 0b0408745e7ff24757cbfd571d69026c0ddb803c upstream.
> >
> > LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
> > proper power off sequence is used before shutting down the platform.
> >
> > Signed-off-by: Alexandre Belloni <[email protected]>
> > Signed-off-by: Sebastian Reichel <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> [...]
> > +static void at91_lpddr_poweroff(void)
> > +{
> > + asm volatile(
> > + /* Align to cache lines */
> > + ".balign 32\n\t"
> > +
> > + /* Ensure AT91_SHDW_CR is in the TLB by reading it */
> > + " ldr r6, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
>
> This clobbers r6...
>
> > + /* Power down SDRAM0 */
> > + " str %1, [%0, #" __stringify(AT91_DDRSDRC_LPR) "]\n\t"
> > + /* Shutdown CPU */
> > + " str %3, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
> > +
> > + " b .\n\t"
> > + :
> > + : "r" (mpddrc_base),
> > + "r" cpu_to_le32(AT91_DDRSDRC_LPDDR2_PWOFF),
> > + "r" (at91_shdwc_base),
> > + "r" cpu_to_le32(AT91_SHDW_KEY | AT91_SHDW_SHDW)
> > + : "r0");
> [...]
>
> ...but the clobber list has r0.
>
Indeed. However, It doesn't matter much as nothing can possibly run
afterwards.
--
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: NeilBrown <[email protected]>
>
> commit f5fe1b51905df7cfe4fdfd85c5fb7bc5b71a094f upstream.
>
> Commit 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
> changed current->bio_list so that it did not contain *all* of the
> queued bios, but only those submitted by the currently running
> make_request_fn.
>
> There are two places which walk the list and requeue selected bios,
> and others that check if the list is empty. These are no longer
> correct.
>
> So redefine current->bio_list to point to an array of two lists, which
> contain all queued bios, and adjust various code to test or walk both
> lists.
>
> Signed-off-by: NeilBrown <[email protected]>
> Fixes: 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
> Signed-off-by: Jens Axboe <[email protected]>
> [jwang: backport to 4.4]
> Signed-off-by: Jack Wang <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---
> block/bio.c | 12 +++++++++---
> block/blk-core.c | 31 +++++++++++++++++++------------
> drivers/md/raid1.c | 3 ++-
> drivers/md/raid10.c | 3 ++-
> 4 files changed, 32 insertions(+), 17 deletions(-)
Why did you drop the changes in drivers/md/dm.c?
Ben.
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -373,10 +373,14 @@ static void punt_bios_to_rescuer(struct
> bio_list_init(&punt);
> bio_list_init(&nopunt);
>
> - while ((bio = bio_list_pop(current->bio_list)))
> + while ((bio = bio_list_pop(¤t->bio_list[0])))
> bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> + current->bio_list[0] = nopunt;
>
> - *current->bio_list = nopunt;
> + bio_list_init(&nopunt);
> + while ((bio = bio_list_pop(¤t->bio_list[1])))
> + bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> + current->bio_list[1] = nopunt;
>
> spin_lock(&bs->rescue_lock);
> bio_list_merge(&bs->rescue_list, &punt);
> @@ -464,7 +468,9 @@ struct bio *bio_alloc_bioset(gfp_t gfp_m
> * we retry with the original gfp_flags.
> */
>
> - if (current->bio_list && !bio_list_empty(current->bio_list))
> + if (current->bio_list &&
> + (!bio_list_empty(¤t->bio_list[0]) ||
> + !bio_list_empty(¤t->bio_list[1])))
> gfp_mask &= ~__GFP_DIRECT_RECLAIM;
>
> p = mempool_alloc(bs->bio_pool, gfp_mask);
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2021,7 +2021,14 @@ end_io:
> */
> blk_qc_t generic_make_request(struct bio *bio)
> {
> - struct bio_list bio_list_on_stack;
> + /*
> + * bio_list_on_stack[0] contains bios submitted by the current
> + * make_request_fn.
> + * bio_list_on_stack[1] contains bios that were submitted before
> + * the current make_request_fn, but that haven't been processed
> + * yet.
> + */
> + struct bio_list bio_list_on_stack[2];
> blk_qc_t ret = BLK_QC_T_NONE;
>
> if (!generic_make_request_checks(bio))
> @@ -2038,7 +2045,7 @@ blk_qc_t generic_make_request(struct bio
> * should be added at the tail
> */
> if (current->bio_list) {
> - bio_list_add(current->bio_list, bio);
> + bio_list_add(¤t->bio_list[0], bio);
> goto out;
> }
>
> @@ -2057,17 +2064,17 @@ blk_qc_t generic_make_request(struct bio
> * bio_list, and call into ->make_request() again.
> */
> BUG_ON(bio->bi_next);
> - bio_list_init(&bio_list_on_stack);
> - current->bio_list = &bio_list_on_stack;
> + bio_list_init(&bio_list_on_stack[0]);
> + current->bio_list = bio_list_on_stack;
> do {
> struct request_queue *q = bdev_get_queue(bio->bi_bdev);
>
> if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {
> - struct bio_list lower, same, hold;
> + struct bio_list lower, same;
>
> /* Create a fresh bio_list for all subordinate requests */
> - hold = bio_list_on_stack;
> - bio_list_init(&bio_list_on_stack);
> + bio_list_on_stack[1] = bio_list_on_stack[0];
> + bio_list_init(&bio_list_on_stack[0]);
>
> ret = q->make_request_fn(q, bio);
>
> @@ -2077,19 +2084,19 @@ blk_qc_t generic_make_request(struct bio
> */
> bio_list_init(&lower);
> bio_list_init(&same);
> - while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
> + while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> if (q == bdev_get_queue(bio->bi_bdev))
> bio_list_add(&same, bio);
> else
> bio_list_add(&lower, bio);
> /* now assemble so we handle the lowest level first */
> - bio_list_merge(&bio_list_on_stack, &lower);
> - bio_list_merge(&bio_list_on_stack, &same);
> - bio_list_merge(&bio_list_on_stack, &hold);
> + bio_list_merge(&bio_list_on_stack[0], &lower);
> + bio_list_merge(&bio_list_on_stack[0], &same);
> + bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
> } else {
> bio_io_error(bio);
> }
> - bio = bio_list_pop(current->bio_list);
> + bio = bio_list_pop(&bio_list_on_stack[0]);
> } while (bio);
> current->bio_list = NULL; /* deactivate */
>
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -877,7 +877,8 @@ static sector_t wait_barrier(struct r1co
> ((conf->start_next_window <
> conf->next_resync + RESYNC_SECTORS) &&
> current->bio_list &&
> - !bio_list_empty(current->bio_list))),
> + (!bio_list_empty(¤t->bio_list[0]) ||
> + !bio_list_empty(¤t->bio_list[1])))),
> conf->resync_lock);
> conf->nr_waiting--;
> }
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -946,7 +946,8 @@ static void wait_barrier(struct r10conf
> !conf->barrier ||
> (conf->nr_pending &&
> current->bio_list &&
> - !bio_list_empty(current->bio_list)),
> + (!bio_list_empty(¤t->bio_list[0]) ||
> + !bio_list_empty(¤t->bio_list[1]))),
> conf->resync_lock);
> conf->nr_waiting--;
> }
>
>
>
--
Ben Hutchings
Software Developer, Codethink Ltd.
On Thu, 2017-04-06 at 20:13 +0200, Alexandre Belloni wrote:
> On 06/04/2017 at 18:45:39 +0100, Ben Hutchings wrote:
> > On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
> > > 4.4-stable review patch. If anyone has any objections, please let me know.
> > >
> > > ------------------
> > >
> > > From: Alexandre Belloni <[email protected]>
> > >
> > > commit 0b0408745e7ff24757cbfd571d69026c0ddb803c upstream.
> > >
> > > LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
> > > proper power off sequence is used before shutting down the platform.
> > >
> > > Signed-off-by: Alexandre Belloni <[email protected]>
> > > Signed-off-by: Sebastian Reichel <[email protected]>
> > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > [...]
> > > +static void at91_lpddr_poweroff(void)
> > > +{
> > > + asm volatile(
> > > + /* Align to cache lines */
> > > + ".balign 32\n\t"
> > > +
> > > + /* Ensure AT91_SHDW_CR is in the TLB by reading it */
> > > + " ldr r6, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
> >
> > This clobbers r6...
> >
> > > + /* Power down SDRAM0 */
> > > + " str %1, [%0, #" __stringify(AT91_DDRSDRC_LPR) "]\n\t"
> > > + /* Shutdown CPU */
> > > + " str %3, [%2, #" __stringify(AT91_SHDW_CR) "]\n\t"
> > > +
> > > + " b .\n\t"
> > > + :
> > > + : "r" (mpddrc_base),
> > > + "r" cpu_to_le32(AT91_DDRSDRC_LPDDR2_PWOFF),
> > > + "r" (at91_shdwc_base),
> > > + "r" cpu_to_le32(AT91_SHDW_KEY | AT91_SHDW_SHDW)
> > > + : "r0");
> > [...]
> >
> > ...but the clobber list has r0.
> >
>
> Indeed. However, It doesn't matter much as nothing can possibly run
> afterwards.
It does matter because the compiler can use r6 for one of the inputs.
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
On Thu, Apr 06, 2017 at 10:38:22AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.60 release.
> There are 26 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat Apr 8 08:35:54 UTC 2017.
> Anything received after that time might be too late.
>
Build results:
total: 149 pass: 149 fail: 0
Qemu test results:
total: 115 pass: 115 fail: 0
Details are available at http://kerneltests.org/builders.
Guenter
On Thu, Apr 6, 2017 at 8:17 PM, Ben Hutchings
<[email protected]> wrote:
> On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
>> 4.4-stable review patch. If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: NeilBrown <[email protected]>
>>
>> commit f5fe1b51905df7cfe4fdfd85c5fb7bc5b71a094f upstream.
>>
>> Commit 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
>> changed current->bio_list so that it did not contain *all* of the
>> queued bios, but only those submitted by the currently running
>> make_request_fn.
>>
>> There are two places which walk the list and requeue selected bios,
>> and others that check if the list is empty. These are no longer
>> correct.
>>
>> So redefine current->bio_list to point to an array of two lists, which
>> contain all queued bios, and adjust various code to test or walk both
>> lists.
>>
>> Signed-off-by: NeilBrown <[email protected]>
>> Fixes: 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
>> Signed-off-by: Jens Axboe <[email protected]>
>> [jwang: backport to 4.4]
>> Signed-off-by: Jack Wang <[email protected]>
>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>> ---
>> block/bio.c | 12 +++++++++---
>> block/blk-core.c | 31 +++++++++++++++++++------------
>> drivers/md/raid1.c | 3 ++-
>> drivers/md/raid10.c | 3 ++-
>> 4 files changed, 32 insertions(+), 17 deletions(-)
>
> Why did you drop the changes in drivers/md/dm.c?
>
> Ben.
>
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -373,10 +373,14 @@ static void punt_bios_to_rescuer(struct
>> bio_list_init(&punt);
>> bio_list_init(&nopunt);
>>
>> - while ((bio = bio_list_pop(current->bio_list)))
>> + while ((bio = bio_list_pop(¤t->bio_list[0])))
>> bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
>> + current->bio_list[0] = nopunt;
>>
>> - *current->bio_list = nopunt;
>> + bio_list_init(&nopunt);
>> + while ((bio = bio_list_pop(¤t->bio_list[1])))
>> + bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
>> + current->bio_list[1] = nopunt;
>>
>> spin_lock(&bs->rescue_lock);
>> bio_list_merge(&bs->rescue_list, &punt);
>> @@ -464,7 +468,9 @@ struct bio *bio_alloc_bioset(gfp_t gfp_m
>> * we retry with the original gfp_flags.
>> */
>>
>> - if (current->bio_list && !bio_list_empty(current->bio_list))
>> + if (current->bio_list &&
>> + (!bio_list_empty(¤t->bio_list[0]) ||
>> + !bio_list_empty(¤t->bio_list[1])))
>> gfp_mask &= ~__GFP_DIRECT_RECLAIM;
>>
>> p = mempool_alloc(bs->bio_pool, gfp_mask);
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -2021,7 +2021,14 @@ end_io:
>> */
>> blk_qc_t generic_make_request(struct bio *bio)
>> {
>> - struct bio_list bio_list_on_stack;
>> + /*
>> + * bio_list_on_stack[0] contains bios submitted by the current
>> + * make_request_fn.
>> + * bio_list_on_stack[1] contains bios that were submitted before
>> + * the current make_request_fn, but that haven't been processed
>> + * yet.
>> + */
>> + struct bio_list bio_list_on_stack[2];
>> blk_qc_t ret = BLK_QC_T_NONE;
>>
>> if (!generic_make_request_checks(bio))
>> @@ -2038,7 +2045,7 @@ blk_qc_t generic_make_request(struct bio
>> * should be added at the tail
>> */
>> if (current->bio_list) {
>> - bio_list_add(current->bio_list, bio);
>> + bio_list_add(¤t->bio_list[0], bio);
>> goto out;
>> }
>>
>> @@ -2057,17 +2064,17 @@ blk_qc_t generic_make_request(struct bio
>> * bio_list, and call into ->make_request() again.
>> */
>> BUG_ON(bio->bi_next);
>> - bio_list_init(&bio_list_on_stack);
>> - current->bio_list = &bio_list_on_stack;
>> + bio_list_init(&bio_list_on_stack[0]);
>> + current->bio_list = bio_list_on_stack;
>> do {
>> struct request_queue *q = bdev_get_queue(bio->bi_bdev);
>>
>> if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {
>> - struct bio_list lower, same, hold;
>> + struct bio_list lower, same;
>>
>> /* Create a fresh bio_list for all subordinate requests */
>> - hold = bio_list_on_stack;
>> - bio_list_init(&bio_list_on_stack);
>> + bio_list_on_stack[1] = bio_list_on_stack[0];
>> + bio_list_init(&bio_list_on_stack[0]);
>>
>> ret = q->make_request_fn(q, bio);
>>
>> @@ -2077,19 +2084,19 @@ blk_qc_t generic_make_request(struct bio
>> */
>> bio_list_init(&lower);
>> bio_list_init(&same);
>> - while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
>> + while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
>> if (q == bdev_get_queue(bio->bi_bdev))
>> bio_list_add(&same, bio);
>> else
>> bio_list_add(&lower, bio);
>> /* now assemble so we handle the lowest level first */
>> - bio_list_merge(&bio_list_on_stack, &lower);
>> - bio_list_merge(&bio_list_on_stack, &same);
>> - bio_list_merge(&bio_list_on_stack, &hold);
>> + bio_list_merge(&bio_list_on_stack[0], &lower);
>> + bio_list_merge(&bio_list_on_stack[0], &same);
>> + bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
>> } else {
>> bio_io_error(bio);
>> }
>> - bio = bio_list_pop(current->bio_list);
>> + bio = bio_list_pop(&bio_list_on_stack[0]);
>> } while (bio);
>> current->bio_list = NULL; /* deactivate */
>>
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -877,7 +877,8 @@ static sector_t wait_barrier(struct r1co
>> ((conf->start_next_window <
>> conf->next_resync + RESYNC_SECTORS) &&
>> current->bio_list &&
>> - !bio_list_empty(current->bio_list))),
>> + (!bio_list_empty(¤t->bio_list[0]) ||
>> + !bio_list_empty(¤t->bio_list[1])))),
>> conf->resync_lock);
>> conf->nr_waiting--;
>> }
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -946,7 +946,8 @@ static void wait_barrier(struct r10conf
>> !conf->barrier ||
>> (conf->nr_pending &&
>> current->bio_list &&
>> - !bio_list_empty(current->bio_list)),
>> + (!bio_list_empty(¤t->bio_list[0]) ||
>> + !bio_list_empty(¤t->bio_list[1]))),
>> conf->resync_lock);
>> conf->nr_waiting--;
>> }
>>
>>
>>
>
> --
> Ben Hutchings
> Software Developer, Codethink Ltd.
>
>
Hi Ben,
Because the code snip doesn't exist in 4.4.
--
Jack Wang
Linux Kernel Developer
On Fri, 2017-04-07 at 10:33 +0200, Jinpu Wang wrote:
> On Thu, Apr 6, 2017 at 8:17 PM, Ben Hutchings
> <[email protected]> wrote:
> > On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
> >> 4.4-stable review patch. If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: NeilBrown <[email protected]>
> >>
> >> commit f5fe1b51905df7cfe4fdfd85c5fb7bc5b71a094f upstream.
[...]
> > Why did you drop the changes in drivers/md/dm.c?
[...]
> Because the code snip doesn't exist in 4.4.
It does, and the upstream patch applies to it without any changes! I'm
objecting to this and the preceding patch until there's a proper
explanation of why device-mapper should be excluded from the backport.
I've attached an alternate version of this patch that has the
device-mapper part restored. While I haven't tested this (don't know
what the test case would be) I think it is rather more likely to be a
correct backport to 4.4.
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
Hi Ben,
On Fri, Apr 7, 2017 at 2:45 PM, Ben Hutchings
<[email protected]> wrote:
> On Fri, 2017-04-07 at 10:33 +0200, Jinpu Wang wrote:
>> On Thu, Apr 6, 2017 at 8:17 PM, Ben Hutchings
>> <[email protected]> wrote:
>> > On Thu, 2017-04-06 at 10:38 +0200, Greg Kroah-Hartman wrote:
>> >> 4.4-stable review patch. If anyone has any objections, please let me know.
>> >>
>> >> ------------------
>> >>
>> >> From: NeilBrown <[email protected]>
>> >>
>> >> commit f5fe1b51905df7cfe4fdfd85c5fb7bc5b71a094f upstream.
> [...]
>> > Why did you drop the changes in drivers/md/dm.c?
> [...]
>> Because the code snip doesn't exist in 4.4.
>
> It does, and the upstream patch applies to it without any changes! I'm
> objecting to this and the preceding patch until there's a proper
> explanation of why device-mapper should be excluded from the backport.
>
> I've attached an alternate version of this patch that has the
> device-mapper part restored. While I haven't tested this (don't know
> what the test case would be) I think it is rather more likely to be a
> correct backport to 4.4.
>
> Ben.
>
> --
> Ben Hutchings
> Software Developer, Codethink Ltd.
>
Thanks, you're right, just found commit
cd8ad4d9eb6d9ee04e77b42c6a7a15eabada85ac was included in 4.4.55+,
We should use your backport!
--
Jack Wang
Linux Kernel Developer
On Fri, 2017-04-07 at 15:03 +0200, Jinpu Wang wrote:
[...]
> Thanks, you're right, just found commit
> cd8ad4d9eb6d9ee04e77b42c6a7a15eabada85ac was included in 4.4.55+,
I see, the device-mapper code changed after you prepared your
backport. :-/
> We should use your backport!
Thanks, I'm glad we could agree.
Ben.
--
Ben Hutchings
Software Developer, Codethink Ltd.
On Fri, Apr 07, 2017 at 02:37:56PM +0100, Ben Hutchings wrote:
> On Fri, 2017-04-07 at 15:03 +0200, Jinpu Wang wrote:
> [...]
> > Thanks, you're right, just found commit
> > cd8ad4d9eb6d9ee04e77b42c6a7a15eabada85ac was included in 4.4.55+,
>
> I see, the device-mapper code changed after you prepared your
> backport. :-/
>
> > We should use your backport!
>
> Thanks, I'm glad we could agree.
Thanks for the updated patch, I've now used it instead of the original
one.
greg k-h