2004-06-17 11:33:47

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Hi, all

On Thu, 27 May 2004 23:04:47 +0200, Ingo Molnar wrote:

>
>* Christoph Hellwig <[email protected]> wrote:
>
>> > +/******************************** Disk dump **************************
>> > *********/
>> > +#if defined(CONFIG_DISKDUMP) || defined(CONFIG_DISKDUMP_MODULE)
>> > +#undef add_timer
>> > +#define add_timer diskdump_add_timer
>> > +#undef del_timer_sync
>> > +#define del_timer_sync diskdump_del_timer
>> > +#undef del_timer
>> > +#define del_timer diskdump_del_timer
>> > +#undef mod_timer
>> > +#define mod_timer diskdump_mod_timer
>> > +
>> > +#define tasklet_schedule diskdump_tasklet_schedule
>> > +#endif
>>
>> Yikes. No way in hell we'll place code like this in drivers. This
>> needs to be handled in common code.
>
>yeah, this is arguably the biggest (and i think only) conceptual item
>that needs to be solved before this can be integrated.
>
>The goal of these defines is to provide a separate (and polling based)
>timer mechanism that is completely separate from the normal kernel's
>state.
>
>it would also be easier to enable diskdump in a driver if this was
>handled in add_timer()/del_timer()/mod_timer()/tasklet_schedule().


Instead of redefinition, I add hooks into the timer/tasklet routines.

ex.

int mod_timer(struct timer_list *timer, unsigned long expires)
{
BUG_ON(!timer->function);

+ if (crashdump_mode()) {
+ return diskdump_mod_timer(timer, expires);
+ }

check_timer(timer);

Please see the following patches for details.
How do you think?

Best Regards,
Takao Indoh



diff -Nur linux-2.6.6.org/include/linux/diskdumplib.h linux-2.6.6/include/linux/diskdumplib.h
--- linux-2.6.6.org/include/linux/diskdumplib.h 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.6/include/linux/diskdumplib.h 2004-06-17 18:28:01.000000000 +0900
@@ -0,0 +1,19 @@
+#ifndef _LINUX_DISKDUMPLIB_H
+#define _LINUX_DISKDUMPLIB_H
+
+struct timer_list;
+struct tasklet_struct;
+struct work_struct;
+
+void diskdump_lib_init(void);
+void diskdump_lib_exit(void);
+void diskdump_update(void);
+
+void diskdump_add_timer(struct timer_list *);
+int diskdump_del_timer(struct timer_list *);
+int diskdump_mod_timer(struct timer_list *, unsigned long);
+void diskdump_tasklet_schedule(struct tasklet_struct *);
+int diskdump_schedule_work(struct work_struct *);
+long diskdump_schedule_timeout(signed long timeout);
+
+#endif /* _LINUX_DISKDUMPLIB_H */
diff -Nur linux-2.6.6.org/include/linux/interrupt.h linux-2.6.6/include/linux/interrupt.h
--- linux-2.6.6.org/include/linux/interrupt.h 2004-06-04 21:22:09.000000000 +0900
+++ linux-2.6.6/include/linux/interrupt.h 2004-06-17 18:28:01.000000000 +0900
@@ -7,6 +7,7 @@
#include <linux/linkage.h>
#include <linux/bitops.h>
#include <linux/preempt.h>
+#include <linux/diskdumplib.h>
#include <asm/atomic.h>
#include <asm/hardirq.h>
#include <asm/ptrace.h>
@@ -172,6 +173,11 @@

static inline void tasklet_schedule(struct tasklet_struct *t)
{
+ if (crashdump_mode()) {
+ diskdump_tasklet_schedule(t);
+ return;
+ }
+
if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
__tasklet_schedule(t);
}
diff -Nur linux-2.6.6.org/include/linux/timer.h linux-2.6.6/include/linux/timer.h
--- linux-2.6.6.org/include/linux/timer.h 2004-06-04 21:22:06.000000000 +0900
+++ linux-2.6.6/include/linux/timer.h 2004-06-17 18:28:01.000000000 +0900
@@ -4,6 +4,7 @@
#include <linux/config.h>
#include <linux/list.h>
#include <linux/spinlock.h>
+#include <linux/diskdumplib.h>

struct tvec_t_base_s;

@@ -83,7 +84,10 @@
*/
static inline void add_timer(struct timer_list * timer)
{
- __mod_timer(timer, timer->expires);
+ if (crashdump_mode())
+ diskdump_add_timer(timer);
+ else
+ __mod_timer(timer, timer->expires);
}

#ifdef CONFIG_SMP
diff -Nur linux-2.6.6.org/kernel/Makefile linux-2.6.6/kernel/Makefile
--- linux-2.6.6.org/kernel/Makefile 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/Makefile 2004-06-17 18:28:01.000000000 +0900
@@ -7,7 +7,7 @@
sysctl.o capability.o ptrace.o timer.o user.o \
signal.o sys.o kmod.o workqueue.o pid.o \
rcupdate.o intermodule.o extable.o params.o posix-timers.o \
- kthread.o
+ kthread.o diskdumplib.o

obj-$(CONFIG_FUTEX) += futex.o
obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o
diff -Nur linux-2.6.6.org/kernel/diskdumplib.c linux-2.6.6/kernel/diskdumplib.c
--- linux-2.6.6.org/kernel/diskdumplib.c 1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.6/kernel/diskdumplib.c 2004-06-17 18:28:01.000000000 +0900
@@ -0,0 +1,213 @@
+/*
+ * linux/kernel/diskdumplib.c
+ *
+ * Copyright (C) 2004 FUJITSU LIMITED
+ * Written by Nobuhiro Tachino ([email protected])
+ *
+ */
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ */
+#include <linux/mm.h>
+#include <linux/init.h>
+#include <linux/timer.h>
+#include <linux/workqueue.h>
+#include <linux/delay.h>
+#include <linux/file.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/utsname.h>
+#include <linux/smp_lock.h>
+#include <linux/genhd.h>
+#include <linux/slab.h>
+#include <linux/blkdev.h>
+#include <linux/diskdump.h>
+#include <linux/diskdumplib.h>
+#include <asm/diskdump.h>
+
+/*
+ * timer list and tasklet_struct holder
+ */
+unsigned long volatile diskdump_base_jiffies;
+static unsigned long long timestamp_base;
+static unsigned long timestamp_hz;
+
+#define DISKDUMP_NUM_TASKLETS 8
+
+/*
+ * We can't use next field of tasklet because it breaks the original
+ * tasklets chain and we have no way to know which chain the tasklet is
+ * linked.
+ */
+static struct tasklet_struct *diskdump_tasklets[DISKDUMP_NUM_TASKLETS];
+
+static LIST_HEAD(diskdump_timers);
+static LIST_HEAD(diskdump_workq);
+
+
+static int store_tasklet(struct tasklet_struct *tasklet)
+{
+ int i;
+
+ for (i = 0; i < DISKDUMP_NUM_TASKLETS; i++)
+ if (diskdump_tasklets[i] == NULL) {
+ diskdump_tasklets[i] = tasklet;
+ return 0;
+ }
+ return -1;
+}
+
+static struct tasklet_struct *find_tasklet(struct tasklet_struct *tasklet)
+{
+ int i;
+
+ for (i = 0; i < DISKDUMP_NUM_TASKLETS; i++)
+ if (diskdump_tasklets[i] == tasklet)
+ return diskdump_tasklets[i];
+ return NULL;
+}
+
+void diskdump_tasklet_schedule(struct tasklet_struct *tasklet)
+{
+ if (!find_tasklet(tasklet))
+ if (store_tasklet(tasklet))
+ printk(KERN_ERR "diskdumplib: too many tasklet. Ignored\n");
+ set_bit(TASKLET_STATE_SCHED, &tasklet->state);
+}
+
+int diskdump_schedule_work(struct work_struct *work)
+{
+ list_add_tail(&work->entry, &diskdump_workq);
+ return 1;
+}
+
+long diskdump_schedule_timeout(signed long timeout)
+{
+ mdelay(timeout);
+ set_current_state(TASK_RUNNING);
+ return timeout;
+}
+
+void diskdump_add_timer(struct timer_list *timer)
+{
+ timer->base = (void *)1;
+ list_add(&timer->entry, &diskdump_timers);
+}
+
+int diskdump_del_timer(struct timer_list *timer)
+{
+ if (timer->base != NULL) {
+ list_del(&timer->entry);
+ return 1;
+ } else {
+ timer->base = NULL;
+ return 0;
+ }
+}
+
+int diskdump_mod_timer(struct timer_list *timer, unsigned long expires)
+{
+ int ret;
+
+ ret = diskdump_del_timer(timer);
+ timer->expires = expires;
+ diskdump_add_timer(timer);
+
+ return ret;
+}
+
+static void update_jiffies(void)
+{
+ unsigned long long t;
+
+ platform_timestamp(t);
+ while (t > timestamp_base + timestamp_hz) {
+ timestamp_base += timestamp_hz;
+ jiffies++;
+ platform_timestamp(t);
+ }
+}
+
+void diskdump_update(void)
+{
+ struct tasklet_struct *tasklet;
+ struct work_struct *work;
+ struct timer_list *timer;
+ struct list_head *t, *n, head;
+ int i;
+
+ update_jiffies();
+
+ /* run timers */
+ list_for_each_safe(t, n, &diskdump_timers) {
+ timer = list_entry(t, struct timer_list, entry);
+ if (time_before_eq(timer->expires, jiffies)) {
+ list_del(t);
+ timer->function(timer->data);
+ }
+ }
+
+ /* run tasklet */
+ for (i = 0; i < DISKDUMP_NUM_TASKLETS; i++)
+ if ((tasklet = diskdump_tasklets[i]))
+ if (!atomic_read(&tasklet->count))
+ if (test_and_clear_bit(TASKLET_STATE_SCHED, &tasklet->state))
+ tasklet->func(tasklet->data);
+
+ /* run work queue */
+ list_add(&head, &diskdump_workq);
+ list_del_init(&diskdump_workq);
+ n = head.next;
+ while (n != &head) {
+ work = list_entry(t, struct work_struct, entry);
+ n = n->next;
+ if (work->func)
+ work->func(work->wq_data);
+ }
+}
+
+void diskdump_lib_init(void)
+{
+ unsigned long long t;
+
+ /* Save original jiffies value */
+ diskdump_base_jiffies = jiffies;
+
+ platform_timestamp(timestamp_base);
+ udelay(1000000/HZ);
+ platform_timestamp(t);
+ timestamp_hz = (unsigned long)(t - timestamp_base);
+
+ diskdump_update();
+}
+
+void diskdump_lib_exit(void)
+{
+ /* Resotre original jiffies. */
+ jiffies = diskdump_base_jiffies;
+}
+
+EXPORT_SYMBOL(diskdump_lib_init);
+EXPORT_SYMBOL(diskdump_lib_exit);
+EXPORT_SYMBOL(diskdump_update);
+EXPORT_SYMBOL(diskdump_add_timer);
+EXPORT_SYMBOL(diskdump_del_timer);
+EXPORT_SYMBOL(diskdump_mod_timer);
+EXPORT_SYMBOL(diskdump_tasklet_schedule);
+EXPORT_SYMBOL(diskdump_schedule_work);
+EXPORT_SYMBOL(diskdump_schedule_timeout);
+
+MODULE_LICENSE("GPL");
diff -Nur linux-2.6.6.org/kernel/timer.c linux-2.6.6/kernel/timer.c
--- linux-2.6.6.org/kernel/timer.c 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/timer.c 2004-06-17 18:28:01.000000000 +0900
@@ -255,6 +255,10 @@
{
BUG_ON(!timer->function);

+ if (crashdump_mode()) {
+ return diskdump_mod_timer(timer, expires);
+ }
+
check_timer(timer);

/*
@@ -286,6 +290,10 @@
unsigned long flags;
tvec_base_t *base;

+ if (crashdump_mode()) {
+ return diskdump_del_timer(timer);
+ }
+
check_timer(timer);

repeat:
@@ -327,6 +335,10 @@
tvec_base_t *base;
int i, ret = 0;

+ if (crashdump_mode()) {
+ return diskdump_del_timer(timer);
+ }
+
check_timer(timer);

del_again:
@@ -1070,6 +1082,10 @@
struct timer_list timer;
unsigned long expire;

+ if (crashdump_mode()) {
+ return diskdump_schedule_timeout(timeout);
+ }
+
switch (timeout)
{
case MAX_SCHEDULE_TIMEOUT:
diff -Nur linux-2.6.6.org/kernel/workqueue.c linux-2.6.6/kernel/workqueue.c
--- linux-2.6.6.org/kernel/workqueue.c 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/workqueue.c 2004-06-17 18:30:31.000000000 +0900
@@ -386,7 +386,10 @@

int fastcall schedule_work(struct work_struct *work)
{
- return queue_work(keventd_wq, work);
+ if (crashdump_mode())
+ return diskdump_schedule_work(work);
+ else
+ return queue_work(keventd_wq, work);
}

int fastcall schedule_delayed_work(struct work_struct *work, unsigned long delay)


2004-06-17 12:00:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, Jun 17, 2004 at 08:34:37PM +0900, Takao Indoh wrote:
> Instead of redefinition, I add hooks into the timer/tasklet routines.
>
> ex.
>
> int mod_timer(struct timer_list *timer, unsigned long expires)
> {
> BUG_ON(!timer->function);
>
> + if (crashdump_mode()) {
> + return diskdump_mod_timer(timer, expires);
> + }
>
> check_timer(timer);
>
> Please see the following patches for details.
> How do you think?

Looks good in principle, some more comments on the actual patch below:

> diff -Nur linux-2.6.6.org/include/linux/diskdumplib.h linux-2.6.6/include/linux/diskdumplib.h
> --- linux-2.6.6.org/include/linux/diskdumplib.h 1970-01-01 09:00:00.000000000 +0900
> +++ linux-2.6.6/include/linux/diskdumplib.h 2004-06-17 18:28:01.000000000 +0900

Should probably just go to linux/dump.h or dump_priv.h

> diff -Nur linux-2.6.6.org/include/linux/interrupt.h linux-2.6.6/include/linux/interrupt.h
> --- linux-2.6.6.org/include/linux/interrupt.h 2004-06-04 21:22:09.000000000 +0900
> +++ linux-2.6.6/include/linux/interrupt.h 2004-06-17 18:28:01.000000000 +0900
> @@ -7,6 +7,7 @@
> #include <linux/linkage.h>
> #include <linux/bitops.h>
> #include <linux/preempt.h>
> +#include <linux/diskdumplib.h>
> #include <asm/atomic.h>
> #include <asm/hardirq.h>
> #include <asm/ptrace.h>
> @@ -172,6 +173,11 @@
>
> static inline void tasklet_schedule(struct tasklet_struct *t)
> {
> + if (crashdump_mode()) {
> + diskdump_tasklet_schedule(t);
> + return;
> + }
> +
> if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
> __tasklet_schedule(t);
> }

Please sprinclde unlikely's all over here. Also the above could be
shortened to

static inline void tasklet_schedule(struct tasklet_struct *t)
{
if (unlikely(crashdump_mode()))
diskdump_tasklet_schedule(t);
else if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
__tasklet_schedule(t);
}

> +int diskdump_schedule_work(struct work_struct *work)
> +{
> + list_add_tail(&work->entry, &diskdump_workq);
> + return 1;
> +}

Should probably just inlined, I guess the function call is bigger
than all of list_add_tail.

> +void diskdump_add_timer(struct timer_list *timer)
> +{
> + timer->base = (void *)1;
> + list_add(&timer->entry, &diskdump_timers);
> +}

dito.

> + /* run timers */
> + list_for_each_safe(t, n, &diskdump_timers) {
> + timer = list_entry(t, struct timer_list, entry);

list_for_each_entry_safe here please.

> @@ -255,6 +255,10 @@
> {
> BUG_ON(!timer->function);
>
> + if (crashdump_mode()) {
> + return diskdump_mod_timer(timer, expires);
> + }

please remove the superflous braces here.

2004-06-17 12:13:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Takao Indoh <[email protected]> wrote:

> int mod_timer(struct timer_list *timer, unsigned long expires)
> {
> BUG_ON(!timer->function);
>
> + if (crashdump_mode()) {
> + return diskdump_mod_timer(timer, expires);
> + }

looks good. Please use the proper coding style:

+ if (crashdump_mode())
+ return diskdump_mod_timer(timer, expires);

but there's another possible method (suggested by Alan Cox) that
requires no changes to the timer API hotpaths: 'clear' all timer lists
upon a crash [once all CPUs have stopped and irqs are disabled] and just
let the drivers use the normal timer APIs. Drive timer execution via a
polling method.

this basically approximates your polling based implementation but uses
the existing kernel timer data structures and timer mechanism so should
be robust and compatible. It doesnt rely on any previous state (because
all currently pending timers are discarded) so it's as crash-safe as
possible.

what do you think?

Ingo

2004-06-17 12:18:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Btw, now that we got you in the loop, any chance to see a forward-port
of netdump to 2.6? I think diskdump and netdump could share a lot of
infrastructure, and given we already have the net polling hooks adding
netdump shouldn't be that much work anymore.

2004-06-17 12:32:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Christoph Hellwig <[email protected]> wrote:

> Btw, now that we got you in the loop, any chance to see a forward-port
> of netdump to 2.6? I think diskdump and netdump could share a lot of
> infrastructure, and given we already have the net polling hooks adding
> netdump shouldn't be that much work anymore.

i think a forward port of netdump might already exist - Jeff, Dave?

i agree that netdump and diskdump should be merged. (Red Hat is involved
in the diskdump project too so this is an ultimate goal even though the
patches are divergent.) Basically diskdumping is another IO transport -
the format, userspace tools and much of the non-IO kernel mechanism is
shared. Diskdumping is more complex on the driver level and it also
needs to be more careful because it writes to media so it verifies
various assumptions by reading on-disk sectors before writing to the
area.

Ingo

2004-06-17 12:44:44

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Hi,

Thank you for comment. After I fix these codes, I'll post new version of
diskdump.

Best Regards,
Takao Indoh


>> diff -Nur linux-2.6.6.org/include/linux/diskdumplib.h linux-2.6.6/include
>> /linux/diskdumplib.h
>> --- linux-2.6.6.org/include/linux/diskdumplib.h 1970-01-01 09:00:00.
>> 000000000 +0900
>> +++ linux-2.6.6/include/linux/diskdumplib.h 2004-06-17 18:28:01.000000000
>> +0900
>
>Should probably just go to linux/dump.h or dump_priv.h
>
>> diff -Nur linux-2.6.6.org/include/linux/interrupt.h linux-2.6.6/include/
>> linux/interrupt.h
>> --- linux-2.6.6.org/include/linux/interrupt.h 2004-06-04 21:22:09.
>> 000000000 +0900
>> +++ linux-2.6.6/include/linux/interrupt.h 2004-06-17 18:28:01.000000000 +

>> 0900
>> @@ -7,6 +7,7 @@
>> #include <linux/linkage.h>
>> #include <linux/bitops.h>
>> #include <linux/preempt.h>
>> +#include <linux/diskdumplib.h>
>> #include <asm/atomic.h>
>> #include <asm/hardirq.h>
>> #include <asm/ptrace.h>
>> @@ -172,6 +173,11 @@
>>
>> static inline void tasklet_schedule(struct tasklet_struct *t)
>> {
>> + if (crashdump_mode()) {
>> + diskdump_tasklet_schedule(t);
>> + return;
>> + }
>> +
>> if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
>> __tasklet_schedule(t);
>> }
>
>Please sprinclde unlikely's all over here. Also the above could be
>shortened to
>
>static inline void tasklet_schedule(struct tasklet_struct *t)
>{
> if (unlikely(crashdump_mode()))
> diskdump_tasklet_schedule(t);
> else if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
> __tasklet_schedule(t);
>}
>
>> +int diskdump_schedule_work(struct work_struct *work)
>> +{
>> + list_add_tail(&work->entry, &diskdump_workq);
>> + return 1;
>> +}
>
>Should probably just inlined, I guess the function call is bigger
>than all of list_add_tail.
>
>> +void diskdump_add_timer(struct timer_list *timer)
>> +{
>> + timer->base = (void *)1;
>> + list_add(&timer->entry, &diskdump_timers);
>> +}
>
>dito.
>
>> + /* run timers */
>> + list_for_each_safe(t, n, &diskdump_timers) {
>> + timer = list_entry(t, struct timer_list, entry);
>
>list_for_each_entry_safe here please.
>
>> @@ -255,6 +255,10 @@
>> {
>> BUG_ON(!timer->function);
>>
>> + if (crashdump_mode()) {
>> + return diskdump_mod_timer(timer, expires);
>> + }
>
>please remove the superflous braces here.
>

2004-06-17 13:03:39

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, 17 Jun 2004 14:13:56 +0200, Ingo Molnar wrote:

>but there's another possible method (suggested by Alan Cox) that
>requires no changes to the timer API hotpaths: 'clear' all timer lists
>upon a crash [once all CPUs have stopped and irqs are disabled] and just
>let the drivers use the normal timer APIs. Drive timer execution via a
>polling method.
>
>this basically approximates your polling based implementation but uses
>the existing kernel timer data structures and timer mechanism so should
>be robust and compatible. It doesnt rely on any previous state (because
>all currently pending timers are discarded) so it's as crash-safe as
>possible.
>
>what do you think?
>

It sounds good because change of timer/tasklet code is not needed.
But, I wonder whether this method is safe. For example, if kernel
crashes because of problem of timer, clearing lists may be dangerous
before dumping. Is it possible to clear all timer lists safely?

Best Regards,
Takao Indoh

2004-06-17 13:12:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Ingo Molnar <[email protected]> wrote:

> * Takao Indoh <[email protected]> wrote:
>
> > It sounds good because change of timer/tasklet code is not needed.
> > But, I wonder whether this method is safe. For example, if kernel
> > crashes because of problem of timer, clearing lists may be dangerous
> > before dumping. Is it possible to clear all timer lists safely?
>
> yes it can be done safely - just INIT_LIST_HEAD() all the timer list
> heads - like init_timers_cpu() does.

obviously this only involves the dumping CPU - no other CPU will run any
kernel code. On SMP you should also clear the timer spinlock of the
dumping CPU's timer base, if the crash happened within the timer code.

Ingo

2004-06-17 13:12:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Takao Indoh <[email protected]> wrote:

> It sounds good because change of timer/tasklet code is not needed.
> But, I wonder whether this method is safe. For example, if kernel
> crashes because of problem of timer, clearing lists may be dangerous
> before dumping. Is it possible to clear all timer lists safely?

yes it can be done safely - just INIT_LIST_HEAD() all the timer list
heads - like init_timers_cpu() does.

Ingo

2004-06-17 13:17:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Ingo Molnar <[email protected]> wrote:

> > yes it can be done safely - just INIT_LIST_HEAD() all the timer list
> > heads - like init_timers_cpu() does.
>
> obviously this only involves the dumping CPU - no other CPU will run
> any kernel code. On SMP you should also clear the timer spinlock of
> the dumping CPU's timer base, if the crash happened within the timer
> code.

put another way: you can consider 'safe dumping' to involve a simplified
bootup and initialization of the kernel's data structures that you need
during the dump, to create a barrier for any (possibly corrupted) kernel
data state prior the crash.

you need to 'boot up' your dumping data structures, the driver (and
hardware) you are going to use for dumping, and the timer code as well.

(total separation is not possible because e.g. the dumping code's
pagetable entries are a data structure too that could get corrupted - on
the other hand complete separation is not necessary because crash
statistics show that this is not a likely event - it is much more likely
to get failure due to hw errors.)

Ingo

2004-06-17 13:59:22

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, 17 Jun 2004 15:11:40 +0200, Ingo Molnar wrote:

>
>* Ingo Molnar <[email protected]> wrote:
>
>> * Takao Indoh <[email protected]> wrote:
>>
>> > It sounds good because change of timer/tasklet code is not needed.
>> > But, I wonder whether this method is safe. For example, if kernel
>> > crashes because of problem of timer, clearing lists may be dangerous
>> > before dumping. Is it possible to clear all timer lists safely?
>>
>> yes it can be done safely - just INIT_LIST_HEAD() all the timer list
>> heads - like init_timers_cpu() does.
>
>obviously this only involves the dumping CPU - no other CPU will run any
>kernel code. On SMP you should also clear the timer spinlock of the
>dumping CPU's timer base, if the crash happened within the timer code.
>
> Ingo


How about this?

void clear_timers(void)
{
int j;
tvec_base_t *base;

base = &per_cpu(tvec_bases, smp_processor_id());
spin_lock_init(&base->lock);
for (j = 0; j < TVN_SIZE; j++) {
INIT_LIST_HEAD(base->tv5.vec + j);
INIT_LIST_HEAD(base->tv4.vec + j);
INIT_LIST_HEAD(base->tv3.vec + j);
INIT_LIST_HEAD(base->tv2.vec + j);
}
for (j = 0; j < TVR_SIZE; j++)
INIT_LIST_HEAD(base->tv1.vec + j);

base->timer_jiffies = jiffies;
}

And new function which runs timer during dumping is needed,
like __run_timers() does...

Best Regards,
Takao Indoh

2004-06-17 14:45:53

by Nobuhiro Tachino

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Hello,

Takao Indoh wrote:
> On Thu, 17 Jun 2004 15:11:40 +0200, Ingo Molnar wrote:
>
>
>>* Ingo Molnar <[email protected]> wrote:
>>
>>
>>>* Takao Indoh <[email protected]> wrote:
>>>
>>>
>>>>It sounds good because change of timer/tasklet code is not needed.
>>>>But, I wonder whether this method is safe. For example, if kernel
>>>>crashes because of problem of timer, clearing lists may be dangerous
>>>>before dumping. Is it possible to clear all timer lists safely?
>>>
>>>yes it can be done safely - just INIT_LIST_HEAD() all the timer list
>>>heads - like init_timers_cpu() does.
>>
>>obviously this only involves the dumping CPU - no other CPU will run any
>>kernel code. On SMP you should also clear the timer spinlock of the
>>dumping CPU's timer base, if the crash happened within the timer code.
>>
>> Ingo
>
>
>
> How about this?
>
> void clear_timers(void)
> {
> int j;
> tvec_base_t *base;
>
> base = &per_cpu(tvec_bases, smp_processor_id());
> spin_lock_init(&base->lock);
> for (j = 0; j < TVN_SIZE; j++) {
> INIT_LIST_HEAD(base->tv5.vec + j);
> INIT_LIST_HEAD(base->tv4.vec + j);
> INIT_LIST_HEAD(base->tv3.vec + j);
> INIT_LIST_HEAD(base->tv2.vec + j);
> }
> for (j = 0; j < TVR_SIZE; j++)
> INIT_LIST_HEAD(base->tv1.vec + j);
>
> base->timer_jiffies = jiffies;
> }

I think you should save original values to somewhere else, so you can
refer these values from vmcore if you want.

2004-06-17 14:52:28

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, 17 Jun 2004 10:45:19 -0400, Nobuhiro Tachino wrote:

>I think you should save original values to somewhere else, so you can
>refer these values from vmcore if you want.

Yes, I should. It is necessary if kernel crashed because of timer
problem. Thanks!

Best Regards,
Takao Indoh

2004-06-17 15:02:53

by Jeff Moyer

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

==> Regarding Re: [3/4] [PATCH]Diskdump - yet another crash dump function; Ingo Molnar <[email protected]> adds:

mingo> * Christoph Hellwig <[email protected]> wrote:

>> Btw, now that we got you in the loop, any chance to see a forward-port
>> of netdump to 2.6? I think diskdump and netdump could share a lot of
>> infrastructure, and given we already have the net polling hooks adding
>> netdump shouldn't be that much work anymore.

mingo> i think a forward port of netdump might already exist - Jeff, Dave?

Yes, I ported the code forward to 2.6. The netpoll infrastructure needed a
little tweaking to accommodate netdump, but nothing major. Namely, we need
to reset some locks, and I added an element to the netpoll data structure
for the dump function. I also updated the zap_completion_queue function to
touch the nmi watchdog.

mingo> i agree that netdump and diskdump should be merged. (Red Hat is
mingo> involved in the diskdump project too so this is an ultimate goal
mingo> even though the patches are divergent.) Basically diskdumping is
mingo> another IO transport - the format, userspace tools and much of the
mingo> non-IO kernel mechanism is shared. Diskdumping is more complex on
mingo> the driver level and it also needs to be more careful because it
mingo> writes to media so it verifies various assumptions by reading
mingo> on-disk sectors before writing to the area.

I'm not quite sure what infrastructure would be shared between the two.
Page selection, perhaps? Anything else?

-Jeff

2004-06-17 15:46:13

by Nobuhiro Tachino

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Jeff Moyer wrote:
> ==> Regarding Re: [3/4] [PATCH]Diskdump - yet another crash dump function; Ingo Molnar <[email protected]> adds:
>
> mingo> * Christoph Hellwig <[email protected]> wrote:
>
>
>>>Btw, now that we got you in the loop, any chance to see a forward-port
>>>of netdump to 2.6? I think diskdump and netdump could share a lot of
>>>infrastructure, and given we already have the net polling hooks adding
>>>netdump shouldn't be that much work anymore.
>
>
> mingo> i think a forward port of netdump might already exist - Jeff, Dave?
>
> Yes, I ported the code forward to 2.6. The netpoll infrastructure needed a
> little tweaking to accommodate netdump, but nothing major. Namely, we need
> to reset some locks, and I added an element to the netpoll data structure
> for the dump function. I also updated the zap_completion_queue function to
> touch the nmi watchdog.
>
> mingo> i agree that netdump and diskdump should be merged. (Red Hat is
> mingo> involved in the diskdump project too so this is an ultimate goal
> mingo> even though the patches are divergent.) Basically diskdumping is
> mingo> another IO transport - the format, userspace tools and much of the
> mingo> non-IO kernel mechanism is shared. Diskdumping is more complex on
> mingo> the driver level and it also needs to be more careful because it
> mingo> writes to media so it verifies various assumptions by reading
> mingo> on-disk sectors before writing to the area.
>
> I'm not quite sure what infrastructure would be shared between the two.
> Page selection, perhaps? Anything else?

I think freezing other CPUs code can be shared, but I can't find more.

2004-06-18 12:01:29

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, 17 Jun 2004 14:13:56 +0200, Ingo Molnar wrote:

>but there's another possible method (suggested by Alan Cox) that
>requires no changes to the timer API hotpaths: 'clear' all timer lists
>upon a crash [once all CPUs have stopped and irqs are disabled] and just
>let the drivers use the normal timer APIs. Drive timer execution via a
>polling method.
>
>this basically approximates your polling based implementation but uses
>the existing kernel timer data structures and timer mechanism so should
>be robust and compatible. It doesnt rely on any previous state (because
>all currently pending timers are discarded) so it's as crash-safe as
>possible.

This is a test patch for clearing timer/tasklet and executing it
during dumping.
This patch does not work yet. I tried dumping using this patch, but
aic7xxx is unstable...


diff -Nur linux-2.6.6.org/include/linux/interrupt.h linux-2.6.6/include/linux/interrupt.h
--- linux-2.6.6.org/include/linux/interrupt.h 2004-06-04 21:22:09.000000000 +0900
+++ linux-2.6.6/include/linux/interrupt.h 2004-06-18 20:53:59.000000000 +0900
@@ -246,4 +246,8 @@
extern int probe_irq_off(unsigned long); /* returns 0 or negative on failure */
extern unsigned int probe_irq_mask(unsigned long); /* returns mask of ISA interrupts */

+
+extern void diskdump_clear_tasklet(void);
+extern void diskdump_run_tasklet(void);
+
#endif
diff -Nur linux-2.6.6.org/include/linux/timer.h linux-2.6.6/include/linux/timer.h
--- linux-2.6.6.org/include/linux/timer.h 2004-06-04 21:22:06.000000000 +0900
+++ linux-2.6.6/include/linux/timer.h 2004-06-18 20:53:59.000000000 +0900
@@ -96,4 +96,7 @@
extern void run_local_timers(void);
extern void it_real_fn(unsigned long);

+extern void diskdump_clear_timers(void);
+extern void diskdump_run_timers(void);
+
#endif
diff -Nur linux-2.6.6.org/include/linux/workqueue.h linux-2.6.6/include/linux/workqueue.h
--- linux-2.6.6.org/include/linux/workqueue.h 2004-06-04 21:22:09.000000000 +0900
+++ linux-2.6.6/include/linux/workqueue.h 2004-06-18 20:53:59.000000000 +0900
@@ -84,4 +84,7 @@
return ret;
}

+extern void diskdump_clear_workqueue(void);
+extern void diskdump_run_workqueue(void);
+
#endif
diff -Nur linux-2.6.6.org/kernel/softirq.c linux-2.6.6/kernel/softirq.c
--- linux-2.6.6.org/kernel/softirq.c 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/softirq.c 2004-06-18 20:53:59.000000000 +0900
@@ -314,6 +314,37 @@

EXPORT_SYMBOL(tasklet_kill);

+struct tasklet_head saved_tasklet;
+void diskdump_clear_tasklet(void)
+{
+ saved_tasklet.list = __get_cpu_var(tasklet_vec).list;
+ __get_cpu_var(tasklet_vec).list = NULL;
+}
+
+EXPORT_SYMBOL(diskdump_clear_tasklet);
+
+void diskdump_run_tasklet(void)
+{
+ struct tasklet_struct *list;
+
+ list = __get_cpu_var(tasklet_vec).list;
+ __get_cpu_var(tasklet_vec).list = NULL;
+
+ while (list) {
+ struct tasklet_struct *t = list;
+ list = list->next;
+
+ if (!atomic_read(&t->count) &&
+ (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)))
+ t->func(t->data);
+
+ t->next = __get_cpu_var(tasklet_vec).list;
+ __get_cpu_var(tasklet_vec).list = t;
+ }
+}
+
+EXPORT_SYMBOL(diskdump_run_tasklet);
+
void __init softirq_init(void)
{
open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
diff -Nur linux-2.6.6.org/kernel/timer.c linux-2.6.6/kernel/timer.c
--- linux-2.6.6.org/kernel/timer.c 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/timer.c 2004-06-18 20:53:59.000000000 +0900
@@ -31,6 +31,7 @@
#include <linux/time.h>
#include <linux/jiffies.h>
#include <linux/cpu.h>
+#include <linux/delay.h>

#include <asm/uaccess.h>
#include <asm/div64.h>
@@ -1070,6 +1071,12 @@
struct timer_list timer;
unsigned long expire;

+ if (unlikely(crashdump_mode())) {
+ mdelay(timeout);
+ set_current_state(TASK_RUNNING);
+ return timeout;
+ }
+
switch (timeout)
{
case MAX_SCHEDULE_TIMEOUT:
@@ -1292,6 +1299,25 @@
base->timer_jiffies = jiffies;
}

+tvec_base_t saved_tvec_base;
+void diskdump_clear_timers(void)
+{
+ tvec_base_t *base = &per_cpu(tvec_bases, smp_processor_id());
+
+ memcpy(&saved_tvec_base, base, sizeof(saved_tvec_base));
+ init_timers_cpu(smp_processor_id());
+}
+
+EXPORT_SYMBOL(diskdump_clear_timers);
+
+void diskdump_run_timers(void)
+{
+ tvec_base_t *base = &__get_cpu_var(tvec_bases);
+ __run_timers(base);
+}
+
+EXPORT_SYMBOL(diskdump_run_timers);
+
#ifdef CONFIG_HOTPLUG_CPU
static int migrate_timer_list(tvec_base_t *new_base, struct list_head *head)
{
diff -Nur linux-2.6.6.org/kernel/workqueue.c linux-2.6.6/kernel/workqueue.c
--- linux-2.6.6.org/kernel/workqueue.c 2004-06-04 21:21:58.000000000 +0900
+++ linux-2.6.6/kernel/workqueue.c 2004-06-18 20:53:59.000000000 +0900
@@ -420,6 +420,36 @@

}

+struct cpu_workqueue_struct saved_cwq;
+void diskdump_clear_workqueue(void)
+{
+ int cpu = smp_processor_id();
+ struct cpu_workqueue_struct *cwq = keventd_wq->cpu_wq + cpu;
+
+ memcpy(&saved_cwq, cwq, sizeof(saved_cwq));
+ spin_lock_init(&cwq->lock);
+ INIT_LIST_HEAD(&cwq->worklist);
+ init_waitqueue_head(&cwq->more_work);
+ init_waitqueue_head(&cwq->work_done);
+}
+
+void diskdump_run_workqueue(void)
+{
+ struct cpu_workqueue_struct *cwq;
+
+ cwq = keventd_wq->cpu_wq + smp_processor_id();
+ while (!list_empty(&cwq->worklist)) {
+ struct work_struct *work = list_entry(cwq->worklist.next,
+ struct work_struct, entry);
+ void (*f) (void *) = work->func;
+ void *data = work->data;
+
+ list_del_init(cwq->worklist.next);
+ clear_bit(0, &work->pending);
+ f(data);
+ }
+}
+
#ifdef CONFIG_HOTPLUG_CPU
/* Take the work from this (downed) CPU. */
static void take_over_work(struct workqueue_struct *wq, unsigned int cpu)
@@ -503,3 +533,6 @@
EXPORT_SYMBOL(schedule_delayed_work);
EXPORT_SYMBOL(flush_scheduled_work);

+EXPORT_SYMBOL(diskdump_clear_workqueue);
+EXPORT_SYMBOL(diskdump_run_workqueue);
+

2004-06-21 05:47:31

by Keith Owens

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Thu, 17 Jun 2004 14:13:56 +0200,
Ingo Molnar <[email protected]> wrote:
>but there's another possible method (suggested by Alan Cox) that
>requires no changes to the timer API hotpaths: 'clear' all timer lists
>upon a crash [once all CPUs have stopped and irqs are disabled] and just
>let the drivers use the normal timer APIs. Drive timer execution via a
>polling method.
>
>this basically approximates your polling based implementation but uses
>the existing kernel timer data structures and timer mechanism so should
>be robust and compatible. It doesnt rely on any previous state (because
>all currently pending timers are discarded) so it's as crash-safe as
>possible.

Don't forget live crash dumping. The system is running and is behaving
strangely so you want to take a dump for investigation, but you do not
want to kill the system afterwards. Live crash dumping is very useful
for problem diagnosis.

It is a little more complex than dumping after an oops because you must
not destroy any kernel data, including timer lists.

2004-06-21 06:24:14

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Hi,

On Mon, 21 Jun 2004 15:46:11 +1000, Keith Owens wrote:

>On Thu, 17 Jun 2004 14:13:56 +0200,
>Ingo Molnar <[email protected]> wrote:
>>but there's another possible method (suggested by Alan Cox) that
>>requires no changes to the timer API hotpaths: 'clear' all timer lists
>>upon a crash [once all CPUs have stopped and irqs are disabled] and just
>>let the drivers use the normal timer APIs. Drive timer execution via a
>>polling method.
>>
>>this basically approximates your polling based implementation but uses
>>the existing kernel timer data structures and timer mechanism so should
>>be robust and compatible. It doesnt rely on any previous state (because
>>all currently pending timers are discarded) so it's as crash-safe as
>>possible.
>
>Don't forget live crash dumping. The system is running and is behaving
>strangely so you want to take a dump for investigation, but you do not
>want to kill the system afterwards. Live crash dumping is very useful
>for problem diagnosis.
>It is a little more complex than dumping after an oops because you must
>not destroy any kernel data, including timer lists.

The method he proposed is used only when die/panic happens. Live dump
is a function crash command provides throuth /dev/mem.

Best Regards,
Takao Indoh

2004-06-21 20:40:57

by Nobuhiro Tachino

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Takao Indoh wrote:
> On Thu, 17 Jun 2004 14:13:56 +0200, Ingo Molnar wrote:
>
>
>>but there's another possible method (suggested by Alan Cox) that
>>requires no changes to the timer API hotpaths: 'clear' all timer lists
>>upon a crash [once all CPUs have stopped and irqs are disabled] and just
>>let the drivers use the normal timer APIs. Drive timer execution via a
>>polling method.
>>
>>this basically approximates your polling based implementation but uses
>>the existing kernel timer data structures and timer mechanism so should
>>be robust and compatible. It doesnt rely on any previous state (because
>>all currently pending timers are discarded) so it's as crash-safe as
>>possible.
>
>
> This is a test patch for clearing timer/tasklet and executing it
> during dumping.
> This patch does not work yet. I tried dumping using this patch, but
> aic7xxx is unstable...

Your new dump_run_timers() calls __run_timers() directly. I think that's
the reason of unstability. __run_timers() calls spin_unlock_irq()
and enables IRQ, but diskdump expects everything runs with IRQ
disabled.

2004-06-22 04:25:35

by Rob Landley

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Monday 21 June 2004 00:46, Keith Owens wrote:

> >this basically approximates your polling based implementation but uses
> >the existing kernel timer data structures and timer mechanism so should
> >be robust and compatible. It doesnt rely on any previous state (because
> >all currently pending timers are discarded) so it's as crash-safe as
> >possible.
>
> Don't forget live crash dumping. The system is running and is behaving
> strangely so you want to take a dump for investigation, but you do not
> want to kill the system afterwards. Live crash dumping is very useful
> for problem diagnosis.
>
> It is a little more complex than dumping after an oops because you must
> not destroy any kernel data, including timer lists.

How much does this differ from the proposed software suspend stuff based on
some of the crash dumping code?

If you've got software suspend and LVM snapshots, the combination of the two
could theoretically be pretty useful in an enterprise environment...

Rob
--
http://www.linucon.org: Linux Expo and Science Fiction Convention
October 8-10, 2004 in Austin Texas. (I'm the con chair.)

2004-06-22 07:55:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Keith Owens <[email protected]> wrote:

> Don't forget live crash dumping. [...]

it's still possible, you'll have to save/restore the timer table,
instead of overwriting it.

Ingo

2004-06-22 14:48:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function


* Nobuhiro Tachino <[email protected]> wrote:

> Your new dump_run_timers() calls __run_timers() directly. I think
> that's the reason of unstability. __run_timers() calls
> spin_unlock_irq() and enables IRQ, but diskdump expects everything
> runs with IRQ disabled.

indeed!

luckily we can solve this in the upstream kernel without too much fuss,
see the patch below. All callers of __run_timers() run with irqs
enabled.

(NOTE: we unconditionally disable interrupts after having run the timer
fn - this solves the problem of a timer fn keeping irqs disabled.)

does this patch stabilize diskdump?

Ingo

--- linux/kernel/timer.c.orig
+++ linux/kernel/timer.c
@@ -423,8 +423,9 @@ static int cascade(tvec_base_t *base, tv
static inline void __run_timers(tvec_base_t *base)
{
struct timer_list *timer;
+ unsigned long flags;

- spin_lock_irq(&base->lock);
+ spin_lock_irqsave(&base->lock, flags);
while (time_after_eq(jiffies, base->timer_jiffies)) {
struct list_head work_list = LIST_HEAD_INIT(work_list);
struct list_head *head = &work_list;
@@ -453,14 +454,14 @@ repeat:
set_running_timer(base, timer);
smp_wmb();
timer->base = NULL;
- spin_unlock_irq(&base->lock);
+ spin_unlock_irqrestore(&base->lock, flags);
fn(data);
spin_lock_irq(&base->lock);
goto repeat;
}
}
set_running_timer(base, NULL);
- spin_unlock_irq(&base->lock);
+ spin_unlock_irqrestore(&base->lock, flags);
}

#ifdef CONFIG_NO_IDLE_HZ

2004-06-23 12:10:44

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

Hi,

On Tue, 22 Jun 2004 12:19:14 +0200, Ingo Molnar wrote:

>indeed!
>
>luckily we can solve this in the upstream kernel without too much fuss,
>see the patch below. All callers of __run_timers() run with irqs
>enabled.
>
>(NOTE: we unconditionally disable interrupts after having run the timer
>fn - this solves the problem of a timer fn keeping irqs disabled.)

My patch posted on 22 Jun had a bug!
http://marc.theaimsgroup.com/?l=linux-kernel&m=108791355225806&w=2
In the __run_timers, I missed spin_unlock_irq before fn(data). Your
patch correct rightly. Thanks!


>does this patch stabilize diskdump?

No. There is one thing I need correct. I need replace proc interface
with sysfs, as Christoph and Arjan commented.
After fix this, I'll release stable version 1.0.

Best Regards,
Takao Indoh

2004-06-23 12:59:42

by Takao Indoh

[permalink] [raw]
Subject: Re: [3/4] [PATCH]Diskdump - yet another crash dump function

On Wed, 23 Jun 2004 21:11:58 +0900, Takao Indoh wrote:

>>does this patch stabilize diskdump?
>
>No. There is one thing I need correct. I need replace proc interface
>with sysfs, as Christoph and Arjan commented.
>After fix this, I'll release stable version 1.0.

Sorry, I misread. Yes, diskdump is stable with your patch. Thanks.

Best Regards,
Takao Indoh