2004-09-20 11:08:26

by Ingo Freund

[permalink] [raw]
Subject: three days running fine, then memory allocation errors

Hello,

I hope you guys can help, I cannot use any kernel 2.4 >23 without
the here described problem.

Searching the web for solutions to my problem I have already found
a thread in a mailing list but no solution was mentioned, also the
guys who talked about the error didn't answer to my direct mail.

The machine is a two xeon cpu database server without any other service
except sshd running. I do some tests on the ICP-Vortex GDT controller
every 2 minutes by using
# cat /proc/scsi/gdt/2
but the output of cat stops without beeing completed.

This is what I see in the syslog file every time when I use the cat
command (the messages beginn after 3 days uptime):
--> /var/log/messages
kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)

What do you propose to do for I can get the information I need for
longer than three days without reboot? This is a highly used database
server in production environment.

Kernel version (from /proc/version):
Linux version 2.4.27 (root@widbrz01) (gcc version 3.3.1


# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 2118139904 2074345472 43794432 0 151343104 1742090240
Swap: 6407458816 48291840 6359166976
MemTotal: 2068496 kB
MemFree: 42768 kB
MemShared: 0 kB
Buffers: 147796 kB
Cached: 1694548 kB
SwapCached: 6712 kB
Active: 223620 kB
Inactive: 1709760 kB
HighTotal: 1179628 kB
HighFree: 2080 kB
LowTotal: 888868 kB
LowFree: 40688 kB
SwapTotal: 6257284 kB
SwapFree: 6210124 kB

# cat /proc/sys/kernel/shmmax
1069547520

# cat /proc/sys/kernel/shmall
1073741824

Please let me know if there are any informations you need.
Thanks in advance for your answer,
regards
ingo.
--
// ---------------------------------------------------------------------
// e-dict GmbH & Co. KG
// Ingo Freund
// Alter Steinweg 3
// D-20459 Hamburg/Germany E-Mail: [email protected]
// ---------------------------------------------------------------------


2004-09-20 14:25:39

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: three days running fine, then memory allocation errors


Achim, I believe there is a memory leak (maybe several) in gdth's proc handling
code, can you please take a look at it?

Ingo, can you give the attached patch a test a show us the result
(you should get "gdth_alloc:x gdth_free:y" on /var/log/messages
at each read of /proc/gdth/xx

On normal server operation just dont "cat /proc/scsi/gdth/.." and your server
should be stable.

On Mon, Sep 20, 2004 at 01:07:54PM +0200, Ingo Freund wrote:
> Hello,
>
> I hope you guys can help, I cannot use any kernel 2.4 >23 without
> the here described problem.
>
> Searching the web for solutions to my problem I have already found
> a thread in a mailing list but no solution was mentioned, also the
> guys who talked about the error didn't answer to my direct mail.
>
> The machine is a two xeon cpu database server without any other service
> except sshd running. I do some tests on the ICP-Vortex GDT controller
> every 2 minutes by using
> # cat /proc/scsi/gdt/2
> but the output of cat stops without beeing completed.
>
> This is what I see in the syslog file every time when I use the cat
> command (the messages beginn after 3 days uptime):
> --> /var/log/messages
> kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)
>
> What do you propose to do for I can get the information I need for
> longer than three days without reboot? This is a highly used database
> server in production environment.
>
> Kernel version (from /proc/version):
> Linux version 2.4.27 (root@widbrz01) (gcc version 3.3.1
>
>
> # cat /proc/meminfo
> total: used: free: shared: buffers: cached:
> Mem: 2118139904 2074345472 43794432 0 151343104 1742090240
> Swap: 6407458816 48291840 6359166976
> MemTotal: 2068496 kB
> MemFree: 42768 kB
> MemShared: 0 kB
> Buffers: 147796 kB
> Cached: 1694548 kB
> SwapCached: 6712 kB
> Active: 223620 kB
> Inactive: 1709760 kB
> HighTotal: 1179628 kB
> HighFree: 2080 kB
> LowTotal: 888868 kB
> LowFree: 40688 kB
> SwapTotal: 6257284 kB
> SwapFree: 6210124 kB
>
> # cat /proc/sys/kernel/shmmax
> 1069547520
>
> # cat /proc/sys/kernel/shmall
> 1073741824
>
> Please let me know if there are any informations you need.
> Thanks in advance for your answer,
> regards
> ingo.
> --
> // ---------------------------------------------------------------------
> // e-dict GmbH & Co. KG
> // Ingo Freund
> // Alter Steinweg 3
> // D-20459 Hamburg/Germany E-Mail: [email protected]
> // ---------------------------------------------------------------------
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Attachments:
(No filename) (2.83 kB)
gdth.patch (1.69 kB)
Download all attachments

2004-09-20 14:58:30

by Ingo Freund

[permalink] [raw]
Subject: RE: three days running fine, then memory allocation errors

Thank you for the answer.
Well, I'll stop my requests to the drivers output immediatly.

The problem is, that I only get the errors on one machine.
Others (with less memory) don't react this way.
It will take some time to include the patch and inform about
the output. I have to reboot the machine after installing the
patch and the new kernel build. This can only happen in certain
time windows.
Is it neccessary to wait until the error occurs or do you only
want some outputs?

Bye - Ingo.


> -----Original Message-----
> From: Marcelo Tosatti [mailto:[email protected]]
> Sent: Monday, September 20, 2004 3:02 PM
> To: Ingo Freund
> Cc: [email protected]
> Subject: Re: three days running fine, then memory allocation errors
>
>
>
> Achim, I believe there is a memory leak (maybe several) in gdth's proc handling
> code, can you please take a look at it?
>
> Ingo, can you give the attached patch a test a show us the result
> (you should get "gdth_alloc:x gdth_free:y" on /var/log/messages
> at each read of /proc/gdth/xx
>
> On normal server operation just dont "cat /proc/scsi/gdth/.." and your server
> should be stable.
>
> On Mon, Sep 20, 2004 at 01:07:54PM +0200, Ingo Freund wrote:
> > Hello,
> >
> > I hope you guys can help, I cannot use any kernel 2.4 >23 without
> > the here described problem.
> >
> > Searching the web for solutions to my problem I have already found
> > a thread in a mailing list but no solution was mentioned, also the
> > guys who talked about the error didn't answer to my direct mail.
> >
> > The machine is a two xeon cpu database server without any other service
> > except sshd running. I do some tests on the ICP-Vortex GDT controller
> > every 2 minutes by using
> > # cat /proc/scsi/gdt/2
> > but the output of cat stops without beeing completed.
> >
> > This is what I see in the syslog file every time when I use the cat
> > command (the messages beginn after 3 days uptime):
> > --> /var/log/messages
> > kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)
> >
> > What do you propose to do for I can get the information I need for
> > longer than three days without reboot? This is a highly used database
> > server in production environment.
> >
> > Kernel version (from /proc/version):
> > Linux version 2.4.27 (root@widbrz01) (gcc version 3.3.1
> >
> >
> > # cat /proc/meminfo
> > total: used: free: shared: buffers: cached:
> > Mem: 2118139904 2074345472 43794432 0 151343104 1742090240
> > Swap: 6407458816 48291840 6359166976
> > MemTotal: 2068496 kB
> > MemFree: 42768 kB
> > MemShared: 0 kB
> > Buffers: 147796 kB
> > Cached: 1694548 kB
> > SwapCached: 6712 kB
> > Active: 223620 kB
> > Inactive: 1709760 kB
> > HighTotal: 1179628 kB
> > HighFree: 2080 kB
> > LowTotal: 888868 kB
> > LowFree: 40688 kB
> > SwapTotal: 6257284 kB
> > SwapFree: 6210124 kB
> >
> > # cat /proc/sys/kernel/shmmax
> > 1069547520
> >
> > # cat /proc/sys/kernel/shmall
> > 1073741824
> >
> > Please let me know if there are any informations you need.
> > Thanks in advance for your answer,
> > regards
> > ingo.
> > --
> > // ---------------------------------------------------------------------
> > // e-dict GmbH & Co. KG
> > // Ingo Freund
> > // Alter Steinweg 3
> > // D-20459 Hamburg/Germany E-Mail: [email protected]
> > // ---------------------------------------------------------------------
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>

2004-09-20 15:12:14

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: three days running fine, then memory allocation errors

On Mon, Sep 20, 2004 at 04:58:02PM +0200, Ingo Freund wrote:
> Thank you for the answer.
> Well, I'll stop my requests to the drivers output immediatly.
>
> The problem is, that I only get the errors on one machine.
> Others (with less memory) don't react this way.

The others also have same gdth controllers? Are the disk configuration similar?
Numbers of disks, etc.

> It will take some time to include the patch and inform about
> the output. I have to reboot the machine after installing the
> patch and the new kernel build. This can only happen in certain
> time windows.

Understood.

> Is it neccessary to wait until the error occurs or do you only
> want some outputs?

Only some outputs - it will show us if the /proc handling function
is freeing correctly some of the memory it allocates.

I forgot to CC Achim in the first message, done now.

>
> Bye - Ingo.
>
>
> > -----Original Message-----
> > From: Marcelo Tosatti [mailto:[email protected]]
> > Sent: Monday, September 20, 2004 3:02 PM
> > To: Ingo Freund
> > Cc: [email protected]
> > Subject: Re: three days running fine, then memory allocation errors
> >
> >
> >
> > Achim, I believe there is a memory leak (maybe several) in gdth's proc handling
> > code, can you please take a look at it?
> >
> > Ingo, can you give the attached patch a test a show us the result
> > (you should get "gdth_alloc:x gdth_free:y" on /var/log/messages
> > at each read of /proc/gdth/xx
> >
> > On normal server operation just dont "cat /proc/scsi/gdth/.." and your server
> > should be stable.
> >
> > On Mon, Sep 20, 2004 at 01:07:54PM +0200, Ingo Freund wrote:
> > > Hello,
> > >
> > > I hope you guys can help, I cannot use any kernel 2.4 >23 without
> > > the here described problem.
> > >
> > > Searching the web for solutions to my problem I have already found
> > > a thread in a mailing list but no solution was mentioned, also the
> > > guys who talked about the error didn't answer to my direct mail.
> > >
> > > The machine is a two xeon cpu database server without any other service
> > > except sshd running. I do some tests on the ICP-Vortex GDT controller
> > > every 2 minutes by using
> > > # cat /proc/scsi/gdt/2
> > > but the output of cat stops without beeing completed.
> > >
> > > This is what I see in the syslog file every time when I use the cat
> > > command (the messages beginn after 3 days uptime):
> > > --> /var/log/messages
> > > kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)
> > >
> > > What do you propose to do for I can get the information I need for
> > > longer than three days without reboot? This is a highly used database
> > > server in production environment.
> > >
> > > Kernel version (from /proc/version):
> > > Linux version 2.4.27 (root@widbrz01) (gcc version 3.3.1
> > >
> > >
> > > # cat /proc/meminfo
> > > total: used: free: shared: buffers: cached:
> > > Mem: 2118139904 2074345472 43794432 0 151343104 1742090240
> > > Swap: 6407458816 48291840 6359166976
> > > MemTotal: 2068496 kB
> > > MemFree: 42768 kB
> > > MemShared: 0 kB
> > > Buffers: 147796 kB
> > > Cached: 1694548 kB
> > > SwapCached: 6712 kB
> > > Active: 223620 kB
> > > Inactive: 1709760 kB
> > > HighTotal: 1179628 kB
> > > HighFree: 2080 kB
> > > LowTotal: 888868 kB
> > > LowFree: 40688 kB
> > > SwapTotal: 6257284 kB
> > > SwapFree: 6210124 kB
> > >
> > > # cat /proc/sys/kernel/shmmax
> > > 1069547520
> > >
> > > # cat /proc/sys/kernel/shmall
> > > 1073741824
> > >
> > > Please let me know if there are any informations you need.
> > > Thanks in advance for your answer,
> > > regards
> > > ingo.
> > > --
> > > // ---------------------------------------------------------------------
> > > // e-dict GmbH & Co. KG
> > > // Ingo Freund
> > > // Alter Steinweg 3
> > > // D-20459 Hamburg/Germany E-Mail: [email protected]
> > > // ---------------------------------------------------------------------
> > >
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-09-20 15:17:45

by Ingo Freund

[permalink] [raw]
Subject: RE: three days running fine, then memory allocation errors

>
> On Mon, Sep 20, 2004 at 04:58:02PM +0200, Ingo Freund wrote:
> > Thank you for the answer.
> > Well, I'll stop my requests to the drivers output immediatly.
> >
> > The problem is, that I only get the errors on one machine.
> > Others (with less memory) don't react this way.
>
> The others also have same gdth controllers? Are the disk configuration similar?
> Numbers of disks, etc.

No not really, the others work with RAID1 (2 SATA disks) the concerned with
RAID1 + 5 (SCSI disks) on several disks and so on...

>
> > It will take some time to include the patch and inform about
> > the output. I have to reboot the machine after installing the
> > patch and the new kernel build. This can only happen in certain
> > time windows.
>
> Understood.
>
> > Is it neccessary to wait until the error occurs or do you only
> > want some outputs?
>
> Only some outputs - it will show us if the /proc handling function
> is freeing correctly some of the memory it allocates.
>
> I forgot to CC Achim in the first message, done now.
>
> >
> > Bye - Ingo.
> >
> >
> > > -----Original Message-----
> > > From: Marcelo Tosatti [mailto:[email protected]]
> > > Sent: Monday, September 20, 2004 3:02 PM
> > > To: Ingo Freund
> > > Cc: [email protected]
> > > Subject: Re: three days running fine, then memory allocation errors
> > >
> > >
> > >
> > > Achim, I believe there is a memory leak (maybe several) in gdth's proc handling
> > > code, can you please take a look at it?
> > >
> > > Ingo, can you give the attached patch a test a show us the result
> > > (you should get "gdth_alloc:x gdth_free:y" on /var/log/messages
> > > at each read of /proc/gdth/xx
> > >
> > > On normal server operation just dont "cat /proc/scsi/gdth/.." and your server
> > > should be stable.
> > >
> > > On Mon, Sep 20, 2004 at 01:07:54PM +0200, Ingo Freund wrote:
> > > > Hello,
> > > >
> > > > I hope you guys can help, I cannot use any kernel 2.4 >23 without
> > > > the here described problem.
> > > >
> > > > Searching the web for solutions to my problem I have already found
> > > > a thread in a mailing list but no solution was mentioned, also the
> > > > guys who talked about the error didn't answer to my direct mail.
> > > >
> > > > The machine is a two xeon cpu database server without any other service
> > > > except sshd running. I do some tests on the ICP-Vortex GDT controller
> > > > every 2 minutes by using
> > > > # cat /proc/scsi/gdt/2
> > > > but the output of cat stops without beeing completed.
> > > >
> > > > This is what I see in the syslog file every time when I use the cat
> > > > command (the messages beginn after 3 days uptime):
> > > > --> /var/log/messages
> > > > kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)
> > > >
> > > > What do you propose to do for I can get the information I need for
> > > > longer than three days without reboot? This is a highly used database
> > > > server in production environment.
> > > >
> > > > Kernel version (from /proc/version):
> > > > Linux version 2.4.27 (root@widbrz01) (gcc version 3.3.1
> > > >
> > > >
> > > > # cat /proc/meminfo
> > > > total: used: free: shared: buffers: cached:
> > > > Mem: 2118139904 2074345472 43794432 0 151343104 1742090240
> > > > Swap: 6407458816 48291840 6359166976
> > > > MemTotal: 2068496 kB
> > > > MemFree: 42768 kB
> > > > MemShared: 0 kB
> > > > Buffers: 147796 kB
> > > > Cached: 1694548 kB
> > > > SwapCached: 6712 kB
> > > > Active: 223620 kB
> > > > Inactive: 1709760 kB
> > > > HighTotal: 1179628 kB
> > > > HighFree: 2080 kB
> > > > LowTotal: 888868 kB
> > > > LowFree: 40688 kB
> > > > SwapTotal: 6257284 kB
> > > > SwapFree: 6210124 kB
> > > >
> > > > # cat /proc/sys/kernel/shmmax
> > > > 1069547520
> > > >
> > > > # cat /proc/sys/kernel/shmall
> > > > 1073741824
> > > >
> > > > Please let me know if there are any informations you need.
> > > > Thanks in advance for your answer,
> > > > regards
> > > > ingo.
> > > > --
> > > > // ---------------------------------------------------------------------
> > > > // e-dict GmbH & Co. KG
> > > > // Ingo Freund
> > > > // Alter Steinweg 3
> > > > // D-20459 Hamburg/Germany E-Mail: [email protected]
> > > > // ---------------------------------------------------------------------
> > > >
> > > > -
> > > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > > the body of a message to [email protected]
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > Please read the FAQ at http://www.tux.org/lkml/
> > >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
>

2004-09-20 15:22:23

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: three days running fine, then memory allocation errors

On Mon, Sep 20, 2004 at 05:17:17PM +0200, Ingo Freund wrote:
> >
> > On Mon, Sep 20, 2004 at 04:58:02PM +0200, Ingo Freund wrote:
> > > Thank you for the answer.
> > > Well, I'll stop my requests to the drivers output immediatly.
> > >
> > > The problem is, that I only get the errors on one machine.
> > > Others (with less memory) don't react this way.
> >
> > The others also have same gdth controllers? Are the disk configuration similar?
> > Numbers of disks, etc.
>
> No not really, the others work with RAID1 (2 SATA disks) the concerned with
> RAID1 + 5 (SCSI disks) on several disks and so on...

OK, so you see the problem is gdth specific... I've seen other users report
the same issue.

2004-09-21 11:37:46

by Ingo Freund

[permalink] [raw]
Subject: RE: three days running fine, then memory allocation errors

Marcelo Tosatti [mailto:[email protected]] wrote:
>
> On Mon, Sep 20, 2004 at 04:58:02PM +0200, Ingo Freund wrote:
> >
> > The problem is, that I only get the errors on one machine.
> > Others (with less memory) don't react this way.
>
> > Is it neccessary to wait until the error occurs or do you only
> > want some outputs?
>
> Only some outputs - it will show us if the /proc handling function
> is freeing correctly some of the memory it allocates.
>

The patch is included.
There is no output neither in the syslog file nor in dmesg and I am not so
good in C or kernel programming to see why.

This is the output of the cat command on /proc/scsi/gdth/2 when it works.
Another strange thing is the time, the command needs to finish.
# time cat /proc/scsi/gdth/2
Driver Parameters:
reserve_mode: 1 reserve_list: --
max_ids: 127 hdr_channel: 0

Disk Array Controller Information:
Number: 0 Name: GDT8543RZ
Driver Ver.: 2.05 Firmware Ver.: 2.28.07-R051
Serial No.: 0x47C24685 Cache RAM size: 262144 KB

Physical Devices:
Chn/ID/LUN: A/08/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 2
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/09/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 9
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/10/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 8
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/11/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 7
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/12/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 4
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/13/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 5
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/14/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 6
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/00/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 0
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/01/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 0
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/02/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 10
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/03/0 Name: IBM IC35L018UCD210-0S5BS
Capacity [MB]: 17501 To Log. Drive: 1
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/04/0 Name: IBM IC35L018UCD210-0S5BS
Capacity [MB]: 17501 To Log. Drive: 1
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/05/0 Name: IBM DDYS-T18350M SA2A
Capacity [MB]: 17501 To Log. Drive: 3
Retries: 0 Reassigns: 0
Grown Defects: 0

Logical Drives:
Number: 0 Status: ok
Capacity [MB]: 35002 Type: RAID-1
Slave Number: 21 Status: ok
Missing Drv.: 0 Invalid Drv.: 0
To Array Drv.: --

Number: 1 Status: ok
Capacity [MB]: 17501 Type: RAID-1
Slave Number: 26 Status: ok
Missing Drv.: 0 Invalid Drv.: 0
To Array Drv.: --

Number: 2 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 3 Status: ok
Capacity [MB]: 17501 Type: Disk
To Array Drv.: --

Number: 4 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 5 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 6 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 7 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 8 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 9 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 10 Status: ok
Capacity [MB]: 35002 Type: Disk
To Array Drv.: --

Array Drives:
Number: 2 Status: ready
Capacity [MB]: 210019 Type: RAID-5

Host Drives:
Number: 0 Arr/Log. Drive: 0
Capacity [MB]: 35000 Start Sector: 0

Number: 1 Arr/Log. Drive: 1
Capacity [MB]: 17500 Start Sector: 0

Number: 2 Arr/Log. Drive: 2
Capacity [MB]: 210013 Start Sector: 0

Number: 3 Arr/Log. Drive: 3
Capacity [MB]: 17500 Start Sector: 0

Controller Events:

real 0m8.626s
user 0m0.000s
sys 0m0.000s

2004-09-23 10:54:19

by Ingo Freund

[permalink] [raw]
Subject: RE: three days running fine, then memory allocation errors

Today the problem in requesting the controllers /proc entry came up again.
This time
reboot system boot 2.4.27 Tue Sep 21 10:35 (2+02:14)
Sep 23 11:14:09 server01 kernel: __alloc_pages: 0-order allocation failed (gfp=0x21/0)
and still no output concerning memory usage of the controller.

Is Achim online?

Marcelo Tosatti [mailto:[email protected]] wrote:
>
> On Mon, Sep 20, 2004 at 04:58:02PM +0200, Ingo Freund wrote:
> >
> > The problem is, that I only get the errors on one machine.
> > Others (with less memory) don't react this way.
>
> > Is it neccessary to wait until the error occurs or do you only
> > want some outputs?
>
> Only some outputs - it will show us if the /proc handling function
> is freeing correctly some of the memory it allocates.
>

The patch is included.
There is no output neither in the syslog file nor in dmesg and I am not so
good in C or kernel programming to see why.

This is the output of the cat command on /proc/scsi/gdth/2 when it works.
Another strange thing is the time, the command needs to finish.
# time cat /proc/scsi/gdth/2
Driver Parameters:
reserve_mode: 1 reserve_list: --
max_ids: 127 hdr_channel: 0

Disk Array Controller Information:
Number: 0 Name: GDT8543RZ
Driver Ver.: 2.05 Firmware Ver.: 2.28.07-R051
Serial No.: 0x47C24685 Cache RAM size: 262144 KB

Physical Devices:
Chn/ID/LUN: A/08/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 2
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/09/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 9
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/10/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 8
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: A/11/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 7
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/12/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 4
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/13/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 5
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: B/14/0 Name: SEAGATE ST373307LC 0007
Capacity [MB]: 70006 To Log. Drive: 6
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/00/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 0
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/01/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 0
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/02/0 Name: IBM IC35L036UCD210-0S5BS
Capacity [MB]: 35002 To Log. Drive: 10
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/03/0 Name: IBM IC35L018UCD210-0S5BS
Capacity [MB]: 17501 To Log. Drive: 1
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/04/0 Name: IBM IC35L018UCD210-0S5BS
Capacity [MB]: 17501 To Log. Drive: 1
Retries: 0 Reassigns: 0
Grown Defects: 0

Chn/ID/LUN: D/05/0 Name: IBM DDYS-T18350M SA2A
Capacity [MB]: 17501 To Log. Drive: 3
Retries: 0 Reassigns: 0
Grown Defects: 0

Logical Drives:
Number: 0 Status: ok
Capacity [MB]: 35002 Type: RAID-1
Slave Number: 21 Status: ok
Missing Drv.: 0 Invalid Drv.: 0
To Array Drv.: --

Number: 1 Status: ok
Capacity [MB]: 17501 Type: RAID-1
Slave Number: 26 Status: ok
Missing Drv.: 0 Invalid Drv.: 0
To Array Drv.: --

Number: 2 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 3 Status: ok
Capacity [MB]: 17501 Type: Disk
To Array Drv.: --

Number: 4 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 5 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 6 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 7 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 8 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 9 Status: ok
Capacity [MB]: 70006 Type: Disk
To Array Drv.: 2

Number: 10 Status: ok
Capacity [MB]: 35002 Type: Disk
To Array Drv.: --

Array Drives:
Number: 2 Status: ready
Capacity [MB]: 210019 Type: RAID-5

Host Drives:
Number: 0 Arr/Log. Drive: 0
Capacity [MB]: 35000 Start Sector: 0

Number: 1 Arr/Log. Drive: 1
Capacity [MB]: 17500 Start Sector: 0

Number: 2 Arr/Log. Drive: 2
Capacity [MB]: 210013 Start Sector: 0

Number: 3 Arr/Log. Drive: 3
Capacity [MB]: 17500 Start Sector: 0

Controller Events:

real 0m8.626s
user 0m0.000s
sys 0m0.000s

2004-09-23 11:18:50

by Leubner, Achim

[permalink] [raw]
Subject: RE: three days running fine, then memory allocation errors

Yes, I'm online and I will try to reproduce the error.

> -----Original Message-----
> From: Ingo Freund [mailto:[email protected]]
> Sent: Donnerstag, 23. September 2004 12:54
> To: Marcelo Tosatti
> Cc: [email protected]; Leubner, Achim
> Subject: RE: three days running fine, then memory allocation errors
>
> Today the problem in requesting the controllers /proc entry came up
again.
> This time
> reboot system boot 2.4.27 Tue Sep 21 10:35
(2+02:14)
> Sep 23 11:14:09 server01 kernel: __alloc_pages: 0-order allocation
failed (gfp=0x21/0)
> and still no output concerning memory usage of the controller.
>
> Is Achim online?
>