#293190 - 31/01/2007 04:08
sigkill error in ext3 code, Hijack v467
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
I've had mp3tofid + rsync working for a while now, and it's made life easier. Until tonight, that is. While doing a pretty routine sync, I got a sigkill error on the screen. I brought the empeg upstairs and reproduced the error with a serial console, and here's the debug output: Code:
Adding Swap: 16028k swap-space (priority -1) Assertion failure in do_get_write_access() at transaction.c line 554: "handle->h_buffer_credits > 0 Unable to handle kernel NULL pointer dereference at virtual address 00000000 memmap = C0E18000, pgd = c0e18000 *pgd = c0b24801, *pmd = c0b24801, *pte = c000000b, *ppte = c000000a Internal error: Oops: 0 CPU: 0 pc : [<c00514b8>] lr : [<c00198b0>] sp : c0af3dcc ip : c0af3d88 fp : c0af3e00 r10: 00000000 r9 : 00000001 r8 : c0aeed40 r7 : c0de5500 r6 : c01f2a40 r5 : c0a4dba0 r4 : 00000000 r3 : 00000000 r2 : c0112264 r1 : 00000001 r0 : 00000068 Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment user Control: C0E1917D Table: C0E1917D DAC: 00000015 Process rsync (pid: 66, stackpage=c0af3000) Stack: c0af3da0: c00198b0 c0af3dc0: c00514b8 60000013 ffffffff c00fa860 c0a4dbb8 00000000 c0de5500 c0a4dba0 c0af3de0: c01f2a40 000011ed 000004a9 c0ddbd20 00000000 c0af3e24 c0af3e04 c005184c c0af3e00: c0051104 c0defadc 000011ec c0defa00 c0a4dba0 000011ed c0af3e64 c0af3e28 c0af3e20: c00587c0 c005181c c0af3e6c c0defa30 c0deb400 c0de5500 c0e04580 c06d6258 c0af3e40: c0af3f48 c0de5500 009436e2 009436e2 00000008 c0a48820 c0af3e8c c0af3e68 c0af3e60: c005b980 c0058424 00000000 c0af3f48 00000000 c06d6258 c0de5500 00000014 c0af3e80: c0af3ecc c0af3e90 c005bf5c c005b940 00000002 c0a4d120 c0fc7aa0 c03bec80 c0af3ea0: 0000040b 00000100 c06d6258 00000008 00000014 c0de5500 c0af3f48 00000001 c0af3ec0: c0af3f10 c0af3ed0 c005c36c c005bdf8 00000001 00000400 00000014 c0af3f48 c0af3ee0: 00000010 00000008 00000000 00000400 c0a3a400 c06d6258 c0de5500 00000a0c c0af3f00: 00005000 c0af3f80 c0af3f14 c0059c4c c005c10c c0af3f48 c0af3f4c 00000001 c0af3f20: 00000000 c0defa00 00000014 00005000 00000000 00000014 c0678e54 022bbe50 c0af3f40: c0678e40 c0a4dc20 ffffffe4 c0fc7aa0 c03bec80 0000040b c06d62a4 c0678e40 c0af3f60: ffffffea c00596cc 00005a0c 00000000 022b6e50 c0af3fb0 c0af3f84 c0031c94 c0af3f80: c00596d8 c000ab88 c000a9f0 00005a0c c000a268 00000004 c0af2000 c0af3ff4 c0af3fa0: 00000001 00000000 c0af3fb4 c000a0c0 c0031af0 00005a0c 00000007 022b6e50 c0af3fc0: 00005a0c 02041360 00005a0c 022b6e50 00000000 00000007 00005a0c 00000000 c0af3fe0: 00000001 bfffca2c 02043548 bfffca10 02016088 400c3eb4 60000010 00000007 Backtrace: Function entered at [<c00510f8>] from [<c005184c>] r10 = 00000000 r9 = C0DDBD20 r8 = 000004A9 r7 = 000011ED r6 = C01F2A40 r5 = C0A4DBA0 r4 = C0DE5500 Function entered at [<c0051810>] from [<c00587c0>] r7 = 000011ED r6 = C0A4DBA0 r5 = C0DEFA00 r4 = 000011EC Function entered at [<c0058418>] from [<c005b980>] r10 = C0A48820 r9 = 00000008 r8 = 009436E2 r7 = 009436E2 r6 = C0DE5500 r5 = C0AF3F48 r4 = C06D6258 Function entered at [<c005b934>] from [<c005bf5c>] r7 = 00000014 r6 = C0DE5500 r5 = C06D6258 r4 = 00000000 Function entered at [<c005bdec>] from [<c005c36c>] r10 = 00000001 r9 = C0AF3F48 r8 = C0DE5500 r7 = 00000014 r6 = 00000008 r5 = C06D6258 r4 = 00000100 Function entered at [<c005c100>] from [<c0059c4c>] r10 = 00005000 r9 = 00000A0C r8 = C0DE5500 r7 = C06D6258 r6 = C0A3A400 r5 = 00000400 r4 = 00000000 Function entered at [<c00596cc>] from [<c0031c94>] r10 = 022B6E50 r9 = 00000000 r8 = 00005A0C r7 = C00596CC r6 = FFFFFFEA r5 = C0678E40 r4 = C06D62A4 Function entered at [<c0031ae4>] from [<c000a0c0>] r10 = 00000001 r8 = C0AF3FF4 r7 = C0AF2000 r6 = 00000004 r5 = C000A268 r4 = 00005A0C Code: ebff2079 e3a03000 (e5c33000) e5973004 e2433001
transaction.c appears to be in the ext3 code. I have no idea what handle->h_buffer_credits is, but it appears to be related to the journaling. My uneducated guess is that the journal file was in a state that made the kernel unhappy. With this in mind, I'm fscking the drive overnight and will retry tomorrow with a hopefully fixed journal, but I wanted to post the error here to see if this is an error condition that can be handled better or something.
|
Top
|
|
|
|
#293191 - 31/01/2007 05:01
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Well, fsck didn't do the trick. It said "rebuilding journal" or somesuch during the fsck so I thought that would fix the problem, but sure enough, rsync died in the same spot.
I'm reverting back to ext2 now. Taking ext3 out of the equation should at least let me sync again... But I'm really curious as to what's happening here.
|
Top
|
|
|
|
#293192 - 31/01/2007 05:05
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
After reverting drives and kernel to ext2, my rsync completes successfully. So whatever it is, it's in ext3, and fixing the journal doesn't seem to solve the problem.
|
Top
|
|
|
|
#293193 - 31/01/2007 14:26
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
You could try removing the journal and recreating it. That would be "tune2fs -O ^has_journal /dev/hda", IIRC.
Also, this is clearly karmic payback for making fun of my recent long ext2 fsck.
_________________________
Bitt Faulk
|
Top
|
|
|
|
#293194 - 31/01/2007 14:26
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Quote: After reverting drives and kernel to ext2, my rsync completes successfully. So whatever it is, it's in ext3, and fixing the journal doesn't seem to solve the problem.
There's a reason why ext3 is *not* the default in Hijack! The early ext3 code that we have there is *very early code*. It mostly works, quite well, but.. hundreds of bugs have been fixed since those days.
Cheers
|
Top
|
|
|
|
#293195 - 31/01/2007 15:05
Re: sigkill error in ext3 code, Hijack v467
[Re: wfaulk]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Yeah, I already removed the journal as part of reverting to ext2. When I have some time I'll go back to ext3 and try again.
And, yeah, karma's a bitch. :/
|
Top
|
|
|
|
#293196 - 31/01/2007 15:07
Re: sigkill error in ext3 code, Hijack v467
[Re: mlord]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Yeah, I understand that. I just thought it was worth reporting in case there was an easy fix. Is updating to more recent ext3 code practical at all?
|
Top
|
|
|
|
#293197 - 31/01/2007 20:27
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Quote: Yeah, I understand that. I just thought it was worth reporting in case there was an easy fix. Is updating to more recent ext3 code practical at all?
I'm not interested, but I do accept patches.
|
Top
|
|
|
|
#293198 - 31/01/2007 21:30
Re: sigkill error in ext3 code, Hijack v467
[Re: mlord]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Right, and I've sent you patches before. My question is, in your opinion, would bringing ext3 support up to more recent versions of the code be a difficult undertaking in terms of conflict with other files that may have been modified as part of the development of Hijack?
|
Top
|
|
|
|
#293199 - 31/01/2007 22:11
Re: sigkill error in ext3 code, Hijack v467
[Re: tonyc]
|
carpal tunnel
Registered: 29/08/2000
Posts: 14496
Loc: Canada
|
Quote: Right, and I've sent you patches before. My question is, in your opinion, would bringing ext3 support up to more recent versions of the code be a difficult undertaking in terms of conflict with other files that may have been modified as part of the development of Hijack?
Nope. It (ext3 code) lives pretty much in its own self-contained directory under fs/
The original patch to include it from scratch was fairly invasive, but just updating it by a few minor kernel releases should be okay.
There's zero chance of the current ext3 code being usable, though. But perhaps a newer backport in the 2.2.xx kernel series would be easy, and maybe a backport from an early 2.4.xx kernel might also be remotely possible.
Cheers
|
Top
|
|
|
|
|
|