SV: 6.2-RELEASE-p4/SMP: fault on nofault entry

From: Gert Lynge (none@gert--lynge.org.lh.bsd-dk.dk)
Date: Mon 17 Sep 2007 - 09:06:44 CEST


From: "Gert Lynge" <none@gert--lynge.org.lh.bsd-dk.dk>
To: <none@bsd-dk--bsd-dk.dk.lh.bsd-dk.dk>
Subject: SV: 6.2-RELEASE-p4/SMP: fault on nofault entry
Date: Mon, 17 Sep 2007 09:06:44 +0200

Hej igen liste

Nu fik jeg en ny panic - med lidt mere info., men stadig ingen dump-fil:

---
Fatal trap 12: page fault while in kernel mode
Cupid = 2; apic id = 2
Fault virtual address = 0x5a
Fault code = supervisor read, page not present
Instruction pointer = 0x20:0xc07e40a3
Stack pointer = 0x28:0xe6a35b74
Frame pointer = 0x28:0xe6a35c40
Code segment = base 0x0, limit 0xfffff, type 0x1b
             = DPL 0, pres 1, def32 1, gran 1
Processor eflags = interrupt enabled, resume, IOPL = 0
Current process = 44 (pagezero)
Trap number = 12
Panic: page fault
Cupid = 2
Uptime 17d22h47m51s
Dumping 2048 MB (2 chunks)
  Chunk 0: 1MB (150 pages)ipfw: xxxxxxxx
ipfw: xxxxxxxxxx
---
(ovenstående er et afskrift fra serverens konsol)

Denne gang havde jeg IKKE xcache og http accept filters slået til - så dem kan vi vist udelukke. Serveren er en Supermicro med ECC-RAM og en Intel Core 2 Quad 6700.

Er der nogen der har et hint til hvad der evt. kunne være galt - og om det er software eller hardware relateret? Det er en 6.2 box som er helt up to date med "freebsd-update" og som jeg jævnligt (manuelt) kører portsnap/portupgrade på. Den har dog ikke den sidste nye apache22 server release da jeg gerne ville have afklaret den fejl her inden jeg piller for meget...

Er det i øvrigt normalt at ipfw stadig skriver til konsollen efter en panic? (ipfw linje nr. to kom mens jeg stod og afskrev konsollen - mindst 10 minutter efter panic'en)

MVH / Regards Gert Lynge

-----Oprindelig meddelelse----- Fra: Gert Lynge [mailto:gert@lynge.org] Sendt: 5. september 2007 09:06 Til: 'bsd-dk@bsd-dk.dk' Emne: 6.2-RELEASE-p4/SMP: fault on nofault entry

Hej liste

Får en gang imellem den her: panic: vm_fault: fault on nofault entry, addr: e92cf000 cpuid = 3 Uptime: 10d0h24m25s Dumping 2046 MB (2 chunks) chunk 0: 1MB (150 pages) ... Ok chunk 1: 2046MB (523744 pages)_ (jeg er ikke sikker på at ovenstående adresse er den samme hver gang, idet det ikke altid er mig der kan reboote serveren)

...og her dør serveren så - dvs. hænger og kommer ikke videre. Enkelte gange har den dog også resat af sig selv - men i de tilfælde aner jeg ikke hvad der stod på konsollen. Efter en reboot, er der ikke noget i /var/crash (selvom jeg har opsat dumpdev). Det lader til at være load-relateret (mellem-busy Apache/Mysql server) - og det kan ske fra flere gange om dagen og til ca. en gang om ugen eller hver 14. dag. Det går mest amok når Apachen bruger http-accept-filters, så da serveren kører produktion har jeg slået det fra indtil videre... Der er _INTET_ i log-filerne der lugter af fejl - og selv IPMI-kortets log er tom.

Nogen ideer? Lugter det af hardware eller software?

PS: Jeg har en lille smule mistanke til xcache (PHP accelrator), så den har jeg også lige prøvet at slå fra... Og strengt taget har jeg da ikke haft en reboot siden.

PPS: Jeg kører jævnligt freebsd-update/portsnap/portupgrade (især fordi jeg har det her problem), så kerne og ports burde være up-to-date.

MVH / Regards Gert Lynge --- ws# uname -a FreeBSD x.x.x 6.2-RELEASE-p4 FreeBSD 6.2-RELEASE-p4 #0: Thu Apr 26 17:55:55 UTC 2007 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/SMP i386 --- ws# cat /etc/rc.conf defaultrouter="x.x.x.x" font8x14="cp865-8x14" font8x16="cp865-8x16" font8x8="cp865-8x8" hostname="x.x.x" ifconfig_em1="inet x.x.x.x netmask x.x.x.x" saver="daemon" usbd_enable="YES" keymap="danish.cp865" keyrate="fast" sshd_enable="YES" firewall_enable="YES" firewall_type="x" firewall_logging="YES" ntpd_enable="YES" ntpd_sync_on_start="YES" mysql_enable="YES" apache22_enable="YES" #apache22_http_accept_enable="YES" inetd_enable="YES" clear_tmp_enable="YES" #log_in_vain="1" sendmail_enable="YES" rsyncd_enable="YES" syslogd_flags="-a x.x.x.x/x:*" clamav_freshclam_enable="YES" local_startup="/usr/local/etc/rc.d" dumpdev="/dev/ar0s1b" --- ws# cat /var/run/dmesg.boot Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-RELEASE-p4 #0: Thu Apr 26 17:55:55 UTC 2007 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/SMP Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz (2660.01-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA ,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xe3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,<b9>,CX16,<b14>,<b15>> AMD Features=0x20000000<LM> AMD Features2=0x1<LAHF> Cores per package: 4 real memory = 2146304000 (2046 MB) avail memory = 2095165440 (1998 MB) ACPI APIC Table: <PTLTD APIC > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: <PTLTD RSDT> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0 pci9: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> at device 0.0 on pci9 pci10: <ACPI PCI bus> on pcib3 pci9: <base peripheral, interrupt controller> at device 0.1 (no driver attached) pcib4: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0 pci13: <ACPI PCI bus> on pcib4 em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0x4000-0x401f mem 0xe0200000-0xe021ffff irq 16 at device 0.0 on pci13 em0: Ethernet address: 00:30:48:8d:1f:5e pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0 pci14: <ACPI PCI bus> on pcib5 em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0x5000-0x501f mem 0xe0300000-0xe031ffff irq 17 at device 0.0 on pci14 em1: Ethernet address: 00:30:48:8d:1f:5f uhci0: <UHCI (generic) USB controller> port 0x3000-0x301f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: <UHCI (generic) USB controller> on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: <UHCI (generic) USB controller> port 0x3020-0x303f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: <UHCI (generic) USB controller> on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: <UHCI (generic) USB controller> port 0x3040-0x305f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: <UHCI (generic) USB controller> on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: <UHCI (generic) USB controller> port 0x3060-0x307f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: <UHCI (generic) USB controller> on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem 0xe0000000-0xe00003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0 usb4: USB revision 2.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci15: <ACPI PCI bus> on pcib6 pci15: <display, VGA> at device 0.0 (no driver attached) isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel ICH7 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x30a0-0x30af at device 31.1 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 atapci1: <Intel ICH7 SATA300 controller> port 0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf mem 0xe0000400-0xe00007ff irq 19 at device 31.2 on pci0 atapci1: AHCI Version 01.10 controller with 4 ports detected ata2: <ATA channel 0> on atapci1 ata3: <ATA channel 1> on atapci1 ata4: <ATA channel 2> on atapci1 ata5: <ATA channel 3> on atapci1 pci0: <serial bus, SMBus> at device 31.3 (no driver attached) acpi_button0: <Power Button> on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 pmtimer0 on isa0 ipmi0: <IPMI System Interface> on isa0 ipmi0: KCS mode found at io 0xca8 alignment 0x4 on isa orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xcf7ff,0xcf800-0xd07ff,0xd0800-0xd17ff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDROM <CD-224E-N/1.AA> at ata0-master UDMA33 ad4: 76319MB <Seagate ST380811AS 3.AAE> at ata2-master SATA300 ad6: 76319MB <Seagate ST380811AS 3.AAE> at ata3-master SATA300 ipmi0: IPMI device rev. 0, firmware rev. 2.2, version 2.0 ipmi0: Number of channels 4 ipmi0: Attached watchdog ar0: 76316MB <Intel MatrixRAID RAID1> status: READY ar0: disk0 READY (master) using ad4 at ata2-master ar0: disk1 READY (mirror) using ad6 at ata3-master SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/ar0s1a WARNING: / was not properly dismounted WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled



This archive was generated by hypermail 2b30 : Sun 30 Sep 2007 - 23:00:03 CEST