There’s a definite pattern here, and it’s definitely tied to xhci_hcd
There’s a definite pattern here, and it’s definitely tied to xhci_hcd, usb-storage and the Akasa/Genesys card reader. Here’s what I’ve done since reporting this:
1) Set ”Legacy USB 3.0” in the BIOS from ”Enabled” to ”Disabled”, and ”Intel xHCI Mode” from ”Smart Auto” to ”Enabled”. I tested briefly with latter set to ”Disabled” and Legacy 3.0 ”Enabled”, but then the card reader wasn’t detected at all and I’d prefer a working XHCI anyway.
2) Switched to mainline kernel 3.8.0-030800rc2-generic #201301022235. With the earlier kernels (3.5 and 3.2 from Precise repo) things have seemed similar to my findings below with mainline, but data with 3.2 and 3.5 are too few to say conclusively there’s no difference at all. I’ve concentrated my testing to mainline just to keep things simpler.
Here’s the pattern with mainline:
1) Cold boot. Early in the boot, the card reader in USB #4 is asked to reset. This results in a flood of ”xHCI xhci_drop_endpoint called with disabled ep ffff880403c8d500”:
Jan 8 10:32:12 saegusa kernel: [ 721.015249] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 8 10:32:12 saegusa kernel: [ 721.032596] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff880403c8d500
Jan 8 10:32:12 saegusa kernel: [ 721.032599] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff880403c8d540
When these messages are there, the session will crash (panic/freeze) at some point; using the card reader isn’t necessary (panics without it eventually too), but it’s easy enough to trigger just by sticking an SD card into the reader. The reader reports buffer errors on the card and then boom.
Jan 8 10:32:13 saegusa kernel: [ 721.713295] sd 8:0:0:2: [sdf] Unhandled error code
Jan 8 10:32:13 saegusa kernel: [ 721.713299] sd 8:0:0:2: [sdf]
Jan 8 10:32:13 saegusa kernel: [ 721.713300] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Jan 8 10:32:13 saegusa kernel: [ 721.713303] sd 8:0:0:2: [sdf] CDB:
Jan 8 10:32:13 saegusa kernel: [ 721.713304] Read(10): 28 00 00 00 20 00 00 00 08 00
Jan 8 10:32:13 saegusa kernel: [ 721.713314] end_request: I/O error, dev sdf, sector 8192
Jan 8 10:32:13 saegusa kernel: [ 721.713318] Buffer I/O error on device sdf1, logical block 0
Another USB #4 -related message often preceding a panic in syslog is this:
Jan 8 17:38:21 saegusa kernel: [ 148.282256] usb 4-4: Disable of device-initiated U1 failed.
Jan 8 17:38:21 saegusa kernel: [ 148.285747] usb 4-4: Disable of device-initiated U2 failed.
The panics, when visible, are always in Pid: usb-storage, and mostly of the ”ring_doorbell_for_active_rings” type (above), but I did have at least one ”warn_slowpath_common” too (will attach a picture if requested).
2) Reboot after the panic, USB #4 doesn’t get reset and no ”disabled ep” or ”Disable of device-initiated …” messages appear in syslog. The card reader works perfectly (i.e. SD card can be inserted and read/written without problems).
3) The panic isn’t necessary to get the card reader working: it’s enough to reboot after one cold boot and the reset signal being sent in that session. Just don’t stay in that cold boot session, because it’ll panic eventually.
I’ll attach a complete syslog from yesterday and today to give context to what I’ve quoted above.