Dominic Cleal's Blog

Mon Mar 2 11:05:00 GMT 2009

permalink BBS2 for a home NAS, part 4: OpenSolaris + RTL8111/8168B

As mentioned in part 2, I've been experiencing network dropouts when using the Tranquil PC BBS2 heavily with OpenSolaris 2008.11. It's been mostly fine and speedy when my home directory's mounted from it to my desktop, except when doing a demanding operation, like extracting multi-GB archives to and from the NAS over NFS.

The system would simply stop responding to any network traffic, including no ARP responses and so no pings etc. Logging into the system itself, I was unable to use the network also and a mirrored switch port confirmed that no network traffic (ARP requests) was leaving the machine.

After about five or six minutes, the network would simply start working again. There were no obvious reasons why, no kernel messages, no trigger that seemed to restore it.

I've mentioned to reproduce this reliably using iperf to generate traffic back and forwards over the interface. On the NAS, run the server portion:

iperf -s
And on another machine (mine's connected over GigE):
iperf -c server -d -t 180
This will run iperf with parallel send/recv processes to stress the network for three minutes. Normally, this is enough to trigger the bug after a couple of minutes. I've yet to reproduce it doing one-way only tests, perhaps it's the heavy duplex traffic that's the trigger?

The network chipset is, unfortunately, a Realtek. A prtconf -v lists it as a Realtek RTL8102, lspci under Linux lists it as RTL8111/8102B (rev 02) and lastly the motherboard specs list it as an RTL8111C. Under Solaris, this is all handled under the rge module/driver.

There are two mentions of this elsewhere. Daz writes about the same issue and the various fixes he's tried applying to get the RTL8111 card running properly under OpenSolaris, though concludes by purchasing an Intel card. Unfortunately in the BBS2, there's only one PCI slot, which contains the SATA controller, so there's no option to replace it (plus the PCI backplane isn't visible through the case).

Secondly, there's a bug report at which describes the same issue (currently accepted as a problem). Interestingly, it lists the bug as a regression since Solaris 10, so I may try installing that and testing it.

Update for concerned Linux users: the system seems to run fine with Linux, I wouldn't worry. The only thing I'm told to watch out for is the fact this chipset's revision 2, so you need to make sure you have a very recent kernel (i.e. 2.6.26 or above). Older kernels apparently may detect the card, but not pass any traffic.

Update for concerned OpenSolaris users: an engineer from the kernel team is currently investigating this issue. Will post more when there's news/progress.

Other related posts:


Comments for this entry are now closed.