Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 98767
Summary: | (NET 3C59X) TCP Transmit errors on 3C905TX cards | ||
---|---|---|---|
Product: | [Retired] Red Hat Raw Hide | Reporter: | Dan Egli <dan> |
Component: | kernel | Assignee: | John W. Linville <linville> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 1.0 | CC: | alexl, barryn, davej, dgenn, edwinh, eep2, emmanuel, hps, hugh_caley, iainr, jim, jlcthibo, kevymac, misek, mkanat, nbryant, pavelr, r.pallucchini, tmokros, yaoz |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i586 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-10-30 03:39:03 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 100643 |
Description
Dan Egli
2003-07-08 17:40:36 UTC
*** Bug 101427 has been marked as a duplicate of this bug. *** Now I'm confused. I specifically stated that this should NOT be a kernel bug since three different kernels (Two Stock Kernels from distributions, one downloaded from kernel.org and compiled) all do the exact same thing. So given that fact, perhaps someone can explain why this was then re-flagged as a kernel bug? This problem occurs as well on a clean install of Fedora Core test2. I've see the same thing with 2 3c905 cards on an old PII that I use for testing, swapping in a 3c905b everything seems to be ok which seems a bit odd. I swap to HD's on RH9 and the other RHRawhide, 3c905 works on RH9 but not on Rawhide. I, too, am using an AMD. Athlon XP 1700+ 1 GB of RAM Oct 4 01:21:37 kernel: 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html Oct 4 01:21:37 kernel: See Documentation/networking/vortex.txt Oct 4 01:21:37 kernel: 00:09.0: 3Com PCI 3c905 Boomerang 100baseTx at 0xd000. Vers LK1.1.18-ac Oct 4 01:21:37 kernel: 00:60:08:19:c5:a4, IRQ 10 Oct 4 01:21:37 kernel: product code 4b4b rev 00.0 date 06-13-97 Oct 4 01:21:37 kernel: 64K word-wide RAM 1:1 Rx:Tx split, autoselect/10baseT interface. Oct 4 01:21:37 kernel: Enabling bus-master transmits and whole-frame receives. Oct 4 01:21:37 kernel: 00:09.0: scatter/gather enabled. h/w checksums disabled Oct 4 01:21:37 kernel: eth0: Dropping NETIF_F_SG since no checksum feature. Oct 4 01:21:37 kernel: eth0: Transmit error, Tx status register d0. Oct 4 01:21:37 kernel: Flags; bus-master 1, dirty 1(1) current 1(1) Oct 4 01:21:37 kernel: Transmit list 00000000 vs. f2916240. Oct 4 01:21:37 kernel: 0: @f2916200 length 8000002a status 8000002a Oct 4 01:21:37 kernel: 1: @f2916240 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 2: @f2916280 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 3: @f29162c0 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 4: @f2916300 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 5: @f2916340 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 6: @f2916380 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 7: @f29163c0 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 8: @f2916400 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 9: @f2916440 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 10: @f2916480 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 11: @f29164c0 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 12: @f2916500 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 13: @f2916540 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 14: @f2916580 length 00000000 status 00000000 Oct 4 01:21:37 kernel: 15: @f29165c0 length 00000000 status 00000000 Oct 4 01:21:39 network: Bringing up interface eth0: succeeded Same here with Fedora Core test2 + rawhide. Bug doesn't happen with 2.6.0-test6.1.49 kernel. Can you please try using mii-tool and forcing it to not use autonegotiation and see if that helps the problem at all. I had a similar problem and this remedied the problem. Any chance this is related to bug 98832? Does removing kudzu from the boot fix anything? Dan: All of the kernels you compared are too similar to say for sure that this isn't a kernel bug. Everyone: Is anyone experiencing this problem *without* rhgb? If you're using rhgb, try booting with "nogui" and see if the problem goes away. have no idea what rhgb is but the results of booting BREAKME with kudzu not loading and with the nogui kernel paramaters are as follows: kernel command line: linux kudzu status: on result: same errors kernel command line: linux kudzu status: off result: seems fine kernel command line: linux nogui kudzu status: on result: same errors kernel command line: linux nogui kudzu status: off result: seems fine so it appears that for whatever, kudzu is the culprit here. rhgb = Red Hat Graphical Boot. It's what you get by default if you do not specify "nogui". In my case, disabling kudzu doesn't make a difference, but "nogui" does... I see it with nogui boot. Also, I don't start the interface on boot, but start it manually later instead. P.S. 2.6.0 kernels do not have this problem, so it probably IS a kernel bug. Same problem with Fedora 0.95 (test 3) on 3c905-TX nogui had no effect but turning kudzu off fixes the problem *** Bug 100470 has been marked as a duplicate of this bug. *** *** Bug 105684 has been marked as a duplicate of this bug. *** Donald Becker, the maintainer of the 3c905 (3c59x) driver, has a mailing list for it. The archives are over at: http://www.scyld.com/pipermail/vortex/ It might be good to search the archives and try out some of the various fixes for problems other people have had. The 3c905TX is mentioned pretty frequently. I would have done more research myself, but I don't have the card to test the solutions on. Somebody with the card could also email the list and Donald would probably be able to figure out what's going on. -M OK, Yesterday I reinstalled the PC with one of the 905's installed using the current version of fedora-test, the initial install went fine until the first reboot. The PC hung at the detecting new hardware message and I hit ctrl-alt-del and then the reset button when it was still sitting there 20 minutes later. next time round it managed to get up to running X but the ethernet card wasn't seeing the network and I was getting error messages similar to above. I swapped the 905 for the 905b, kudzu detected the change and fedora came up with a working ethernet card. I then installed all the updates, shut the machine down again and swapped the 905 back in, graphical boot has dissapeared, kudzu again detected the change and we don't have a working ethernet card. The machine also has redhat 9 so reboot the PC into rh9, same problem, card can't see the network and error messages as above. shut the machine down (and power off) then boot into rh9, card works fine, reboot into fedora: dead card. try running mii-tool --force=100baseTx-FD, doesn't work, try 100baseTx-HD still doesn't work, just running mii-tool reports that there is "No MII Trasciever present!" move /etc/rc.d/init.d/kudzu out of the way and reboot. working network card. Run /usr/sbin/kudzu and we still have a working network card, re-enable /etc/rc.d/init.d/kudzu reboot and dead network card again. So it seems that I have a working 905 if kudzu isn't run during startup/shutdown. If there's any other info I can generate to shed light on this feel free to ask. I can confirm Iain Rae's finding. I've filed a similar bug#105684 which is marked as duplicate of this one. I just tested like this: 1. Boot to 2.6.0-0.test9.1.67, the network works fine. Then /sbin/chkconfig --level 5 kudzu off 2. Reboot to 2.4.22-1.2115.nptl. The network still works. What I can tell is that under 2.6, when kudzu is on, there is not problem with the network. But under Fedora Core's 2.4.22, when kudzu is on, the 3C90x is not working. It turning off kudzu under Fedora Core's 2.4.22, the NIC works fine. alexl -- Bug 98832 gives me a "You are not authorized to access bug #98832". It seems that it is related, since all reporters say that disabling kudzu un-breaks it. If it's being worked on, and its resolution would resolve this, could you tell us? :-) I added you to the CC list so that you would get this message. Feel free to remove yourself if you want. -M I can confirm Iain Rae's finding. On my Asus P2B mainboard when Kudzu is turned off for the boot runlevel my 3c905 initializes ok. Shutting down Kudzu post boot does not allow the network card to ifup properly. The only solution is to have Kudzu off at the runlevel boot. I have had the same problem since test 1 and I am using test 3 now. Never had problems with redhat 7 to 9. I made bug 98832 readable. It doesn't contain much info though. For consistency, this bug should be in kudzu and notting should be the assignee, since it's clearly a kudzu issue at this point. Also, shouldn't platform be "all" -- it's been reported on an athlon as well as an i586 (and a zillion other machines, apparently). Could somebody with the appropriate permissions make these changes? (Hahaha -- I just tried to commit the bug [having changed the component to kudzu without the permissions, being used to having Bugzilla access elsewhere] and it informed me that I was changing it from BitchX to kudzu... Ah, if only.) -M This is not a kudzu issue. All it does is call ethtool ioctls, and then the driver freaks out. All other drivers handle this fine... And in my case (a 3C905C-based Cardbus card) the problem seems to be triggered by rhgb, not kudzu. (I haven't tried recent rhgb releases, i.e. the exact revision shipping with the Fedora Core 1 release, so I don't know if they happen to fix anything somehow. I'll see if I can test this again soon.) For me, running Fedora Core Test3, kernel 2.4.22-1.2115.nptl, Asus P2B mainboard and 3C905B I get no errors; with or without rhgb, with or without kudzu. For me, running Fedora Core Test3, kernel 2.4.22-1.2115.nptl, Asus P2B mainboard and 3C905 I get errors still; with kudzu on at the boot runlevel. I have applied all the rawhide updates and have tried both the nptl kernels and the same problems exist. I also tried the Redhat 9 kernel 2.4.20-9 and the same problem exists. The nic initializes properly only when kudzu is off at boot time. I downgraded Kudzu to ver 0.99 (from RH9) then booted with kudzu on at the runlevel and my 3c905 initialized without problems. As soon as I go to a version of kudzu after RH9's version (Fedora test1 - yarrow) the NIC does not initialize properly and mii-tool reports missing MII. When the NIC is initialized properly at boot time (kudzu off) mii-tool reports the negotiated link speed and status properly. I have a Fedora core 1, 3c905b (cyclone) PIII550 Asus P3B-f. The eth is configurated to have ip address from dhcp and at startup have an error message "link don't available, check the cable". To start the nic i need to do: ifconfig eth0 up ifup eth0 after the nic work... sometime i have a network mistakes (i mean), i download some program from internal ftp server and sometime i don't able to get it and to the hub see a lot of collision! next i tried to smb connection and work fine! At this moment this is my condiction... i want to try to recompile kernel (i need for ntfs driver) and if any changing i mean to update to 2.6 kernel... I have three of these cards that do this, all 3c905-TX. If you need one to work with email me, happy to donate. Anyone got a current status on this one? I'd really like to see it resolved. It's 5 1/2 months old now. For what is worth: we are 2.5 hours before switching to 2004 and the bug is still there. Just installed fedora (and I was hit by the bug..). Applied all available patches (including a switch from stock -2115nptl kernel to -2135nptl) but still the only way to have a functional network is disabling kudzu at startup. I have also performed different variants of rmmod 3c59x/mii-tool -r / mii-tool -R/ mii-tool -F but to no avail. The driver seems to enter into a neverending play-dead state after kudzu's magical touch. I can confirm this behaviour (3C905 (without any A, B or C) with current Fedora Core 1 (all Upgrades installed, Kernel 2135). Removing kudzu from the startup fixed the problem completely. This machine has been network-kickstarted, so installing (and moving 700 MBytes over the NIC while installing) did work but the kudzu screws up the NIC. Maybe this is a driver issue, where loading the driver itself doesn't reset the NIC or MII correctly. I can confir; the bug is still present in latest Fedora Core 2 Test2 just released. The NIC works perfectly from CDrom (did you see there is even ssh tu continue to work during the hour of installation ?), but not after reboot. Does a developper read the bugzilla ? ;) IT WORKS! Just installed 2.6.5-1.327smp (http://download.fedora.redhat.com/pub/fedora/linux/core/test/1.92/i386/os/Fedora/RPMS/kernel-smp-2.6.5-1.327.i686.rpm) on my fc2-test2 machine and my 3com 3c905 started to work with tcp again. the nic was re-detected as "3Com 3c590/3c595/3c90x/3cx980" instead of "3c905 100BaseTX [Cyclone]". I can confirm that this problem still occurs with Fedora-Core-1 and the 2.4.22-1.2188.nptl kernel and the 3Com Corporation 3c905B 100BaseTX [Cyclone] card. Many problems with this card, including autonegotiation to 100 Mb FD not working, lockups under nfs load and errors such as the following: Jun 21 16:24:36 pinzolo kernel: eth0: Transmit error, Tx status register 82. Jun 21 16:24:36 pinzolo kernel: Probably a duplex mismatch. See Documentation/networking/vortex.txt Jun 21 16:24:36 pinzolo kernel: Flags; bus-master 1, dirty 996(4) current 997(5) Jun 21 16:24:36 pinzolo kernel: Transmit list 1ea51300 vs. dea51300. Jun 21 16:24:36 pinzolo kernel: 0: @dea51200 length 800000be status 000100be Jun 21 16:24:36 pinzolo kernel: 1: @dea51240 length 800000be status 000100be Turning off kudzu fixes it. I was able to reproduce the same problem but with another distro, Gentoo. I booted a P3 667MHz computer with 3c905tx-nm with the Gentoo x86 Minimal Instalation CD (downloaded 8/6/2005): 2.6.11-gentoo-r3 kernel, mii and 3c59x as modules loaded by kudzu. I assigned an IP address to eth0 and could not ping to another computer on the same Ethernet segment. lsmod showed that both mii and 3c59x were properly loaded. I tested the network cables, the Ethernet switch and I was sure that de NIC had to work as I had been using it with another OS. Following the comments in this thread, I disabled kudzu at the Instalation CD startup passing nodetect option to kernel. I loaded 3c59x with modprobe, assigned an ip address and now it can correctly ping to other hosts. I think this links might cast some light: http://www.scyld.com/pipermail/vortex/2000-June/000425.html |