

Linux Token Ring Support Bringing Down Corporate Nets? 354
"My company runs Token Ring at the office (puke!) I got drivers from the card manufacturer (Madge), and I'd been happily churning along. Then last week, we started seeing a bunch of errors on the network. These errors would bring everyone on the ring down. After a week of this kinda stuff, they eventually isolated it to me.
Reboot the laptop into Windows and the network card works just fine and they don't see any ring errors. Reboot into linux, and suddenly they start seeing ring errors. I don't really grok token ring, so I'm not entirely certain that I know exactly what the problem is. But, whenever I brought the token ring on line under linux, they saw ring errors, which eventually (as I understand it) would bring down the entire ring. Switch cards (same model) and it continues to happen. It looked to me (and the network analysts) that the Linux driver was causing the problem.
I tried switching to an IBM token ring card, but there's a bug and I hadn't patched for this. The people with the fluke would not wait around while I tried to figure this out. I didn't have any other token ring cards that I could try.
In the end, I agreed not to boot into Linux unless I went into the conference room (which is one of the only rooms in the building with ethernet ports). How should I have done this differently so that using Linux would have been a more positive experience for my company?"
Shouldn't this have been caught? (Score:1)
Re:Shouldn't this have been caught? (Score:1)
Linux just isn't tested enough before its branded "stable". How many people have had VM failures in linux or at least really bad performance of swapping. I have heard several horror stories.
Re:Shouldn't this have been caught? (Score:1)
I'm pretty sure it isn't user error; and it's happened on several different machines I've tried it on.
I can do a ping -s 1465 (The number is something like that, but that's not exact), and it'll work fine... increase it to ping -s 1466, and no replies come back.
Re:Shouldn't this have been caught? (Score:1)
I'm pretty sure it isn't user error; and it's happened on several different machines I've tried it on.
I can do a ping -s 1465 (The number is something like that, but that's not exact), and it'll work fine... increase it to ping -s 1466, and no replies come back.
You need to read up on MTUs. Ethernet has a *fixed* maximum frame size, due to latency messing with collision detection. That's what you're hitting.
Re:Shouldn't this have been caught? (Score:1)
In any case, someone else decided we were going to use Sundance cards in our servers here.
I downed all the servers, and put them in. The netware machines came up fine, but all of a sudden we had no internet connection and no incoming/outgoing email. I put the old cards back in, and everything went fine.
The part that annoys me is that I had specifically asked for tulip chipset cards, because I know they work fine. =)
Re:Shouldn't this have been caught? (Score:1)
Anyway, see your ifconfig:
$
eth0 Link encap:Ethernet HWaddr 00:50:DA:1F:06:57
inet addr:10.0.0.42 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:83036 errors:0 dropped:0 overruns:0 frame:0
TX packets:80763 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:18468908 (17.6 Mb) TX bytes:6354302 (6.0 Mb)
Interrupt:18 Base address:0xc800
See that MTU setting? Screw with that a bit to see what I'm talking about.
Re:Shouldn't this have been caught? (Score:2)
Re:Shouldn't this have been caught? (Score:1)
So not being a part of the standard kernel and not having a large user-base seems to be the problem here.
No problems with my Olicom card (Score:2)
I have a Linux box that's been on my company's token ring for nearly three years (I work for IBM), using an Olicom card. Not only do the sysadmins not bug me about it, they use it on a regular basis. It's my team's development server, but the network admins come to me quite frequently with some little thing they'd like to try or some temporary service they need to provide and they find that my box is the simplest and most reliable place to do those things (the other local servers are NT and OS/2; there are plenty of AIX boxes around, but they're not at my location and the local sysadmins don't really have access to them).
Re:No problems with my Olicom card (Score:2)
Perhaps... (Score:3, Funny)
Have them wire your desk for ethernet. (Score:2, Interesting)
If you already have ethernet in a conference room it might not be too hard to just have the port you use added to whatever hub they use for the conference room.
It seems that you are running on a laptop since you can move you computer to the conference room. Another option is to insist that they put up an 802.11b network. You could then wander freely and have wireless ethernet. Even better!
Re:Have them wire your desk for ethernet. (Score:2)
Re:Have them wire your desk for ethernet. (Score:2, Insightful)
Debug before, gateway after. (Score:2)
Now, I'm unsure on Token Ring. I'm taking that it runs on coax cabling and you have to be in a loop for every computer to work. UGH! I'm unsure of the protocalls used with it, but if it can run TCP/IP on it, you may be able to get away with a gateway computer (which does TokenRing <==> Ethernet.)
Re:Debug before, gateway after. (Score:3, Informative)
I used to work for IBM, and was only too exposed to the hell that is TR. The most common way rings go boom is when unix-type users add machines to a ring at the wrong speed. E.g. Bringing a card up at 4Mbit on a ring running at 16 will usually drop it.
My solution was to build a gateway running a very hacked up debian. IBM's 300PL worked great, as it came out of the box with both ethernet and token ring cards. I ran 2.2.19 on it and used iproute2 to do various NAT and address forwarding tricks to an ethernet switch.
Worked great until the hard drive melted. Damned fujitsu's.
Lessons learned:
1. Test first. A nice way of doing this is to get one of IBM (or someone else's) TR Hubs, and use it for your testing. That way, if you blow something up, you'll only blow up your testing network. The downside is that these things are expensive as all hell, and hard to come by. Even at IBM they required much bribery to nab for... 'unofficial uses'.
2. Use IBM hardware if at all possible. Someone else said something to the effect of "IBM - Hardware build like a tank". Very true.
3. Ask someone first. Chances are, at least in larger tech-oriented companies, someone else will have tried alternative operating systems before, and have advice (or horror stories) to share.
Re:Debug before, gateway after. (Score:1)
No probs here with madge (Score:1)
Anyway token ring is obsolete and you should try to avoid it...
Re:No probs here with madge (Score:1)
Contact the developers (Score:5, Informative)
Re:Contact the developers (Score:1)
Token Ring sucks, Linux TR REALLY sucks (Score:5, Insightful)
I have seen a TR network where a single machine could develop a problem, and this would cause a group of 8 machines to all lose the net. Any one of those machines could bring them all down, and the only thing that would get them back up was shutting them all off (completely power-down, even rebooting didn't do it) and then bringing them back up one by one. Something as simple as shutting down Windows NT to the "click to reboot" prompt was enough to cause the problem to develop; eventually one of them would lose it's mind, and they'd all go.
Throw into that mix, the fact that Linux Token Ring drivers are bastard stepchildren that get 1/1,000th of the use of the Ethernet drivers (if that much) and you end up with real problems.
Bottom line; come in a weekend and try that other NIC out, maybe it's drivers are more mature. But other than that, don't dick with the company network, Token Ring is too damn sensitive.
You might try putting a few NT boxes into the "click to reboot" state, and see if they screw up the company network too. Works best with 3COM TR NICs, which is ironic since they also seem to recover the best to having their cable pulled and replaced while live.
If they see the problem is Token Ring specific, and just exacerbated by a bad Linux driver, perhaps they'll switch to Ethernet. If they trade their TR NICs in to somebody like CablExpress, they might break even or make a small profit on the switchover, and they'll certainly recover the costs in a short period of buying Ethernet NICs instead of new TR ones; they're horribly expensive, and the infrastructure gear (CAUs, LAMs, MAUs, switches, routers, etc.) is even worse.
An even better suggestion might be to find a job in a shop that prefers the more-manageable problems of Ethernet to the problems of Token Ring.
Re:Token Ring sucks, Linux TR REALLY sucks (Score:1)
/Ryan
Re:Token Ring sucks, Linux TR REALLY sucks (Score:3, Informative)
I work for a large bank that is _SUPRISE_ largely an IBM shop. the vast majority of our network is token ring and thanks to relatively clueful network design is very stable. Token ring is not ancient technology, it is a mature technology that has a lot of advantages over ethernet for wide area networking; especially in a Mainframe environment. Source Route Bridging anyone? try that with ethernet.
as far as MAUs are concerned, gimme a break. now that _is_ ancient. hit up ebay and grab a nice synoptics hub and a token ring switch. follow IBM's standards on lobe length, number of stations per ring and cabling type and verify the buffer sizes on your adapters. most of the non-ibm adapter drivers i've seen set them far too low and you end up with gobs and gobs of reciever congestion errors. if you have enough stations broadcasting these errors, bam. you have a beaconing ring.
my experience so far has been best with the IBM PCI token Ring adapter 2 and the IBM auto 16/4 PC cards (the older ones with the hologram-y label, not version 2). The lanstreamers are kinda junky but work well in windows and Novell.
to sum up, use your head, follow standards and understand how token ring actually works. grab an IBM redbook on it while you're at it...
Token Ring lover in WI,
sixty4k
Re:Token Ring sucks, Linux TR REALLY sucks (Score:3, Funny)
Well, that's just the point, isn't it? Source Route bridging was designed by people who Just Didn't Get It - it's doing almost all the work of a router without gaining any of the benefits. It doesn't deserve as much bashing as Microsoft, who also Never Did Get It about networking, because the folks who did SNA and its predecessors were trying to design protocols that would run well on systems with slightly more horsepower than digital watches, back when RAM actually cost money. It's possible to use Transparent Bridging on many kinds of IBM systems, and these days any of the mainframes will run TCP/IP. And almost all the applications can either be handled using tn3270 (since there aren't a lot of Genuine Original Green-Screen 3270 terminals left - almost everyone uses PCs with various boards in them) or if necessary encapsulated in IP using DLSW or other tunnelware.
Re:Token Ring sucks, Linux TR REALLY sucks (Score:2)
Re:Token Ring sucks, Linux TR REALLY sucks (Score:2)
Otherwise it's not worth it.
Two speeds, 16/4 Granted, no collisions but you'll get better throughput with 100 Mb ethernet than 16 Mb TR. We'll probably see Gb ethernet more common in offices soon, and that will definately put that arguement to bed.
Ever bought a TR card? I'm not talking ebay either, most companies don't let you do that. When I last bought one, they were running at about $250, each. 10 3c905's were $250 at that time. We were using type-1 o we were using MAU's, find one of those... 8 port MAU would run you close to a grand.
So the logical solution was to pull all the TR and replace it with cat-5. I requested $3000, and set up switched 100bt. This cost us a little more initially, but they haven't had to buy a TR card since, and it is so much simpler to drop a new cable now. Also don't have to worry about the ring falling due to an old card dying on you. That's fun. Take 30 minutes out of everyones day to shut down and then bring machines back one by one...
Sure, we could have ran TR over cat-5, but I should have been fired if I reccommended that. It just makes better sense to run ethernet.
Re:Token Ring sucks, Linux TR REALLY sucks (Score:2)
And... gig ethernet is pretty common in server backends, but I certainly wouldn't say it's very common in office LAN's. If your desktop has gig ethernet, you're a minority.
Re:Token Ring sucks, Linux TR REALLY sucks (Score:2)
Yes, as I said, the IBM cards are less susceptible to the timing issue I discussed. However, in the testing we did, they almost never recover properly if you yank the cable, wait a few seconds, and plug it back in. Under NT and OS/2, you had to reboot in that situation to regain network connectivity.
With the 3COM cards, they usually reconnected.
Plus we could get refurbs from CablExpress' Equal2New program for cheap, and still have a lifetime warranty. But it was still way more expensive than Fast Ethernet, and for 1/4th the throughput.
Realtime (Score:2)
This, of course, is more of a LAN than a WAN point.
oh really ! (Score:1, Flamebait)
thats the sort of FUD that gets alot of people angry
LOTS of people (and by this I mean end users) have used token ring drivers with complete success
the poster knew this
so what are they looking for ?
help with their card and hardware ?
(which it seems is the problem)
info needed
kernel version
card ID
machine ID
and maybe a trace of the network
to say IT DONT WORK is cave man like in the extream even kids as young as 7 know that to fix something that you cant see you need to have it described
regards
john jones
p.s. what kind of editor runs this ?
Re:oh really ! (Score:1)
What the poster is asking is how he should have gone about taking a machine that totally did not fit within the specifications of the network that IT set up, inserting it onto a network that uses a technology he does not understand, and using an operating system whose Token-Ring driver are sub-par.
Basically, 'What should I have done that would have prevented this from happening?'.
The answer to this, I can't help but think, is to go and talk to IT/the network admin/God, and tell them 'My manager gave me permission to connect this laptop to the network, what do I need to know before I do?' They would know more about the system than his manager, likely, and at the very least, more than he did, and would be able to track down problems easier later on if they couldn't prevent them entirely.
His post is not FUD, nor is it driver help, nor is it networking help, nor is it technical help at all. You would be well adviced to read what he is saying, and avoid flooding your kneejerk reactions onto slashdot without prior thought.
p.s. what kind of editor runs this ?
The same kind of editors that have been running this site for years and built it up from nothing to become one of the most popular techie/open-source websites on the internet. If you don't like how they're doing it, don't read, but please, don't complain about it. No one's forcing it upon you.
--Dan
Tolkien Rings (Score:2, Funny)
Yeah, I know, jokes like this are a bad Hobbit...
Token Ring (Score:5, Informative)
We switched the whole network to 100 MBit Ethernet, so we will not look into the issue in the future.
The drivers in the kernel have some problems, particular for PCMCIA.
Here some useful links:
Linux Token-Ring page, with updated drivers, but a discouraging news entry from 9/14/2001:
http://www.linuxtr.net/
Linux-Software for Olicom-Drivers(recommended):
http://www.madge.com/connect/downloads/software
Linux-Software for Madge-Adapters on:
http://www.madge.com/Connect/Downloads/Software
Those damn hobbits.. (Score:1, Offtopic)
This sort of thing happens all the time ... (Score:2)
It's not that uncommon at all to find some application that conflicts with some other application and floods the network with crap. Ditto for hardware.
Yes, in this case, Linux did get a bad rep, and it may have been deserved. It's a fairly safe bet that very few people use Linux with Token Ring, so the drivers probably haven't been very well tested.
If you're truly paranoid, do what another poster did and test in a limited environment. Unfortunately, doing this for every new piece of hardware and software added to the network (not just Linux stuff) would take *forever* so you need to trust that things will work at some point.
Proper Network Design (Score:1)
"Your system is sending inproper commands to the network and as a result, your network interface has been shut down. Please contact your system administrator for details."
Imagine how much havoc one could reek if they walked into a business with a small "network tool" that would bring down the network, plugged it into a nearby jack, and left.
When writing CGI programs for the web, the first thing you learn after how to work the #! ling is that you should NEVER TRUST THE USER. Networks need to work the same way. If the node says that it is supposted to speak now because it has the token, the router should check that the node really has the token and make adjustments to avoid bringing down the network if it does not.
Short of charging the network lines with a 5,000 volt current (and surge protecters were invented to stop even this), there should be no way that I can bring down the network from a point on the network. (Well, excluding using SNMP to shutdown the router, which assumes that I have the password, etc.)
Re:Proper Network Design (Score:1)
Re:Proper Network Design (Score:2)
A router is just another node on a token-ring - it can't do the stuff you're talking about in the general case. If some broken-ring adaptor screws up, the router is typically just as baffled as anyone else. token-ring is over-complicated, over-costly nonsense. With really cheap switched ethernet available, no new network should ever use it, and old networks should migrate, as new NICs and hub costs will eat you alive.
What kind of TR-network... (Score:1)
Obvious the hardware is working fine. You say nothing about what speed you brought the TR-NIC up at, but you're probably aware of that bringing up a NIC at 4mbit/s in a 16mbit/s network will beakon the ring and bring the network down.
Anyway, to work out the problem, simply ask the network people to lend you a MAU. Hook up one or more of the other pc's that works fine in the ring and do some labbing with your own little ring until you locate the problem.
A quick look at the Madge website it seems like they actually SUPPORTS their linux drivers
Ran tr0 for over two years (Score:5, Informative)
I ran Token Ring on my personal desktop and a server at work for over two years without any incidents requiring sysadmin intervention.
Here's how I did it:
So, it worked for me, as I said, for a couple of years. But then I moved to a new site with pure Ethernet, and I have to admit that life is much simpler now.
A few hints for Token Ring Troubleshooting (Score:5, Informative)
This is generally bad, because TR is realy a cool Technology (except that it was always to expensive and proprietary)
But superior technology was never the point.
(see also: Donalds Becker s comment on NE2000 clones)
The Network card is not the only possible error source. Token Ring is an active Network, where a lot of the logic is within the NIC and the Cabling (e.g. M(S)AU = Multi Station access unit)
All Stations are assembled in a physical double ring. Even though the Cabling is a star topology.
If you connect your station to a MAU (= TR hub)
your plug is connected to the MAU, but you are not yet connected to the ring. If you turn on your computer, the network driver opens a relay in the MAU (signaled via the adabter cable) to switch you into the ring.
If you turn off the computer you get discinnected.
All data on the ring passes all the NICs in the ring (exception: Early Token Release). The NC acts as a Bridge (it amplifies the signal to the next ring segment).
Since the unamplified distance between to NCs is limited this can lead to the "Token Ring Sleeps at Night" Problem, where the token Ring refused to work at night (simply because too many employes turn off there PC after work)
This can simply be overcome by replacing passive MAUs with active TR Switches.
One should also have in mind, that the cable to the network card is a part of the ring after activation of the card. A faulty cable can disturbe the ring (even though it should be automaticaly removed from the ring)
I would try your laptop directly on a TR switch.
Thís way you can eliminate driver/TR component interaction (a driver which agressivly tries to connect to a ring with a faulty cable)
I personaly implemented many Linux Servers with linux and never had problems with disturbing ring operation. I used IBM and Olicom Adabters and they always worked well.
Huh? (Score:1)
Re:Huh? (Score:2)
Rationale for TR is outdated. (Score:3, Informative)
Ethernet evolved from networking schemes used for packet radio. The original idea was you had a single medium (a long cable) that was shared by a number of hosts. As in radio, they were supposed to listen before talking (Carrier Sense Multiple Access/Collision Detection or CSMA/CD) so they didn't garble each others' messages (collisions).
CSMA/CD networks have two problems: (1) throughput begins to collapse somewhere between 40% and 50% of the nominal speed due to collisions and retransmissions and (2) packets delivery cannot be guaranteed within a fixed time (although at low loads latencies tend to be very low).
However, Ethernet switching technology has taken care of the througput problem by reducing the number of machines sharing a medium for purpose of collision detection, to the point where a single workstation on a full duplex switched port can never have a collision. A combination of switches with huge backplane capacity, spanning tree, trunking, VLAN and powerful routers give the administrator great flexibility in delivering network capacity to every port on his network, along with excellent scalability.
The only thing that remains is guaranteed delivery times for packets; although stations needn't worry about collisions, there is still queuing time within the switches to consider. This might affect people attempting to stream broadcast quality video over their network to several workstations, who might choose to go with 100Mbit token ring. In theory QoS is supposed to address this, but I haven't seen it used much. Most streaming media applications are Internet centric, and buffer their data to prevent problems due to the much more random nature of the Internet. It is possible to contrive scenarios where you need QoS or isochrounous packet delivery (e.g. high quality video conferencing over a LAN) but these haven't proved to be very important. If they were, then ATM would probably be a better choice than TR.
Of course TR still has to be supported for places that have too much human inertia to switch, but I don't think there is any technology that is superior to Ethernet in its cost effectiveness for the widest range of corporate applications.
QoS and Ethernet (Score:2)
Many Ethernet switches support 802.1p (a priority field within the 802.1Q VLAN header), allowing basic prioritisation. The larger L3-aware switches also support IP Precedence or even DiffServ (e.g. the Catalyst 6500). In the longer term, as policy-based management becomes more widely deployed, it's likely that switches will have 802.1p turned on, largely to support VoIP (since that is actually being deployed on some networks).
it's your vendor's problem (Score:3, Interesting)
You do have a way out: use the IBM card. It was working a few years ago, and I imagine it's still working today. Yes, you do have to patch the kernel--what's the problem with that?
If that's not to your liking, you can throw money at the problem and buy a TokenRing/Ethernet bridge and use an Ethernet card on the Linux machine. Maybe your managers will see the light and convert more of your network to Ethernet.
In general, TokenRing is dead technology. Many operating systems just don't support it at all anymore. How long should Linux carry the burden of supporting outdated and flaky technologies?
Re:it's your vendor's problem (Score:2)
Companies will view token ring in much the same way (ok, you can say it is broken, but ethernet can be flaky too), especially if their vital server is hooked up via TR. Many old(er) systems only have AUI or 10b2 connections for ethernet, neither of which I would like to rely on (OK, you can get AUI->TP connecters, but that's a kludge).
Just because nobody's doing new installations of an old technology doesn't mean that it's not in use somewhere; didn't somewhere recently retire a PDP-11 or similar?
A little TR background. (Score:5, Informative)
I came from an all Ethernet environment prior to the this job and have had some experience with ARCNet as well. (Hows that for you old
Token Ring is a logical ring topology, ususally implemented in a physical star or bus topology. Some of our rings have upward of 200 nodes with thousands of feet of cabling connecting them. We have MAU's (Multiple Access Units - a hub) connected to each other with copper and fiber. Most of the cabling that runs to the workstations is type I - 4 conductor, big gauge stuff that comes to large data connectors at the wall. If you haven't seen these, you'd love them, about 1 1/4" square and 2 1/2" long. Then a lobe cable goes to a db-9 connector on the NIC card.
TR works by passing a token (electrically) to each node in sequence. When a node has data to be transmitted, it hangs the data on the token and sends it on it's way. All subsequent cards check to see if the data is for them and then pass it all on if it's not. The intended recipient strips the data and sends the token on it's way. In a 4Mb ring, there is one token and on a 16 Mb ring there are two, 180 deg. to each other (timing-wise) on the ring. I don't know how the 100 Mb version does it, but almost nobody uses that.
This has an advantage in that there are no such things as collisions like on Ethernet. This allows for the massive number of nodes per ring and high efficiency in data transfer - perhaps 80 - 90% For comparison, Ethernet starts having problems due to collisions at 40% or so - depending on the number of nodes.
It also has the disadvantage that a single break at any point in the ring breaks the whole ring. (Think Christmas lights in series rather than parallel.) Another disadvantage is exactly the problem the poster reports - timing errors. I don't know if the problem was just timing errors, but the other problem - beaconing - would have brought the whole ring down right away and he said that it was was just noise with the potential to bring the ring down.
Indeed, timing is critical. Beacon errors are worse as the NIC put out spurious signal that doesn't allow any node to hear the token as they attempt to pass it around.
Early in my employment, I attempted to put a linux box on the ring, but couldn't get the TR drivers to work with a Madge or old IBM card. About a year in, they got all tight-assed and concerned about security and prohibited all alternate OS's. We're an all M$ house, how's that for irony. Security, what security? At least we're behind a pretty good firewall.
As far as the problem with this particular installation, I agree with other posters who have said that the author of the driver needs to be contacted to report the bug and maybe get a fix. It would be good to set up a separate ring with just the two nodes (and the fluke) to try to ID the problem. But he may also be facing administrative/political issues as well. Those are hard to overcome, especially in a large organization, and even more in a government agency - as I have found.
I'm not karma whoring, I just thought that since this technology (TR) is so ancient and in use by so few places, readers unfamiliar with it might like a little info.
BTW, the aforementioned ARCNet is also a token passing design that runs on a bus or a star and runs at 2 Mb. It can run on UTP or 93 ohm coax (RG-62) It's relatively robust, if slow. A boss of mine went to a Novell Admin class where the instructor hooked a server and workstation together on ARCNet with BNC connectors crimped to a piece of barbed wire. It passed data acceptably.
Hope this all helps a bit.
Re:A little TR background. (Score:1)
On my home server (2way dual p100 mosix cluster) I have...
Arcnet
Token Ring
Localtalk PC (this one wasn't easy to find, or config)
10mps ethernet (to the cable modem)
2 100mps ethernet (to the LAN)
2 gigabit (crossover, dedicated cluster link)
FDDI
100mps Fast Arcnet
Still looking for 64mps token ring hardware (some people describe it as 100mps TR *shrug*), Acorn econet for PC, cheap 155mps ATM cards, VG Anylan, and possibly Corvus Omninet cards. If anyone can help, please let me know.
ARCNet still has it's uses... (Score:2)
It may sound arcane, but it's robust, interference resistant, cheap (I guess), and reliable. If you're not passing tons of data, it works just fine.
MadCow.
Ethernet Dies from Collisions at 37% Urban Legend (Score:2)
Oh fergawdsakes, will this urban legend ever die!
It simply isn't true.
Re:Ethernet Dies from Collisions at 37% Urban Lege (Score:2)
Oh fergawdsakes, will this urban legend ever die!
It simply isn't true.
Actually, it is true... Under ethernet, the more nodes you put in a collision domain (this is layer 2, not to be confused with a layer 3 _broadcast_ domain), the more likely a collision, the more time is spent recovering from collisions. The exact 40% depends on how many nodes are on the segment and how much info they have to send and how often, but 40% is in the ballpark for when problems start to appear
There is a very simple (and not very expensive nowadays) way to get around this problem: use switches in place of hubs. A switch creates a two node collision domain (the switch and the end node). In fact, when full duplex is enabled (only possible with switches for an obvious reason) the collision detection is turned off.
The result: no collisions and thus you can get even more efficient than token ring (no token to wait for).
Re:Ethernet Dies from Collisions at 37% Urban Lege (Score:2)
I've never had a 10/T segment with more than 5 computers active get any more than 2-3Mbps in backplane bandwidth.
With a switched hub, yes, you get full bandwidth between each port (assuming the hub has sufficient backplane bandwidth to deal with all the inter-port traffic), but with normal "old-fashioned" ethernet hubs, no, you don't.
Re:Ethernet Dies from Collisions at 37% Urban Lege (Score:2)
The canonical reference is: Measured Capacity of an Ethernet: Myths and Reality [compaq.com].
Note particularly that Van Jacobson was able to obtain measured TCP throughput of 8Mb/s in the 80's.
There is a good review of the area in the O'Reilly book on Ethernet too.
ARGH! (Score:1)
Voodoo debugging (Score:5, Informative)
The promised reliability never materialized. In the early days, the TR connector was the same as that for DB9 serial ports and EGA (pre-VGA) video. L-users would frequently connect the cables incorrectly, taking down the entire LAN. In the later days, 10BaseT Ethernet replaced coax, and became slightly more reliable than Token Ring. These days, we used switched Ethernet, which is infinitely more reliable than Token Ring.
Keeping Token Ring networks running has become like voodoo management. Stories like yours are common. Nobody knows exactly WHY things are going wrong, so they are quick to point the finger at oddball stuff. There is so little support for Token Ring that nobody can figure out how to solve even basic problems. The only solution is to remove the offending products from the network.
Here is some background for what might be going wrong. First of all, your card has its own microprocessor. As a kid in the early 1980s I owned a TI-99/4a home computer/game-console: it is roughly the same CPU in your card. It runs its own embedded OS. This means that under normal conditions, your card will run fine, regardless of the driver: all the intelligence is on the adapter, not in the driver.
I point this out because you never specified exactly the types of errors you are receiving. In theory, all such errors are related to the hardware, and there is nothing the driver can do to cause them. Specifically, I don't know how it can be possible for something to "cause ring errors that eventually bring down the entire net". There are really no progressive failures like this in Token Ring.
If you mentioned the precise ring error and/or the method in which the ring goes down, it might be helpful. Here are some possible ring erors.
A burst-error is caused when an adapter inserts itself into or removes itself from the ring. This might be caused because, for some reason, Linux might be re-initializing the card. For example, you may have DHCP set to renew the lease every minute which may cause this to happen. I have no knowledge of how Linux deals with Token Ring, but if the problem is "Burst Errors", then it is because of some higher-layer interaction like this.
A "receiver congestion" error is caused when the Linux driver doesn't remove packets from the card's buffers fast enough. In theory, they are suppose to indicate that packets are coming in too fast for the machine to handle. In practice, you see this happen when machines "hang" and fail to empty their queues. You might be running some sort of libpcap packet-sniffer on the system or have the adapter running in promiscuous mode (do an ifconfig to check) that is having some sort of pathelogical condition.
Maybe you are getting "FC errors" which indicate that somebody has the same MAC address as you. This won't happen if you use the standard MAC address built into the card, but it could happen if the Linux driver has a bug setting a locally administered address. Maybe it's setting it to all zeroes, causing a conflict with some other card that has a similar bug.
None of these errors really cause problems. Burst errors will nuke a frame as it passes by (maybe one out of a thousand) -- the hardware auto-retransmits, so it doesn't cause performance problems. Receiver congestion errors only cause problems for YOU and nobody else on the ring. A duplicate address will only cause problems with the other machine that shares your MAC address.
My guess is that your admins are just getting testy over the fact that your Linux box re-inserts itself more often than Windows boxen, causing a higher number of relatively harmless burst-errors. When they diagnose problems with the ring, they notice that your machine causes the highest number of errors, and therefore blamr any ring failure on you.
If your machine is truly causing a problem, the only thing I can think of is that your port on the hub gets "stuck" (this happens a lot). The process of re-inserting has a small chance of getting stuck, so if your Linux box re-inserts 100 times more often than Windows, you'd see this.
BTW, Token Ring is a good lesson in Zen. A burst-error is defined as 5 half-bit times without a transition. What this really means is that a station has entered or left the ring. I point this out because if you try to debug this problem yourself, you'll have to hunt down Token Ring references. Go quickly to the definition of burst-errors: if it has the "technical" definition, discard the reference and move on. If it has the "practical" definition, then you'll be in luck.
Re:Voodoo debugging in the Good Old Days (Score:2, Interesting)
Re:Voodoo debugging (Score:5, Interesting)
Wow! This has *got* to be what the problem was. This problem started showing up right around the time of the big Code Red hubub. So I installed snort just to watch and see what was going on. Snort, of course, uses libpcap and puts the card into promiscuous mode. Right afterwards, is when we started seeing problems on the network.
Holy schnikies! You must have been in the room! That is *exactly* what happened. They discovered these errors and basically said that the errors were the *only* thing that they could see that was wrong with the network. From this they concluded that the problem must have been caused by my running Linux.
About the only thing that does not fit, is that since I've stopped running Linux on the network at work, the problem has completely gone away. Not a single recurrance in several weeks time (I actually submitted this article to /. many weeks ago. Why it took so long to get accepted, I dunno.) They did, as part of their process of troubleshooting replace all of the TR equipment in the closet. But even after they did that, we were still having problems. So far the only thing that seems to have fixed this problem was me staying out of Linux.
Thanks for you're very informative post!
Re:Voodoo debugging (Score:2)
You're correct. That is why the last question of my post was: "How should I have done this differently so that using Linux would have been a more positive experience for my company?" From the start, my assumption was that I, without trying, made Linux look bad. My goal in posting this to /. was to share this experience so that it can be avoided in the future... by me and others who either are making the same mistake right now, or are about to.
Still, Linux was *not* doing what I asked it to do. I certainly did not ask it to generate ring errors when I put the card into promiscuous mode. I should just as soon expect that you're going to tell me that reading an email in outlook that spams all of the addresses is my fault.
While I should take much of the blame for doing something w/out fully understanding the ramifications of it, the underlying code can't get off scott free for behaving badly when performing a normal task.
Tolkien Rings Suck? (Score:5, Funny)
One ring to bring them all and in the darkness bind them.
A great Truth! (Score:2)
ttyl
Farrell
What to do (Score:2)
Um, how about posting the actual "ring errors" that your lan admins were seeing. Also, did you try contacting Madge, since they supply the card and drivers? I'm still not really sure why this is an Ask Slashdot. While I'm sure it's within the realm of possibility that an errant (or improper configuration of a) driver hosed the network (an ex-admin of ours hosed our network with the linux box after he was let go), there isn't much detail here, there are many mailing lists devoted to this kind of thing, and your hardware vendor does support your card under linux. From their website:
yada yada. The fact that you posted this as an Ask Slashdot (and the complete lack of details), make me question the veracity of this report. Regardless, if this did happen, it is too bad. People at your work will undoubtedly have a bad impression of linux from this. Such is life.A TR to Ethernet Router ? (Score:1)
Yes it costs you an extra box, but it could be just some old p166 or something..
Have Fun..
Re:A TR to Ethernet Router ? (Score:2)
Then you need the network admins to either assign you a subnet, or you have to deal with NAT... and you have an extra box.
Linux on TR (Score:2)
We have token ring adapters of all types. I've found the newer IBM PCI TR adapters to work best. They use the "olympic" driver included with the kernel. I have 6 Linux machines at work using these drivers and they've performed flawlessly... except when someone unplugs the cable. In that case, the box needs a reboot, but the rest of the network is fine.
I first tried the IBM PCI LanStreamer but couldn't get it to last more than a few minutes. I'm guessing there's a problem with the buffers that freezes up the interface. I tried one of the newer Olympic cards on a whim, and haven't looked back.
If you have REALLY ancient equipment, the Tropic-based 16/4 TR Adapter/A (long and short version) is known to work on Microchannel machines. I've put together one of these and had it running as a TR-to-ethernet router for a while.
All this stuff can be gotten on eBay or elsewhere for dirt cheap.
I haven't tried the Madge or Olicom cards, but we have plenty of IBM cards, so I've stuck with those.
TokenRing is nasty (Score:3, Insightful)
Theoretically, TokenRing is wise, clever, fault-tolerant and the *only* way to get your LAN running close to ultimate bandwidth.
I *like* TokenRing. In theory.
In practise, it sucks. Specifically, it sucks *all* the bandwidth out of your network.
A healthy TR network *will* use all the bandwidth that it needs. Collisions are impossible.
Then, a node fails (which is what happened to you). Gradually, the overhead of regenerating tokens becomed an issue. Remember, a node will *never* regenerate a token unless it is convinced that it's up-ring node is dead or bonkers. However, if the upstream node has gone totally doolally, the one downstream will regenerate.
Regeneration is *good*. except that the upring node is spewing filth onto the ring. So, the regenerated token is lost in garbage. Guess what happens next?
Yes! the next node tries to regenerate.....
Identifying this is easy. Find the node that is still operating. Then, find the nearest axe, and apply it with enthusiasm. Amazingly, the LAN recovers within seconds.
I still bear the scars of a TR network that failed. We split it every way we could. Eventually, we we took the server with the borked ISA bus (feeding the NIC) out of the loop.
All I can say is *GET THAT BOX OFF THE LAN* You are pissing people off with it. That is *not* good for Linux.
I would say "rewrite the network drivers", but that's pointless. Say sorry (repeatedly) and advocate Ethernet. I'd rather have collisions than catastrophes...
Ring Speed? (Score:2, Informative)
Been there ... (Score:2)
This looks like something that happened a few years ago where I was working.
If the driver sets the card to 4MBits (The token ring cards have different speeds) the whole ring will pop into 4 Mb/s and the ring will crash as a result.
It was at a
I've seen this before, but not with Linux (Score:2, Flamebait)
I did not view this as a PC problem, but an inherrent problem with Token Ring, so I moved that the company should migrate to Fast Ethernet, and low and behold they did.
That was the end of that particular problem.
Token Ring issue no-one has mentioned (Score:2)
Madge cards do not like working in an environment with IBM cards (okay they have sorted things out a wee bit, but it still isn't perfect) because IBM token ring cards run to lower timing tolerances. This is fine where all other cards on the ring accept this, but Madge cards are designed to much higher specifications and easily start beaconing if the previous/next card round the ring is off on its timing.
Your best bet is to get hold of those IBM drivers and use an IBM card - and go for a test environment for a wee while too:)
Re:Tolken Ring Networks.... (SOUR GRAPES) (Score:2, Interesting)
Fix it... don't say its not worth having anyway. Or if its not worth having strip token ring support out and save a few kB.
Re:Tolken Ring Networks.... (Score:2, Interesting)
Of course, Ethernet costs so little that you can build an Ethernet in a star topology for less than a TokenRing in a ring topology. "Good enough" wins again.
Re:Tolken Ring Networks.... (Score:2, Informative)
However, TR is horribly inefficient when one machine produces a disproportionate amount of traffic (which is the case for pretty much all corporate networks). Unless each machine in your ring produces a steady stream of packets, the old ALOHA collision model still wins.
Re:Tolkien Ring Networks.... (Score:3, Funny)
Re:actually (Score:2, Interesting)
Perhaps the biggest problem in the computing industry in general, and in mixed os environments in general, is the fact that standards are often never actually standards. Even without casting blame on any of the products in question, standards are often not as defined as they should be, and any liberties or assumptions made by programmers, usually ends up in catastrophic incompatibilities. Regardless of where the blame lies (MS, BSD or Linux not following standards), the solution is to viehemently define standards so that there is no question about their implementation.
Re:it is obvious (Score:1)
Re:Imagine if this was Windows... (Score:1)
1. Posts complaining that linux can't do anything right because it is open source, and if it was company controlled source they wouldn't have these problems.
2. Posts arguing that open source programmer do not take the apropriate care.
3. Posts asking why anyone is using Linux since it is obviously inferior to Windows.
Although, I have to admit, in a much smaller number
Re:Imagine if this was Windows... (Score:2, Informative)
Some how, some way, every madge nic on a particular ring would decide at almost the same time that it wanted to be the RPS (ring parameter server) and/or the active controller. Not a very nice thing on a large ring, nor is it easy to troubleshoot.
We eventually figured out the problem when (for the third time) we shut every machine on the ring down, and brought them up one by one. The machine that started having the problem changed every time, but every machine that started the problem had the same driver loaded. We replaced the cards with Olicom, got the current drivers, and never had that problem again.
Notice I didn't say never had A problem again. When Token Ring worked, it was fairly good... when it didn't, almost by design it was a pain in the (insert your choice here).
Anyhow, my 2%.
Re:Imagine if this was Windows... (Score:2, Offtopic)
1) they destroyed America's free market for software;
2) there is no source code available so people could fix the driver.
I'd be suprised if this wasn't fixed pretty fast. That's the great thing about Linux...one of the things that always pissed me off about dealing with MS was various NT bugs that would just sit for years at a time.
Re:Imagine if this was Windows... (Score:1)
Re:Imagine if this was Windows... (Score:1)
Nah. You are thinking of Apple. MS release with a lot of bugs unfound, but they are pretty fast in fixing them. Of course, they should find them Before release to deserve any credit at all.
Re:Imagine if this was Windows... (Score:2)
Some bugs only affect a tiny minority of people, just like some 'orphan diseases' that may affect only tens of people world wide. Along with bugs that are perceived as trivial, such bugs sometimes get very little attention from a vendor, for commercial reasons. Some vendors are better than others, particularly small vendors for whom any customer is pretty important, or those with good contact with their users (e.g. some shareware vendors). A key benefit of open source is that it tends to bring users into closer contact with developers, and of course users can just become developers (or hire a developer) to fix problems.
Joining the Zealots (Score:2, Flamebait)
And then an informed reader would point out that the driver was provided by the manufactorer, not Microsoft. Thus, Microsoft itself would have little direct involvement in this case.
A more reasonable Open Source advocate might chip in that an open source driver would provide a faster path to hunting down and fixing the problem (Source is available for this driver, though I don't know what the license is - so that point may or may not be tested in this case).
There is mindless zealotry all over the tech industry, media, and public forumns. It goes far beyond Slashdot and Linux. Please try to refrain from adding to it.
Re:Imagine if this was Windows... (Score:2)
This is obviously a vendor driver problem, not a Microsoft or Linux problem.
Having the source *does* help, but that, again, is up to the vendor.
How does an obviously offtopic troll post get moderated up to 5? There appears to be a lot of freebasing going on amongst the moderators today...
Re:Imagine if this was Windows... (Score:1)
I'll give you that people go easier on FS than on MS in lieu of any mistakes made in the past, but maybe it has more to do with track-record than a bandwagon?
Not to mention, its a third-party driver..
Re:Imagine if this was Windows... (Score:1)
even better... (Score:2, Troll)
Re:even better... (Score:1)
Not that token ring is the greatest network architecture ever designed, but it wasn't half bad for it's time either. When ethernet was still stuck in 10mps land, token ring provided a managed 16mps, with the practical difference being a bit higher than the raw bandwidth alone would suggest. Even on many lower end switches manufactured today, a single machine or small groupd of machines can easily hog all the bandwidth starving others... with token ring each node gets its fair share of bandwidth.
Granted, I've thankfully never had to use it in a work enviroment, but at home on my shits and giggles network, I have both a 8 port Startek MAU, and a 60 port Radring with fiber RI/RO. It's fun to play with something different, or just to see how many screens worth of ifconfig I can get on a single box. I'd love to get ahold of some of the 64mps or 256mps TR hardware that I've read about.
Imagine that... (Score:1)
Re:even better... (Score:1)
And the rewiring costs? And personnel costs to redo all the computers? And the downtime to all of this? The HW is "cheap as dirt", but you have to factor everything else in too.
Re:even better... (Score:2)
Linux users have often prided themselves on the fact that their OS had support for old hardware, forced hardware upgrades were an aspect of Windows, not Linux.
Of course, this is off topic for the original post. I sympathize with the original poster - it always sucks when your favorite OS doesn't interoperate well and gets a bad rap. But hey, Apple is still around, so it's not a fatal problem...
Apple was banned in many corporate offices. This guys boss my never approve another another person to use Linux on the corporate network again. In this case it may very well be a fatal problem. It's always a bad thing when your unique system doesn't work right, and you can't get your job done. When you bring down the network, no one can get their job done, and they have to debug the network to isolate your system. That's pretty much a fatal problem. It's real hard to justify Linux as cost effective when they just wasted thousands of dollars because they let the first person hook up a Linux box.
Re:even better... (Score:2)
but... protocals should be designed to withstand broken, even pathologically broken, operating systems. Ergo, ethernet is extremely popular because, among other things, its fairly fault-tolerant, in that a fault on one node causes minimal disruption on other nodes. Token ring, however, is so stupidly easy to break its a joke.
Re:even better... (Score:2)
Re:Possible solution (Score:1)
You'll have to ask the vendor. This was a vendor-supplied driver, not a generic driver in the distro (just checked, there is only an MCA madge token ring driver in the official kernel). Added to the fact that people with other token ring cards are running Linux just fine, this seems like this is much like most windows problems are claimed to be, just a bad vendor-supplied driver.
Re:Possible solution (Score:1)
Cisco still requires learning steps in token ring devices for most certifications they provide.
This is what is stopping me from trying for my CCIE at the moment, I'm trying to find some token ring devices to setup in my working lab enviroment.
Re:Brought down the ENTIRE network? (Score:2, Informative)
You should pull out your 'network protocol design books' and read up on the fundamental differences between token ring and ethernet. On a token ring network, each node plays an active part in passing the token. If one node is misbehaving, it _can_ seriously affect the rest of the network.
Re:Brought down the ENTIRE network? (Score:1)
Re:simple... (Score:1)
Re:Buy Stuff!!! Help the Economy on Monday (Score:1)