Microsoft Linux Repos Suffered 22-Hour Outage (arstechnica.com) 41
"Everything from Visual Studio Code to Microsoft Edge and Teams package links were affected," reports Windows Central. They note Azure's status page (which now shows the issue lasting for more than 22 hours), though however long it lasted, "it's a virtual eternity for those whose entire ecosystem is crippled by such an outage."
According to Ars Technica, starting on Wednesday, "packages.microsoft.com — the repository from which Microsoft serves software installers for Linux distributions including CentOS, Debian, Fedora, OpenSUSE, and more — went down hard..." The outage impacted users trying to install .NET Core, Microsoft Teams, Microsoft SQL Server for Linux (yes, that's a thing) and more — as well as Azure's own devops pipelines.
We first became aware of the problem Wednesday evening when we saw 404 errors in the output of apt update on an Ubuntu workstation with Microsoft Teams installed. The outage is somewhat better-documented at this .NET Core issue report on Github, with many users from all around the world sharing their experiences and theories...
The entire repository cluster that serves all Linux packages for Microsoft was completely down — issuing a range of HTTP 404 (content not found) and 500 (Internal Server Error) messages for any URL — for roughly 18 hours. Microsoft engineer Rahul Bhandari confirmed the outage roughly five hours after it was initially reported, with a cryptic comment about the infrastructure team "running into some space issues."
Eighteen hours after the issue was detailed, Bhandari said that the mirrors were once again available — although with temporarily degraded performance, likely due to cold caches.
According to Ars Technica, starting on Wednesday, "packages.microsoft.com — the repository from which Microsoft serves software installers for Linux distributions including CentOS, Debian, Fedora, OpenSUSE, and more — went down hard..." The outage impacted users trying to install .NET Core, Microsoft Teams, Microsoft SQL Server for Linux (yes, that's a thing) and more — as well as Azure's own devops pipelines.
We first became aware of the problem Wednesday evening when we saw 404 errors in the output of apt update on an Ubuntu workstation with Microsoft Teams installed. The outage is somewhat better-documented at this .NET Core issue report on Github, with many users from all around the world sharing their experiences and theories...
The entire repository cluster that serves all Linux packages for Microsoft was completely down — issuing a range of HTTP 404 (content not found) and 500 (Internal Server Error) messages for any URL — for roughly 18 hours. Microsoft engineer Rahul Bhandari confirmed the outage roughly five hours after it was initially reported, with a cryptic comment about the infrastructure team "running into some space issues."
Eighteen hours after the issue was detailed, Bhandari said that the mirrors were once again available — although with temporarily degraded performance, likely due to cold caches.
Aptly named "cluster" (Score:5, Funny)
The entire repository cluster that serves all Linux packages for Microsoft was completely down...
Now you know why they call it a "cluster".
Re: (Score:2)
It's like Seinfeld and rental car companies. You know how to take the reservations but you don't really know how to hold the reservations.
You can say you have a cluster, but if it's not really clustering, that's not really a cluster is it?
Wear protection. (Score:4, Funny)
Alternate title: When the internet goes into self-defense mode.
Advice to Microsoft (Score:1)
If you adopt a puppy, makes sure it likes you first and won't bite your hand.
My guess is the Linux source code tree finally enacted revenge. Unlike MS, it has a long memory and it hasn't forgotten years of abuse.
Is this the third step so soon? (Score:1, Troll)
Embrace
Extend
Extinguish
Maybe someone else on the Internet decided .Net was kind of a joke for Linux or something.
As for MS SQL, I mean, really, unless you're transitioning some MS mess to Linux, there probably are whole ton of better DB alternatives when you're using Linux. (and for God's sake, I am NOT talking about Oracle solutions, Friends don't let Friends use Oracle)
Re: (Score:2)
I find having SQL Server in Docker containers is actually useful for testing in a development environment, when you are locked to SQL server in production.
Re: (Score:2)
Yes, someone -- perhaps an intern -- got things out of order. They extinguished their embrace, before really extending anything.
Re: (Score:2)
They wrote software for Linux! EEE!
They don't write software for Linux! EEE!
They have coffee in the morning! EEE!
Re: (Score:2, Insightful)
Like what exactly? I readily concede that when the license costs / restrictions are considered, there are a lot of projects and general use cases where PostgreSQL is probably the route go. I would agree with the statement that MySQL might be the route to take over PostgreSQL if the needs are modest and maintenance effort knowledge requirement is to be minimized.
However if money is no object-
I really can't think of any case with in the sphere of traditional transaction RDBMS applications (specifically exclud
Re: (Score:2)
Re: (Score:2)
xp_cmdshell actually works but you get '/bin/sh' as the command interpreter. Some of the other missing stuff, I can't say I have tried, like file operations could probably be trivially replaced with some .Net extensions (that don't use unsafe).
I agree thought there are certainly Windows platform elements that are not fully abstracted that client applications might care about. I was not really suggesting that MSSQL on Linux was a drop in replacement for MSSQL on x86/64, back in the day the same was true of t
Re: (Score:2)
I'd be looking for reasons to NOT to pick SQL Server, rather than the other way round given the landscape today.
It's expensive and not open. Those are two very good reasons.
Re: (Score:2)
In my experience:
Postgresql - every once in a while you'll find a new feature that is really nice.
MSSQL - basically acceptable. Best GUI if that's what you want.
MySQL - every once in a while you'll find a new 'feature' that is really awful and causes problems.
Re: (Score:2)
Can you elaborate further on their plans? How does anything get extinguished.
Independent mirrors (Score:5, Informative)
This is why most linux distro sites include independent mirrors around the world operated by third parties. Even if the primary goes down, it just means the mirrors won't receive any new content for a while - users can still install packages.
Re: (Score:2)
MS cannot have that. People may get scared otherwise when they realize they have been using 3rd rate MS "technology" all this time when they could have run reliable and secure things instead.
Re: Independent mirrors (Score:2)
We're talking about Windows users here. If you didn't get it from an official Microsoft site, it must be pirated software. And you are evil.
A necessary part of EEE.
Re: (Score:2)
How on earth Microsoft's LINUX repositories have anything to do with Windows??
Ecosystems (Score:2)
"it's a virtual eternity for those whose entire ecosystem is crippled by such an outage"
Otherwise known as morons who don't have a plan B for their business critical systems and deserve to fail.
Re: (Score:2)
Re: Ecosystems (Score:4, Insightful)
Feel free to setup multiple CDNs and multiple cloud providers over multiple regions. For most it's not worth the cost.
I suggest you will get better results by avoiding anything Microsoft though. AWS and Google both work great and strangely people say oracle cloud works too.
Re: (Score:2)
Single Point of Failure is a design feature widely used by Microsoft and all Microsoft software users. If there is a Single Point of Failure and something fails, then your Microsoft Support Technician will be able to tell where the problem is located.
Re: (Score:2)
"Except that now you don't have a choice"
Yes, you do. You can - shock horror! - have your own backup servers! Who knew?
" Office, in an offline config?"
Err, yes, no problem. What planet are you from?
Re: (Score:2)
Of course, in this case the people affected would people who had a need to install the software that day.
I think it would be unreasonable to backup the install sources for all software that you *might* need to install one day. Of course, being delayed in installing the Electron app versions of Teams and VS code by a day as 'crippling' seems a bit melodramatic (for Teams, the website is better than the 'local' app for Linux anyway, and VSCode you can just use something else in the meanwhile, it may be less o
Anyone surprised? (Score:5, Interesting)
I've used Azure. I've read their documentation. I've talked to their level 1 though 3 support. I've read their incident reports. I've run production there. This isn't a surprise, Azure is a train-wreck though and though. It's the cloud of choice of those who don't get a choice because management believed the promises made by industry leading salesmen.
The only suprise is that anyone is actually surprised.
Re: (Score:2)
_That_ bad? They have outages in the order of a day and businesses are willing to tolerate that?
No worries... (Score:2)
Welcome to MS "quality" levels (Score:2)
Where outages are the norm, insecurity is business as usual and sysadmins and developers are semi-competent at best. Got to make sure folks used to MS feel right at home here, cannot have servers just running forever without problems. People may get scared otherwise.
Re: (Score:2)
LoB
Not mentioned, Github Actions. (Score:4, Informative)
Github actions was pretty severely affected by this.
All of the actions that run on ubuntu or any of the other flavours of linux were unable to start for at least as long as mentioned in the story summary.
I think the outage was actually a bit longer than suggested as it took the CDN networks that microsoft uses internally to pickup the new versions taking a while to pick them up, and then after that there was a backlog of jobs to process on github actions that made the downloading of images be quite slow (installing linux at 40kB/s takes a while) as well as a back log of jobs in the github actions.
I'm slightly dissapointed that the status page for github actions didn't list any info about this. It left users unabe to know if there was anything they could do to make their stuff work again, or if they needed to just wait it out.
Re: (Score:2)
This is a phenomenon that broadly frustrates me: CI environments that to vet every single commit reinstall from scratch. At best it can mean CI tests that could take a couple of hours on actions that no commit may even possibly mess up, at worst the first pull request to trip over a broken dependency arbitrarily has that developer saddled with a problem that wasn't vaguely his responsibility.
As an asynchronous test, and a gate for tags, sure, a scope where the project maintainers are the ones that get notif
Re: (Score:2)
They meant their software packages *for* those OSes.
MSSQL server, all sorts of dotnet packages, powershell, azure integration, etc.
I'm not saying ... (Score:2)
Re: (Score:2)
First the power company, then meat packing industry, and now Microsoft Linux repos?
It's like these hacks are escalating, but in reverse order.