Has Linux returned to the 'bad old days'?

I’m posting out of frustration. I thought we had gone past the point where updates broke things. One of our systems runs Debian - configured for the ‘stable’ branch. A simple apt update broke exim4 and parts of Nagios. With the number of servers we have to maintain, I’m not confident in increasing our installations of debian. We just don’t have time for this.

Is Debian supposed to be used for mission critical systems? (actual question)

I know it’s known as one of the most reliable stable (unchanging) distros but i’ve never heard it advertised as a mission critical system. One argument might be that unchanging packages over a long enough period of time could produce unreliable interactions with new software and protocol changes which the system may need to use.

I’ve always heard RHEL and Ubuntu Server recommended to me, my laptop for example is running CentOS 8 (at least till Decemeber) and it’s been bulletproof. Someone correct me if i’m wrong here.

I’ve heard of people using Debian but not on large scale. There’s no commercial support unless there’s a third party that offers this.

Digital Ocean has Debian available for servers as does Linode so in theory there could be a lot of Debian servers out there.

Lets not forget that openSUSE Leap 15.3 is now binary compatible with SLE with a simple upgrade to SLE if commercial support is needed. It’s a good option. But the reason to use Debian would be to avoid commercial entities and support it yourself, luckily its stable enough it doesn’t need much, except in this case apparently.

No matter what distro you use, you should always research the compatibility of the applications you depend on for the new version of the distro.

If stuff is mission critical, I suggest you run a “clone” of your production system and test out updates using that, before updating the production server.

Debian had a new release recently (10 to 11), so basically everything changed as it gets updated.

3 Likes

I’ve had some time to cool off. Managed to fix almost everything with 2 hours of overtime. In retrospect, this disaster was partially self-inflicted. Because recent updates have been so reliable, I’ve developed the bad habit of running the apt commands without carefully reading through all the packages that are being upgraded and removed. Discovered that mailman was no longer included in stable, so had to add the repository for an older version of debian. Had to muck around in the exim4 configuration and change a few configurations in nagios. Still not 100% fixed, but it’s good enough.

From what I understand, debian has partners or consultant companies that can provide commercial support - not that we could afford it. Still not happy that the ‘stable’ release cycle actually broke stuff. From what I was able to find, the developers knew that things would break but didn’t bother to write anything to fix it. So my conclusion remains that debian may be fine for a couple servers, but isn’t suitable for anyone who has to maintain a lot of servers - or can’t tolerate hours of downtime to resolve problems.

Another result of this incident is that I’m unlikely to apply patches to this server on a timely basis in the future. What I was reading about the exim4 changes implied that they’re going to break it again in the next release.

Agreed. And, I’ll also add that you can lock versions of specific packages so that apt will not upgrade them. Then test them in DEV. Once you are ready to upgrade, use apt to install/upgrade to a specific version.

Also, one more thing…consider Ansible/Puppet/Chef, etc. These could have helped you tremendously in this type of situation.

2 Likes

I always run Timeshift before any update just to be on the safe side!

2 Likes

Keep in mind that this might be the fault of a developer or a package maintainer. Don’t fault Debian as a whole for the failure. This could have happened with any distro due to human error.

What this has shown is that you need a process to quickly restore service.

1 Like

Actually it is. They state so when you read the Debian wiki.
The mission statement and reasons for Debian. Scroll down to enterprise.

https://www.debian.org/intro/why_debian.en.html

Every software and distro can fail at some point. Debian explicitly supports in-place upgrades but you should read the release announcement and errata pages.
When you do a dist-upgrade you normally have to read through the apt change log.

Of course if you need paid support there are RHEL and Ubuntu but regarding Ubuntu, there is no difference in the package base and packages outside of main are not getting the same care as packages in Debian, where everything is in main except non-free.

OldFart,

I apologize in advance if I am mis-reading your posts, but I had a couple of thoughts.

To me you posts read like you actually have “stable” in your sources.lists file instead of the actual release name of “bullseye”, which is the current stable release, that came out on 8/14/2021. The problem with that is that if you install from buster install media, which is he previous version, and change the sources.list file from buster to stable, then the apt update/upgrades will work just fine until the next release comes out, bullseye in this case, then it will start pulling from the bullseye repos instead of the buster repos.

This page has a warning not to do that:

https://wiki.debian.org/SourcesList

Distribution

The ‘distribution’ can be either the release code name / alias (jessie, stretch, buster, sid) or the release class (oldstable, stable, testing, unstable) respectively. If you mean to be tracking a release class then use the class name, if you want to track a Debian point release, use the code name. Avoid using stable in your sources.list as that results in nasty surprises and broken systems when the next release is made; upgrading to a new release should be a deliberate, careful action and editing a file once every two years is not a burden.

Unfortunately, Debian’s release upgrade process, to go from buster to bullseye for example is much involved than a simple apt update/upgrade pair:

https://wiki.debian.org/DebianUpgrade

The other rule that Debian preaches is to not mix any of the other repos with stable, like testing, sid, oldstable, etc, which I think that you did to get mailman installed. On a plus side though, it looks like mailman is still in bullseye, but it got renamed to mailman3.

Assuming that is what happened, then in my opinion, your best bet would to blow it away and install bullseye from scratch, leaving the sources.list file as bullseye. That would give you the best chance to properly upgrade down the road, using the link above as a guide, to the next release, bookworm when it comes out.

Just my two cents.

Actually, changing my sources.list from buster to bullseye and then running apt upgrade is pretty much how I upgraded all of my servers.

I did, of course, run apt clean and apt autoremove. I have not experienced any issues with the upgrade.

I don’t have any extra repo’s enabled at this point, not for my servers.

Looking at the man page for apt, it looks like the differences between “upgrade” and “full-upgrade” are that “full-upgrade” will remove packages if needed to upgrade a different package, while “upgrade” will never remove a package, and will skip upgrading any package that require the removal of a package:

upgrade (apt-get(8))
           upgrade is used to install available upgrades of all packages 
currently installed on the system from the sources configured via 
sources.list(5). New packages will be installed if required to satisfy 
dependencies, but existing packages will never be removed. If an 
upgrade for a package requires the removal of an installed package 
the upgrade for this package isn't performed.
full-upgrade (apt-get(8))
           full-upgrade performs the function of upgrade but will remove 
currently installed packages if this is needed to upgrade the system 
as a whole.

I’m pretty sure i’m wrong.

I’ve been factoring through my arguments and cases I can make against Debian are generally the same cases I can make against RHEL and Ubuntu Server.

For example… Static (Stable) releases suffer from a lack of newer security features so even if their security features are perfected they grow larger security gaps over time from what newer software covers.

Static releases also lose reliability over time to the degree they need to interface with external data or software that’s continuing to change.

Those arguments are the exact same for RHEL and Ubuntu Server.

This is indeed confusing in the “apt terminology” and therefor it was always recommended to use:

apt update
apt dist-upgrade

It is the equivalent to full-upgrade and on sid you should not even use apt-upgrade after the update command.

The manual for apt dist-upgrade:

dist-upgrade in addition to performing the function of upgrade,
   also intelligently handles changing dependencies with new versions
   of packages; apt-get has a "smart" conflict resolution system, and
   it will attempt to upgrade the most important packages at the
   expense of less important ones if necessary. So, dist-upgrade
   command may remove some packages

I think that is the one of the reasons why KDE Neon introduced a wrapper around apt which name I forgot. I even think that Ubuntu’s update manager is doing the same and it should be the default how Synaptic uses apt for its updates.
I could be wrong but full-upgrade was added later and of course does basically the same:

full-upgrade
   full-upgrade performs the function of upgrade but may also remove
   installed packages if that is required in order to resolve a
   package conflict

Though I am used to the old ways. Both commands do not only serve for updating on a regular basis but are also important for Debian point releases and especially when you do an in-place upgrade to the next Debian release.

That is my understanding too, that dist-upgrade and full-upgrade are equivalent and do the same thing. It just that dist-upgrade comes apt-get command, while full-upgrade comes from apt command.

According to the man page Debian added the apt command after the apt-get command in an effort to create a higher level interface that was designed to be run directly by the end users in a terminal, while apt-get is lower level and more script friendly:

DESCRIPTION
       apt provides a high-level commandline interface for the package 
       management system. It is intended as an end user interface and 
       enables some options better suited for interactive usage by default 
       compared to more specialized APT tools like apt-get(8) and 
       apt-cache(8).

       Much like apt itself, its manpage is intended as an end user 
       interface and as such only mentions the most used commands 
       and options partly to not duplicate information in multiple places 
       and partly to avoid overwhelming readers with a cornucopia of options
       and details.

From my reading, I think mailman3 is more than just renaming the project. It looks like a significant amount of work would be needed, and some amount of communications to the users to warn them that things would look and act a little different after the upgrade.

I can’t speak for all the overworked network support people, so I’ll just speak for myself: I don’t have time to deal with upgrades that break things. We are rolling out new applications and servers too quickly to properly prepare for a patching process that isn’t absolutely smooth.

The sad fact of our industry is that Linux can not afford to be “as good as” or “as reliable as” the alternatives. It must be better. Too many of our applications require Windows servers. Therefore, network admins have no choice but to learn how to deal with the problems of Windows updates. We don’t have time to also learn how to deal with troublesome Linux updates. If we postpone applying patches until we have time to deal with possible failures then we’ll “never” apply patches. If we never (or rarely) apply patches, then we’ll be at risk. If our security audit shows that outdated Linux servers are a risk, then we’ll probably be forced to migrate to a Windows server or an Internet service.

I truly prefer Linux - but I’m struggling with the realities of an understaffed department with no training budget. Our Linux implementations are probably going to be relegated to skunkworks projects (i.e. trials and low visibility applications that can afford to be broken for a while).