Planning of the future of Systems Management software?
For some reason or another, I’ve been doing this “systems management” thing for a while. I started out at AT&T with a job that turned into writing some in-house monitoring tools, then I did some storage management stuff, took a brief (but well appreciated break) doing digital video software for law enforcement — still kind of “management” only different, and now I’m doing multi-machine systems management again, with a lot of focus on provisioning and automation. Cobbler and Func are very fun, I think they are quite useful, but I’m wondering what are next on the horizon for server management tech, not in terms of a evolutionary improvement but how things can be legitimately improved by fundamental, indeed “paradigm-shifty” means.
I’m starting to think, why are there so many Linux/Unix systems management applications out there? I’m not entirely sure why, other than people may think it’s good to build more of them. There aren’t this many web browsers, web servers, desktop apps (maybe text editors — as they are easy for little projects), window managers, DAWs, or anything else. I think it’s ultimately because there’s just a huge industry of people who’s job it is to do /things/ with servers. That’s fine. Computers are everywhere. Though it still seems we overcomplicate things and you can’t just plug in a network of servers and have them do what you want and keep working and all talk to each other like Furbies. I want that.
As datacenters are shifting to the model of shipping servers in shipping containers and plugging them in, I think more and more people want that. Though is management software solving that problem, or are we just building more of the same solutions? I’m not sure.
Perhaps we are just trained to make more management software in an attempt to automate the automators? If we make our management problems too simple, we all then create other tools we have to simplify with our newfound free time, and the cycle does not end. Things that make some things simple make other things hard. We just then create more work/complexity, when ideally we’d like to all live on the beach and let robots do our work for us. Ultimately, can you make a datacenter completely run by itself? Unlikely. Though we don’t seem to be getting there, and while fancy phrases like “Holistic Computing” come up (whatever they are calling it now), these are illusory. They do not exist. What we do get however is lots of management software that attempts to make things simpler. Attempts to automate generate a new class of problems. Solving many of these problems is naturally very import (queue “grid”, “cloud”, and other trends).
Why so many management apps out there, and why is this such a big industry? Part of it was that in the past it wasn’t easy to work with the closed ones to improve them, so people write their own, and they often turn into other systems management products. Other OSS projects sometimes don’t portray themselves as being open to new contributors, so people will create new ones, and also sometimes developers just think they can do something better. Bottom line, a lot of people are doing the same thing, and it seems to be a massive amount of overkill because any standard that attempts to unify and interconnect things (CIM, etc) is usually done pretty darn badly so it’s not really an answer and doesn’t eliminate any real work. Some things are just biologically grown from starting at simple automation shell scripts and turn into management software. Other sites have complex needs so tools that use A, B, C, and D are almost perfect, but “D” should really be “E”, so they build their own thing. So there is a lot of software-diversity in this field. (Studying it with biological models would probably be pointless but you’ve got to wonder at all the causes projects come into being and the forces that make them evolve).
Anyhow, a couple of things people don’t realize they are getting into when building systems management apps:
- The Fallacies of Distributed Computing are all there.
- Everything is glue. What isn’t glue is plumbing, which is held together by glue. The glue never stops and there are multiple kinds of glue in everything. Sticky!
- All components being glued together change without thinking of the other components every year or so, so, despite 10 years of software development in “management” technology, the tools we invent today are very similar to what we invented ten years ago — they just manage slightly different things. Maybe the interfaces look different or they manage today’s technology instead of yesterday’s but it is fundamentally the same — still databases, RPC, interfaces, systems… yet because components change we need to keep rewriting and redoing the above.
- Due to pressures of “complying” with industry trends, lots of software gets more complex over time in ways that it shouldn’t need to. This makes the apps that used to run at speed X ten years ago run at the same speed they do now, they just took 10 times longer to write! Ouch! (Some pushback in Python/Ruby/etc land, thank you! But in general, we tend to make things more complicated than they should be)
- Often the managemenet problems we try to solve are problems created by other software, we have workflows producing feedback loops that are hard to eliminate.
It’s a lot of work, and the off-the-shelf stuff may do 90% of it, yet we keep building more and making the glue problem harder to solve. So the moral of this story is, kids, write video games. We need more of those.
No, that’s not the moral of the story. The moral of the story is “I think we need to revisit the general problem of “What is Systems Management”" with the perspective that it never existed. How do we build architectures that survive over time that are easy to grow upon and survive the unknown future, and how to we build environments where everyone can collaborate happily on common tools and reduce all of this duplicated work. We’re not there yet. Not even close, and it is not possible to steer the universe to agree on such a plan.
If we built an erector set for systems management “glue” would someone use it? What sort of technologies in management space can be built to /enforce/ collaboration and eliminate the continual rewrites of management software churn? How we do we transform an army of systems administrators into one unified army of developers building /THEIR/ one perfect systems management LEGO/Erector/Voltron set? What sort of components should go into it? How do the various aspects in people-land get set up to make sure it’s cohesive and not over-complicated?
Hard problem. Fun problem. The trick is in sharing our lessons learned so that other folks can help solve these problems as well, and we can start to have general practices shift in a way that starts to make our life easier. That’s a evolutionary process in itself. Maybe it /is/ solved in ten years, or in ten years, do our management apps look basically the same as today?
“Holistic computing” has been a failure that never appeared as reality. I think what we really want to achive as a long term goal is “everything is zeroconf” (thinking of zeroconf not as Avahi, but in the fact I really don’t want to know how to configure 12 different services each with 100 different parameters) mixed in with a fair bit of artificial intelligence and “plug and play on steroids”.
That’s so true, and yet I’m building an administration interface for my server pool…
Steven
August 29, 2008 at 7:26 am
I tell you what I want. The Fedora Home Server spin . Let me explain.
I have DSL at home to get Internet service.
I have a wireless router, but it sucks. I’ve used SmoothWall and it’s great, but it only runs on that one box, and wireless is hard to set up, and it doesn’t do disk storage.
ClarkConnect does file storage and *other* stuff but is a bit more complex. But not as hard as setting up Fedora to work as a home router and server.
I used cobbler to get some PXE booting and network installation working, but I ended up having to setup two boxes, one to run smoothwall, and another Fedora box to do the PXE stuff.
I see the systems management stuff in Spacewalk, but I really just want one box, not two or three.
And… I’m just a regular user, I can read, and I can assemble a computer, but I’m not a Linux, systems management, or networking expert.
I just want a box that can connect to DSL, act as a firewall and router serve files, run PXE and stuff. Help plz?
Spacewalk
Marland V. Pittman
August 29, 2008 at 8:05 am
Marland,
It seems you’d like something like an all-in-one SoHo appliance type thing. I agree, that would be interesting — I’m more pondering the future of large datacenterness, though that’s one area that is also lacking, and would be useful in competing with some of the M$ offerings.
michael.dehaan
August 29, 2008 at 8:07 am
Steven, EXACTLY
It would be nice if services found each other magically started working together and built their own and…
I also want a pony. Actually I don’t want a pony.
A llama, maybe. If it was robotic. Lower maintenance that way.
Just need to find the steps to start getting us there.
michael.dehaan
August 29, 2008 at 8:09 am
Well, I used to deploy webmin on my servers, this was a first step to unify all my different Linux distributions on a single administration interface for simple tasks. I did that until webmin made one of my servers segfault and the harddisk crashed. Great.
Now I have a set of small shell scripts on each server, to do very specific tasks and I call them via a web interface. I probably reinvented the wheel but at least I know what it’s doing and it fits my needs.
Steven
August 29, 2008 at 9:13 am
Webmin’s failing is that it’s strictly 1:1, and yeah, we all reinvent the wheels, the entire systems management thing is about wheel reinvention, so there’s nothing really wrong with it if your wheel is square.
I just kept thinking about things like Moore’s law, where my flying car was, and why software for managing things is still fundamentally the same despite lots of improvement in hardware and a lot more proliferation of servers. Perhaps there is no point or immediate solution here. Random thoughts
michael.dehaan
August 29, 2008 at 9:20 am
I think what I really wanted to say was… I know you’re working more on the “big/many” system stuff, but that I had a really good experience with cobbler (except that it had no UI) and that I really hope the integration with other components comes along to the point where I can get what I want more easily…
So, thanks for cobbler, maybe I’ll get off of my duff and just make a Fedora Home Server spin myself… I know webmin isn’t the end-all solution for remote management, but I don’t even have square wheels to re-invent. Is there a good gui-based remote managment tool packaged with Fedora?
Marland V. Pittman
August 29, 2008 at 10:20 am
FWIW, cobbler has had a UI for the last year or so.
FuncWeb was intended to be what you seek, but we have found most folks really don’t want the UI for Func. No, there’s not really a good remote management many-to-many type solution for configuring services, mainly because modeling the various servers and all versions of them is a huge challenge — that’s why Webmin is a mess (look at it’s Apache config, for instance). This is why tools like config management systems work at the level of the config file rather than modelling the application itself.
We believe it’s better to use a config management system for app config sort of things and then use Func for assorted tasks that don’t fit into that model (like restarting machines, running commands, misc scripting, or diagnostics).
Michael DeHaan
August 29, 2008 at 11:13 am
See also: http://people.byte-code.com/fcrippa/wp-content/uploads/2008/06/fcrippa_large-scale-env.pdf
Michael DeHaan
August 29, 2008 at 11:17 am
You really can’t expect hobbyist writing systems management software that would be good enough (to stop new projects re-inventing the wheel, and actually work too) because they have no idea what good systems management software is like. Something like eDirectory or Active Directory simply does NOT exist in open source world.
rawsausage
August 30, 2008 at 2:50 am
rawsausage: good point.
What we really needed at work was something like eDirectory, but we ended up using OpenLdap and pam_access on each server to manage permissions. For which we had to write lot’s of glue to manage it centrally etc..
I think there are several reasons we keep reinventing things:
- We’re sysadmins, not programmers with a CS degree
- No time to write lot’s of software, because we’re constantly disturbed by operational issues. So that’s why we have to pick existing tools and glue them together, fast.
- And of course, the existing commercial system management systems don’t actually do anything for you. You can install Tivoli plus it’s agents, but you still have to write enormous amounts of scripts, find ways to version control them etc.
Ruben Kerkhof
August 30, 2008 at 3:08 am
sausage, I’m amused to have picked up a troll. Or a have I?
Naturally management software is not a hobbyist endeavor, it is written by software developers and systems administrators. The idea that OSS software is exclusively created by hobbyists is FUD, and frankly is insulting and ignorant. This does not diminish anyone who does things in their spare time, but is a typical line out of the Microsoft bible. The idea that AD is “goodness” is clearly due to a lack of familiarity with all of it’s badness. AD can cause a boatload of problems for administrators of Windows networks, as I have experienced first hand on numerous occasions.
This post is not to stop people from inventing things that solve their problems, it’s more of a open ended question about where the future lies for datacenter automation and getting us beyond the way we do things now, and to ask why there is so much diversity in the systems management space when all apps, proprietary and not, are essentially doing the same thing they have always done.
michael.dehaan
August 30, 2008 at 7:26 am
[...] [8] http://www.michaeldehaan.net/?p=702 [...]
Fedora Weekly News 141 « fedora-announce
September 2, 2008 at 1:12 am