Category: Software Architecture

  • Summary: Occamality and Software Architecture

    This is a summary of my posts on the single most important principle of software: Occamality (non-redundancy). This principle applies to everything from the simple concept of what makes a piece of software "good" to developing and evolving software quickly, efficiently and with high quality. It drives good software architecture and all other aspects of development, from requirements through QA. 

    This summary also includes my posts on software architecture, mostly explaining why widely accepted architectures like microservices are terrible.

    To start, it's worth pointing out that software people don't know what makes a piece of software "good."

    https://blackliszt.com/2023/09/how-do-you-know-if-a-given-piece-of-software-is-good.html

    Software people often have strong thoughts about software languages and architecture. However, it is extremely rare for those opinions to be grounded in or related to the goals of software architecture. What are the goals? Here’s my proposal.

    https://blackliszt.com/2022/05/the-goals-of-software-architecture.html

    Here are the key specific things you do to accomplish the goals.

    https://blackliszt.com/2020/02/how-to-build-applications-that-can-be-changed-quickly.html

    Here is a layman's, common-sense explanation of the same idea:

    https://blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-common-sense-approach.html

    Here is a specific explanation explaining why Occamal programs are better than non-Occamal ones and why you should care.

    https://blackliszt.com/2023/09/why-should-you-build-occamal-programs.html

    How do you apply Occamality in practice? Here is a short, simple list of the practical things you do.

    https://blackliszt.com/2023/09/understanding-occam-optimality-practically.html

    Occamality isn't another entrant in the myriad of design principles competing for attention in the chaotic world of software — it's an overriding principle, one that stands above and ranks all the contenders.

    https://blackliszt.com/2023/09/occamality-and-other-design-principles.html

    Occamality isn't confined to writing software. It applies to all stages of the development lifecycle, from requirements through QA and support.

    https://blackliszt.com/2023/09/occam-optimality-applies-to-all-stages-of-the-software-life-cycle.html

    The value of reducing redundancy isn't confined to software; it's a general principle.

    https://blackliszt.com/2020/05/lessons-for-better-software-from-washing-machine-design.html

    Saying that you should reduce redundancy in a program sounds simple, but once you get past trivial examples, it's not. Here's an analysis of the increasingly sophisticated kinds of redundancy in programs that should be addressed.

    https://blackliszt.com/2023/10/moving-towards-occamal-software.html

    Reducing redundancy is accomplished by taking a declarative approach to programming instead of a purely imperative one. There are many examples of this.

    https://blackliszt.com/2021/07/software-programming-languages-the-declarative-core-of-functional-languages.html

    Databases are an excellent, proven example of applying the principle of Occamality.

    https://blackliszt.com/2023/09/occamality-in-databases.html

    It's not just databases — Occamality is a thread that weaves through much of the history of software.

    https://blackliszt.com/2023/09/occamality-in-software-history.html

    You might think that reducing redundancy is an obviously valuable thing to do. The trouble is, modern software orthodoxy endorses the notion of collections of code that are separated by high walls (components, services, layers, objects, etc.), which typically leads to huge amounts of redundancy.

    https://blackliszt.com/2023/09/occamality-the-problem-with-layers-components-and-objects.html

    There is a simple idea that shows the basic approach to eliminating redundancy in programs: instead of stating how a thing should be accomplished, you concentrate on defining what is to be accomplished.

    https://blackliszt.com/2023/09/achieving-occamality-what-not-how.html

    The optimal way to reduce redundancy includes recognizing that in addition to instructions and data, programs include varying amounts of metadata. Metadata is an easy concept for those who use it, but many programmers don't get past the idea of parameters. Here's a way to understand metadata:

    https://blackliszt.com/2023/09/achieving-occamality-through-definitions.html

    In broader context, here's how metadata fits into a whole program, as the third dimension of software architecture.

    https://blackliszt.com/2020/02/the-three-dimensions-of-software-architecture-goodness.html

    For all dimensions, lack of redundancy is the main virtue. As a group, the more functionality is expressed in metadata and the less in code, the better.

    A focus on metadata is similar to having a generic direction-generating program that refers to an easily-changed map.

    https://blackliszt.com/2020/06/the-map-for-building-optimal-software.html

    Here’s more theoretical depth on the role of metadata in a software system, with a comparison to theories of the solar system.

    https://blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-code-and-metadata.html

    Why put as much application knowledge into metadata as possible? It's the easiest thing to change, and above all, it's the best place to eliminate redundancy, which is the enemy of fast, error-free change.

    https://blackliszt.com/2020/03/william-occam-inventor-method-for-building-optimal-software.html

    Here is a more extensive explanation of the history and context of Occam's Razor and its relevance to software.

    https://blackliszt.com/2023/09/occams-razor-the-key-to-optimal-software-development.html

    How do you achieve this ideal architecture for a body of code? Not all at once! You avoid the usual nightmare of useless, ever-changing requirements and do something that makes a customer happier than they were. Then fix it. Here’s the process, to which I’ve given a fancy name.

    https://blackliszt.com/2022/09/better-software-and-happier-customers-with-post-hoc-design.html

    Here’s another statement of the basic idea:

    https://blackliszt.com/2011/06/software-how-to-move-quickly-while-not-breaking-anything.html

    https://blackliszt.com/2020/03/how-to-pay-down-technical-debt.html

    Here is more detail and explanation of how to use increasing amounts of metadata to help build applications quickly, which of course should be a major goal of software architecture.

    https://blackliszt.com/2022/05/how-to-improve-software-productivity-and-quality-schema-enhancements.html

    Here's a short case study from early in my career that demonstrated to me the incredible value of taking an Occamal approach to building an end-user business application.

    https://blackliszt.com/2023/09/achieving-occamality-through-definitions-case-study.html

    Here is a more recent case study of a system based on extensive use of metadata and what happened when technology-fashion-driven executives took over the company.

    https://blackliszt.com/2023/09/case-study-replacing-metadata-with-fashionable-software.html

    One of the most basic aspects of software architecture is the data and where it is stored. The default choice for most architecture is to use a standard DBMS. Given the steady advance of Moore's Law, this is often no longer the best choice.

    https://blackliszt.com/2010/09/databases-and-applications.html

    Given the huge advantage of taking a metadata approach to software, why isn't it widely used? It's because all of software has been obsessed with procedural language as the core focus of programming. While necessary for the first decades of computing, it's now the core reason, never discussed, for the near-universal dysfunction in software development.

    https://blackliszt.com/2024/08/why-is-writing-computer-software-dysfunctional.html

     

    Bad Software Architectures

    Software is infected with architectural religions, none of them with a sound basis in logic or real-world experience. It’s not that you can’t build software that sort of eventually kinda works with them – but it’s like building a car with a steam engine.

    Sadly, some programming languages and programming concepts encourage redundancy.

    https://blackliszt.com/2014/03/how-to-evaluate-programming-languages.html

    Starting a couple decades ago the idea of “distributed computing” as an architecture become the thing all the cool kids gravitated to.

    https://blackliszt.com/2015/04/the-distributed-computing-zombie-bubble.html

    A modern incarnation (with a new name and rhetoric of course) is micro-services, which is supposed to boost programmer productivity.

    https://blackliszt.com/2021/03/how-micro-services-boost-programmer-productivity.html

    Not only does micro-services boost programmer productivity, it supposedly is a “scalable” architecture – in sharp contrast to the evil “monolithic” architecture … a word which is usually pronounced with a sneer.

    https://blackliszt.com/2020/06/why-is-a-monolithic-software-architecture-evil.html

    The trouble is, microservices make about as much sense as blood-letting did in medicine. It's widely accepted as useful, but entirely without evidence.

    https://blackliszt.com/2019/02/what-software-experts-think-about-blood-letting.html

    Programmers seem to like to layer their software, often without thinking about it.

    https://blackliszt.com/2012/06/layers-in-software-fuss-and-trouble-without-benefit.html

    Similarly when they link together pieces, a key decision is whether the coupling is loose or tight.

    https://blackliszt.com/2012/08/coupling-in-software-loose-or-tight.html

    Components and layers have been promoted for a long time.

    https://blackliszt.com/2021/03/micro-services-the-forgotten-history-of-failures.html

    https://blackliszt.com/2021/09/software-components-and-layers-problems-with-data.html

    https://blackliszt.com/2021/08/the-dangerous-drive-towards-the-goal-of-software-components.html

    For the best results, it’s good to focus on the goals of software architecture described above, and assure that everything that you do contributes to those goals. Part of how you do this is to avoid the always-present temptation of following software fashions.

    https://blackliszt.com/2023/07/summary-software-fashions.html

     

  • How to Improve Software Productivity and Quality: The Common Sense Approach

    I talked with a frustrated executive in a computer software company. I was about to visit their central development location for the first time, and he wanted to make sure I asked sufficiently penetrating questions so that I would find out what was “really” going on.

    He explained that while he had written software, it was only for a few years in the distant past, and things had changed a great deal since his day. His current job in product marketing didn’t really require him to get into any details of the development shop, and in fact he preferred to stay out of the details for several reasons: (1) he was completely out of date with current technology and methods; (2) he didn’t want his thinking constrained by what the programmers declared was possible; (3) it was none of his business.

    He had developed a keen interest in what was going on in the software group, however, because he realized that it had a dramatic effect on his ability to successfully market the product. His complaints were personal and based on his own experience, but they were fairly typical, which is why I’m recounting his tale of woe here.

    The Lament

    The layman’s lament was an interesting mish-mash of two basic themes:

    • I’m not getting the results I need. There are certain results that I really need for my business. My competitors seem to be able to get those results, and I can’t. Basically, I want more features in each release, more frequent releases, more control and visibility on new features, fewer bugs in new releases, and the ability to make simple-sounding changes quickly. Our larger competitors seem to be able to move more quickly than we do.
    • I think the way the developers’ work is old-fashioned, and if it were brought up-to-date, I would get the results I need. What they do seems to be “waterfall,” with lots of documentation that doesn’t say a lot. There must be something better, along the lines of what we used to call RAD (rapid application development). They only have manual testing, nothing automated, and they tell me it will be years before they can build automated testing! And shouldn’t they be using object-oriented methods? Wouldn’t that provide more re-use, so that things can be built and changed more quickly? They have three tiers, but when I want to change something, the code always seems to be in the wrong tier and takes forever. They’re talking about re-writing everything from scratch using the latest  technology, but I’m afraid it will take a long time and there won’t be anything that benefits me.

    Basically, he was saying that he wants more things, quicker, and better quality. He also advanced some theories for why he’s not getting those things and how they might be achieved, but of course he couldn’t push his theories too hard, because he lacked experience and in-depth knowledge of the newer methods. He even claimed, in classic “the grass is greener” style, that practically everyone accomplishes these things, and he was nearly alone in being deprived of them – not true!

    The usual dynamics of a technology group explaining itself to “outsiders” was also at work here – if you just listen to the technology managers, things are pretty good. The methods are modern and the operation is efficient and productive. There are all sorts of improvements that could be made with additional money for people and tools, of course, but for a group that’s been under continual pressure to build new features, support cranky customers and meet accelerated deadlines with fewer resources, they’re doing amazingly well. The non-technology executives tend to feel that this is all a front, and that results really could be better with more modern methods and tools. The technology managers, for their part, feel like they’re flying passenger planes listening to a bunch of desk-bound ignoramuses complain about their inability to deliver the passengers safely and on-time while upgrading the engine and cockpit systems at the same time. These people have no idea what building automated testing (for example) really takes, they’re thinking. The non-technology people don’t really want to talk about automated testing, of course – they’re the ones taking the direct heat from customers who get hurt by bugs in the new release, and aren’t even getting proposals from the technology management of how this noxious problem can be eliminated. Well, if you can’t tell me how to solve the problem (and you should be able to), how about this (automated testing, object-oriented, micro-services, etc.)??

    It goes on and on. The business executives put a cap on it, sigh, maybe throw a tantrum or two, but basically try to live with a situation they know could be better than it is. Inexperienced executives refuse to put with this crap, and bring in new management, consultants, do outsourcing, etc. Their wrath is felt! Sadly, though, the result is typically a dramatic increase in costs, better-looking reporting, but basically the status quo in terms of results, with success being defined downwards to make everything look good. The inexperienced executive is now experienced, and reverts to plan A.

    The technology manager does his version of the same dance. The experienced manager tries to keep things low-key and leaves lots of room for coping with disasters and the unexpected. Inexperienced technology managers refuse to tolerate the tyranny of low expectations; they strive for real excellence, using modern tools and methods. Sadly, though, the result is typically a dramatic increase in costs, better-sounding reports, but basically the status quo in terms of tangible results. The new methods are great, but we’re still recovering from the learning curve; that was tense and risky, I’m lucky I survived, that’s the last time I’m trying something like that again!

    The Hope

    The non-technology executive is sure there’s an answer here, and it isn’t just that he’s dumb. He keeps finding reason to hope that higher productivity with high quality and rapid cycles can be achieved. In my experience, the most frequent (rational) basis for that hope is a loose understanding of the database concept of normalization, and the thought that it should enable wide-spread changes to be made quickly and easily. Suppose the executive looks at a set of functionally related screens and wants some button or style change to be applied to each screen. It makes sense that there should be one place to go to make that change, because surely all those functionally related screens are based on something in common, a template or pattern of some kind. What if the zip code needs to be expanded from five digits to nine? The executive can understand that you’d have to go to more than one place to make the change, because the zip code is displayed on screens, used in application code and stored in the database, but there should be less than a handful of places to change, not scores or hundreds!

    But somehow, each project gets bogged down in a morass of detail. When frustration causes the executive to dive into “why can’t you…” the eyes normally glaze over in the face of massive amounts of endless gobbledy-gook. What bugs some of the more inquisitive executives is how what should be one task ends up being lots and lots of tasks? With computers to do all the grunt work, there’s bound to be a way to turn what sounds, feels and seems like one thing (adding a search function to all the screens) actually be one thing – surely there must be! And if everything you can think of is in just one place, surely you should be able to go to that one place and change it! Don’t they do something like that with databases?

    There is a realistic basis for hope

    I’ve spent more of my life on the programmers’ side of the table than the executives’, so I can go on, with passion and enthusiasm, about the ways that technology-ignorant executives reduce the productivity, effectiveness and quality of tech groups, not to mention the morale! The more technical detail they think they know, the worse it seems to be.

    That having been said, the executive’s lament is completely justified, and his hope for better days is actually reasonable (albeit not often realized).

    What his hope needs to be realized is there to be exactly one place in the code where every completely distinct entity is defined, and all information about it is stated. For example, there should be exactly one place where we define what we mean by “city.” This is like having domains and normalization in database design, only extended further.

    The definition of “city” needs to have everything we know about cities in that one place. It needs to include information that we need to store it (for example, its data type and length), to process it (for example, the code that verifies that a new instance of city is valid) and to display it (for example, its label). The information needs to incorporate both data (e.g. display label) and code (e.g. the input edit check) if needed to get the job done. This is like an extended database schema; a variety of high-level software design environments have something similar to this.

    It must be possible to create composite entities in this way as well, for example address. A single composite entity would typically include references to other entities (for example, city), relationships among those other entities and unique properties of the composite entity (for example, that it’s called an “address.” This composite-making ability should be able to be extended to any number of levels. If there are composites that are similar, the similarity should be captured, so that only what makes the entity unique is expressed in the entity itself. A common example of this is home address and business address.

    Sometimes entities need to be related to each other in detailed ways. For example, when checking for city, you might have a list of cities, and for each the state it’s in, and maybe even the county, which may have its own state-related lists.

    The same principle should apply to entities buried deep in the code. For example, a sort routine probably has no existence in terms of display or storage, but there should usually be just one sort routine. Again, if there are multiple entities that are similar, it is essential that the similarities be placed in one entity and the unique parts in another. Simple parameterization is an approach that does this.

    Some of these entities will need to cross typical software structure boundaries in order to maintain our prime principle here of having everything in exactly one place. For example, data entities like city and state need to have display labels, but there needs to be one single place where the code to display an entity’s label is defined. Suppose you want a multi-lingual application? This means that the single place where labels are displayed needs to know that all labels are potentially multi-lingual, needs to know which the current language is, and needs to be able to display the current language’s label for the current entity. It also means that wherever we define a label, we need to be able to make entries for each defined language. This may sound a bit complicated at first reading, but it actually makes sense, and has the wonderful effect of making an application completely multi-lingual.

    In order to keep to the principle of each entity defined once, we need the ability to make relationships between entities. The general concept of inheritance, more general than found in object-oriented languages, is what we need here. It’s like customizing a standard-model car, where you want to leave some things off, add some things and change some things.

    There’s lots more detail we could go into, but for present purposes I just want to illustrate the principle of “each entity defined in one place,” and to illustrate that “entity” means anything that goes into a program at any level. By defining an entity in one place, we can group things, reference things, and abstract their commonality wherever it is found, not just in a simple hierarchy, and not limited to functions or data definitions or anything else.

    While this is a layman’s description, it should be possible to see that IF programs could be constructed in this way, the layman’s hope would be fulfilled. What the layman wants is pretty simple, and actually would be if programs were written in the way he assumes. The layman assumes that there’s one way to get to the database. He assumes that if you have a search function on a screen, it’s no big deal to put a search function on every screen. He assumes that if he wants a new function that has a great deal in common with an existing function, the effort to create the new function is little more than the effort to define the differences. He assumes that look and feel is defined centrally, and is surprised when the eleventh of anything feels, looks or acts differently than the prior ten.

    Because he has these assumptions in his mind, he’s surprised when a change in one place breaks something that he doesn’t think has been changed (the infamous side-effect), because he assumes you haven’t been anywhere near that other place. He really doesn’t understand regression testing, in which you test all the stuff that you didn’t think you touched, to make sure it still works. Are these programmers such careless fools that, like children in a trinket shop, they break things while walking down the aisle to somewhere else, and you have to do a complete inventory of the store when the children leave?

    Programs are definitely not generally written in conformance with the layman’s assumptions; that’s why there’s such a horrible disconnect between the layman and the techies. The techies have a way of building code, generally a way that they’ve received from those who came before them, that can be made to work, albeit with considerable effort. They may try to normalize their database schemas and apply object principles to their code, but in the vast majority of cases, the layman’s assumption of a single, central definition of every “thing,” and the ability to change that thing and have the side-effects ripple silently and effectively through the application, does not exist, is not articulated, not thought about, and is in no way a goal of the software organization. It’s not even something they’ve heard talked about in some book they keep meaning to get to. It’s just not there.

    I assert that it is possible to write programs in a way that realizes the layman’s hope.

    I’ve done it myself and I’ve seen others do it. The results are amazing. It’s harder to do than you would ideally like because of a lack of infrastructure already available to support this style of writing, but in spite of this, it’s not hard to write. Moreover, once the initial investment in structure has been made, the ability to make changes quickly and with high quality quickly pays back the investment.

    The main obstacle for everyone is that there is tremendous inertia, and the techniques that provide a basis for the hope, while reasonable and achievable, are far out of the mainstream of software thinking. I have seen people who have good resumes but are stupid or lazy look at projects that have been constructed according to the “one entity – one definition” principle and simply declare them dead on arrival, complete re-write required. But I have also encountered projects in domain areas where there is no tradition at all in building things in this way in which the people have invented the principles completely on their own.

    The “principle of non-redundancy” has far-reaching technical consequences and ends up being pretty sophisticated, but at its heart is simple: things are hard to do when you have to go many places or touch many things to get them done. When the redundancy in program representation (ignoring for the moment differences between code, program data and meta-data) is eliminated, making changes or additions to programs is optimally agile. In other words, with program representation of this type, it is as easy and quick as it can possibly be to make changes to the program. In general, this will be far quicker than most programs in their current highly redundant form.

    The layman’s hope that improvements can be made in software productivity, quality and cycle is realistic, and based on creating a technical reality behind the often-discussed concepts of “components” and “building blocks” that is quite different from the usual embodiment.

    I have no idea why this approach to building software, which is little but common sense, isn't taught in schools and widely practiced. For those who know and practice it, the approach of "Occamality" (define everything in exactly one place) gives HUGE competitive advantages.

  • Better Software and Happier Customers with Post-hoc Design

    What can you possibly mean by "post-hoc design?" Yes, I know it means "after-the-fact design," using normal English. It's nonsense! First you design something. Then you build it. Period.

    Got your attention, have I? I agree that "post-hoc design" sounds like nonsense. I never heard of it or considered it for decades. But then I did. Before long I saw that great programmers used it to create effective high-quality, loved-by-customers software very quickly.

    The usual way to build software: design then build

    The way to build good software is obviously to think about it first. Who does anything important without having a plan? Start by getting requirements from the best possible source, as detailed as possible. Then consider scale and volume. Then start with architecture and drill down to design.

    When experienced people do architecture and design, they know that requirements often "evolve." So it's important to generalize the design anticipating the changes and likely future requirements. Then you make plans and can start building. Test and verify as you drive towards alpha then beta testing. You know the drill. Anything but this general approach is pure amateur-hour.

    I did this over and over. Things kept screwing up. The main issue was requirements "evolution," which is something I knew would happen! Some of the changes seemed like they were from left field, and meant that part of my generalized architecture not only failed to anticipate them, but actually made it harder to meet them! Things that I anticipated might happen which I wove into the design never happened. Not only had I wasted the time designing and building, the weren't-needed parts of the design often made it hard for me to build the new things that came along that I had failed to anticipate.

    I assumed that the problem was that I didn't spend enough time doing the architecture and design thinking, and I hadn't been smart enough about it. Next time I would work harder and smarter and things would go more smoothly. Never happened. How about requirements? Same thing. The people defining the requirements did the best they could, but were also surprised when things came along, and embarrassed when things they were sure would be important weren't.

    After a long time — decades! — I finally figured out that the problem was in principle unsolvable. You can't plan for the future in software. Because you can't perfectly predict the future! What you are sure will happen doesn't, and what you never thought about happens.Time spent on anything but doing and learning as you go along is wasted time.

    The winning way to build software: Build then Design

    Build first. Then and only then do the design for the software you've already built. Sounds totally stupid. That's part of why I throw in some Latin to make it sound exotic: "Post-hoc design," i.e., after-the-fact design.

    When you design before you build, you can't possibly know what you're doing. You spend a bunch of time doing things that turn out to be wrong, and making the build harder and longer than it needs to be. When you build in small increments with customer/user input and feedback at each step, keeping the code as simple as possible, you keep everything short and direct. You might even build a whole solution for a customer this way — purposely NOT thinking about what other customers might need, but driving with lots of hard-coding to exactly what THIS customer needs. Result: the customer watches their solution grow, each step (hopefully) doing something useful, guides it as needed, and gets exactly what they need in the shortest possible time. What's bad about a happy customer?

    Of course, if you've got the typical crew of Design-first-then-build programmers, they're going to complain about the demeaning, unprofessional approach they're being forced to take. They might cram in O-O classes and inheritance as a sop to their pride; if they do, they should be caught and chastised! They will grumble about the enormous mountain of "technical debt" being created. Shut up and code! Exactly and only what's needed to make this customer happy!

    When the code is shown to another customer, they might love some things, not need some other things and point out some crucial things they need aren't there. Response: the nearly-mutinous programmers grab a copy of the code and start hacking at it, neutering what isn't needed, changing here and adding there. They are NOT permitted to "enhance" the original code, but hack a copy of it to meet the new customer's need. At this point, some of the programmers might discover that they like the feeling of making a customer happy more quickly than ever before.

    After doing this a couple times, exactly when is a matter of judgement, it will be time to do the "design" on the software that's already been built. Cynics might call this "paying off tech debt," except it's not. You change the code so that it exactly and only meets the requirements of the design you would have made to build these and only these bodies of code. You take the several separate bodies of code (remember, you did evil copy-and-modify) and create from them a single body of code that can do what any of the versions can do.

    When you do this, it's essential that you NOT anticipate future variations — which will lead to the usual problems of design-first. The pattern for accomplishing this is the elimination of redundancy, i.e., Occamality. When you see copy/modify versions of code, you replace them with a single body of code with the variations handled in the simplest way possible — for example, putting the variations into a metadata table.

    This isn't something that's done just once. You throw in a post-hoc design cycle whenever it makes sense, usually when you have an unwieldy number of similar copies.

    As time goes on, an ever-growing fraction of a new user's needs can be met by simple parameter and table settings of the main code line, and an ever-shrinking fraction met by new code.

    Post-Hoc Design

    Ignoring the pretentious name, post-hoc design is the simplest and most efficient way to build software that makes customers happy while minimizing the overall programming effort. The difference is a great reduction in wasted time designing and building, and in the time to customer satisfaction. Instead of a long requirements gathering and up-front design trying valiantly to get it right for once, resulting in lots of useless code that makes it harder to build what it turns out is actually needed, you hard-code direct to working solutions, and then periodically perform code unification whose purpose is to further shorten the time to satisfaction of new customers. To the extent that a "design" is a structure for code that enables a single body of code to be easily configured to meet diverse needs, doing the design post-hoc assures zero waste and error.

    What is the purpose of architecture and design anyway? It is to create a single body of code (with associated parameters and control tables) that meets the needs of many customers with zero changes to the code itself. The usual method is outside-in, gaze into the future. Post-hoc design is inside-out, study what you built to make a few customers happy, and reduce the number of source code copies to zero while reducing the lines of code to a minimum. The goal is post-hoc design is to minimize the time and effort to satisfy the next customer, and that's achieved by making the code Occamal, i.e., eliminating redundancies of all kinds. After all, what makes code hard to change? Finding all the places where something is defined. If everything is defined in exactly one place, once you've found it, change is easy.

    Post-hoc design is a process that should continue through the whole life of a body of code. It prioritizes satisfaction of the customer in front of your face. It breaks the usual model of do one thing to build code and another to modify it. In the early days of what would normally be called a code "build," the code works, but only does a subset of what it is likely to end up doing. When customers see subsets of this kind, it's amazing how it impacts their view of their requirements! "I love that. I could start using it today if only this and that were added!" It's called "grow the baby," an amazing way to achieve both speed and quality.

    New name for an old idea

    All I'm doing with "Post-hoc design" is putting a name and some system around a practice that, while scorned by academia and banned by professional managers, has a long history of producing best-in class results. I'm far from the first person who has noticed the key elements of post-hoc design.

    Linus Torvalds (key author of Linux, the world's leading operating system) is clearly down on the whole idea of up-front design:

    Don’t ever make the mistake [of thinking] that you can design something better than what you get from ruthless massively parallel trial-and-error with a feedback cycle. That’s giving your intelligence much too much credit.

    Gall's Law is a clear statement of the incremental approach:

    A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.

    The great computer scientist Donald Knuth, author of the multi-volume Art of Computer Programming, was a master of shifting between assembler language programming and abstract algorithms and back, the key activities of the speed-to-solution and post-hoc abstraction phases of the method I've described here.

    People who discover the power and beauty of high-level, abstract ideas often make the mistake of believing that concrete ideas at lower levels are worthless and might as well be forgotten. On the contrary, the best computer scientists are thoroughly grounded in basic concepts of how computers actually work. The essence of computer science is an ability to understand many levels of abstraction simultaneously.

    Thanks to Daniel Lemire for alerting me to these quotes.

    Conclusion

    Post-hoc design is based on the idea that software is only "built" once, and after that always changed. So why not apply the optimal process of changing software from day one? And then alternate between as-fast-as-possible driving to the next milestone, with periodic clean-ups to make fast driving to the next goal optimal? Post-hoc design is a cornerstone of the process of creating happy customers and optimal code.  It also happens to conform to the goals of software architecture. Post-hoc design is like first fighting a battle, and then, once the battle is over and you've won, cleaning and repairing everything, incorporating what you learned from the battle just past so that everything is ready for the next battle. Post-hoc design is the way to win.

     

  • The Goals of Software Architecture

    What goals should software architecture strive to meet? You would think that this subject would have been intensely debated in industry and academia and the issue resolved decades ago. Sadly, such is not the case. Not only can't we build good software that works in a timely and cost-effective way, we don't even have agreement or even discussion about the goals for software architecture!

    Given the on-going nightmare of software building and the crisis in software that still going strong after more than 50 years, you would think that solving the issue would be top-of-mind. As far as I can tell, not only is it not top-of-mind, it’s not even bottom-of-mind. Arguably, it’s out-of-mind.

    What is Software Architecture?

    A software architecture comprises the tools, languages, libraries, frameworks and overall design approach to building a body of software. While the mainstream approach is that the best architecture depends on the functional requirements of the software, wouldn’t it be nice if there were a set of architectural goals that were largely independent of the requirements for the software? Certainly such an independence would be desirable, because it would shorten and de-risk the path to success. Read on and judge for yourself whether there is a set of goals that the vast majority of software efforts could reasonably share.

    The Goals

    Here’s a crack at common-sense goals that all software architectures should strive to achieve and/or enable. The earlier items on the list should be very familiar. The later items may not be goals of every software effort; the greater in scope the software effort, the more their importance is likely to increase.

    • Fast to build
      • This is nearly universal. Given a choice, who wants to spend more time and money getting a software job done?
    • View and test as you build
      • Do you want to be surprised at the end by functionality that isn't right or deep flaws that would have been easy to fix during the process?
    • Easy to change course while building
      • No set of initial requirements is perfect. Things change, and you learn as you see early results. There should be near-zero cost of making changes as you go.
    • Minimal effort for fully automated regression testing
      • What you've built should work. When you add and change, you shouldn't break what you've already built. There should be near-zero cost for comprehensive, on-going regression testing.
    • Seconds to deploy and re-deploy
      • Whether your software is in progress or "done," deploying a new version should be near-immediate.
    • Gradual, controlled roll-out
      • When you "release" your software, who exactly sees the new version? It is usually important to control who sees new versions when.
    • Minimal translation required from requirements to implementation
      • The shortest path with the least translation from what is wanted to the details of building it yields speed, accuracy and mis-translations.
    • Likelihood of slowness, crashes or downtime near zero
      • 'Nuff said.
    • Easily deployed to all functions in an organization
      • Everything that is common among functions and departments is shared
      • Only differences between functions and departments needs to be built
    • Minimal effort to support varying interfaces and roles
      • Incorporate different languages, interfaces, modes of interaction and user roles into every aspect of the system’s operation in a central way
    • Easily increase sophisticated work handling
      • Seamless incorporation of history, evolving personalization, segmentation and contextualization in all functions and each stage of every workflow
    • Easily incorporate sophisticated analytics
      • Seamless ability to integrate on and off-line Analytics, ML, and AI into workflows
    • Changes the same as building
      • Since software spends most of its life being changed, all of the above for changes

    Let’s have a show of hands. Anyone who thinks these are bad or irrelevant goals for software, please raise your hand. Anyone?

    I'm well aware that the later goals may not be among the early deliverables of a given project. However, it's important to acknowledge such goals and their rising importance over time so that the methods to achieve earlier goals don't increase the difficulty of meeting the later ones.

    Typical Responses to the Goals

    I have asked scores of top software people and managers about one or more of these goals. I detail the range of typical responses to a couple of them in my book on Software Quality.

    After the blank stare, the response I've most often gotten is a strong statement about the software architecture and/or project management methods they support. These include:

    • We strictly adhere to Object-oriented principles and use language X that minimizes programmer errors
    • We practice TDD (test-driven development)
    • We practice X, Y or Z variant of Agile with squads for speed
    • We have a micro-services architecture with enterprise queuing and strictly enforced contracts between services
    • Our quality team is building a comprehensive set of regression tests and a rich sandbox environment.
    • We practice continuous release and deployment. We practice dev ops.
    • We have a data science team that is testing advanced methods for our application

    I never get any discussion of the goals or their inter-relationships. Just a leap to the answer. I also rarely get "this is what I used to think/do, but experience has led me to that." I don't hear concerns or limitations of the strongly asserted approaches. After all, the people I ask are experts!

    What's wrong with these responses?

    In each case, the expert asserts that his/her selection of architectural element is the best way to meet the relevant goals. The results rarely stand out from the crowd for the typical answers listed above.

    The key thing that's wrong is the complete lack of principles and demonstration that the approaches actually come closer to meeting the goals than anything else.

    The Appropriate Response to the Goals

    First and foremost, how about concentrating on the goals themselves! Are they the right goals? Do any of them work against the others?

    That's a major first step. No one is likely to get excited, though. Most people think goals like the ones listed above don't merit discussion. They're just common sense, after all.

    Things start to get contentious when you ask for ways to measure progress towards each goal. If you're going to the North Pole or climbing Mt. Everest, shouldn't you know where it is, how far away you are, and whether your efforts are bringing you closer?

    Are the goals equally important? Is their relative importance constant, or does the importance change?

    Wouldn't it be wonderful if someone, somewhere took on the job of evaluating existing practices and … wait for it … measured the extent they achieved the goals. Yes, you might not know what "perfect" is, but surely relative achievement can be measured.

    For example, people are endlessly inventing new software languages and making strong claims about their virtues. Suppose similar claims were made about new bats in baseball. Do you think it might be possible that the batter's skill makes more of a difference than the bat? Wouldn't it be important to know? Apparently, this is one of the many highly important — indeed, essential — questions in software that never gets asked, let alone answered.

    Along the same lines, wouldn't it be wonderful if someone took on the job of examining outliers? Projects that worked out not just in the typical dismal way, but failed spectacularly? On the other end of the spectrum, wouldn't amazing fast jobs be interesting? This would be done on start-from-scratch projects, but equally important on making changes to existing software.

    A whole slew of PhD's should be given out for pioneering work on identifying and refining the exact methods that make progress towards the goals. It's likely that minor changes to the methods used to meet the earlier goals well would make a huge difference in meeting later goals such as seamlessly incorporating the results of analytics.

    Strong Candidates for Optimal Architecture

    After decades of programming and then more of examining software in the field, I have a list of candidates for optimal architecture. My list isn't secret — it's in books and all over this blog. Here's a couple places to start:

    Speed-optimized software

    Occamality

    Champion Challenger QA

    Microservices

    The Dimensions

    Abstraction progression

    The Secrets

    The books

    Conclusion

    I've seen software fashions change over the years, with things getting hot, fading away, and sometimes coming back with a new name. The fashions get hot, and all tech leaders who want to be seen as modern embrace them. No real analysis. No examination of the principles involved. Just claims. At the same time, degrees are handed out by universities in Computer Science by Professors who are largely unscientific. In some ways they'd be better off in Art History — except they rarely have taste and don't like studying history either.

    I look forward to the day when someone writes what I hope will be an amusing history of the evolution of Computer Pseudo-Science.

  • Software Components and Layers: Problems with Data

    Components and layers are supposed to make software programs easier to create and maintain. They are supposed to make programmers more productive and reduce errors. It’s all lies and propaganda! In fact, the vast majority of software architectures based on components (ranging from tiny objects to large components) and/or layers (UI, server, storage and more) lead to massive duplication and added value-destroying work. The fact that these insane methods continue to be taught in nearly all academic environments and required by nearly all mainstream software development groups is cornerstone evidence that Computer Science not only isn’t a science, it resembles a deranged religious cult.

    Components and Layers

    It has been natural for a long time to describe the different “chunks” of software that work together to deliver results for users in dimensional terms.

    First there’s the vertical dimension, the layers. Software that is “closer” to end users (real people) is thought of as being on “top,” while the closer software is to long-term data storage and farther from end users is thought of as being on the “bottom” of a software system. In software that has a user interface, the different depths are thought of as “layers,” with the user interface the top layer, the server code the middle layer and the storage code the bottom layer. Sometimes a system like this is called a “three tier” system, evolved from a two tier “client/server” system.

    Second there’s the horizontal dimension, the components. These are bodies of code that are organized by some principle to break the code up into pieces for various reasons. I've given a detailed description of components. A component may have multiple layers in it or it may operate as a single layer.

    Layers are often written in different languages and using different tools. Javascript is prominent in user interface layers, while some stored procedure language is often used in the lowest data access layer. The server-side layer may be written in any of dozens of languages, including java and python. Sometimes there are multiple layers in server-resident software, with a scripting language like PHP or javascript used for server code on the “top,” communicating with the client software.

    Components are self-contained bodies of software that communicate with other components by some kind of formal means. Formal components always have data that can only be accessed by the code in the component. The smallest kind of component is an object, as in object-oriented languages. The data members of the object can only be accessed by routines attached to the object, known as methods.  Methods can normally be called by methods of other objects. Larger components are often called microservices or services. These are usually designed so that they could run on separate machines, with calling methods that can span machines, like old RPC’s or more modern RESTful API’s. Sometimes components are designed so that instead of being directly called, they take units of work from a queue or service bus, and send results out in the same way. When a component makes a call, it calls a specific component. When a component puts work onto a queue or bus, it has no knowledge or control over which component takes the work from the queue or bus.

    Layer and components are often combined. For example, microservice components often have multiple layers, frequently including a storage layer. With a thorough object-oriented approach, there could be multiple objects inside each layer that makes up a service component.

    Why are there layers and components? Software certainly doesn’t have to use these mechanisms to be effective. In fact, major systems and tools have been built and widely deployed that mostly ignore the concepts of layers and components! See this for historic examples.

    The core arguments for layers and components typically revolve around limiting the amount of code (components) and the variety of technologies (layers) a programmer has to know about to make the job easier, along with minimizing errors and maximizing teamwork and productivity. I have analyzed these claims here and here.

    Data Definitions in Components and Layers

    All software consists of both procedures (instructions, actions) and data. Each of those needs to be defined in a program. When data is defined for a procedure to act upon, the data definitions are nearly always specific to and contained within the component and/or layer they’re part of. For the general concept and variations on how instructions and data relate to each other, see this.

    Thinking about data that’s used in a component or layer, some of the data will be truly only for use by that component or layer. But nearly any application you can imagine has data that is used by many components and layers. This necessity is painful to object/component purists who go to great lengths to avoid it. But when a piece of data like “application date” is needed, it will nearly always have to be used in multiple layers: user interface, server and database. To be used it must be defined. So it will be defined in each layer, typically a minimum of three times!

    When data is defined in software, it always has a name used by procedures to read or change it. It nearly always also has a data type, like character or integer. The way most languages define data that’s it! But there’s more.

    • When the “application date” data is shown to the user in the UI layer, it typically also needs a label, some formatting information (month, day, year), error checking (was 32 entered into the day part?) and an error message to be displayed.
    • When application date is used in the server layer by some component, some kind of error checking is often needed to protect against disaster if a mistaken value is sent by another component.
    • When a field like social security number is used in the storage layer, sanity checks are often applied to make sure, for example, that the SSN entered matches the one previously stored for the particular customer already stored in the database.
    • There need to be error codes produced if data is presented that is wrong. When the user makes an error, you can’t use a code, you have to use a readable message, which the user layer might need to look up based on the error code it gets from another component or layer.
    • Each language has its own standards for defining data and the attributes associated with it. Someone has to make sure that all the definitions in varying languages match up, and that when changes are made they are made correctly everywhere.
    • If the data is sent from one component to another, more definitions have to be made: the procedure that gets the data, puts it into a message and sends it, possibly on a queue; the procedure that gets the message, maybe from a queue and sends the data to the routine that will actually do the processing.
    • When data is sent between components, various forms of error checking and error return processing must also be implemented for the data to protect against the problems caused by bad data being passed between components that, for example, might have been implemented and maintained by separate groups. Sometimes this is formalized into "contracts" between data-interchanging components/layers.

    So what did we gain by breaking everything up into components and layers? A multiplication of redundant data definitions containing different information, expressed in different ways! What we “gained” by all those components and layers was a profusion of data definitions! The profusion is multiplied by the need for components and layers to pass data among themselves for processing.The profusion can’t be avoided and can only be reduced by introducing further complexity and overhead into the component and layer definitions.

    See this for another take on this subject.

    I’ve heard the argument that unifying data definitions makes things harder for the specialists that often dominate software organizations. The database specialists are guardians of the database, and make sure everything about it is handled in the right way. The user interface specialists keep the database specialists away from their protected domain, because if they meddled the users wouldn’t enjoy the high quality interfaces they’ve come to expect. There is no doubt you want people to know their stuff. But none of this is really that hard – channeling programmers into narrow specialties is one of the many things that leads to dysfunction. Programmers produce the best results by thoroughly understanding their partners and consumers, which can only be done by spending time working in different roles – for example spending time in sales, customer service and different sub-departments of the programming group.

    Data Access in Components and Layers

    Now we’ve got data defined in many components and layers. In a truly simple system, data would be defined exactly once, be read into memory and be accessed in the single location in which it resides by whatever code needs it for any reason. If the code needs to interact with the user, perform calculations or store it, the code would simply reference the piece of data in its globally accessible named memory location and have at it. Fast and simple.

    If this concept sounds familiar, you may have heard of it in the world of relational DBMS’s. It’s the bedrock concept of having a “normalized” schema definition, in which each unique piece of data is stored in exactly one place. A database that isn’t normalized is asking for trouble, just like the way that customer name and address are frequently stored in different places in various pieces of enterprise software that evolved over time or were jammed together by corporate mergers.

    Components and layers performing operations on global data is neither fast nor simple. Suppose for example that you’ve got a new financial transaction that you want to process against a customer’s account. In an object system, the customer account and financial transaction would be different objects. That’s OK, except that the customer account master probably has a field that is the sum of all the transactions that have taken place in the recent time period.

    In a sensible system, adding the transaction amount to a customer transaction total in the customer record would probably be a single statement that referenced each piece of data (the transaction amount and the transactions total) directly by name. Simple, fast and understandable.

    In a component/object system that’s properly separated, transaction processing might be handled in one component and account master maintenance in another. In that case, highly expensive and non-obvious remote component references would have to be made, or a copy of the transaction placed on an external queue. In an object system, a method of the transaction object would have to be called and then a method of the account master object.

    It’s a good thing we’ve got smart, educated Computer Scientists arranging things so that programmers do things the right way and don’t make mistakes, isn’t it?

    Conclusion

    With the minor exception of temporary local variables, nearly every layer or component you break a body of software into leads to a multiplication of redundant, partly overlapping data definitions expressed in different languages — each of which has to be 100% in agreement to avoid error. The communication by call or message between layers and components to send and receive data increases the multiplication of data definitions and references. Adding the discipline of error checking is a further multiplication.

    Not only does each layer and component multiply the work and the chances of error, each simple-seeming change to a data definition results in a nightmare of redundancy. You have to find each and every place the data is defined, used or error-checked and make the right change in the right language. Components and layers are the enemy of Occamality. Software spends the VAST majority of its life being changed! Increasingly scattered data definitions make the thing that is done 99% of the time to software vastly harder and riskier to do.

  • The Dangerous Drive Towards the Goal of Software Components

    Starting fairly early in the years of building software programs, some programs grew large. People found the size to be unwieldy. They looked for ways to organize the software and break it up into pieces to make it more manageable. This effort applied to the lines of code in the program and to the definitions of the data referred to by the code, and to how they were related. How do you handle blocks of data that many routines work with? How about ones whose use is very limited? The background of this issue is explained here.

    These questions led to a variety of efforts that continue to this day to break a big body of software code and data definitions into pieces. The obvious approach, which continues to be used today, was simply to use files in directories. But for many people, this simple, practical approach wasn't enough. They wanted to make the walls between subsets of the code and data higher, broader and more rigid, creating what were generally called components. The various efforts to create components vary greatly. They usually create a terminology of their own, terms like “services” and “objects.” Such specialized, formulaic approaches to components are claimed to make software easier to organize, write, modify and debug. In this post I’ll take a stab at explaining the general concept of software components.

    Components and kitchens

    When you’re young and in your first apartment, you’ve got a kitchen that starts out empty. You’re not flush with cash, so you buy the minimum amount of cooking tools (pots, knives, etc.) and food that you need to get by. You may learn to make dishes that need additional tools, and you may move into a house with a larger kitchen that starts getting filled with the tools and ingredients you need to make your growing repertoire of dishes. You may have been haphazard about your things at the start, but as your collection grows you probably start to organize things. You may put the flour, sugar and rice on the same shelf, and probably put all the pots together. You may put ingredients that are frequently used together in the same place. You probably store all the spices and herbs together. It makes sense – if everything were scattered, you’d have a tough time remembering where each item was. The same logic applies to a home work bench and to clothes.

    The same logic applies for the same reason to software! It’s a natural tendency to want to organize things in a way that makes sense – for remembering where they are, and for convenience of access. This natural urge was recognized in the early days of software programming. Given that there aren’t shelves or drawers in that invisible world, most people settled on the term “component” as the word for a related body of software. The idea was always that instead of one giant disorganized block of code, the thing would be divided into a set of components, each of which had software and data that was related.

    A software program consists of a set of procedures (the actions) and a set of data definitions (what the procedures act on). Breaking up a large amount of code into related blocks was helped by the early emergence of the subroutine – a block of code that is called on to do something and then returns with a result; kind of like a mixer in a kitchen. The problems that emerged were how to break a large number of subroutines into components (each of which had multiple subroutines), and how to relate the various data definitions to the routine components. This problem resembles the one in the kitchen of organizing tools and ingredients. Do you keep all the tools separate from the ingredients, or do you store ingredients with the tools that are used to process them?

    In the world of cooking, this has a simple answer. If all you do is make pancakes, you might store the flour and milk together with the mixing bowls and frying pans you use with them. But no one ever just makes pancakes – duh! And even if you did, you’d better put the milk and butter in the fridge! Ditto with spices. Nearly everyone has a spice drawer or shelf, and spices used for everything from baking to making curries are stored there.  Similarly, you store all the pans together, all the knives, etc.

    In software it’s a little tougher, but not a lot. One tough subject is always do you group routines (and data) together by subject/business area or by technology area? The same choice applies to organizing programmers in a department. It makes sense for everyone who deals primarily with user interfaces to work together so they can create uniform results with minimal code. Same thing with back-end or database processing. But over time, the programmers become less responsive to the business and end users; the problem is often solved by having everyone who contributes to a business or kind of user work as a team. Responsiveness skyrockets! Before long, things begin to deteriorate on the tech side, with redundant, inconsistent data put into the database, UI elements that varying between subject areas, confusing users, etc. There’s pressure to go back to tech-centric organization. And so it goes, round and round. Answer: there is no perfect way to organize a group of programmers!. Just as painfully, there is no perfect way to organize a group of procedures and their relationship to the data they work on!!

    The drive towards components

    There are two major dimensions of making software into components. One dimension is putting routines into groups that are separate from each other. The second dimension is controlling which routines can access which data definitions.

    Separating routines into groups can be done in a light-weight way based on convenience, similar to having different workspaces in a kitchen. In most applications there are routines that are pretty isolated from the rest and others that are more widely accessed. Enabling programmers to access any routine at any time and having access to all the source code makes things efficient.

    Component-makers decided that programmers couldn't be trusted with such broad powers. They invented restrictions to keep routines strictly separate. While there were earlier versions of this idea, a couple decades ago "services" were created to hold strictly separate groups of routines. An evolved version of that concept is "micro-services." See this for an analysis.

    The second major dimension of components is controlling and limiting the relationship between routines and data. The first step was taken very early. It was sensible and remains in near-universal use today: local variables. These are data definitions declared inside a routine for the private use of that routine during its operation. They are like items on a temporary worksheet, discarded when the results are created.

    Later steps concerning "global" variables were less innocent. The idea was to strictly separate which routines can access which data definitions. Early implementations of this were light-weight and easily changed. Later versions built the separation into the architecture and code details. For example, each micro-service is ideally supposed to have its own private database and schema, inaccessible to the other services. This impacts how the code is designed and written, increasing the total amount of code, overhead and elapsed time.

    Languages that are "object oriented" are an extreme version of the component idea. In O-O languages, each data definition ("class") can be accessed only by an associated set of routines ("methods"). This bizarre reversal of the natural relationship between code and data results in a wide variety of problems.

    These ideas of components can be combined, making the overhead and restrictions even worse. People who build micro-services, for example, are likely to use O-O languages and methods to do the work. Obstacles on top of problems.

    Components in the kitchen

    All this software terminology can sound abstract, but the meaning and consequences can be understood using the kitchen comparison.

    What components amount to is having the kitchen be broken up into separate little kitchenettes, each surrounded by windowless walls. There are two openings in each kitchenette's walls, one for inputs and one for outputs. No chef can see or hear any other chef. The only interaction permitted is by sending and receiving packages. Each package has an address of some kind, along with things the sending chef wants the receiving chef to work on. For example, the sending chef may mix together some dry ingredients and send them to another chef who adds liquid and then mixes or kneads the dough as requested. The mixing chef might then put the dough into another package and return it to the sending chef, who might put together another package and send it to the baking chef who controls the oven.

    If the components are built as services, the little kitchenettes are built as free-standing buildings. In one version of components (direct RESTful calls), there is a messenger who stands waiting at each chef's output window. When the messenger receives a package, the messenger reads the address, jumps into his delivery van with the package and drives off to the local depot and drops off the package; another driver grabs the package, loads it into his van and delivers it to the receiver's input window. If the kitchenettes are all in the same location the vans and depot are still used — the idea is that a genius administrator can change the location of the kitchenettes at any time and things will remain the same.

    Another version of components is based around an Enterprise Service Bus (ESB). This is similar to the vans and the central office depot except that the packages are all queued up at the central location, which has lots of little storage areas. Instead of a package going right to a recipient, it's sent to one of these little central storage areas. Then, when the chef in a kitchenette is ready for more work he sends a requests to the central office, asking for the next package from a given little storage area. Then a worker grabs the oldest package, gives it to a driver who puts it in his van and delivers the package to the input window of the requesting chef.

    If this sounds bizarre and lots of extra work, it's because … it's bizarre and requires lots of extra work.

    The ideal way to organize software

    The ideal way to break software into components is actually pretty similar to the way good kitchens are organized. Generally speaking, you start by having the same kinds of things together. You probably store all the pots and pans together, sorted in a way that makes sense depending on how you use them. You probably pick a place that’s near the stove where they’ll probably be used – maybe even hanging on hooks from the ceiling. You probably have a main store of widely used ingredients like salt, but you may periodically put containers of it near where it is most often used. An important principle is that most chefs work at or near their stations – but (it goes without saying) can move anywhere to get anything they need. There aren't walls stopping you, and you don’t bother someone else to get something for you when you can more easily do it yourself.

    Exactly the same principle applies whether you are creating an appetizer, an entree, a side dish or a dessert — you do the same gathering and assembly from the ingredients in the kitchen, but deliver the results on different plates at different times.

    In software this means that while routines may be stored in files in different directories for convenience, and that usually your work is confined to a single directory, you go wherever you need to in order to do your job. Same thing with data definitions; you can break them up if it seems to make things more organized, but any routine can access any data definition it needs to. When you’re done, you make a build of the system and try it out. That's what champion/challenger QA is for.

    Are you deploying on a system that has separate UI, server and storage? No problem! That's what build scripts (which you would have anyway) are for! This approach makes it easier to migrate from the decades-old DBMS-centric systems design and move towards a document-centered database with auto-replication to a DBMS for reporting, which typically improves both performance and simplicity by many whole-number factors.

    Are the chefs in your kitchen having trouble handling all the work in a timely way? In the software world you might think of making a "scalable architecture" with separate little kitchenettes, delivery vans, etc. — which any sensible person knows adds trouble and work and makes every request take longer from receipt to ultimate delivery. In the world of kitchens (and sensible software people) you might add another mixer or two, install a couple more ovens and hire another chef or two, everyone communicating and working in parallel, and churning out more excellent food quickly, with low overhead.

    If you think this sounds simple, you’re right. If you think it sounds simplistic, perhaps you should think about this: any of the artificial, elaborate, rigid prescriptions for organizing software necessarily involves lots of overhead for people and computers in designing, writing, deploying, testing and running software. Each restriction is like telling a chef that under no circumstances is he allowed to access this shelf of ingredients or use that tool when the need arises.

    Arguments to the contrary are never backed with facts or real-world experiments – just theory and religious fervor. Instead, you should consider the effective methods of optimizing a body of software, which are centered on Occamality (elimination of redundancy) and increasing abstraction (increasingly using metadata instead of imperative code). Both of these things will directly address the evils that elaborate component methods are supposed to cure but don't.

  • The Relationship between Data and Instructions in Software

    The relationship between data and instructions is one of those bedrock concepts in software that is somehow never explicitly stated or discussed. While every computer program has instructions (organized as routines or subroutines) and data, the details of how the data is identified, named and accessed vary among programming languages and software architectures. Those differences of detail have major consequences. That’s why understanding the underlying principles is so important.

    Programmers argue passionately about the supposed virtues or defects of various  software architectural approaches and languages, but do so largely without reference to the underlying concepts. Only by understanding the basic concepts of data/instruction relationships can you understand the consequences of the differences.

    The professional kitchen

    One way to understand the relation between instructions (actions) and data is to compare it to something we can all visualize. An appropriate comparison with a program is a professional chef’s kitchen with working cooks. But in this kitchen, the cooks are a bit odd — only one of them is active at a time. When someone asks them to do something they get active, each performing its own specialty. A cook may ask another cook for help, giving the cook things or directing them to places, and getting the results back. The cooks are like subroutines in that they get called on to do things, often with specific instructions like “medium rare.” The cook processes the “call” by moving around the kitchen to fetch ingredients and tools, brings them to a work area to process the ingredients (data) with the tools, and then delivers the results for further work or to the server who put in the order. The ingredients (data) can be in long-term storage, working storage available to multiple chefs or in a workspace undergoing prep or cooking. In addition, food (data) is passed to chefs and returned by them.

    The action starts when a server gives an order to the kitchen for processing.

    • This is like calling a program or a subroutine. Subroutines take data as calling parameters, which are like the items from the menu written on the order to the kitchen.

    The person who receives the order breaks the work into pieces, giving the pieces to different specialists, each of whom does his work and returns the results. Unlike in a kitchen, only one cook is active at a time. One order might go to the meat chef and another to whoever handles vegetables.

    • This is like the first subroutine calling other subroutines, giving each one the specifics of the data it is supposed to process. The meat subroutine would be told the kind and cut of meat, the finish, etc.

    In a professional kitchen there is lots of pre-processing done before any order is taken. Chefs go to storage areas and bring back ingredients to their work areas. They may prepare sauces or dough so that everything is mixed in and prepped so that it can be finished in a short amount of time. They put the results of their work in nearby shelves or buckets for easy access later in the shift.

    • This is like getting data from storage, processing it and putting the results in what is called static or working storage, which is accessible by many different subroutines.

    There is a storage area and refrigerator that stores meat and another that stores vegetables. The vegetable area might have shelves and bins. The cook goes to the storage area and brings the required ingredients back to the cook’s work space. Depending on the recipe, the cook may also fetch some of the partly prepared things like sauces, often prepared by others, to include.

    • This is like getting data from long-term storage and from working storage and bringing it to automatic or local variables just for this piece of work.

    The storage area could be nearby. It could be a closet with shelves containing big boxes that have jars and containers in it. A cook is in charge of keeping the pantry full. They go off and get needed ingredients and put them in the appropriate storage area as needed. They could also deliver them as requested right to a chef.

    • This is like having long-term storage and access to it completely integrated with the language, or having it be a separate service that needs to be called in a special way.

    The chef does the work on the ingredients to prepare the result.

    • This is like performing manipulations on the data that is in local variables until the desired result has been produced. In the course of this, a chef may need to reach out and grab some ingredient from a nearby shelf.

    The chef may need extra space for a large project. He grabs some empty shelves from the storage area and uses them to store things that are in progress, like dough that needs time to rise. later a chef might call out “grab me the next piece of dough” or “I need the dough on the right end of the third shelf.

    • This is like taking empty space and using it. Pointers are sometimes used to reference the data, or object ID’s in O-O systems.

    The cook delivers the result for plating and delivery.

    • This is like producing a return variable. It may also involve writing data to long-term storage or working storage.

    I’m not a cooking professional, but I gather that the work in professional kitchens and how they’re organized has evolved towards producing the best results in the least amount of elapsed time and total effort. As much prep work as done before orders are received to minimize the time and work to deliver orders quickly and well. The chefs have organized work and storage spaces to handle original ingredients (meat, spices, flour, etc.) and partly done results (for example, a restaurant can’t wait the 45 minutes it might take to cook brown rice from scratch).

    In the next section, this is all described again somewhat more technically. If you’re interested in technology or have a programming background, by all means read it. The main points of this post and the ones that follow can be understood without it.

    Instructions and data in a computer program

    The essence of a computer program is instructions that the computer executes. Most of the instructions reference data in some way – getting data, manipulating it, storing results. See this for more.

    In math, from algebra on up, variables simply appear in equations. In computer software, every variable that appears in a statement must be defined as part of the program. For example, a simple statement like

              X = Y+1

    Means “read the value stored in the location whose name is Y, add the number 1 to it, and store the result in the location whose name is X.” Given this meaning, X and Y need to be defined. How and where does this happen? There are several main options:

    • Parameters. These form part of the definition of a subroutine. When calling a subroutine, you include the variables you want the subroutine to process. These are each named.
    • Return value. In many languages, a called routine can return a value, which is defined as part of the subroutine.
    • Automatic or local variables. These are normally defined at the start of a subroutine definition. They are created when the subroutine starts, used by statements of the subroutine and discarded when the subroutine exits.
    • Static or working storage variables. These are normally defined separately (outside of) subroutines. They are assigned storage at the start of the whole program (which may have many subroutines), and discarded at the end.
    • Allocated variables. Memory for these is allocated by a subroutine call in the course of executing a program. Many instances of such allocated variables may be created, each distinguished by an ID or memory pointer.
    • File, database or persisting variables. These are variables that exist independent of any program. They are typically stored in a file system or DBMS. Some software languages support these definitions being included as part of a program, while others do not. See this for more.

    There are a couple concepts that apply to many of the places and ways variables can be defined.

    • Grouping. Groups of variables can be in an ordered list, sometimes with a nesting hierarchy. This is like the classic Hollerith card: you would have a high-level definition for the whole card and then a list of the variables that would appear on the card.
      • There might be subgroups; for example, start-date could be the name of a group consisting of the variables day, month, year.
      • Referring to such a variable might look like “year IN start-date IN cust-record” in COBOL, while in other languages it might be year.start-date.cust-record.
    • Multiples. Any variable or group can be declared to be an array, for example the variable DAY could be made an array of 365, so there’s one value per day of a year.
    • Types or templates. Many languages let you define a template or type for an individual variable or group. When you define a new variable with a new name like Y, you could say it’s a variable of type X, which then uses the attributes of X to define Y.
    • Definition scope. Parameters, return values and local variables are always tied to the subroutine of which they are a part. They are “invisible” outside the subroutine. The other variables, depending on the language, may be made “visible” to some or all of a program’s subroutines. Exactly how widely visible data definitions are is the subject of huge dispute, and is at the core of things like components, services and layers.

    When you look at a statement like X = Y+1, exactly how and where X and Y are defined isn’t mentioned. X could be a parameter, a local variable or defined outside of the subroutine in which the statement appears. Part of the job of the programmer is to name and organize the data definitions in a clean and sensible way.

    The variety of data-instruction organization and relationships

    Most of the possibilities for defining variables listed above were provided for by the early languages FORTRAN, COBOL and C, each of which remains in widespread use. Not long after these languages were established, variations were introduced. Software languages and architectures were created that selected and arranged the way instructions related to data definition. Programmers and academics decided that some ways of referencing and organizing data were error-prone and introduced restrictions that were intended to reduce the number of errors that programmers made when creating programs. In software architecture, the idea arose that all of a program's subroutines should be organized into separate groups, usually called "components" or "services," each with its own collection of data definitions. The different components call on each other or send messages to ask for help and get results, but can only directly operate on data that is defined as part of the component.

    The most extreme variation of instruction/data relationship is a complete reversal of point of view. The view I've described here is "procedural," which means everything is centered around the actor, the chef who does things. The reversal of that point of view is "object-oriented," called O-O, which organizes everything around the data, the acted upon, the ingredients and workspaces in a kitchen. Instead of following the chef around as he gets and operates on ingredients (data), we look at the data, called objects, each of which has little actors assigned to it, mini-chefs, that can send messages for help, but can only work on their own little part of the world. It's hard to imagine!

    The basic idea is simple: instead of having a master chef or ones with broad specialties like desserts, there are a host of mini-chefs called "methods," each of  which can only work on a specific small group of ingredients. A master chef has to know so much — he might make a mistake! By having a mini-chef who is 100% dedicated to dough, and never letting anyone else create the dough, we can protect against bad chefs (programmers) and make sure the dough is always perfect! Hooray! Or at least that's the theory…

    Conclusion

    Computers take data in, process it and write data out. Inside the computer there are instructions and data. Software languages have evolved to make it easier for programmers to define the data that is read and created and to make it easier to write the lines of code that refer to the data and manipulate it. As bodies of software have grown, people have created ways to organize the data that a computer works on, for example putting definitions for the in-process data of a subroutine inside the subroutine itself or collecting a group of related subroutines into a self-contained group or component with data that only it can work on.

    Understanding the basic concepts of instruction/data relationships and how those relationships can be organized and controlled is the key to understanding the plethora of approaches to language and architecture that have been created, and making informed decisions about which language and architecture is best for a given problem. The overall trend is clear: Programming self-declared elites decide that this or that restriction should be placed on which variables can be accessed by which instructions in which way, with the goal of reducing the errors made by normal programming riff-raff. Nearly all such restrictions make things worse!

  • How Micro-Services Boost Programmer Productivity

    There's a simple way to understand the impact of micro-services on programmer productivity: they make it worse. Much worse. How can that be?? Aren't monolithic architectures awful nightmares making applications unable to scale and causing a drain on programmer productivity? No. No. NO! Does this mean that every body of monolithic code is wonderful, supporting endless scaling and optimal programmer productivity? Of course not. Most of them have problems on many dimensions. But the always-wrenching transition to micro-services makes things worse in nearly all cases. Including reducing programmer productivity.

    Micro-services for Productivity

    Micro-services are often one of the first buzzwords appearing on the resumes of fashion-forward CTO's. They are one of today's leading software fashions; see this for more on software fashions. I have treated in depth the false claims of micro-services to make applications "scalable."

    I recently discussed the issue with a couple CTO's using micro-services who admitted that their applications will never need to get up to hundreds of transactions a second, a tiny fraction of the capacity of modern cloud DBMS's. In each case, they have fallen back on the assertion that micro-services are great for programmer productivity, enabling their teams to move in fast, independent groups with minimal cross-team disruption.

    The logic behind this assertion has a couple major aspects. The basic assertion that small teams, each concentrating on a single subject area, are more productive than large, amorphous teams is obviously correct. This is an old idea.  It has nothing to do with micro-services. It takes a fair amount of effort for members of a team to develop low-friction ways of working together, and it also takes time to understand a set of requirements and a body of existing code. Why not leverage that investment, keeping the team intact and working on the same or similar subject areas? Of course you should! No-brainer!

    Here's the fulcrum point: given that it's good to have a small team "owning" a given set of functionality and code, what's the best way to accomplish this? The assertion of those supporting micro-services is that the best way is to break the code into separate, completely distinct pieces, and to make each piece a separate, independent executable, deployed in its own container, and interacting with other services using some kind of service bus, queuing mechanism or web API interface shared by the other services. The theory is that you can deploy as many copies of the executables as you want and change each one independently of the others, resulting in great scalability and team independence. In most cases, each micro-service even has its own data store for maximum independence.

    I covered the bogus argument about scalability here. You can deploy all the copies of a monolith that you want to. Having separate DBMS's introduces an insane amount of extra work, since there's no way (with rare exceptions) each service database would be truly independent of the others, and sending around the data controlled by the other services at least triples the work. The original service has to get the data and not only store it locally, but send it to at least one other service, which then has to receive it and store it. That's 4X the work to build in the first place and 4X again every time you need to make changes. And sending things from one service to another is thousands of times slower than simply making a local call.

    Now we're down to productivity. Surely having the group concentrate on its own body of code is a plus, isn't it? YES! It is! But how does having a separation of concerns in a large body of code somehow require that the code devoted to a particular subject be built and deployed as its own executable???

    Let's look at a "monolithic" body of code. Getting into detail, it amounts to a hierarchy of directories containing files. Some of the files will have shared routines (classes or whatever) and others won't be shared. For most groups, going to micro-services means taking those directories and converting them to separate directories, each built and deployed by the team that owns it. Anyone who's tried this and/or knows a large body of code well knows that there's a range of independence, with some routines being clearly separable, some being clearly shared and others some of each.

    A sensible person would look at this set of directories as a large group of routines organized into files and directories as it was built. They would see that as changes were made and things evolved, the code and the organization got messy. There were bits of logic that were scattered all over the place that should be put in one place. There were things that were variations on a theme that should be coded once, with the variations handled in parameters or metadata. There were good routines that ended up in the wrong place. The sensible person realizes that not only is this messy, but it makes things harder to find and change, and makes it more likely that something will break when you change it.

    The sensible person who sees redundancy and sloppy organization wants to fix it. One long-used way to organize code to avoid the problem is to create what are called "components," which are sort of junior versions of microservices. Here is a detailed description of the issues with components, and how sensible people respond to the hell-bent drive towards components. The metaphor of a kitchen with multiple cooks as an apt one.

    Then there's the generic approach of technical debt. While you can go crazy about this, the phrase "paying down technical debt" is a reasonable one. For my take on this tricky subject, see this. Here's a simple way to understand the process and value, and here's a more far-reaching explanation of the general principle of Occamality.

    The sensible person now has things organized well, with most of the redundancy squeezed out. Why is this important? Simple. What do you mostly do to code? Change it. When you look for the thing that needs changing, where would you like it to be? In ONE PLACE. Not many places in different variations. Concerns about basically the same thing should be in a single set of code. You can build it, deploy it and test it easily.

    What's the additional value of taking related code and putting it in its own directory tree, with its own build and deployment? None! First, it's extra work to do it. Second, there are always relationships between the "separate" bodies of code — that's why, when they're separate services, you've got enterprise service buses, cross-service calls, etc. Extra work! And HUGE amounts of performance overhead. Even worse, if you start out with a micro-services approach instead of converting to one, your separate teams will certainly confront similar problems and code solutions to them independently, creating exactly the kind of similar-but-different code cancer that sensible people try to avoid and/or eliminate!

    Separate teams with separate executables also have extra trouble testing. No extra testing trouble you say? Because you do test-driven development and have nice automated tests for everything? I'm sorry to have to be the one to tell you, but if you really do all this obsolete stuff, your productivity is worse by at least a factor of two than if you used modern comparison-based testing. Not to mention that your quality as delivered is worse. See this and this.

    Bottom line: separation of concerns in the code is a good thing. Among other things, it enables small groups to mostly work on just part of the code without being an expert in everything. Each group will largely be working on separate stuff, except when there are overlaps. All of this has nothing to do with separate deployment of the code blocks as micro-services.  Adding micro-services to the code clean-up is a LOT of extra work that furthermore requires MORE code changes, an added burden to testing and ZERO productivity benefit.

    Conclusion

    The claim that small teams working closely together on a sensible subset of a larger code base is a productivity-enhancing way to organize things is true. Old news. The claim that code related to a subject should be in one place also makes sense, kind of the way that pots and pans are kept in the kitchen where they're used instead of in bedroom closets. Genius! Going to the extreme of making believe that a single program should be broken into separate little independent programs that communicate with each other and that the teams and programs share nothing but burdensome communications methods is a productivity killer. It's as though instead of being rooms in a house, places for a family to cook, eat, sleep and relax each had its own building, requiring travel outside to get from one to the other. Anyone want to go back to the days of outhouses? That's what micro-services are, applied to all the rooms of a house.

  • Why is a Monolithic Software Architecture Evil?

    Why is a monolithic software architecture evil? Simple. There is no need to explain “why,” because monolithic is not evil. Or even plain old bad. In fact it’s probably better than all the alternatives in most cases. Here’s the story.

    The Cool, Modern Programmers Explain

    The new, modern, with-it software people come in and look at your existing code base. While admitting that it works, they declare it DOA. They say DOA, implying “dead on arrival.” But since the software apparently works, it can’t be “dead,” except in the eyes of the cool kids, as in “you’re dead to me.” So it must be “disgusting,” “decrepit,” “disreputable,” or something even worse.

    Why is it DOA (whatever that means)? Simple: … get ready … it’s monolithic!! Horrors! Or even better: quelle horreur!!

    Suppose you don’t immediately grimace, say the equivalent of OMG, and otherwise express horror at the thought of a code base that’s … monolithic!! … running your business. Suppose instead you maintain your composure and ask in even, measured tones: “Why is that bad?” Depending on the maturity level of the tech team involved, the response could range from “OK, boomer,” to a moderate “are you serious, haven’t you been reading,” all the way up to a big sigh, followed by “OK, let me explain. First of all, if an application is monolithic, it’s so ancient it might as well be written in COBOL or something people who are mostly dead now wrote while they were sort of alive. But whatever the language, monolithic applications don’t scale! You want your business to be able to grow, right? Well that means the application has to be able to scale, and monolithic applications can’t scale. What you need instead is a micro-services architecture, which is the proven model for scalability. With micro-services, you can run as many copies of each service on as many servers as you need, supporting endless scaling. Even better, each micro-service is its own set of code. That means you can have separate teams work on each micro-service. That means each team feels like they own the code, which makes them more productive. They’re not constantly stepping on the other teams’ toes, running into them, making changes that break other teams’ work and having their own code broken by who-knows-who else? With monolithic, nobody owns anything and it’s a big free-for-all, which just gets worse as you add teams. So you see, not only can’t the software scale when it’s a monolith, the team can’t scale either! The more people you add, the worse it gets! That’s why everything has to stop and we have to implement a micro-service architecture. There’s not a moment to lose!”

    After that, what can a self-respecting manager do except bow to the wisdom and energy of the new generation of tech experts, and let them have at it? All it means is re-writing all the code, so how bad can it be?

    One of the many signs that “computer science” does little to even pretend to be a science is the fact that this kind of twaddle is allowed to continue polluting the software ecosphere. You would think that some exalted professor somewhere would dissect this and reveal it for the errant nonsense it is. But no.

    Some Common Sense

    In the absence of a complete take-down, here are a few thoughts to help people with common sense resist the crowd of lemmings rushing towards the cliff of micro-services.

    Here's a post from the Amazon Prime Video tech team of a quality checking service they had written using classic microservices architecture that … couldn't scale!! The architecture that solves scaling can't scale? How is that possible? Even worse is how they solved the problem. They converted the code, re-using most of it, from microservices to … wait, try to guess … yes, it's your worst nightmare, they converted it to a monolith. The result? "Moving our service to a monolith reduced our infrastructure cost by over 90%. It also increased our scaling capabilities."

    Here's the logic of it. Let’s acknowledge that modern processor technology has simply unbelievable power and throughput. Handling millions of events per second is the norm. The only barrier to extreme throughput and transaction handling is almost always the limits of secondary systems such as storage.

    Without getting into too many details, modern DBMS technology running on fairly normal storage can easily handle thousands of transactions per second. This isn’t anything special – look up the numbers for RDS on Amazon’s AWS for example. Tens of thousands of transactions per second with dynamic scaling and fault tolerance are easily within the capacity of the AWS Aurora RDBMS;  with the key-value DynamoDB database, well over 100,000 operations per second are supported.

    Keeping it simple, suppose you need to handle a very large stream of transactions – say for example 30 million per hour. That’s a lot, right? Simple arithmetic tells you that’s less than ten thousand transactions per second, which itself is well within the capacity of common, non-fancy database technology. What applications come even close to needing that kind of capacity?

    The database isn't the problem, you might think, it's the application! OK, there is a proven, widely used solution: run multiple instances of your code. As many as you need to handle the capacity and then some — you know, kinda like microservices! It's more than kinda. Each transaction that comes in gets sent to one of what could be many copies of the code. The transaction is processed to completion, making calls to a shared database along the way, and then waits for another transaction to come in. Since all the code required to process the transaction resides in the same code instance, all the time and computational overhead of using the queuing system for moving stuff around among the crowd of services is eliminated. Both elapsed time and compute resources are like to be much better, often by a factor of 2 or more.

    OK, what if something extreme happens? What if you need more, and that somehow it’s your code that’s the barrier. Here the micro-services groupies have it right – to expand throughput, the right approach is sometimes to spin up another copy of the code on another machine. And another and another if needed. I talk about how to scale with a shared-nothing architecture here. Why is this only possible if the code has been re-written into tiny little slivers of the whole, micro-services?

    The micro-service adherent might puke at the thought of making copies of the whole HUGE body of code. Do the numbers. Do you have a million lines of code? Probably not, but suppose each line of take 100 bytes, which would be a lot. That’s 100 MB of code. I’m writing this on a laptop machine that’s a couple years old. It has 8GB of RAM in it. That’s 80 times as large as the space required for the million lines of code, which is probably WAY more than your system has. Oh you have ten million lines? It’s still 8 times larger. No problem. And best of all, no need to rewrite your code to take advantage of running it on as many processors as you care to allocate to it.

    I can see the stubborn micro-services cultist shaking his head and pointing out that micro-services isn’t only about splitting up the code into separate little services, but making each service have its own database. Hah! With each service having its own database, everything is separate, and there are truly no limits to growth!

    The cultist is clearly pulling for a “mere” tens of thousands of transactions a second not being nearly enough. Think of examples. One might be supporting the entire voting population of California voting using an online system at nearly the same time. There are fewer than 20 million registered voters in that state. Fewer than 60% vote, usually much less. Suppose for sake of argument that voter turnout was 100% and that they all voted within a single hour, a preposterous assumption. A monolithic voting application running on a single machine with a single database would be able to handle the entire load with capacity to spare. Of course in practice you’d have active-active versions deployed in multiple data centers to assure nothing bad happened if something failed, but you’d have that no matter what.

    Suppose somehow you needed even more scaling than that. Do you need micro-services then?

    First of all, there are simple, proven solutions to scaling that don’t involve the trauma of re-writing your application to micro-services.

    The simplest one is a technique that is applicable in the vast majority of cases called database sharding. This is where you make multiple copies of not just your code but also the database, with each database having a unique subset of the data. The exact way to shard varies depending on the structure of the data, but for example could be by the state of the mailing address of the customer, or by the last digit of the account, or something similarly simple. In addition, most sharding systems also have a central copy of the database for system-wide variables and totals, which usually requires a couple simple code changes.

    Sharding is keeping the entire database schema in each copy of the code, but arranging things so that each copy has a subset of all the data. Micro-services, in contrast, usually involve creating a separate database schema for each micro-service, and attempting to arrange things so that the code in the service has ALL the tables it needs in its subset of the overall schema, and ONLY the tables it needs. In practice, this is impossible to achieve. The result is that micro-services end up calling each other to get the fields it doesn’t store locally and to update them as well. This results in a maze of inter-service calling, with the attendant errors and killing of elapsed time. If all the code and the entire schema were in one place, this wouldn’t be needed.

    I am far from the only person who has noticed issues like this. There was even a list of problems in Wikipedia last time I looked.

    Making sure your application is scalable and then scaling it doesn’t arise often, but when it does you should definitely be ready for it. The answer to the question of how best to architect a software application to be scalable from day one is simple: assure that it’s monolithic! Architect your application so it’s not database centric – this has been a reasonable approach for at least a decade, think it might be worth a look-see? If you do have a RDBMS, design your database schema to enable sharding should it be needed in the future. Make sure each software team “owns” a portion of the code; if you work towards eliminating redundancy and have a meta-data-centric attitude, you’ll have few issues with team conflict and overlap.

    Do yourself and your team and your customers and your investors a BIG favor: stubbornly resist to siren call to join the fashion-forward micro-services crowd. Everything will be better. And finally, when you use the term “monolithic,” use it with pride. It is indeed something to guard, preserve and be pleased with.

  • The Three Dimensions of Software Architecture Goodness

    In my work on Wartime Software, I describe the methods used by small groups of smart, motivated programmers to compete with large, established groups using standard software techniques – and win. I haven’t invented those methods; I’m simply collecting, organizing and describing what such groups of software ninjas actually do.

    Similarly, after observing the internal structure of many bodies of software over a long time, patterns emerge about the internal structures that yield competitive advantage, and tend to take market share. These internal structures are a kind of continuum rather than an either/or: a group that is farther along the continuum of goodness I’ve observed has a substantial competitive advantage over any group whose software is more primitive in the continuum. This pattern is so strong that you can see it play out with many companies competing over a long period of time.

    The patterns are fundamentally simple, but rarely discussed among programmers or in the literature – and not taught (to my knowledge) in Computer Science departments. Using the patterns, it’s possible to rank any piece of software along three independent but conceptually related dimensions.

    The dimensions are:

    • The code. The not-good end of this axis is code that has a great deal of redundancy, and the good end is code that has no redundancy.
    • The data. The not-good end of the axis is data that is defined and/or stored in multiple places, and the good end is data that is uniquely defined and stored.
    • The meta-data. The not-good end of the axis is a software system with no meta-data, and the good end is one in which most application functionality is defined and controlled by meta-data, and can be changed by editing it with no code changes. I often think of this as the vertical dimension.

    These dimensions are closely related conceptually, but in practice aren’t necessarily linked. Nonetheless, in practice, movement towards the good end of any dimension encourages and helps movement towards the good end of the others.

    Why should anyone care about dimensions of software architecture goodness?

    It’s an abstract idea. It’s new to many people. It’s a tough addition for managers whose heads already are chock full of barely understandable buzzwords and acronyms. Why pile on more?

    Simple: investments in moving software towards the goodness end of the dimensions is HIGHLY correlated with shorter times, lower costs and lower risks of meeting the evolving needs of existing and new customers in a highly competitive market. The correlation is clearly demonstrated historically, and incremental progress yields immediate benefits in sales and business satisfaction. There is no single thing a software group can do that has greater impact on fast, low-cost and trouble-free delivery of existing and new product features to customers.

    I know these are bold claims, particularly in an environment that swirls with buzzwords for software. Everything will be better once we start programming in Golang, like Google! Our teams will work better and our software will be scalable once we re-organize using micro-services! We’ve got to automate testing and move to test-driven development! We have to add Kanban to our Agile environment, and hire a certified SCRUM master! These subjects and more have passionate adherents and are widely implemented. But none of them yield the promised results, which is part of why there’s such a merry-go-round of methods becoming fashionable, only to fade quietly into obscurity or ho-hum, nothing’s changed but it’s just what we do.

    Details of the Dimensions

    Data

    The data dimension is the most widely understood and practiced of the three. I hope everyone who can spell “relational database” is familiar with the concept of a normalized schema – the whole point of which is to assure that each piece of data is defined and stored in exactly one place. But the concept of a normalized schema applies strictly within the bounds of the DBMS itself – there is no widely used term or concept for a truly central definition of data, including inside the code.

    Is having fewer, less redundant, more centralized definitions a good idea? You bet. This is largely why the RAILS framework for Ruby rapidly grew to widespread use. It had nothing to do with the design of Ruby or its object-orientation – it was all about the RAILS framework and it’s central DRY principle; DRY = “don’t repeat yourself.” The idea was/is simple: data definitions in RAILS are created in a central place – and then applied by the framework both in the DBMS schema and in the data definitions in the code to access the data. To change a data definition, you change it in the RAILS definition, and it changes it in both the database and the Ruby code.

    It’s important to note that the data dimension in my definition encompasses data wherever it appears – program files, schema files, anywhere.

    Code

    The code redundancy dimension is much less understood and recognized for its importance. But the value of reducing code redundancy is easily understood: when you need to make a change to a program, how many places do you have to go to make the change? The answer any appropriately lazy programmer would like is “exactly one place.” The programmer finds the place, makes the change, and all is good. As you add code to a system to make it do new things, often you end up repeating or creating, perhaps inadvertently, variations on existing themes in the code. When you’re done, what’s the best thing you can do to make the next change as easy as possible? Eliminate the redundancy. That’s it! Objects, layers, components, services and the rest don’t matter. Redundancy matters.

    Meta-data

    The third dimension is meta-data. Having no redundancy in the meta-data dimension is important, and is often achieved by measures such multiple inheritance hierarchies and links. But the most important thing is this: to the maximum extent possible, application-specific behaviors of all kinds should be defined in meta-data rather than lines of code. Goodness in this dimension is measured by growing meta-data. Over time, decreasing code redundancy is achieved by increasing the size, extent and power of the meta-data.

    This evolution directly powers business benefits for the simple reason that meta-data can be changed and augmented quickly without creating a new version of the code and without fear of introducing bugs into the code. In the end, you end up with something like Excel: the source code never needs to be touched, while supporting endless variations of spreadsheets, with no programming skills required. When you make a mistake in a spreadsheet, you may not get the result you want, but Excel doesn’t crash.

    Conclusion

    Wartime Software is a collection of techniques that have evolved to build software quickly, and to win in a software-fueled business.  People who practice some subset of this collection of techniques often find themselves moving towards the goodness end of the software architecture dimensions as they rapidly cycle their code, since every small step along any of the three dimensions helps them move more quickly and satisfy emerging customer needs more rapidly than the competition. It’s one of the most powerful tools in the Wartime Software arsenal.

  • The Progression of Abstraction in Software Applications

    This post describes a little-known concept for understanding and creating software architecture that small groups use to defeat large, powerful incumbents and nimble competitors. It is one of a small number of powerful, repeating patterns that help us understand and predict the evolution of software. Understanding these patterns can help entrepreneurs direct their efforts; if they do it well, they greatly enhance their chances of success. Understanding the patterns can also help investors choose to invest in groups that are walking a path to success.

    Evolution of Applications Towards Abstraction on a Platform

    One of these patterns is the stages that applications naturally evolve through on a technology platform. Each step or stage brings a big increase in the power of the software, decreasing the effort and increasing the speed and effectiveness of being deployed to meet customer needs.

    A category of software applications may well get “stuck” at a particular stage for a long time, sometimes even decades. During this time, the software may appear to move forward, and of course the marketing people and managers will always put things in the best possible light. But it’s always vulnerable to being supplanted by a next-stage version of the functionality.

    While there aren’t clear lines of delineation between the stages, it’s nonetheless useful to understand them roughly as:

    • Prototype. A hard-coded body of code.
    • Custom Application. Does a job reliably, but most changes require changing source code.
    • Basic Product. The code now has parameters, maybe user exits and API’s. Real-life implementations tend to require extensive professional services, and the cost of upgrading to new versions tends to be high.
    • Parameterized Product. The level of parameterization is high, with interface layers to many things outside the core code, so that many implementations can be done without changing source code. There may be some meta-data or editable rules.
    • Workbench Product. A large portion of the product’s functionality has migrated from code to editable meta-data, so that extensive UI, workflow, interface and functionality changes can be accomplished via some form of workbench, which could be just a text editor. The key is that the details of application functionality are expressed as editable data, meta-data, instead of code. Nonetheless, all fundamental capabilities are expressed in highly abstract code.

    As a body of code goes through this sequence of abstraction, it is increasingly able to meet the needs of new customers and changing needs of existing customers, with decreasing amounts of effort, risk and changes to source code. At the same time, the more abstract a program, the more functionality is expressed as data that is not part of the software itself, a.k.a. meta-data, and the more the software implements generic capabilities, as directed by the meta-data.

    The pattern applies both to individual bodies of code and to collections of them. It applies to code built internally for an organization and to code that is sold as a product in any way.

    I defined the stages above as a convenience; in reality, the categories are rarely hard-and-fast. A body of code could be given a big transformation and leap along the spectrum, or it could take a long series of small steps. One body of code could remain stuck with little change in abstraction, while other bodies of code doing similar things could be ahead, or progress rapidly towards abstraction.

    The Driver of Abstraction Evolution

    In biological nature, competitive pressures and external change appear to drive evolutionary changes. Similarly, when we look at categories of software, if little changes to make the software change, it doesn’t change – why take the trouble and expense to change software that meets your needs or the needs of your customers?

    In reality, someone always seems to want changes to an application. A prospective user would gladly use the software if it did this, that or the other thing. A current user complains about something – it’s too slow, too hard to use, too complicated, or it just doesn’t work for X, Y or Z. How often does a piece of software not have a “roadmap?” If it doesn’t, it’s probably slated for retirement before long. Brand-new software is rare. The vast majority of software effort goes into making changes to an existing piece of software.

    How much time and effort is needed to change a particular body of software? That is the key touch-point between techies and business people. This is the point at which the level of abstraction of the application comes into play. Regardless of the level of abstraction of the application, the “change” required either has been anticipated and provided for or it has not.

    • If the change has been anticipated, the code can already do the kind of thing the user wants – but not the particular thing. Doing the particular thing requires that a parameter be defined, a configuration file changed, a template created or altered, workflow or rule changes made, or something similar that is NOT part of the program’s source code. This means that the requirement can be met quickly, with little chance of error.
    • If the change has not been anticipated, then source code has to be changed in some way to make the change. The changes required may be simple and localized, or complex and extensive – but the source code is changed and a new version of the program is created.

    This is what the level of abstraction of a software application is all about: the more abstracted the application, the more ways it can be changed without altering the source code and making a new version of the program.

    This is the fundamental driver of applications towards increasing abstraction: as changes have to be made to the application, at some point the technical people may decide to make the change easier to make in the future, and create the appropriate abstraction for the kind of change. This may happen repeatedly. As more complex changes are required, the program may just get more complicated and more expensive to make further changes, or more sophisticated abstractions may be introduced.

    While historically it appears that outside forces drive applications towards increasing abstraction, smart programmers can understand the techniques of abstraction and build an application that is appropriately abstract from the beginning. Similarly, the abstracting methods can be applied by smart programmers to existing bodies of code to transform them, just because it's a way to build better code and meet ever-evolving business requirements.

    Conclusion

    The hierarchy of abstraction in software is one of the most important concepts for understanding a specific piece of software, or a group of related software. Over time, software tends to become more abstract because of competitive and business pressures, or because of smart programmers working to make things better. The more abstract a piece of software is, the more likely that it can respond to business and user demands without modifications to the source code itself, i..e., quickly and with low risk.

    The hierarchy of abstraction is certainly a valuable way of understanding the history of software. But it is more valuable as a framework for understanding a given piece of software, and the way to evolve that software to becoming increasingly valuable. It is most valuable to software developers as a framework for understanding software, and helping them to direct their efforts to get the greatest possible business impact with the smallest amount of time and effort.

  • The Distributed Computing Zombie Bubble

    Distributed computing is a trend whose time has come … and gone. Well, not completely. If my computers have to ask your computers a question, that's best done using something like "distributed computing." But to be used by a single software group to serve their organization's needs? Fuhgeddabouddit.

    The early days of distributed computing

    In earlier days, there were lots of computing problems that were too large to be solved in a reasonable period of time on a single computer. If it was important to cut the time to finish the job, you had to use more than one computer, sometimes lots of them. This was frequently the case during the first internet bubble period, for example, when the concept of “distributed computing” really got traction. The idea was simple: in order to serve lots and lots of people with your application, a single computer couldn’t possibly get the job done without making everyone wait too long. So you wrote your application so that it could use lots of computers to serve your users; you wrote a “distributed” application.

    It’s always been harder to write distributed applications than non-distributed ones, and of course there’s lots of overhead in moving data from one computer to another. But if you can’t serve your users with a single-computer application, you bite the bullet and go distributed.

    Distributed computing today

    The most common form of distributed computing lives on today, more often called "multi-tiered architecture." This is when you have, for example, computers that are web servers, front-ending computers that are application servers, front-ending computers that run a database. That's a simple, three-tier architecture. The idea is that, except for the database tier, it's easy to add computers to handle more users, and by doing much of the computing on something other than the database server, you make it handle a higher load than it otherwise would be able to.

    There's a more elaborate form of distributed computing that also has a strong fan base, sometimes centered around a service bus. Other people call it SOA (a service-oriented architecture). These are slightly different flavors of distributed computing, often found together in the same application.

    Like most ways of thinking about software, the people who love distributed computing learned to love it and think it's right. Period. Just plain better, more advanced, more scalable, more all good things than the stuff done by those amateurs who run around being amateurish.

    The impact of computer speed evolution

    As I've mentioned a few times, computers evolve more quickly than anything else in human experience. Do you think that the computers of today can handle more than computers could at the time distributed computing took its present form? Is it just possible that, for most applications, a simpler approach than distributed computing in any of its forms would get the job done?

    Multi-core processors

    We all know about Moore's Law, I hope. But people don't think so much about the impact of multi-core processors. Simply speaking, "cores" put more than one computer on the chip. Physically, you still have a single chip. But inside the chip, there are really multiple computers, one per core, each running completely independent of the others. And the way they've built the cores, you actually get two threads per core — each thread can be considered a execution of a program. So, in a sense, you’ve got “distributed computing” inside the chip!

    Let's take a quick look at one of those chips. Here's one of the latest from Intel.

    Intel 15 core
    This is one awesome chip! It's got

    • 15 cores, supporting
    • 30 threads, and can support
    • 1.5TB of RAM
    • 85GB/s memory speed, plus
    • over 32MB of on-chip cache

    This is incredible. In the past, you might have 3 computers on each of 3 tiers, each with a robust 16BG of RAM (who would ever need more??), for a total of 9 computers with about 150GB of RAM. Connected by dirt-slow (by comparison) ethernet. Here, you've got 2-4 times the number of threads, 10X the amount of total RAM, all in a single chip, no bopping around on the ethernet slow lanes required. Who needs distributed computing when you've got one of these babies?!

    Conclusion

    Clearly, all the folks who regularly attend services at the Church of Distributed Computing didn't get the memo. This is not new news — except to the SOA and enterprise bus enthusiasts! There's no way mere facts are going to cause them to stray from their life-enhancing faith!

    But for the rest of us, it's clear. Use those cores. Use those threads. Make sure there's lots of RAM. And enjoy the numerous, multi-dimensional benefits of the simpler life.

  • Layers in Software: Fuss and Trouble without Benefit

    Most everyone in software seems to accept that layers are a good thing. In general, they're not. They take time and effort to build, they are a source of bugs, and make change more difficult and error-prone.

    What are layers in software?

    It's possible to get all sophisticated with this, but let's keep it simple. Imagine that your application stores data, presents some of it to users, the users change and/or add data, and the application then stores it again. Everyone thinks, OK, we've got the UI, the application and the storage. That's three layers to create, and for data to pass through and be processed. This is the classic "three-tier architecture," usually implemented with three tiers of physical machines as well.

    Everyone knows you use different tools and techniques to build each layer. You'll use something web-oriented involving HTML and javascript for the UI, some application language for the business logic, and probably a DBMS for the storage. Each has been adapted to its special requirements, and there are specialists in each layer. Everyone agrees that this kind of layering is good: each specialist can do his/her thing, and changes in each layer can be made independent of the others. We end up with solid, secure storage, a great UI and business logic that isn't dependent on the details of either.

    More layers!

    If layers are good, more layers must be better, right? It's definitely that way with cakes, after all. We know layer cakes are in general wonderful things. In some places, having 12 layers or more is what's done. ArticleLarge

    It's not unusual for application environments to have six layers or even more. Among the additional layers can be: stored procedures in the database; a rules engine; a workflow engine; a couple layers in the application itself; an object-relational mapper; web services interfaces; layers for availability and recovery; etc.

    It's hard to find anyone say this isn't a good thing. Imagine a speaker and a group of software developers. He says "Motherhood!" Everyone smiles and nods. He says "apple pie!" Everyone smiles and licks their lips. He says "layer cake!" 1960 06 22 WR 09-09
    Everyone can picture it, perhaps remembering blowing out the candles on just such a cake. 1959 06 22 WR 05-09
    You can remember as a kid opening wide and biting in to a nice big piece of birthday layer cake. He says "Software should be properly layered!" Everyone gets a look that ranges from professional to sage and nods in agreement at such a statement of the obvious.

    Layers are good, aren't they?

    Layer Cake, yes; Software Layers, uh-uh

    Take another look at the pictures above; you'll notice that cake alternates with icing, whether there are 3 layers or 15. One person makes both. There's a way to make icing, a way to make the cake. Usually one person makes both and assures that a wonderful, integral layer cake is the result.

    It's a whole different story in software. Even though the data flowing down from the top (UI) to bottom (storage) may be the same (date, name, amount, etc.), each layer has its own concerns and pays attention to different aspects of that data. Here's the real rub: when a change is made to data, far from being isolated, each component that touches the data has to be changed in different but exactly coordinated ways. The data is even organized differently — that's why ORM's exist, for example.

    One of the fundamental justifications for thinking layers are good is separation of concerns: you can change each component independently of the others (the same fraudulent justification that lies at the heart of object-orientation, BTW). But this is just wrong (except in trivial cases)! Any time you want to add, remove or non-trivially change a field, all layers are affected. Each specialist has to go to each place the data element is touched and make exactly the right change.

    But it gets worse. Because each layer has its own way of representing data, there are converters that change the data received from "above" to this layer's preferred format, and then when the data is passed "down" it is converted again. If you are further saddled with web services or some other way to standardize interfaces, you have yet another conversion, to and from the interface's preferred data representation. Each one of these conversions takes work to build and maintain, takes work to change whenever a data element is changed, and can have bugs.

    Think it can't get worse? It can and does! Each group in charge of a layer feels the need to maintain the integrity of "their" layer. Those "foreign" layers — they're so bad — they do crappy work — we better protect ourselves against the bad stuff they send us! So we'd better check each piece of data we get and make sure it's OK, and return an error if it's not. Makes sense, right? Except now you have error checking and error codes to give on each piece of data you receive, and when you send data, you have to check for errors and codes from the next layer. Multiplied by each layer. So now when you make a change, just think of all the places that are affected! And where things can go wrong!

    Here's the bottom line: every layer you add to software is another parcel of work, bugs and maintenance. With no value added! Take a simple case, like moving to zip plus 4. Even in a minimal 3 layer application, 3 specialists have to go make exactly the changes to each place the field is received, in-converted, error checked, represented locally, processed, out-converted and sent, with code to handle errors from the sending.

    In Software, the Fewer Layers the Better

    I'm hardly the first person to notice this. Why is the Ruby/RAILS framework so widely considered to be highly productive? Because it exemplifies the DRY principle, specifically because it eliminates the redundancy and conversion between the application and storage layers. What RAILS is all about is defining a field, giving it a name, and then using it for storage and application purposes! Giving one field a column name in a DBMS schema and another name in an class attribute definition adds no value. What a concept! (Although far from a new one. Several earlier generations of software had success for similar reasons, for example, Powerbuilder with its Data Window.)

    It's simple: In cakes, more layers is good. In software, more layers is not good.

  • Single Point of Failure: Logical vs. Physical

    People who want to build a highly available computer system tend to focus on eliminating single points of failure. This is a good thing. But we tend to focus only on the physical layer. We don't even notice the single points of failure at the logical layer. Logical single points of failure are just as likely to result in catastrophe as physical ones, and it's high time we started paying attention to them!

    Why? Keeping your system up and running is the most important job of an organization's techies.

    Physical redundancy

    We eliminate single points of failure by having more than one of every component, and a structure that enables the system to keep running with a failed component, while allowing it to be repaired or replaced. For example, here is a typical redundant system diagram (credit Wikipedia).

    220px-Distribfaultredudance
    And here are instructions for replacing a hot-swap drive in a redundant design (credit IBM):

    Dw1hb008

    Logical vs. Physical

    We are familiar with the concept of logical and physical in computing. All of computing is built on layers and layers of logical structures. Frequently, we call something a "physical" layer which is actually the next layer down the stack of logical layers; we call it "physical" only because it is "closer" to the actual physical layer than the layer we call "logical." A good example of this is in databases, in which it is common to have a "physical" database design and a logical (higher level) one; of course, calling it a "physical" design is a joke, there's nothing physical about it.

    Keep eliminating physical single points of failure

    I am not arguing that physical redundancy isn't important or that we should stop eliminating physical single points of failure. When I see people running important computer systems that have single points of failure, I tend to wonder how often the people in charge were dropped on their heads onto concrete sidewalks as children, and how they manage to feed themselves.

    The principle is simple: if you have just one of a thing, and that thing breaks, you're screwed. For example, if you have your database on one physical machine and that machine breaks, no more database.

    What is a logical single point of failure?

    I think people don't pay attention to logical single points of failure because it just isn't something anyone talks about. It's not part of the discourse. Let's change that!

    A prime example of a logical single point of failure is a program. You create physical redundancy by running the program on several machines. Great. That covers machine (physical) failures. What about program (logical) failures? After all, program failures (i.e., bugs) hurt us far more often than machine failures. But somehow, we don't think of a program as a logical single point of failure. We think that the program has bugs, that our QA and testing weren't good enough, and we should re-double our QA efforts. And somehow, miraculously, for the first time in history, create a 100% bug free program. Ha, ha, ha, ha, ha, ha, ha.

    Suppose you have version 3.2 of your program running in your data center. If that program is running on more than one machine, you have eliminated the physical single point of failure. If version 3.2 is the only version of the program that's running in the data center, then version 3.2 is a logical single point of failure. The only way to eliminate it is to have another version of the program also running in the data center!

    Eliminate logical single points of failure, too!

    Smart people already have a data center discipline that eliminates logical single points of failure.

    Suppose there is a new version of linux you think should be deployed. Do you stop the data center, upgrade all the machines, and start things up again? While that may be the most "efficient" thing to do, from a redundancy point of view, it's completely insane. Smart people just don't do it. They put the new version of linux on just one of their machines, and see how it goes. If it runs well, they will deploy it to another machine, and eventually all the machines will have it. It's called a "rolling upgrade."

    Same thing with the application. Great web sites change their applications frequently, but using the rolling upgrade discipline, and if they're really smart, with live parallel testing as well.

    Getting into the details of application design, the best people go beyond this method to create another logical layer, so that the things that change most often are stored as "data," in a way that changes are highly unlikely to bring down the site. A simplistic example is content management systems, which are nothing more than a way of segregating the parts of a site that change often from those that don't, and keeping the frequently changed parts (the content) in a non-dangerous format.

    Conclusion

    There is little that is more important than keeping your system available to your customers. Eliminating single points of failure is a cornerstone activity in this effort. Many of us are well aware of physical single points of failure, and eliminate them nicely. It's time for more of us to include logical single points of failure in our purview, and to eliminate them with the same vigor and thoroughness that we do the phsical kind.

Links

Recent Posts

Categories