The Black Liszt

Category: Occams Razor Occamality

Why is Writing Computer Software Mostly Dysfunctional?

While not much discussed or broadly recognized, the vast majority of efforts to build computer software are disasters. They take too long, cost too much, and result in varying degrees of crap. Lots of solutions have been been promoted and tried. None of them have worked.

There are exceptions, of course. The exceptions prove that it really is possible to create good software quickly and efficiently. It is highly unlikely that the cause is that the methods taught in academia are good, but screwed up when applied; instead, it’s likely that nearly everyone has it wrong, just like doctors did when they bled patients to cure them and refused to sanitize when operating on patients.

A major cause of the dysfunction can be found in the approach to building software that was taken for good reason in the early days of computing. This approach was necessary for the first decades of computing. As computers grew more powerful, the necessity of doing things the same way began to fade away and finally disappear, Many of the early cumbersome practices were discarded, but the key focus has remained the core of software development to this day.

What is this universally accepted, unquestioned aspect of building software that does so much harm? Simple: it’s obsessing on software imperative language, relegating data definitions and attributes to a necessary annoyance, confined to a tiny island in an obscure corner of the vast sea of procedural language.

Does this sound awful? No. Everyone who does it doesn't think they're obsessing — they're just working, writing their code! Similarly, getting water from the local community well didn’t sound awful in the 1800’s – until people finally found out about water contaminated with diseases like cholera. It took decades for the need for sanitation to be taken seriously, when the result of doing it poorly was death! We can hope that procedural language obsession will in the future be recognized as the main source of disease in software.

Early Computing

The roots of the language obsession are in the earliest days of computing. It was one thing to build a machine, and quite another to get it to do what you wanted it to do. The start was plugs and switches. Then the stored program computer was given step-by-step instructions in binary machine language. In the 1950’s first FORTRAN then COBOL were invented to make the process of creating the precise instructions needed easier, while still enabling the computer to operate at maximum speed.Those were indeed big advances.

In the 1960’s it still took a great deal of careful work to get results in a timely manner from computers. While languages like FORTRAN made writing easier, the fact that a compiler translated them to maximum speed machine language made their use acceptable.

The Apollo space capsule had a custom-built guidance system that was essential to its operation. Here is Margaret Hamilton next to a stack of code she and her team wrote for the Apollo Mission computers.

The Apollo guidance computer was a fast machine for its day, but the programmers had to get all the power out of it that they could to guide the capsule in real time. This is an extreme example, but 20 years into the computer revolution, everyone focused on using compiled procedural languages to get performance, and assembler language when necessary.

It was already evident that getting programs written quickly and well was incredibly hard. In fact, a big conference was held in 1968 to address what was called the "crisis" in software. Nothing got fixed. Meanwhile, efforts continued then and to this day to invent new programming languages that would miraculously make the problem go away. Nothing has changed for the better.

Partial steps towards declarative

From the early days to the present, there have been isolated efforts to go beyond simple definitions of data for procedural commands to operate on. Generally, the idea is that procedural commands spell out HOW to accomplish a task, while data definitions and attributes define WHAT the task is. It's like having a map (what is there) and directions (how to get from place to place on the map). See this for an explanation.

The invention of the SQL database was a small but important early step in this direction. SQL is all declarative. It is centered around a schema (a set of data definitions organized in rows and columns). The SELECT statement states what you want from the database, but not how to get it. WHAT not HOW!

You would think this would have led to a revolution in language (HOW) obsession. It didn't. In fact, because the language obsession stayed in charge, in some ways things got worse.

A few years after the DBMS revolution, people started putting big collections of historic data into what were called data warehouses. The idea was to make reporting easier without impacting production databases. Before long, OLAP (OnLine Analytical Processing) was invented to complement existing OLTP (OnLine Transaction Processing). While there were many differences, the core of OLAP was having a schema definition in the form of a star (star schema), with a central table of transactions and tables related to it containing the attributes (dimensions) of the transactions, typically organized in hierarchies. So there would be a time dimension (days, weeks, months), a location dimension (office, region, state) and others as relevant (sales, profits, department, etc.). After constructing such a thing, it was easy to get things like the change in sales from month to month in the hardware department without writing code.

OLAP was and is powerful. It assigned attributes to data in hierarchies, with unchanging programs that made it easy to navigate around. You could add attributes, dimensions, etc. without changing code! What an idea! But the idea was strictly confined to the isolated, distant island of OLAP and had no impact on software as a whole. The procedural language obsession continued without pause.

Declarative front and center

Procedural code is necessary. Code is what makes a machine run. However, the time for near-exclusive obsession about procedural code has long since passed. Limitations of computer speed and storage space were a legitimate reason to obsess about using the speed you had optimally. Think about cars. Engineers worked for decades to get the maximum speed of cars from about 10MPH to finally breaking 100. Now it's in the hundreds. In a much shorter period of time, computers have increased in speed by millions of times. Computer speed is rarely an issue.

How can we spend a little of the mountains of excess, unused computer speed to help make creating computer software less dysfunctional? Maybe instead of concentrating on procedural languages, there's a another way to get computer software to work quickly and well?

There is a proven path. It's obsessing about the WHAT is to be done. Obsessing about the data and everything we know about the data — its attributes. This means applying the fruitful but limited approach we took with OLAP, and extending it as far as possible. In other words, instead of creating a set of directions from every possible starting point to every possible destination, we create a map in the form of metadata and a tiny, rarely-changing direction-generating program that takes as input starting and ending points and generates directions. You know, like those direction-generating program that are so handy in cars? That's how they work! See: https://blackliszt.com/2020/06/the-map-for-building-optimal-software.html

When we do this, we'll advance from the endless complexities of the solar system as described by Ptolemy to the simple, clear and accurate one described by Newton. What Newton did for understanding the movements of the planets, Metadata obsession will do for software. See this for more: https://blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-code-and-metadata.html

Software development is stuck in the endlessly complex epicycles of Ptolemy; we need to get to Newton.

August 1, 2024
Moving towards Occamal software
Someday, there will be tools that actively help you build occamal software. I imagine that the tools will resemble a modern IDE, but will have assists, wizards, visual representations and other methods of helping you see actual and potential commonalities. In addition, there will be common components, both part of the development environment and part of the execution environment, which will make building occamal software easy and natural. Until such tools and components exist, the work and imagination of the developer will have to fill the gap. Remembering that we live in the real world, our goal is not to build software that is perfectly occamal; software that is nearly occamal would be wonderful, and software that is much more occamal than today’s software would be a big improvement.

A good way to think about occamality is the elimination of redundancy. It may be useful to put redundancy into categories that vary by blatancy. So we can identify:
- Simple redundancy
Simple redundancy is really, really blatant redundancy. There’s not much excuse for it, although some programming environments are shockingly encouraging of it. A great deal of the Y2K problem was due, sadly enough, to simple redundancy. Most of the relevant programs were written before databases were in common use, and there was in any case no native support for the “date” data type in the programming environments used; in the vast majority of cases, no one bothered to take the simple steps required to create the then-equivalent of a date data type. During the seventies, instead of eliminating simple redundancy, people spent their time engaged in fierce ideological arguments about whether “structured” programming represented an advance in program quality and programmer productivity or whether it was just a bunch of hoo-hah. The extent and depth of the Y2K problem told us the answer.

Simple redundancy is normally cured by replacing the redundant instances with references to a definition of whatever it is that they have in common.
- Complex redundancy
Complex redundancy takes some real effort and energy to eliminate, though often not a great deal. Complex redundancy involves redundancy over multiple programming environments or some other issue that raises it above the simple. For example, you may display dates on screens and store dates in databases. Is there a truly single place where you define what you know about “date” in general and each date in particular, for both environments, including temporary variables and parameters? If not, you have a case of complex redundancy. Dates are relatively easy these days, because most systems provide some sort of native support for them. So take something more application-specific, which is unlikely to be predefined, like account number, part number, order number or something like that and ask the same questions.

Complex redundancy is normally cured by shared definitions, but those definitions need to be created and maintained outside of the individual programming environments, and generated into them as needed.
- Redundancy due to incidental differences
There are many cases where there are true differences between potentially redundant items, but they are distinctions that don’t really “make a difference.” This kind of incidental difference frequently comes out of “over-determination,” for example, specifying that a certain field should be at this exact X and Y position, when what you really mean is “under” some other field.
- Redundancy due to over-refinement
Sometimes differences that are potentially “incidental” spring from definite user requirements. Typical sources of such requirements are highly experienced and sophisticated users, experts in a field, who know how it’s best to do things. They know, for example, that at step 4 of a certain process, they have always wanted the screen to be different in this or that way, to remove or add these buttons or fields, or do something else that makes that screen different from the others. It’s something they feel strongly about. It expresses their knowledge, experience and judgment. Rejecting the requirement can be taken by such people as ignoring their experience, denying their knowledge and deprecating their judgment. In other words, it’s bad. It may be a great idea to incorporate the suggestion – or it may be yet another distinction that definitely has a cost from the beginning to the end of the software project, and in the end makes no real difference. From the point of view of Occamality, the bar has to be set very high for such things.
- Redundancy due to external software
Another prime place for complex redundancy to show its ugly face is in interfaces. Many interfaces require you to do the same thing over and over and over again; they give you no choice. If there is no way to centralize this and eliminate the repetition (see next section), you’re stuck in a highly non-Occamal situation, and there may be little you can about it.
- Redundancy imposed by the programming environment
Some programming methods, tools, languages and environments naturally lend themselves to redundancy more than others. A good, simple example is the “polish” convention of naming variables, in which the variable name includes its type, for example instead of just naming a variable IDENTIFIER, you would name it IDENTIFIER-INT to indicate that it’s an integer. The convention came out of a real problem – someone would apply an operation to a variable that was inappropriate for its type; wouldn’t it be a nice idea if the name of the variable itself reminded you of the type so you wouldn’t make such mistakes? Yes, of course, but then if you need to change the variable’s type, either its name misleads you or you have to find every instance of it and change it. While well-intended, it’s non-Occamal.

Once you begin to think about simple things like variable naming, you begin to wonder why in most programming environments, the name of a variable has any significance at all? Why can’t you go to the single place where the variable is defined, and change anything at all about it – including its name?

The exercise is pretty simple. Anyplace you see repetition of any kind, you have to ask yourself, when I make a change, do I have to find all the places and make the change? The answer is probably yes, and it’s not good.
- Redundancy to which abstraction can be applied
If you’re just looking at code, redundancies may not spring out at you – in fact, at the level of the code, there may not be redundancies. But that doesn’t mean your code is Occamal! Sometimes there are common ideas that are expressed with great diversity in the code – or so you realize once you grasp the relevant abstraction. In such cases, it is typical that the people who understood the original problem and the people who wrote the code thought they were dealing with many independent things, and the abstraction that unified them simply never occurred to them.

A good example of this is a collections system I worked on. This was a huge body of code that was used to automate the work of hundreds or thousands of people in call centers whose job was to call people who owed money to an institution, for example a credit card company, and get them to pay. The code had a wide variety of concepts implemented in that were specific to the collections industry. I came into the situation with a broad background in multiple workflow-type applications, and quickly recognized that collections was 95% call-center-oriented workflow, with a little customization and a few specific features for collections.

It wasn’t obvious when looking at the code, but if you started with the basic workflow constructs (workstep, queue, conditional routing, etc.) and took it from there, you ended up with a much more compact and easily extensible application. The original application was simply a “collections” application with some parameters; changing anything required finding the relevant places in the code and making the changes. The new application implemented a core set of workflow abstractions, and hardly ever needed to change. It also had a set of easily editable tables that expressed the current state of the workflow, which enabled almost anything to be changed. Finally, it had a small collection of application fragments that were truly specific to collections, things like the “promise to pay” function.
October 5, 2023
Occamality: the problem with layers, components and objects
Modern software orthodoxy endorses the notion of collections of code that are separated by high walls. Object-oriented thinking codifies the supposed virtue of data hiding, and keeping all the code that works on a particular block of data (a class) behind a wall. Everyone seems to like the idea of components; people will talk in terms of assembling applications out of component building blocks (when has this ever happened except in seminar rooms?). Finally layers or tiers are supposed to introduce discipline to an application. The idea is you have the user interface (top) tier; then you have the application or business logic layer; and finally you have the database layer. Frequently, these are designed and built by separate groups, each a specialist in its domain – do you really want those superficial, flashy, image-obsessed UI people messing around with your transaction data – do you??? Of course not. But you also don’t want the groups to “hold each other up,” so the idea of having the layers be strictly separated so the UI people can do their thing without being held up by the others.

This is the theory. Lots of people practice it, or at least say they do. If you’ve never experienced anything else, it makes sense. Just the other day I met a software company chief scientist, a PhD in something or other, who explained to me that they were converting some scripts from PERL to java because java was a “real” language, and the scripting languages – all of them! – were not. I didn’t get a real explanation of why this was so; he took it as self-evident to all educated and reasonable people that it was, and by questioning it I really tried his patience, since I was apparently uneducated and unreasonable…

Components, layers, microservices and objects can be good things in limited circumstances, but it’s really, really important to see that if you accept that the prime measure of goodness of a piece of software is occamality for the reasons already discussed, in general, components, objects and layers reduce occamality to the extent they are applied.

The core problem with these things is that they result in multiple definitions of what amounts to the same data. The theory actually encourages this. Take a simple data element such as “account number.” In any program involving accounts, this piece of data will be all over the place. If you’ve got layers, it will be in all of them, because surely it will appear on screens, business logic and the database. If you’ve got objects, it will be in the interfaces and internal implementation of many of them – think, for example, of “account master” and “transaction.” If you think you’re being clever and have built yourself a message passing architecture, it will be in the message definitions – lots of them.

Naturally, you assume and hope that the definition of account number doesn’t change, and in most cases it won’t, in which case the wild redundancy won’t hurt you much. But problems come most often from things you didn’t think about – what happens if there’s a merger, and suddenly there are a whole set of accounts you’d like to bring into your software, and the new account number definitions are totally incompatible. What if they’re bigger and there’s no way to cram them into the existing scheme? You desperately cast around for translation schemes, but in the end you accept the inevitable: you’re hosed.

Now think about this: what if there was a central place where everything you know about account number was defined? Suppose the central place had everything that all the layers needed, for example label for the UI and foreign key relationships for the database. You could go to this single place and take care of most of the account number issues. I assume the program isn’t perfectly Occamal, and there are a few other places you have to go for everything to work. So what? You’re way better off than you otherwise would be.

The painful, anti-orthodox but blindly obvious conclusion is this:
- Each “layer” you put into your software increases its complexity and time to construct, and increases the time and risk to make changes to it.
- The more you use classes and other forms of object-orientation in your code, the more redundancy you introduce, and thus the harder you make your code to construct and change.
  - You may think – before you’ve had lots of experience – that the data hiding of classes increases the “componentization” of your code, and thus makes it easier to change without trouble. Sadly, your thinking on this subject needs revision. Parameters and calls weave your objects together and make them do useful things, and the greater their interaction, the greater the redundancy.
There will always be people who love components, objects and the rest, just like there will always be people who insist on having half a dozen gin-on-the-rocks before supper to whet their appetites. Words don't work. Logic is irrelevant. They should stay in university or wallow around in some giant software bureaucracy where they'll have plenty of company.
September 28, 2023
Achieving Occamality through definitions: case study

Quite a few years ago I had the problem of creating a product that would help printers create estimates for potential printing jobs. We had one of the early micro-computers at our disposal, and the only programming tool that was available for it was a macro assembler. We had to get the product out in a ridiculously short period of time, and were very limited in the amount of memory we could use.

We got together and realized that we had a fairly simple problem. The ultimate goal was to create printed estimates. Each estimate was calculated based on a combination of data that was unique to the job (the size of the paper, the number of colors, the number of copies, the type of paper, etc.) and data that was common to most jobs, but could be changed at will (the amount to charge for different kinds of paper, press setup and per-copy charges, etc.).We also had to save estimates and create new versions with changed parameters.

Here’s what we did. We broke the whole problem down into seven overlapping problem domains. We created a set of macros for each domain. The parameters of each macro contained attributes relevant to the domain. When assembled, each macro would deposit coded data in memory – no instructions. For example, we had a set of macros for input, another for printed estimates, and another for the core variables relevant to estimating. Macros could refer to other macros, so we could eliminate redundancy and keep the memory requirements as small as possible.

Each parameter in a macro had to correspond to or do something, so we definitely wrote code, but the code we wrote was highly abstract and was roughly proportional to the extent of the macro parameter definitions. When we added a new macro or a parameter to an existing macro, we would add or modify a couple line of assembler code.

The actual functionality of the program was created by a small amount of code that rarely needed to be touched and was pretty easy to write and debug, and a relatively larger amount of macro calls that deposited meta-data in memory. The instructions walked through the meta-data and created the behavior the user saw. We spent time at the beginning understanding the nature of the problem, defining macros and writing code. As the project went on, we spent less time with code and definitions and more time with writing and modifying macro calls. As time went on, the users got more and more stuff, and we were able to change what they didn’t like quickly and without introducing further errors or side effects.

We lost a good deal of time because we picked a computer that was too early in its lifecycle. It had hardware errors, and the macro assembler that was so important to us was buggy. The operating system was primitive, and we even had to write our own file system. In assembler. With a flakey machine and a buggy macro pre-processor.

The main competitor at the time had a system they sold for roughly $25,000. Ours was faster, easier to use, had far more functions, and could be sold at great margins for $10,000. The project was taken on by me and three totally awesome geeks. We delivered the product on the date that was fixed at the start of the project, a date that based on nothing except when it would be nice to have the product. It proved to be nearly bug-free when delivered to customers; one customer found one bug after a couple of months, and I was able to fix it in an hour. Time from start to finish: ten weeks.

September 28, 2023
Achieving Occamality: What not How

I don’t think I can improve on Chris Date’s formulation of the issue. His basic point is that most computer programs solve problems by taking an imperative approach, telling the computer how to accomplish a given task. He argues strongly in favor of a declarative approach, telling the computer what needs to be accomplished, and having a core set of application-independent functions that accomplish the goal in an optimal way.

Date’s favorite domain is databases. His approach works for databases by writing a program that knows about schemas (tables and columns), data (rows) and SQL statements (Insert, Select, etc.). If you can fit your problem of representing (the schema) and manipulating (the SQL) your data in the way databases expect, and an amazingly wide variety of problems can be represented in this way, then you get to define, load manipulate and access your data without writing any code at all. Since the one body of DBMS code ends up solving a huge number of problems, this approach is highly Occamal.

There are other well-know examples. Spreadsheets are a good one. If you’ve got a spreadsheet-type problem, you can get your needs met quickly and well. You can even write formulas and small program to perform custom calculations.

While databases and spreadsheets are widely understood “horizontal” applications, the approach is applicable to nearly any domain. In fact we often take an approach of this kind when writing applications anyway! If you look at a body of code, there will probably be some code that actually performs an application-specific function that users, for example, would think is meaningful to them. Then there’s all the “other” code that you had to write to enable you to deliver the application value. What’s unusual is for programmers to go “all the way,” and make a clean separation between domain-related abstract functions and application-specific declarations.

The notion of "what" not "how" is a far-reaching one. For example, most of criminal law is "what" you're not supposed to do — don't kill anyone! The law doesn't spell out "how" you're supposed to avoid killing — just don't do it! By contrast, most regulations are written as "how" type rules. For example, instead of just saying "the medical device as to work correctly," typical "how" type regulations spell out in gruesome detail how you must accomplish this; and better ways are not allowed!. See this for more detail.

September 28, 2023
Achieving Occamality through definitions

This is a deep subject, and challenges the way many programmers think. It is rooted in the most fundamental assumptions about the way computers work and the way we program them. In my experience, the ideas first strike people as being simple, obvious, and uninteresting. They seem irrelevant to anything important, and even seem like far-out crank talk.

One way to approach this is to imagine that you stop a computer and examine its memory, byte by byte. At one level, it’s all data. By definition, what’s stored in a computer’s memory is data. But in practical terms, every single byte falls into one of two categories: (1) “plain” data, and (2) instructions. Instructions are placed or loaded into a computer’s memory like any other data, but unlike plain data, the computer can be “pointed” to a starting address and told to start executing the data that is there, and good things will result if the data conform to the rules for instructions.

Achieving Occamality through definitions essentially amounts to dividing the computer memory into three categories: in addition to instructions and “plain” data, you add “meta-data.” If you wrote a program with just data and instructions, you would have a certain amount of space devoted to each. If you write a program using meta-data, the data stays pretty much the same, but the space devoted to instructions typically shrinks by a great deal, and you have a good deal of space devoted to meta-data.

It is important to note that the meta-data should not be in the form of tokens, and while the instructions in some sense “interpret” the meta-data, the meta-data should not primarily be a directive, imperative program.

In practical reality, what you do is create a descriptive, declarative language that is as close as possible to the problem domain as possible. There should be a minimum of translation between the “natural” way of thinking about the problem and the way the problem is expressed in the language. A good example is a navigation system as described here – the meta-data describes the map in natural terms. If you understand language concepts at all, there is probably a one-to-one relation between elements on a visual map and elements in the map description language.

Of course, even a highly declarative approach has a directive aspect to it. My intention here is to emphasize what is usually ignored, not to present an either/or. For example, while a program that gives directions mostly consists of the map meta-data, the direction-generator itself can usually be built in a partly imperative and partly declarative fashion, and needs some plain old parameters, things like whether to avoid toll roads or interstate highways.

This is not a left-field approach to programming. In fact, every method of programming computers, with the arguable exception of “raw” assembler language, starts with a “model” of the kind of program you are going to write, and a choice of how to think about that program.

One of the earliest approaches to rising above raw assembler was the language FORTRAN, short for “formula translator,” derived from the fact that its creators were scientists and engineers who wanted to put their mathematical formulas into the machine for solution.

The business programmers who wanted to use the machine for common business record-keeping functions didn’t find FORTRAN particularly helpful. So COBOL, an acronym for Common Business-Oriented Language got invented.

While it’s reasonable to think about FORTRAN, COBOL and the various languages that succeeded them in terms of language, and this has often been done, it is also reasonable to think: When I sit down with language X, what kind of problem does it help (or hurt) me to think about? How much “translation” is required from the natural terms of thinking about the problem and the language? Do the very terms of the language talk about things I think about? If not, you probably have an opportunity to create a definition domain in which to express your problem, and write a much shorter program than you would otherwise need to write to get the job done.

I’m happy to say that the approach I advocate here is very close to what many people call “model-based” programming. It is ironic that this is so hard to describe, instead of just being mainstream common sense. When you think about a body of code that solves a problem, it is normally possible to code each abstract function that the code performs many times exactly once, and then reduce things it does uniquely to a set of meta-data. Most programmers are pretty comfortable taking this approach when defining the schema for a database-style problem. The model-based approach is really little more than taking the concept of a database schema and greatly generalizing it.

September 28, 2023
Case Study: Replacing metadata with fashionable software

A VC firm I worked for made a (sadly, in retrospect) passive, non-controlling investment in an emerging PLM company that was all over model-based, declarative development. They didn’t use silly, made-up names like “Occamal” to describe what they did, but they were fully self-conscious of the power of meta-data, and used it fully. The CTO chose Microsoft as the target technology stack, and ran into the fact that most programmers who specialize in this stack found the meta-data approach to be a foreign one. He overcame the problem by getting his programming done by a group of mathematically-oriented programmers in Kiev, encountering a phenomenon I have encountered many times, namely, that people who think mathematically find it natural to think in the abstractions demanded by Occamality.

This company, AA, had both the problems and the achievements many companies have at their early stage. It had accomplished amazing things in a remarkably short period of time with a small team. It had created a distinctive product that could be customized with an order of magnitude less effort than competing products. It had some really happy customers.

AA also had some unhappy customers. It had made some promises it could not keep, and its code had some really embarrassing failures in the field. It had gotten speed and innovation pretty well down, but quality and reliability remained works-in-progress.

AA accepted a major investment from a couple of brand-name venture capital firms. Each of the firms had name-brand partners on the investment. They surveyed the customers, and discovered the quality problems. So, naturally, you would think they focused on fixing the quality problems, making sure to keep the overall approach that was the key to the company’s value. Makes sense, yes? What else would experienced software-industry investors do? Well, they could do what they actually did – which was hire a big-name CEO, who brought in his own VP Engineering, who glanced at the product, decided it was no good because it wasn’t written in a proper language, viz. java, and put the existing product “on ice” while his huge new team re-wrote everything in industrial-strength java, which of course, as everyone knows, is a language so magical that all you need to do is write in it and what results is fast, bug-free, infinitely scalable programs.

I came in at this point and raised a strong protest. I defended the model-driven vision and most of its execution. I pleaded for spending the new money to fix what was broken – the quality and release process – and for continuing to nurture the goose that was already laying gold nuggets and clearly gearing up to laying golden eggs. It was like I was from Mars, and not the advanced-civilization Mars of science fiction, but a hick Mars in which the most visionary thinkers could hope there was a Stone Age in their future. They tolerated my presence with curled lip and up-turned nose, and went back to whatever they thought they were doing before I intruded on them.

A year and a half later, all the money was spent, the re-write of the application was “nearly” ready to ship (a status it maintained for an amazingly long period of time), some of the programmers were found to have played with visual basic at home, which accounted for the fact that the magic that normally makes java bug-free had been compromised by exposure to impurities. Worst of all, customers were telling the CEO that they wouldn’t upgrade to the new version even if it were perfect, because it lacked the easy adaptability of the original, model-based application. The CEO did what any sensible, experienced executive would have done under the circumstances – blamed the founder for his flawed product and market vision, flaws so profound that even a total re-write was unable to cure them, and sold himself into another job, where they were glad to land such a seasoned executive who had such practical, down-to-earth judgment.

The investors, left to pick up the pieces, had no choice but to return control to the founder/CTO, who, amazingly, had stuck around while the new team was spewing all the money into the garbage dump while dumping on his vision and execution. He hired a VP Engineering, a Russian with many years of hands-on US-based software experience, who fully appreciated the vision, and was able to get the newly re-hired, largely Slavic software team on board with making quality an equal partner with speed and innovation. The quality problems got fixed, customers were re-enchanted with the product, and AA enjoyed a second chance at success. (A few inessential facts have been changed to protect the guilty.)

Occamality is the best possible software architecture, but it ranges from unknown to miles beyond left field. Successful, experienced software professionals are highly likely to dismiss it out of hand in favor of whatever the cool thing is at the moment. To be successful with Occamality, not only do the engineers need to understand it but the executive management team has to understand that the approach is the keystone of the company's technical advantage over the competition.

September 28, 2023
Understanding Occam-optimality practically
Here are some basic thoughts to help understand Occam’s razor for software as it is applied in practice:
- Build only the software you need to build in order to be successful.
- With rare exceptions, shorter programs are better than longer programs. The goal of programming should be to produce the minimum lines of code in order to accomplish the job.
- Always keep the “parts count” of your program in mind, and make it your goal to keep the number of “unique parts” to a minimum, with each “thought” expressed exactly once, with everything that “uses” that thought generated from the primary expression of the thought. See this for a practical example.
- The best days are ones that reduce the total number of lines of code while increasing the functionality of the program. A day spent doing nothing but writing new code may very well be a day that that reduces the Occamality of your program, so that every line of code adds to the later burden of testing, documenting, learning, using, maintaining and ultimately changing the program, thus making it worse than it needed to be in nearly every respect.
- In situations where there are multiple paths to a software goal, pick the shortest one that involves the least amount of work. Don’t plan for future change; if you write the shortest possible program, future change will be optimally prepared.
- Ignore common practices if they produce redundancy, and thus violate the razor. For example, multi-tier design frequently results in huge redundancy of data definitions. A multi-tier execution environment is fine, so long as the tiers are generated from and share common definitions, so that the program at the source code level is Occam-optimal.
- Frequently, software is complex because the requirements are needlessly extensive and complex. Reducing and/or standardizing the requirements is a key part of Occamal software.
- Apply Occamal principles to the whole process involving the software, from requirements through to training and support. Cases that seem marginal when looking at the software in isolation often become clear when taken in full context.
- Copy, use, ignore or emulate everything that is not uniquely required to be successful
- Avoid over-determination, i.e., putting in things that sound good or say more than you mean to say. For example, there is frequently no mechanism to deliver the results of error checks in a database to the people or programs that need them, and they are therefore useless. Again, some people use screen design programs that specify design elements in far more detail than they really mean, therefore saying too much.
A great deal could be written about this. For an explanation of the "post-hoc design" approach to Occamality, see this. It's a way of speeding the way to a goal.
September 28, 2023
Understanding Occam-optimality technically

I am not aware of a set of terms I can use to make a technically clear definition of “Occam optimality” for software, so I’ll just re-use some existing ones. I suspect there’s an exact, more mathematical way to do this. I’m hoping someone will pull it off. My goal here is to express the most basic relevant concepts in rough terms.

By “program” I mean the entire source text that is required to build the executable program. If the program uses a DBMS, the schema definition is part of the program. If there are “include” files or resource files, they are part of the program. If the program is a multi-tier one with an ASP GUI, a component-based application and a database layer with stored procedures, it’s all part of the program. If there is information that is part of the program that is not immediately available as text, then for these purposes we convert it to text. While we normally think of these things as being in totally separate categories, for these purposes, we will construct a single source text that has everything required to build the program.

Here’s a tricky, vague but essential definition. Without this definition, an occamal program would be just a version of the original compressed by mechanical means, kind of like a compiler optimizer. The “intention” of a program is its true goals, without technical specification or constraint. The “expression” of a program is a “program” as defined above, i.e., completely determined by its text. A single program intention can have a very large number of possible expressions. The purpose of the normal program design process is, at the very early stages, to clarify the intention, but most of the process is designed to narrow down among all possible expressions to get close to a narrow range of possible expressions. The final determination of program expression is, of course, made by the coders.

The effort of creating Occamality comes during the process of turning “intention” into “expression.” Of all the evaluation criteria that have been used to drive and evaluate this process, and there have been many, the occamal criterion says: express only what has been intended (don’t over-determine), and express what has been intended in the least possible number of “semantic units” possible, with no redundancy among them.

I use the awkward phrase “semantic unit” to indicate a programmer’s thought, and to avoid getting caught up in one language vs. another, the number of characters involved in the keywords and syntax, comments, the length of labels and variable names, etc. A semantic unit is any combination of code, data and meta-data that is required to express a thought. Semantic units can be primary, composite, and have any number of connections and relationships among them. Good definition and use of connections and relationships are how redundancy is eliminated, along with careful attention to avoid “distinctions without a difference” of substance.

Semantic units can have “primary” information in them and/or “reference” information. “Primary” information in a semantic unit is an actual definition or program statement, for example a data type or a conditional expression. “Reference” information is a reference to one or more other semantic units. For example, you would define that a date is a month, day, and year, with these data types, lengths, labels, compute rules, etc. This would all be “primary” information about date. When you defined “birth date,” its most important characteristic would be the “reference” information that it is a type of date. A semantic unit could easily contain a mixture of primary and reference information. For example, “birth date” would have as primary information its unique label and the fact that it is a separate piece of information; everything else about “birth date” would be a reference to “date.” The reference should be a single reference to the “date” semantic unit.

References, as defined in this broad sense, are the key to creating occamal programs. Every reference is something that could have been given a separate expression, which might have been literally identical, but more likely would have had the same intention but differ in trivial detail as expressed. In the example of the Y2K problem, each reference to a central date definition replaces a separate definition of date. Without references, all the separate instances of date would have to be found and changed; with references to a single central definition, all that needs to change is the central definition, and the references cause the change to be rippled to all uses of date.

A “program” is Occam-optimal when it has the minimum possible number of “semantic units” in its expression, and when there is no redundancy among the semantic units, while still fairly expressing the original intention of the program.

Obviously, this only scratches the surface of the technical side of Occam-optimality. I hope it’s enough to convey the general idea.

September 28, 2023
Occam-optimality applies to all stages of the software life cycle
It’s natural that Occam-optimality is mostly focused on software. But it applies to everything from requirements to architecture and design, QA and testing, documentation, support and the endless round of changes software tends to have.

Requirements occamality

Are Occamal principles just being applied by the programmers who code up the application? If they are, that’s good, but there are steps in the software cycle before the programmers start their work, and if those steps are conducted without Occamality in mind, even if the programmers program a perfectly Occamal program, there will end up being many more lines of code than there could have been. This is easiest to see in the requirements process. Suppose one group writes up requirements for an on-line credit card application and another group writes up requirements for an on-line application for a term loan. The requirements are likely to be similar because the same kind of data needs to be collected in both cases. But what if one group takes a very matter-of-fact approach, while the other emphasizes fancy graphics, visuals and user interaction? Once those requirements come to the programming group, unless that group is amazingly perceptive and aggressive, the two sets of requirements will be treated as wholly unrelated projects. In other words, business as usual.

Imagine, on the other hand, that the people thinking about requirements are familiar with Occamality and want to apply it. Rather than starting from scratch, they may look for places where the data they need is already being collected – maybe this new project is an opportunity to unify some existing code, and bring it into the Occamal world. In any case, the two groups will decide that they’re really one group with a couple of minor differences between the two applications, and they’ll also realize that the styles of user interaction will change over time, so they’ll take some effort to separate that out from the actual data gathering. The net effect is likely to be a set of requirements that are “pre-occamized,” that should not take much transforming for the software group to implement in an Occamal way.

If software requirements aren’t Occam-optimal, the software may be occamal in some abstract way, given the requirements, but the requirements themselves may demand a solution that feels like there are all sorts of special widgets and do-hickey’s that could just as well be dispensed with. In other words, Occamality is best achieved if it is a conscious goal from the very start of the requirements process to the end.

Here are some reasons why it’s important to think Occamally in the requirements:
- Redundant requirements or ones with trivial variations normally result in needless redundancy in all the following steps.
- A set of requirements that are similar in principle but different in detail are frequently over-defined and over-refined, creating work that delivers no value.
- People who create requirements frequently know a great deal about their subject and have thought about it long and hard – and the more they know and think, the more extensive their requirements are likely to be, which is often a problem.
Software requirements people frequently resemble the washing machine designers I have described who optimize what they do over a very narrow domain. The result is screens and functions that are highly customized to the job for which they’re intended. Every bit of customization adds that much more to what has to be programmed, tested, documented, taught, learned, used and maintained. This is typically a very large cost for an assumed benefit that often isn’t even there, because people find uniform programs easier to learn and use.

The consequences of applying Occamality as broadly as possible in the requirements will roll down the line of testing, documentation, roll-out, training and support, saving time and money each step of the way.

Software scope occamality

If you look at 100 lines of code, you may be able to find some redundancy, but probably not too much. If you look at 1,000 lines, 10,000 lines or 100,000 lines, there are likely to be whole gobs of redundancy.

When you cross the line from one officially-designated “program” to another, you greatly increase the odds of finding redundancy. Within a single program or project, there is only so much you’re like to find. But particularly within a single business, the same things come up over and over, and get solved over and over again, each time in slightly different ways.

Things begin to get interesting when you cross traditional software boundaries, like from user interface to application programming to network to database. In each case, there are frequently specialists who do things in their own way. If, as is likely, they have caught the “object orientation” bug, they each attempt to create their own little worlds of nicely designed classes/objects. But when you look at the project as a whole, and you think of something like “account number,” think of the number of times and places and ways that concept has been defined. Even within one of those domains, like the application, an account number data element will be defined as part of each class, and will appear every time account number crosses to or from the database, and again every time account number crosses to or from the user interface. You’ll find that “account number” is all over the place! If you have any question about this, just try to change the data type, label and error checking for it, and see how many places you have to touch to make it work.

Software architecture Occamality

Occamal thinking is crucial in the architecture and design of the software:
- Achieving Occamality frequently involves introducing a strong but appropriate amount of abstraction into the program design. Simply translating requirements with minimal abstraction into objects, schema designs, etc. typically leads to huge amounts of redundancy.
- If you just focus on the text of the code, you may find little redundancy. But you may also be able to achieve substantial code reductions by taking a whole different approach.
Software QA Occamality

The usual approach to QA is highly redundant and un-optimal. When QA automation is normally performed, it is highly redundant, and leads to a system that doesn’t have better quality but is even harder to change. Typical script-based automation ends up having the effect of copying variables already present in the program in a different environment (the QA one), and creating logic that reflects and is derived from the base program. This approach to quality automation clearly reduces the quality of the overall program, and certainly reduces its occamality of the overall software effort The ideal approach is highly automated, fully supports occamal software, and indeed is occamal itself. See this for more details.

Later stage Occamality

Post-coding software steps benefit indirectly from smaller code and less redundancy, but in addition, Occamal thinking can benefit them directly.
- Occamal thinking applies to documentation and help text. The arguments about redundancy making documentation more difficult and error-prone to change apply fully.
- Let’s not forget the users! Chances are excellent that additional code is the direct result of input from some users, but most users will suffer as a result. Every concept, field and command you eliminate is one more thing users don’t need to worry their little heads about.
It is important to focus on Occamality in the code. Extending it to encompass the entire life cycle of software is what delivers big benefits. Post-hoc design as described here is a practical approach to accomplish Occamality in waves, without going too crazy trying to do everything perfectly.
September 27, 2023
Occamality and other design principles

People use and defend the use of all sorts of methods and tools for building software. The number and range of methods and tools is overwhelming. Is Occam optimality yet another entrant to be added to the already-too-long list? No. I hope that the notion of Occam optimality will be understood to be independent of all those methods and principles, and will provide a way to both improve each of them and, in some cases, select among them.

If the notion of Occam-optimality is correct, it stands above and cuts across all methods of building software. It is a principle that enables you to take two efforts to build a program in the same environment and tools, and say objectively which is better.

It is certainly true that some tools and methods make it easier to achieve Occam-optimality than others. But it is important to understand that Occam-optimality is not primarily about judging tools and methods – it is a notion of optimality that can be applied given any selection of tools and methods. Programs can be written in assembler language (assuming there’s a good macro pre-processor) that are Occam-optimal, and in fact it is easy to understand the concept in that environment, since everything is laid bare.

If most current ways of thinking about programs are like algebra, then the analysis and understanding of Occam-optimality is like calculus. Just as calculus doesn’t replace or criticize algebra but is built on it and extends it, so is Occamal analysis built on existing methods and tools of program construction. Just as calculus provides tools that didn’t exist before, like methods for calculating the area under a curve or the zero or minima of a curve, so does occamal analysis enable us to understand what “optimal” really means for most programs, and gives us guidelines and a way of thinking to enable us to construct programs that approach occamality.

Unlike the myriad algebra-like methods of program design and construction, an occamal program that meets the business intentions is also very likely to be optimal in terms of all the other goals that business people have: it will be as efficient as possible to define, build, test, document, deploy, learn and support. Most important, it will be possible to modify and enhance it more quickly, safely and inexpensively than any other kind of program. To the extent that the program fails to meet important expectations on any other grounds, e.g. performance, availability or attractiveness, a program that is occamal lends itself to alteration to meet those expectations better than a program which is sub-occamal.

Once this criterion is understood and accepted, our tools and methods will evolve rapidly to achieve the maximum possible productivity in software design, construction, support and evolution.

Focusing on the occamality of programs is a way of focusing on what we most care about in software. It doesn’t eliminate our other concerns – we still care that programs perform reasonably, that databases avoid corrupting data, etc. But an amazing number of those other concerns will be taken care of by-the-way if our programs are occamal, and if they’re not, we know that occamal programs are almost by definition the easiest to change to address any need or fill any gap.

I worked with a company having horrible problems with the administrative workstation of their storage product. It took a long time to load and many seconds to respond to reasonable-seeming requests. Its user interface was inconsistent and it took a very long time to update to reflect changes and new features in the underlying product of which it was a part. Looking under the covers, it turns out there were more than 200,000 lines of what turned out to be java code, and it was really hard to say what useful function large chunks of it were performing. Java was selected because the people thought it was modern, object-oriented, and would let them build re-useable components and other good things. Then they just started writing code, and here we were.

There were lots of arguments, pro and con, about whether java was any good, all of which were beside the point – wrong discussion. After a great deal of struggle, the whole thing was gutted and replaced with less than 20,000 lines of code and definitions. The people doing the work did not concentrate on performance, changeability, or other architectural concerns – their main goal was to make it small. When they were done, and they did it quickly, they had a complete replacement, a consistent user interface, and it magically became really fast and easy to change!

September 27, 2023
Occamality in Software History

Occamal concepts have naturally emerged in software efforts from nearly the beginning of software, and limited formulations of Occamal concepts have been promoted and valued in software. The purpose of this section is to identify those early expressions for what they are; they show the widespread, ever-springing nature of the desire to come to some principle that will yield optimal programs.

Software tools

One of the earliest pure-software efforts, the work to build the early language FORTRAN, was clearly a step in an Occamal direction. By creating a machine-independent language in the first place, programmers could concentrate on expressing their programs in a single language, FORTRAN, instead of the multiple machine languages that came to exist. Since FORTRAN is closer to the problem domain of most scientific programming, it usually takes fewer statements to express a program in FORTRAN than in assembly language; therefore, writing in FORTRAN eliminates redundancy and is thus better in Occamal terms. The relationship between the statements of the language and the machine language are all centralized in one place, the compiler. The compiler’s run-time library was just as important in this respect – it provided a set of commonly used functions that only needed to be written once, and then used whenever needed. In the early days of FORTRAN, for example, machines typically did not provide native support for floating point operations. The programmer could write his floating-point formulas and calculations without thinking about the machine, knowing that if the machine his program eventually ran on had no support for floating point, the compiler would generate the right calls and the run-time library would do the work.

All computer languages take programs in an Occamal direction. Just as FORTRAN abstracted and centralized the floating point operations commonly needed in scientific computing, so did COBOL abstract and centralize processing records of data and the BCD arithmetic needed for financial calculations.

Similarly, the concept of subroutine and shared subroutine libraries arose very early. One of the important functions performed by early software societies (IBM’s SHARE, the ACM and IEEE) was to collect, refine and standardize libraries of functions widely used by groups of people. Early software companies owed their existence to the time and money it took to build major functions, and built a business on selling many copies of software that was expensive to build, and less expensive for the user to buy than build for himself.

The use of macros is clearly driven by Occamal goals. The idea and motivation of a macro is simple: when there are repeating examples of a string in the text of your program that you suspect you may want to change in the future, you define the text in a macro and use the macro instead of the text itself. When the need for a change arises, you go to the single macro definition, change it, and you’re good to go, regardless of how extensively the macro is used.

The concepts of inheritance, templates and component re-use are clearly Occamal in nature. The idea is you write the common parts of a function in a master class or template, and then sub-class or apply the template to handle special cases or variations. You have the master class or template which gives you the single place to go to make changes, while benefiting from only needing to spell out the variation in each particular application. Using these things tends to reduce the number of lines of code and reduce the redundancy, and is therefore Occamal.

Software applications

Occamality can be clearly seen in the evolution of software products.

Over time, software efforts in a particular field naturally tend to the Occamal, model-based ideal, because such bodies of code are the easiest to maintain and enhance, while providing maximum flexibility to their users. Early attempts at building Customer Relationship Management (CRM) systems, for example, tended to be bodies of code written in some supposedly easy-to-change 4GL. Every CRM system needs customization. When one of these early CRM systems was installed, the source code would be opened up, programmers would hack away, and sometimes an application vaguely appropriate to the customer’s needs would emerge stumbling from the dust. The awfulness of this approach quickly became evident as customers found that upgrading to the vendor’s latest release meant re-entering the coding war zone and having to fight a battle on two fronts, i.e., their customizations of the old release and the vendor’s “improvements” to the old release in the form of the “upgrade.”

The vendors who survived this nightmare were the ones who, through natural selection and the survival of the fittest, minimized the damage to their customers through the installation and customization process, not unlike the way parasites evolve to avoid killing their hosts. Invariably, this meant increasing degrees of Occamality (not “perfect” Occamality, mind you, just “increasing degrees” of Occamality), nearly always expressed as a more model and template-based approached to application definition.

Engineering product management systems, usually called PLM systems, are an example of a set of software that has long-since evolved to be more Occamal than not, simply because engineering products and processes are so highly individualized that one of the main functions of modern PLM products is to make it easy to express product and process definitions without requiring modification of the source code of the product.

Here is further explanation of the evolution towards increasing application in software applications.

These examples make it clear that Occamality is a thread that weaves through the software industry and underlies many of its developments, although the people who created and promoted the developments generally did not think of them in these terms.

September 27, 2023
Occamality in Databases

The accepted practice of database schema design is a good application of Occamal principles. In fact, in a broad sense, Occam optimality takes the concepts accepted in schema design and applies them to programs and the software lifecycle as a whole.

Someone at a bank would write a system for automating checking accounts. Naturally, the account information would include the name and address of the person who owns the account. Some else would write a system for automating savings accounts. Naturally, the account information would be essentially the same; it may be the same because someone knew all about the checking system and copied it, or it might be the same because of “convergent evolution.” The account holder moves, and somehow tells the bank. The person at the bank has to look everywhere the person’s address might be stored and change it there. They might miss a place, or they may enter something incorrectly. This places a burden on the bank, makes extra work, is a source of errors, and may lead to customer dissatisfaction. It’s far better to have a single place where a person is identified at the bank, and where the things we know about that person (name, address, phone number, etc.) are stored. In the database world, this is the concept of “normalization,” which essentially means store each piece of unique data exactly once.

The database world also has the concept of “reference” in a couple of ways. If the person actually sets up a checking account, the checking account system needs the account information for the customer. This is done by identifying each account by a unique identifier, known as a “primary key.” The primary key is associated with the master copy of the account-holder’s information. Then, in each place where the customer uses a bank service, for example a checking account, a “foreign key” to the account information is placed. This is actually a copy of the primary key. So if you are customer 123, your primary key is 123, and the number 123 is also made the foreign key for your savings account, your checking account, etc. Calling it a “foreign key” says that it’s a reference to the one place where the information is stored, and not the master copy itself.

The database also uses references to eliminate redundancy in data types. When you need multiple fields that have to be distinct but actually represent the same kind of data, you define the type information once in something called a “domain,” and then each use of the domain is actually a reference to the master definition. If you change the domain, all the uses of the domain automatically change.

Databases represent Occamal principles in another important respect: statements in the data manipulation language (today, that means statements in SQL, for example SELECT data FROM tables WHERE conditions) are limited to what needs to be done, not how it is to be done. For example, if there is a JOIN between two tables, which should be used first? There is an excellent answer to that question, for example if one table is much smaller than the other, use the smaller table, etc. By abstracting how the join is implemented from the fact that you want a join, you get to define the how of joins in exactly one place, while requesting joins in many places. In earlier data manipulation languages, you could get the job done, but you would explicitly go first to the table you thought best to start from, and then go to the next. The trouble is, this puts the knowledge of how to navigate tables to get information in many places! This is bad because everyone has to learn it and apply it correctly, and if you want to change the method because you’ve figured out a better way to do it, you have to go through all the DML in the program and make individual decisions about how and what to change. Separating what we want from how to get it was definitely an advance, because now SQL could benefit from ever-improving execution routines without having to be changed! The point here is that exactly those benefits are the ones we would expect, because the change also made all programs that use SQL closer to being Occam-optimal!

It is true that many database implementations are Occam-suboptimal when considered in isolation. Because DBMS schemas are typically completely isolated from the programs that use them, programs that use databases (taking the database schema and stored procedures to be part of the program) are typically far from Occam-optimal. This is recognized in the RAILS framework of the Ruby language, which unifies the data definitions of the database and the program; this is the reason why programming is so much more efficient when using RAILS. Apart from RAILS, it is wonderful to have modern databases as an example of an island of software in which Occamal principles are accepted and valued, and the benefits widely enjoyed. By understanding the examples of good database practice in terms of Occam optimality, we can get an idea of how the principle can be extended to the rest of software.

September 27, 2023
Why should you build Occamal Programs?

There is a good deal of background and analysis to understand just how the concept of Occamality applies to the practical details of building programs. But before we get to that, perhaps it would be good to review the benefits we can expect. The core benefit of paring down a program to its bare minimum information content is pretty simple:

To the extent that any semantic concept is repeated anywhere, in any form, in the specification of the program, it is a redundancy that, if discarded, would reduce the cost of implementing the concept, the cost of one or more of the downstream components of the program, and the cost of subsequent modifications.

In other words, an Occamal program is one that is quicker than any other program that does the same thing to build, and is overwhelming the easiest to change, even though you did not think you were building “flexibility” into the program!

Let’s review briefly the lifecycle of software. At the beginning of the process for a whole new program or a change to an existing one, there is the need to create or change embodied in a set of requirements. From there, with variations depending on the programming shops’ methodology, we have roughly specifications (business, functional, high level, low level, etc.), design (various levels of focus and detail), code (perhaps with prototypes, code walk-throughs, etc.), test (various levels, sometimes performed by different groups of people using different tools), document (internal and external), train (the people who operate, administer, install, support and use), learn and use the software. This sequence may be linear or it may have iterative cycles. Once the program is in operation, there is an extended support cycle, with support, maintenance and bug fixing. There is typically a demand for new or altered features; these go through a version of a similar chain from requirements through building and use. It is generally recognized that (1) most efforts to build new programs crash and burn prior to roll-out, and (2) most of the money that is spent on a program is spend during its extended “tail,” namely the support and modify cycle. The difficulty of building brand-new programs increases the extent to which existing programs are modified, sometimes through multiple technology cycles.

Today, because occamality is not a familiar concept, it may take longer in practice to build programs that are occamal. Even when it is understood, redundancy frequently does not cost much to build into a program. In fact, it may be quicker to build a program with extensive redundancy (think copy and modify) than with less. But every superficial gain from redundancy is paid for, over and over again. Once we’re past the learning curve, building occamal programs should cost about the same as non-occamal ones (ideally, they would cost less to build), and the majority of the cost of software ownership should experience dramatic benefits. The benefits would mostly be due to the lower cost and lower risk of testing, documenting, learning, using, maintaining, and changing programs.

What exactly is it that makes programs hard to change? Once you understand basically what you have to do, it comes down to finding all the places in the program that will be affected by your change. The side-effects are always the toughest.

What if the change you wanted to make could be accomplished by changing exactly one place in the program? What if, to take a trivial example, you wanted to change the classic

Print (“Hello, world”);

By printing “bubba” instead of “world?” This would be easy, because you could just go to the single program line and make the change.

Now let’s take a well-known horrible example – the Y2K problem. This was the problem of modifying programs so they would continue to work when the year changed from 1999 to 2000. This was a problem because many programs did not use 4 digits to represent the year. For various reasons, usually involving an attempt to save space or time, year was represented as a two digit number in many programs, for example 99, with the leading two digits, 19, understood. If those programs had a single place where the number of digits in a year was represented, solving the Y2K problem would have been trivial – you would just go to each program, find the place where year was defined, change it, re-compile, check for problems, and you’re done. A quick change, low risk, no big deal. So Y2K was a problem not because of anything difficult or mysterious about dates – in fact, it could have been about any similar program change. Y2K was a problem simply and solely because the knowledge of the number of digits in the year of dates was expressed in many ways in many places, and it took a long time to find and fix them all with a substantial risk that some places were missed.

I completely admit, and I remember because I participated in the madness, that there were bizarre variations that made Y2K particularly gruesome. One example is “overloading,” which means that sometimes a programmer would use the value of “99” in the year field to mean “no date was given.” Y2K was a problem before the year 2000 because some programmers used “9999” (which could mean September 9, 1999) to mean “no more records in this sequence.” But all such cases just multiplied the basic fact: the number of digits in a year was expressed in many ways in many places, and in some of those places, other values or indicators were also stored.

So we want to have exactly one place in which the number of digits in date is defined. Notice that doesn’t mean we are limited to a single date. The key concept here is distinguishing between definition and use. Let’s add another level to make it clear. We want to define date in one place. Then we define, say, a couple of dates: birth date, marriage date, date of loan start, date of loan end. Each of these would be defined as types of dates. “Date” would have how many digits are in a year, and “birth date” would be a kind of date. We would define with “date” those things that are true of all dates, like number of digits in the year. We would define in “birth date” only those things that were true of “birth date” in particular, for example, the label to use when displaying it.

This is an elementary concept as I have illustrated it so far. What is different is the extent of the application of the concept that I propose. While it is (I hope) common practice for dates to be defined in this way within a program, it is certainly not common practice for all dates used anywhere in a program for any purpose to be defined in this way, as we saw big time in Y2K. For example, we have one way of defining dates in the database, another one for local storage in application programs, and yet another one for display and reports. While it is reasonable for there to be aspects of dates that some parts of programs care about that others don’t, for example the display label, it is not reasonable that all aspects that are common be defined redundantly – they should be defined exactly once.

More importantly, this concept applies to all aspects of the software lifecycle, not just the building phase. It even applies to documentation, training and using programs, because it means that there is less to document, train and learn, and when changes are made, the user’s common-sense notion that “something changed” actually applies – for example, if zip code is changed from 5 numbers to 5 numbers optionally followed by 4 numbers, that change is universal. The change, made in a single place, ripples though screens, applications, temporary variable definitions and databases.

Occamality directly addresses a lament frequently heard from programmers, about not being given enough time to do a job “right,” to design it for future flexibility. One of the many reasons good managers refuse to support a “good, flexible design” is that you always pay the price of the additional effort, and you rarely actually experience the benefit. That is simply because the designer tries to anticipate the nature of the potential future change, and make provisions for it. But of course, what usually happens is the anticipated change doesn’t happen and something else does, and the fact that the program is longer than it could have been because of the additional code that provides for un-needed flexibility makes the unanticipated change harder to make than it had to be.

Now that we understand what makes change hard – the fact that what we want to change is expressed in many ways in many places, all of which have to be found, understood, and correctly changed – we can design programs that are both as quick as possible to build and as quick to change in the future, regardless of the new requirement; we know that if we need to change something, there’s always one place to go to make the change.

The answer to why you should build Occamal programs is simple:

Occamal programs cost less to specify, design and build (once you are past the learning and acceptance curves), cost much less in the later stages of the software lifecycle where most of the costs are incurred, and enable changes and other forms of support and maintenance to be made with the greatest speed and the lowest cost and risk. And this is true regardless of language, operating system, application environment, and anything else about your software environment and tools.

September 27, 2023
How do you know if a given piece of software is good?

Software has a real problem. Let me explain.

While a huge gulf separates the novices from the experts in every field, I like to think that the widespread simple knowledge in most fields is like writing as taught to elementary school students. Fifth graders use the same alphabet that I use; while my vocabulary is more extensive and my use of grammar more elaborate, in both cases what I do extends and builds upon what the kids do. Fifth graders don’t need to unlearn the “bad” letters when they get to high school. Books for children can be well-written, and understood and enjoyed by both children and adults. Books for adults are supersets of children’s books in terms of vocabulary, sentence structure and experience. And while professional writers may not make extensive use of notes written on 3 by 5 cards the way English teachers used to make kids do in high school, the principles of organization are the same.

In software, however, the doctrines and received methods taught and practiced in the grades and, worse, among mainstream professionals, are simply inadequate to support doing a really good job; the best people don’t extend them, they ignore them and in various ways violate their principles. The typical methods for performing the quality assurance and testing functions, for example, are counter-productive. They don’t need correction – they need complete replacement. They focus on the wrong issues, they have the wrong concepts, and the most widely used methods and tools simply don’t get the job done!! The relationship between programs and data. Same. By reference and by value, implicit and explicit. Other books in this series go into detail for a couple of these subjects.

Things are so bad in software that the most experienced and accomplished people don’t even agree on what makes a program “good.” Given a set of programs that don’t crash and meet a set of requirements, how do you rank them in order of goodness? Is goodness proportional to how deep the inheritance trees are? How well documented they are? To what extent the evil “go to” mechanism is used? What is the test coverage, or the number of unit tests? How few machine resources they consume? Whether they’ve been written in this language or that?

I propose that there is, with a few common-sense qualifications, a measure of software goodness we can all agree to. I suggest this isn’t a new idea, but one that many people have sensed and acted on. The principle even makes common sense! The only thing it lacks is articulation and discussion. Here is a non-technical explanation of it. Here is one of my attempts to state the principle in a historic and more technical context. Here is a deeper explanation of the context and background.

This is not an abstract, nice-to-have concern. Unless you know what “goodness” is in a program, how can you measure whether you’re going in the right direction? Think about target shooting: without a target, how can you decide how close you are, and therefore how accurate a shooter you are? How can you compare two different shots?

Having an explicit agreement of what makes a program “good” is indispensable to making good software, and making it effectively and efficiently.

September 27, 2023
Occam’s Razor: the key to optimal software development

William of Occam (ca. 1285 to 1349) was an English logician and Franciscan friar. He is credited with formulating a principle that has been applied to various aspects of computer systems. In those areas of computing to which it has been applied, it reigns supreme – it supplies optimal solutions to the relevant problems.

There are large areas of computing to which Occam’s razor has not been applied. Worse, it is not even one of the candidates under consideration. As a result, those aspects of computing are fractured, inefficient, unpredictable, and driven by fashion and politics.

Everyone involved knows that the whole process of specifying, designing, building, testing and supporting software is hopelessly inefficient, unpredictable and error-prone. The leading methodology that concentrates on the process of building software, “project management,” is theoretically bankrupt and in any case has an empirical track record of failure. In terms of the content of good software, there are fierce battles among competing approaches, none of which is anything but a collection of unsupported and unfounded assertions, and which in practice don’t contribute to building good software.

Occam’s razor leads to the principles on which good software may be built, and supplies a single simple, widely applicable theme that, when applied, cuts away (as a razor should) all the inefficiency and generally what we don’t like about software. Once the principle is understood and widely applied, It should be the undisputed standard for how software is built, just as it has in the other areas of computing to which it has been applied.

Here is a shorter attempt to explain the Razor and its application to software. Here is an approach to the same problem from a common-sense layman's point of view.

The razor itself

“Occam’s razor” is a famous principle of thinking from the European Middle Ages. Occam’s razor is applied in situations where there is more than one reasonable explanation for a phenomenon, and basically says that you should pick the simplest explanation consistent with the phenomena you observe. From Wikipedia:

Leonardo da Vinci (1452–1519) lived after Occam's time and has a variant of Occam's razor. His variant short-circuits the need for sophistication by equating it to simplicity.

Simplicity is the ultimate sophistication.

Occam's Razor is now usually stated as follows:

Of two equivalent theories or explanations, all other things being equal, the simpler one is to be preferred.

As this is ambiguous, Isaac Newton's version may be better:

We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.

In the spirit of Occam's Razor itself, the rule is sometimes stated as:

The simplest explanation is usually the best.

Other applications of Occam’s razor

While it plays little overt role in intellectual debates, in fact Occam’s razor and concepts derived from it are central to scientific thinking. So while you may not have ever heard of it, or may only vaguely remember having heard it in the past, you shouldn’t think that it’s an obscure little Medieval tidbit that has no relevance to the present that this guy is just pulling out for some weird reason. If you go through the Wikipedia reference, you find that:

Occam's Razor has become a basic perspective for those who follow the scientific method. … without the principle of Occam's Razor science does not exist. The primary activity of science, formulating theories and selecting the most promising theory based on analysis of collected evidence, is not possible without some method of selecting between theories which do fit the evidence. This is because, for every set of data, there are an infinite number of theories which are consistent with those data (this is known as the Underdetermination Problem).

You can find Occam’s razor in detail in astronomy, physics, biology, and medicine. It even appears explicitly in statistics.

There are various papers in scholarly journals deriving versions of Occam's Razor from probability theory and applying it in statistical inference, and also of various criteria for penalizing complexity in statistical inference. Recent papers have suggested a connection between Occam's Razor and Kolmogorov complexity.

It should be evident that Occam’s razor is an important underpinning of the whole scientific enterprise.

Applications of Occam’s razor in computing

We already understand Occam’s razor clearly in information theory, in which we measure the information content of a transmission. Information theory has provided the theoretical foundation of all communications and data transmission since its invention by Claude Shannon in 1948.

Occam’s razor is particularly applied in minimum message length (MML) theory, which focuses on the least number of bits required to encode a given amount of information.

MML has been in use since 1968. MML coding schemes have been developed for several distributions, and many kinds of machine learners including: unsupervised classification, decision trees and graphs, DNA sequences, Bayesian networks, Neural networks (one-layer only so far), image compression, image and function segmentation, etc.

While the terminology is not normally used, and I am not aware that Occam’s razor was explicitly used to formulate it, it is clear that the principles are clearly expressed in modern relational database theory, and in the practice of schema design. I suggest that is one of the reasons for the success of the DBMS approach to data storage and access.

Shannon’s information theory

Shannon’s application of the concept to information theory provides a good springboard to seeing how it applies to software design. So let’s understand information theory a little better. I don’t think I can do better than Wikipedia. Here is a snapshot of the article on the history of information theory:

Claude E. Shannon (1916–2001) founded information theory with his classic paper "A Mathematical Theory of Communication," published in the Bell System Technical Journal in July and October of 1948. At the beginning of his paper, Shannon asserted that "The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point." His theory for the first time considered communication as a rigorously stated mathematical problem in statistics and gave communications engineers a way to determine the capacity of a communication channel in terms of the common currency of bits. This problem is called the channel coding problem. The transmission part of the theory is not concerned with the meaning (semantics) of the message conveyed.

A second set of ideas in information theory relates to data compression. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data. There are two formulations for the compression problem — in lossless data compression the data must be reconstructed exactly, whereas lossy data compression examines how many bits are needed to reconstruct the data to within a specified fidelity level. This fidelity level is measured by a function called a distortion function. In information theory this is called rate distortion theory. Both lossless and lossy source codes produce bits at the output which can be used as the inputs to the channel codes mentioned above.

This division of information theory into compression and transmission is justified by the information transmission theorems, or source-channel separation theorems that justify the use of bits as the universal currency for information in many contexts. …

Communications channels have a fixed capacity for sending information per unit of time. To use that capacity as efficiently as possible, you distinguish between the message that is presented for sending and the actual information content of that message. What this amounts to is saying that the “information content” of a message is the smallest number of bits required to exactly reproduce the message after transmission. In plain language, the “information content” of a message is the message compressed as much as possible, with all repeats and redundancy removed, except for redundancy purposely introduced for error identification and correction.

Like any application of Occam’s razor, the focus on information content comes out of knowing what you care about; you include everything required to get what you care about and throw out everything else. What I care about in communications is accuracy and efficiency. Efficiency means that I should transmit the most information through a communications channel of given capacity as possible. This can be achieved by using the smallest possible number of bits to encode the message.

The application of Information theory to software

How is all this relevant to building software? Aren’t we supposed to use structured design, use cases, components, or whatever to design software? Most of those techniques focus on the process of a good design, or a supposed ideal structure for a program. If we apply the same thinking that we see in information theory to software design, we will focus instead on the results of the process, and once we are satisfied we know how to judge whether a computer program is optimal or not, we will be able to find ways to build it.

Software specification and design has clearly been an island of technical art, separated from science and technology as a whole, except for those portions that have already come under the sway of “Occamal” thinking such as information and MML theory. Like any relatively isolated island of theory and practice, a wide variety of techniques and practices arise to fill the gap left by a completely baseless, ad-hoc approach. There are innumerable approaches to software design and building and no clear way to decide among them.

When designing software, we care most about things that are a very close analogy to what we care about in communications. In communications, we have a channel of given capacity, and want to squeeze the most information through it we can; therefore, we eliminate all redundancy from the original message, and transform it to contain only its pure information content. Anything extra would take transmission capacity and add nothing. In software design, we normally have capacity to build a representation of the program (source code) that is as long as we would like in terms of the capacity of the computer to execute the code. However, every line of code adds a burden to one or more stages of the whole chain of specify, design, code, test, document, train, learn, use, maintain, modify, and repeat. This chain of effort, the entire lifecycle of the program and everyone who touches it, is like the communications channel in information theory. The idea is that most program specifications are highly redundant, like messages in their original form. We want to define the “information content” of programs just like we define the information content of messages, so that every redundant or repeating group is eliminated, and what is left is everything required to make the program operate and absolutely nothing else. We focus on information content in communications because we want to make the best use of fixed communications capacity; we focus on information content in software because we want to make the best use of all the resources that are involved with the software in any way.

We don't just want to make programs short for general reasons. We also want to make them easy to change, with as little effort and error as possible. By eliminating all redundancy from a program, we make it so that there's exactly one place to go to make a change.

Getting back to Occam (though I admit Shannon and fancy formulas are more impressive and intimidating), if Occam’s razor is:

Entia non sunt multiplicanda praeter necessitatem.

No more things should be presumed to exist than are absolutely necessary.

Occam’s razor applied to software would be:

No more software entities should be created than are absolutely necessary.

Why? Just as sending bits beyond the information content of the message has a cost but doesn’t improve the message that is received, so do additional entities in the specification or expression of the program add to the cost to build and the cost to change without increasing the value of the program.

This may not sound dramatic or exciting. But as someone who has lived with the process of building software for a long time, and has struggled extensively with the question of what makes software “good,” it’s very satisfying to have an objective criterion that enables you to judge the “goodness” of a program, and to say when a program is “optimally” good. It gets exciting when you realize that this abstract notion of optimality has consequences that are extremely practical and down-to-earth. In particular, it shows you a path to building programs more quickly than using any other method; doing less to build the programs than you thought you had to do; and having the resulting programs be as easy and safe to change as it is possible for programs to be.

September 27, 2023
Summary: Occamality and Software Architecture

This is a summary of my posts on the single most important principle of software: Occamality (non-redundancy). This principle applies to everything from the simple concept of what makes a piece of software "good" to developing and evolving software quickly, efficiently and with high quality. It drives good software architecture and all other aspects of development, from requirements through QA.

This summary also includes my posts on software architecture, mostly explaining why widely accepted architectures like microservices are terrible.

To start, it's worth pointing out that software people don't know what makes a piece of software "good."

https://blackliszt.com/2023/09/how-do-you-know-if-a-given-piece-of-software-is-good.html

Software people often have strong thoughts about software languages and architecture. However, it is extremely rare for those opinions to be grounded in or related to the goals of software architecture. What are the goals? Here’s my proposal.

https://blackliszt.com/2022/05/the-goals-of-software-architecture.html

Here are the key specific things you do to accomplish the goals.

https://blackliszt.com/2020/02/how-to-build-applications-that-can-be-changed-quickly.html

Here is a layman's, common-sense explanation of the same idea:

https://blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-common-sense-approach.html

Here is a specific explanation explaining why Occamal programs are better than non-Occamal ones and why you should care.

https://blackliszt.com/2023/09/why-should-you-build-occamal-programs.html

How do you apply Occamality in practice? Here is a short, simple list of the practical things you do.

https://blackliszt.com/2023/09/understanding-occam-optimality-practically.html

Occamality isn't another entrant in the myriad of design principles competing for attention in the chaotic world of software — it's an overriding principle, one that stands above and ranks all the contenders.

https://blackliszt.com/2023/09/occamality-and-other-design-principles.html

Occamality isn't confined to writing software. It applies to all stages of the development lifecycle, from requirements through QA and support.

https://blackliszt.com/2023/09/occam-optimality-applies-to-all-stages-of-the-software-life-cycle.html

The value of reducing redundancy isn't confined to software; it's a general principle.

https://blackliszt.com/2020/05/lessons-for-better-software-from-washing-machine-design.html

Saying that you should reduce redundancy in a program sounds simple, but once you get past trivial examples, it's not. Here's an analysis of the increasingly sophisticated kinds of redundancy in programs that should be addressed.

https://blackliszt.com/2023/10/moving-towards-occamal-software.html

Reducing redundancy is accomplished by taking a declarative approach to programming instead of a purely imperative one. There are many examples of this.

https://blackliszt.com/2021/07/software-programming-languages-the-declarative-core-of-functional-languages.html

Databases are an excellent, proven example of applying the principle of Occamality.

https://blackliszt.com/2023/09/occamality-in-databases.html

It's not just databases — Occamality is a thread that weaves through much of the history of software.

https://blackliszt.com/2023/09/occamality-in-software-history.html

You might think that reducing redundancy is an obviously valuable thing to do. The trouble is, modern software orthodoxy endorses the notion of collections of code that are separated by high walls (components, services, layers, objects, etc.), which typically leads to huge amounts of redundancy.

https://blackliszt.com/2023/09/occamality-the-problem-with-layers-components-and-objects.html

There is a simple idea that shows the basic approach to eliminating redundancy in programs: instead of stating how a thing should be accomplished, you concentrate on defining what is to be accomplished.

https://blackliszt.com/2023/09/achieving-occamality-what-not-how.html

The optimal way to reduce redundancy includes recognizing that in addition to instructions and data, programs include varying amounts of metadata. Metadata is an easy concept for those who use it, but many programmers don't get past the idea of parameters. Here's a way to understand metadata:

https://blackliszt.com/2023/09/achieving-occamality-through-definitions.html

In broader context, here's how metadata fits into a whole program, as the third dimension of software architecture.

https://blackliszt.com/2020/02/the-three-dimensions-of-software-architecture-goodness.html

For all dimensions, lack of redundancy is the main virtue. As a group, the more functionality is expressed in metadata and the less in code, the better.

A focus on metadata is similar to having a generic direction-generating program that refers to an easily-changed map.

https://blackliszt.com/2020/06/the-map-for-building-optimal-software.html

Here’s more theoretical depth on the role of metadata in a software system, with a comparison to theories of the solar system.

https://blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-code-and-metadata.html

Why put as much application knowledge into metadata as possible? It's the easiest thing to change, and above all, it's the best place to eliminate redundancy, which is the enemy of fast, error-free change.

https://blackliszt.com/2020/03/william-occam-inventor-method-for-building-optimal-software.html

Here is a more extensive explanation of the history and context of Occam's Razor and its relevance to software.

https://blackliszt.com/2023/09/occams-razor-the-key-to-optimal-software-development.html

How do you achieve this ideal architecture for a body of code? Not all at once! You avoid the usual nightmare of useless, ever-changing requirements and do something that makes a customer happier than they were. Then fix it. Here’s the process, to which I’ve given a fancy name.

https://blackliszt.com/2022/09/better-software-and-happier-customers-with-post-hoc-design.html

Here’s another statement of the basic idea:

https://blackliszt.com/2011/06/software-how-to-move-quickly-while-not-breaking-anything.html

https://blackliszt.com/2020/03/how-to-pay-down-technical-debt.html

Here is more detail and explanation of how to use increasing amounts of metadata to help build applications quickly, which of course should be a major goal of software architecture.

https://blackliszt.com/2022/05/how-to-improve-software-productivity-and-quality-schema-enhancements.html

Here's a short case study from early in my career that demonstrated to me the incredible value of taking an Occamal approach to building an end-user business application.

https://blackliszt.com/2023/09/achieving-occamality-through-definitions-case-study.html

Here is a more recent case study of a system based on extensive use of metadata and what happened when technology-fashion-driven executives took over the company.

https://blackliszt.com/2023/09/case-study-replacing-metadata-with-fashionable-software.html

One of the most basic aspects of software architecture is the data and where it is stored. The default choice for most architecture is to use a standard DBMS. Given the steady advance of Moore's Law, this is often no longer the best choice.

https://blackliszt.com/2010/09/databases-and-applications.html

Given the huge advantage of taking a metadata approach to software, why isn't it widely used? It's because all of software has been obsessed with procedural language as the core focus of programming. While necessary for the first decades of computing, it's now the core reason, never discussed, for the near-universal dysfunction in software development.

https://blackliszt.com/2024/08/why-is-writing-computer-software-dysfunctional.html

Bad Software Architectures

Software is infected with architectural religions, none of them with a sound basis in logic or real-world experience. It’s not that you can’t build software that sort of eventually kinda works with them – but it’s like building a car with a steam engine.

Sadly, some programming languages and programming concepts encourage redundancy.

https://blackliszt.com/2014/03/how-to-evaluate-programming-languages.html

Starting a couple decades ago the idea of “distributed computing” as an architecture become the thing all the cool kids gravitated to.

https://blackliszt.com/2015/04/the-distributed-computing-zombie-bubble.html

A modern incarnation (with a new name and rhetoric of course) is micro-services, which is supposed to boost programmer productivity.

https://blackliszt.com/2021/03/how-micro-services-boost-programmer-productivity.html

Not only does micro-services boost programmer productivity, it supposedly is a “scalable” architecture – in sharp contrast to the evil “monolithic” architecture … a word which is usually pronounced with a sneer.

https://blackliszt.com/2020/06/why-is-a-monolithic-software-architecture-evil.html

The trouble is, microservices make about as much sense as blood-letting did in medicine. It's widely accepted as useful, but entirely without evidence.

https://blackliszt.com/2019/02/what-software-experts-think-about-blood-letting.html

Programmers seem to like to layer their software, often without thinking about it.

https://blackliszt.com/2012/06/layers-in-software-fuss-and-trouble-without-benefit.html

Similarly when they link together pieces, a key decision is whether the coupling is loose or tight.

https://blackliszt.com/2012/08/coupling-in-software-loose-or-tight.html

Components and layers have been promoted for a long time.

https://blackliszt.com/2021/03/micro-services-the-forgotten-history-of-failures.html

https://blackliszt.com/2021/09/software-components-and-layers-problems-with-data.html

https://blackliszt.com/2021/08/the-dangerous-drive-towards-the-goal-of-software-components.html

For the best results, it’s good to focus on the goals of software architecture described above, and assure that everything that you do contributes to those goals. Part of how you do this is to avoid the always-present temptation of following software fashions.

https://blackliszt.com/2023/07/summary-software-fashions.html

August 31, 2023
How to Improve Software Productivity and Quality: The Common Sense Approach
I talked with a frustrated executive in a computer software company. I was about to visit their central development location for the first time, and he wanted to make sure I asked sufficiently penetrating questions so that I would find out what was “really” going on.

He explained that while he had written software, it was only for a few years in the distant past, and things had changed a great deal since his day. His current job in product marketing didn’t really require him to get into any details of the development shop, and in fact he preferred to stay out of the details for several reasons: (1) he was completely out of date with current technology and methods; (2) he didn’t want his thinking constrained by what the programmers declared was possible; (3) it was none of his business.

He had developed a keen interest in what was going on in the software group, however, because he realized that it had a dramatic effect on his ability to successfully market the product. His complaints were personal and based on his own experience, but they were fairly typical, which is why I’m recounting his tale of woe here.

The Lament

The layman’s lament was an interesting mish-mash of two basic themes:
- I’m not getting the results I need. There are certain results that I really need for my business. My competitors seem to be able to get those results, and I can’t. Basically, I want more features in each release, more frequent releases, more control and visibility on new features, fewer bugs in new releases, and the ability to make simple-sounding changes quickly. Our larger competitors seem to be able to move more quickly than we do.
- I think the way the developers’ work is old-fashioned, and if it were brought up-to-date, I would get the results I need. What they do seems to be “waterfall,” with lots of documentation that doesn’t say a lot. There must be something better, along the lines of what we used to call RAD (rapid application development). They only have manual testing, nothing automated, and they tell me it will be years before they can build automated testing! And shouldn’t they be using object-oriented methods? Wouldn’t that provide more re-use, so that things can be built and changed more quickly? They have three tiers, but when I want to change something, the code always seems to be in the wrong tier and takes forever. They’re talking about re-writing everything from scratch using the latest technology, but I’m afraid it will take a long time and there won’t be anything that benefits me.
Basically, he was saying that he wants more things, quicker, and better quality. He also advanced some theories for why he’s not getting those things and how they might be achieved, but of course he couldn’t push his theories too hard, because he lacked experience and in-depth knowledge of the newer methods. He even claimed, in classic “the grass is greener” style, that practically everyone accomplishes these things, and he was nearly alone in being deprived of them – not true!

The usual dynamics of a technology group explaining itself to “outsiders” was also at work here – if you just listen to the technology managers, things are pretty good. The methods are modern and the operation is efficient and productive. There are all sorts of improvements that could be made with additional money for people and tools, of course, but for a group that’s been under continual pressure to build new features, support cranky customers and meet accelerated deadlines with fewer resources, they’re doing amazingly well. The non-technology executives tend to feel that this is all a front, and that results really could be better with more modern methods and tools. The technology managers, for their part, feel like they’re flying passenger planes listening to a bunch of desk-bound ignoramuses complain about their inability to deliver the passengers safely and on-time while upgrading the engine and cockpit systems at the same time. These people have no idea what building automated testing (for example) really takes, they’re thinking. The non-technology people don’t really want to talk about automated testing, of course – they’re the ones taking the direct heat from customers who get hurt by bugs in the new release, and aren’t even getting proposals from the technology management of how this noxious problem can be eliminated. Well, if you can’t tell me how to solve the problem (and you should be able to), how about this (automated testing, object-oriented, micro-services, etc.)??

It goes on and on. The business executives put a cap on it, sigh, maybe throw a tantrum or two, but basically try to live with a situation they know could be better than it is. Inexperienced executives refuse to put with this crap, and bring in new management, consultants, do outsourcing, etc. Their wrath is felt! Sadly, though, the result is typically a dramatic increase in costs, better-looking reporting, but basically the status quo in terms of results, with success being defined downwards to make everything look good. The inexperienced executive is now experienced, and reverts to plan A.

The technology manager does his version of the same dance. The experienced manager tries to keep things low-key and leaves lots of room for coping with disasters and the unexpected. Inexperienced technology managers refuse to tolerate the tyranny of low expectations; they strive for real excellence, using modern tools and methods. Sadly, though, the result is typically a dramatic increase in costs, better-sounding reports, but basically the status quo in terms of tangible results. The new methods are great, but we’re still recovering from the learning curve; that was tense and risky, I’m lucky I survived, that’s the last time I’m trying something like that again!

The Hope

The non-technology executive is sure there’s an answer here, and it isn’t just that he’s dumb. He keeps finding reason to hope that higher productivity with high quality and rapid cycles can be achieved. In my experience, the most frequent (rational) basis for that hope is a loose understanding of the database concept of normalization, and the thought that it should enable wide-spread changes to be made quickly and easily. Suppose the executive looks at a set of functionally related screens and wants some button or style change to be applied to each screen. It makes sense that there should be one place to go to make that change, because surely all those functionally related screens are based on something in common, a template or pattern of some kind. What if the zip code needs to be expanded from five digits to nine? The executive can understand that you’d have to go to more than one place to make the change, because the zip code is displayed on screens, used in application code and stored in the database, but there should be less than a handful of places to change, not scores or hundreds!

But somehow, each project gets bogged down in a morass of detail. When frustration causes the executive to dive into “why can’t you…” the eyes normally glaze over in the face of massive amounts of endless gobbledy-gook. What bugs some of the more inquisitive executives is how what should be one task ends up being lots and lots of tasks? With computers to do all the grunt work, there’s bound to be a way to turn what sounds, feels and seems like one thing (adding a search function to all the screens) actually be one thing – surely there must be! And if everything you can think of is in just one place, surely you should be able to go to that one place and change it! Don’t they do something like that with databases?

There is a realistic basis for hope

I’ve spent more of my life on the programmers’ side of the table than the executives’, so I can go on, with passion and enthusiasm, about the ways that technology-ignorant executives reduce the productivity, effectiveness and quality of tech groups, not to mention the morale! The more technical detail they think they know, the worse it seems to be.

That having been said, the executive’s lament is completely justified, and his hope for better days is actually reasonable (albeit not often realized).

What his hope needs to be realized is there to be exactly one place in the code where every completely distinct entity is defined, and all information about it is stated. For example, there should be exactly one place where we define what we mean by “city.” This is like having domains and normalization in database design, only extended further.

The definition of “city” needs to have everything we know about cities in that one place. It needs to include information that we need to store it (for example, its data type and length), to process it (for example, the code that verifies that a new instance of city is valid) and to display it (for example, its label). The information needs to incorporate both data (e.g. display label) and code (e.g. the input edit check) if needed to get the job done. This is like an extended database schema; a variety of high-level software design environments have something similar to this.

It must be possible to create composite entities in this way as well, for example address. A single composite entity would typically include references to other entities (for example, city), relationships among those other entities and unique properties of the composite entity (for example, that it’s called an “address.” This composite-making ability should be able to be extended to any number of levels. If there are composites that are similar, the similarity should be captured, so that only what makes the entity unique is expressed in the entity itself. A common example of this is home address and business address.

Sometimes entities need to be related to each other in detailed ways. For example, when checking for city, you might have a list of cities, and for each the state it’s in, and maybe even the county, which may have its own state-related lists.

The same principle should apply to entities buried deep in the code. For example, a sort routine probably has no existence in terms of display or storage, but there should usually be just one sort routine. Again, if there are multiple entities that are similar, it is essential that the similarities be placed in one entity and the unique parts in another. Simple parameterization is an approach that does this.

Some of these entities will need to cross typical software structure boundaries in order to maintain our prime principle here of having everything in exactly one place. For example, data entities like city and state need to have display labels, but there needs to be one single place where the code to display an entity’s label is defined. Suppose you want a multi-lingual application? This means that the single place where labels are displayed needs to know that all labels are potentially multi-lingual, needs to know which the current language is, and needs to be able to display the current language’s label for the current entity. It also means that wherever we define a label, we need to be able to make entries for each defined language. This may sound a bit complicated at first reading, but it actually makes sense, and has the wonderful effect of making an application completely multi-lingual.

In order to keep to the principle of each entity defined once, we need the ability to make relationships between entities. The general concept of inheritance, more general than found in object-oriented languages, is what we need here. It’s like customizing a standard-model car, where you want to leave some things off, add some things and change some things.

There’s lots more detail we could go into, but for present purposes I just want to illustrate the principle of “each entity defined in one place,” and to illustrate that “entity” means anything that goes into a program at any level. By defining an entity in one place, we can group things, reference things, and abstract their commonality wherever it is found, not just in a simple hierarchy, and not limited to functions or data definitions or anything else.

While this is a layman’s description, it should be possible to see that IF programs could be constructed in this way, the layman’s hope would be fulfilled. What the layman wants is pretty simple, and actually would be if programs were written in the way he assumes. The layman assumes that there’s one way to get to the database. He assumes that if you have a search function on a screen, it’s no big deal to put a search function on every screen. He assumes that if he wants a new function that has a great deal in common with an existing function, the effort to create the new function is little more than the effort to define the differences. He assumes that look and feel is defined centrally, and is surprised when the eleventh of anything feels, looks or acts differently than the prior ten.

Because he has these assumptions in his mind, he’s surprised when a change in one place breaks something that he doesn’t think has been changed (the infamous side-effect), because he assumes you haven’t been anywhere near that other place. He really doesn’t understand regression testing, in which you test all the stuff that you didn’t think you touched, to make sure it still works. Are these programmers such careless fools that, like children in a trinket shop, they break things while walking down the aisle to somewhere else, and you have to do a complete inventory of the store when the children leave?

Programs are definitely not generally written in conformance with the layman’s assumptions; that’s why there’s such a horrible disconnect between the layman and the techies. The techies have a way of building code, generally a way that they’ve received from those who came before them, that can be made to work, albeit with considerable effort. They may try to normalize their database schemas and apply object principles to their code, but in the vast majority of cases, the layman’s assumption of a single, central definition of every “thing,” and the ability to change that thing and have the side-effects ripple silently and effectively through the application, does not exist, is not articulated, not thought about, and is in no way a goal of the software organization. It’s not even something they’ve heard talked about in some book they keep meaning to get to. It’s just not there.

I assert that it is possible to write programs in a way that realizes the layman’s hope.

I’ve done it myself and I’ve seen others do it. The results are amazing. It’s harder to do than you would ideally like because of a lack of infrastructure already available to support this style of writing, but in spite of this, it’s not hard to write. Moreover, once the initial investment in structure has been made, the ability to make changes quickly and with high quality quickly pays back the investment.

The main obstacle for everyone is that there is tremendous inertia, and the techniques that provide a basis for the hope, while reasonable and achievable, are far out of the mainstream of software thinking. I have seen people who have good resumes but are stupid or lazy look at projects that have been constructed according to the “one entity – one definition” principle and simply declare them dead on arrival, complete re-write required. But I have also encountered projects in domain areas where there is no tradition at all in building things in this way in which the people have invented the principles completely on their own.

The “principle of non-redundancy” has far-reaching technical consequences and ends up being pretty sophisticated, but at its heart is simple: things are hard to do when you have to go many places or touch many things to get them done. When the redundancy in program representation (ignoring for the moment differences between code, program data and meta-data) is eliminated, making changes or additions to programs is optimally agile. In other words, with program representation of this type, it is as easy and quick as it can possibly be to make changes to the program. In general, this will be far quicker than most programs in their current highly redundant form.

The layman’s hope that improvements can be made in software productivity, quality and cycle is realistic, and based on creating a technical reality behind the often-discussed concepts of “components” and “building blocks” that is quite different from the usual embodiment.

I have no idea why this approach to building software, which is little but common sense, isn't taught in schools and widely practiced. For those who know and practice it, the approach of "Occamality" (define everything in exactly one place) gives HUGE competitive advantages.
October 10, 2022
How to Improve Software Productivity and Quality: Code and Metadata
In the long-fought war to improve software programming productivity, there have been offensives on many fronts, but precious little genuine progress made. We give our programmers the fanciest, most high-tech equipment imaginable – and it is orders of magnitude faster and more powerful than the equipment available to earlier generations – but this new equipment has made only marginal difference. While relieving programmers of the burden of punch cards helps, the latest generation of programmers are not getting the job done much better than their comparatively low-tech predecessors.

Form and Content

As usual, most efforts to improve the situation focus on one of two general approaches: form or content.

People who focus on form tend to think that the process of getting software written is what’s important. They talk about how one “methodology” is better than another. They think about how people are selected, trained, organized and motivated. As you might imagine, the spectrum of methodologies is a broad one. On one end of the spectrum is the linear approach, which starts from the generation of business requirements for software, and ends with testing, installation and ongoing maintenance. On the other end of the spectrum is the circular, interactive approach, in which a small working program is built right away, and gradually enhanced by programmers who interact closely with the eventual end-users of the program. There are any number of methodologies between these two extremes, each of which claims an ideal combination of predictable linearity with creative interactivity.

People who focus on content tend to think that the language in which programs are written and the structure and organization of programs are what’s important. Naturally, they like design and programming tools that give specific support to their preferred language and/or architecture. There tends to be a broad consensus of support at the leading edge around just a couple of languages and structures, while the majority of programmers struggle with enhancing and maintaining programs originally written according to some earlier generation’s candidate for “best language” or “best architecture.” At the same time, many of those programmers make valiant attempts to renovate older programs so that they more closely resemble the latest design precepts, frequently creating messes. Regardless of the generation, programmers quickly get the idea that making changes to programs is their most time-consuming activity (apart, of course, from never-ending meetings), and so they focus on ways to organize programs to minimize the cost of change. This leads to a desire to build “components” that can be “assembled” into working programs, and naturally to standards for program components to “talk” with each other.

Lots of effort, not much progress

The net effect of all these well-intentioned efforts, on both the form and content sides, has been a little bit of progress and a great deal of churn. Having some agreed-on methodology leads to better predictability and generally better results than having none; it seems that having a beat to which everyone marches in unison, even if it’s not the best beat, leads to better results than having a great beat that many people don’t know or simply refuse to march to. What is the best methodology in any one case depends a great deal on both how much is already known about the problem to be solved and how smart and broadly skilled the participants in the project are. The more qualified the people and the less known about the target, the more appropriate it is to be on the “interactive” end of the spectrum; think highly qualified and trained special forces going after a well-defended target in enemy territory – creativity, teamwork and extraordinary skills are what you need. The more ordinary the participants and with better-understood objectives, the more appropriate being somewhere towards the “linear” end of the spectrum; think of a large army pressing an offensive against a broad front – with so many people, they can’t all be extraordinary, and you want coordinated, linear planning, because too much local initiative will lead to chaos.

As to content, while it is clear that the latest programming languages encourage common-sense concepts like modular organization and exposing clear interfaces, good programmers did that and more decades ago in whatever language they were writing, including assembler, and bad programmers can always find a way to make messes. And even if you use the best tools and interface methods, incredible churn is created by frequent shifts in what is considered to be the best architecture, practically obsoleting prior generations. Probably the single biggest movement over the last several decades has been the gradual effort to take important things about programs that originally could be discovered only by inspecting the source code of the program and making those things available for discovery and use by people and programs, without access to the source code. The first major wave of this movement led to exposure of much of a program’s persistent data, in the form of DBMS schemas; the other major wave of this movement (in many embodiments, from SAA to SOAP to microservices) exposes a program’s interfaces, in the form of callable functions and their parameters. This has been done in the belief that it will enable us to make important changes to some programs without access to or impact on others, and thus approach the dream of programming by assembling fixed components, like building blocks or legos, into completed programs or buildings. The belief in this dream has been affected very little by decades of evidence that legos don’t build things that adults want to live in.

I believe that while considerations of form are incredibly important, and when done inappropriately can drag down or even sink any programming project, there is little theoretical headway to be made by improvements in methodology. I think most of the relevant ideas for good methodology have already been explored from multiple angles, with the exception of how to match methodology to people and project requirements – no one methodology is the best for each project and each collection of people. But even when you’ve picked the best methodology, the NBA players will always beat the high school team, as long as the NBA players execute well on the motivational and teamwork aspect that any good methodology incorporates. With methodology we’re more in need of good execution and somehow assembling a talented and experienced team than we are of fresh new ideas.

Ptolemy, Copernicus, Kepler, Newton

Content, however, is another story altogether. I think our current best languages and architectures will look positively medieval from the perspective of a better approach to content. I think there is a possible revolution here, one that can bring about dramatic improvements in productivity, but which requires an entirely new mind-set, as different as Copernicus and Ptolemy, as different as Einstein and Newton.

Having said that, let me also say that the “new” ideas are by no means completely new. Many existing products and projects have exploited important parts of these ideas. What is mostly new here is not any particular programming technique, but an overriding vision and approach that ties various isolated programming techniques into a unified, consistent whole. Kepler’s equations described the motion of planets with accuracy equal to Newton’s – in fact, you can derive one set of equations mathematically from the other; but Newton provided the vision (gravity) and the tools (calculus) that transformed some practical techniques (Kepler’s equations) into the basis of a new view of the world. That’s why physics up to the end of the nineteenth century was quite justifiably characterized as “Newtonian” and not “Keplerian.”

Ptolemy and Newton looked at the same set of objects; Ptolemy picked the closest one, the earth, to serve as the center of thinking, while still incorporating the rest.

His main goal was to describe the motions of what you could see, focused on the matter. Copernicus noted that things got simpler and more accurate if you picked one farther away, the Sun, to serve as the center of things.

Kepler made things better by noticing the curves made by the planets. Newton then took the crucial step: instead of focusing on matter, he focused on energy (gravity in this case), and wrote an equation describing how gravity works,

which creates changes in the location and velocity of matter. In programming, we have clear equivalents of matter and energy: matter is data (whether or not it is persistent), and energy is instructions (lines of code, regardless of the language it’s in). In COBOL, for example, this division is made explicit in the data division and the procedure division. In modern languages the two are more intermixed, but it remains clear whether any particular line describes data or describes an action to take (an if statement, assignment statement, etc.).

Now, in spite of what you may have been taught in school, Ptolemy’s method works. In fact, it would be possible (though not particularly desirable) to update his approach with modern observations and methods, and have it produce results that are nearly identical to those possible today; in fact, only when you get to relativity does Ptolemy’s method break down altogether — but remember that the effects of relativity are so small, it takes twentieth century instruments to detect them. But no one bothers to do this, because the energy-centered approach developed by Newton and refined by his successors is so much simpler, cleaner and efficient.

Similarly, there is no doubt that the instructions-centric approach to programming works. But what comes out of it is complicated, ugly and inefficient. The Newtonian breakthrough in programming is replacing writing and organizing instructions (and oh, by the way, there’s some data too) as the center of what we do with defining and describing data (and oh, by the way, there are some instructions too). The instruction-centered approach yields large numbers of instructions with a small amount of associated data definitions; the data-centered approach yields large numbers of data definitions and descriptions operated on by a significant body of unchanging, standard instructions and small numbers of instructions specifically written for a particular program. In the instruction-centered approach, we naturally worry about how to organize collections of instructions so that we can write fewer of them, and arrive at concepts like objects, components and class inheritance (a subclass inherits and can override its parent’s methods (instructions)). In the data-centered approach, we naturally worry about how to organize collections of data definitions so that we can have the minimal set of them, and arrive at concepts like meta-data, standard operations (e.g. pre-written meta-data-driven instructions enabling create, query, select, update and delete of a given collection of data) and data inheritance (a child data definition, individual or group, inherits and can override its parent’s definitional attributes). In short, we separate out everything we observe into a small, unchanging core (like Newton's gravity) that produces ever-changing results in a diverse landscape.

We shift our perspective in this way not because it enables us to accomplish something that can’t be accomplished by the current perspective, but because it proves to be cleaner, simpler, more efficient, easier to change, etc.

Instructions, data and maps

Only by understanding the details of this approach can it really be appreciated, but the metaphor of driving instructions and maps is appropriate and should make the core idea clear.

Suppose your job is to drive between two locations, and the source and destination location are always changing. There are two general approaches for giving you directions:
1. Turn-by-turn directions (the instruction-driven, action-oriented approach)
2. A map, with source and destination marked (the data-driven, matter-oriented approach)
The advantage of directions is that they make things easy for the driver (the driver is like the computer in this case). You pick up one step, drive it, then pick up the next, drive it, and so on until the end. You don’t have to think in advance. All you have to do is follow directions, which tell you explicitly what to do, in action-oriented terms (turn here, etc.).

The problem with directions come in a couple of circumstances:
- You have to have a huge number of directions to cover all possible starting points and destinations, and there is a great deal of overlap between sets of directions
- If there is a problem not anticipated by the directions, such as an accident or road construction, you have to guess and get lucky to get around the problem and get back on track.
A map, on the other hand, gives you most of the information you need to generate your own directions between any two given points. The map provides the information; you generate the actions from that information. With a direction-generating program and some parameters like whether to use toll roads, a generic program can generate directions on the fly. With a program like Wayz, real-time changes due to updated traffic information can even be generated.

The Wayz program itself isn't updated very often — it's mostly the map and road conditions. Same thing with a program written using this approach; the core capabilities are written in instructions, while the details of the input, storage, processing and output are all described in a "map" of the data and what is to be done with it.

Conclusion

It's pretty simple: programming today largely follows the method of Ptolemy, resulting in an explosion of software epi-cycles to get anything done. Attempts to keep things easily changeable sound promising but never work out in practice.

The way forward is to focus instead on what there is and what is to be done with it (data and metadata), with a small amount of Wayz-like code to take any user or data stream from its starting point to its destination with easy changes as needed.
October 3, 2022
Better Software and Happier Customers with Post-hoc Design

What can you possibly mean by "post-hoc design?" Yes, I know it means "after-the-fact design," using normal English. It's nonsense! First you design something. Then you build it. Period.

Got your attention, have I? I agree that "post-hoc design" sounds like nonsense. I never heard of it or considered it for decades. But then I did. Before long I saw that great programmers used it to create effective high-quality, loved-by-customers software very quickly.

The usual way to build software: design then build

The way to build good software is obviously to think about it first. Who does anything important without having a plan? Start by getting requirements from the best possible source, as detailed as possible. Then consider scale and volume. Then start with architecture and drill down to design.

When experienced people do architecture and design, they know that requirements often "evolve." So it's important to generalize the design anticipating the changes and likely future requirements. Then you make plans and can start building. Test and verify as you drive towards alpha then beta testing. You know the drill. Anything but this general approach is pure amateur-hour.

I did this over and over. Things kept screwing up. The main issue was requirements "evolution," which is something I knew would happen! Some of the changes seemed like they were from left field, and meant that part of my generalized architecture not only failed to anticipate them, but actually made it harder to meet them! Things that I anticipated might happen which I wove into the design never happened. Not only had I wasted the time designing and building, the weren't-needed parts of the design often made it hard for me to build the new things that came along that I had failed to anticipate.

I assumed that the problem was that I didn't spend enough time doing the architecture and design thinking, and I hadn't been smart enough about it. Next time I would work harder and smarter and things would go more smoothly. Never happened. How about requirements? Same thing. The people defining the requirements did the best they could, but were also surprised when things came along, and embarrassed when things they were sure would be important weren't.

After a long time — decades! — I finally figured out that the problem was in principle unsolvable. You can't plan for the future in software. Because you can't perfectly predict the future! What you are sure will happen doesn't, and what you never thought about happens.Time spent on anything but doing and learning as you go along is wasted time.

The winning way to build software: Build then Design

Build first. Then and only then do the design for the software you've already built. Sounds totally stupid. That's part of why I throw in some Latin to make it sound exotic: "Post-hoc design," i.e., after-the-fact design.

When you design before you build, you can't possibly know what you're doing. You spend a bunch of time doing things that turn out to be wrong, and making the build harder and longer than it needs to be. When you build in small increments with customer/user input and feedback at each step, keeping the code as simple as possible, you keep everything short and direct. You might even build a whole solution for a customer this way — purposely NOT thinking about what other customers might need, but driving with lots of hard-coding to exactly what THIS customer needs. Result: the customer watches their solution grow, each step (hopefully) doing something useful, guides it as needed, and gets exactly what they need in the shortest possible time. What's bad about a happy customer?

Of course, if you've got the typical crew of Design-first-then-build programmers, they're going to complain about the demeaning, unprofessional approach they're being forced to take. They might cram in O-O classes and inheritance as a sop to their pride; if they do, they should be caught and chastised! They will grumble about the enormous mountain of "technical debt" being created. Shut up and code! Exactly and only what's needed to make this customer happy!

When the code is shown to another customer, they might love some things, not need some other things and point out some crucial things they need aren't there. Response: the nearly-mutinous programmers grab a copy of the code and start hacking at it, neutering what isn't needed, changing here and adding there. They are NOT permitted to "enhance" the original code, but hack a copy of it to meet the new customer's need. At this point, some of the programmers might discover that they like the feeling of making a customer happy more quickly than ever before.

After doing this a couple times, exactly when is a matter of judgement, it will be time to do the "design" on the software that's already been built. Cynics might call this "paying off tech debt," except it's not. You change the code so that it exactly and only meets the requirements of the design you would have made to build these and only these bodies of code. You take the several separate bodies of code (remember, you did evil copy-and-modify) and create from them a single body of code that can do what any of the versions can do.

When you do this, it's essential that you NOT anticipate future variations — which will lead to the usual problems of design-first. The pattern for accomplishing this is the elimination of redundancy, i.e., Occamality. When you see copy/modify versions of code, you replace them with a single body of code with the variations handled in the simplest way possible — for example, putting the variations into a metadata table.

This isn't something that's done just once. You throw in a post-hoc design cycle whenever it makes sense, usually when you have an unwieldy number of similar copies.

As time goes on, an ever-growing fraction of a new user's needs can be met by simple parameter and table settings of the main code line, and an ever-shrinking fraction met by new code.

Post-Hoc Design

Ignoring the pretentious name, post-hoc design is the simplest and most efficient way to build software that makes customers happy while minimizing the overall programming effort. The difference is a great reduction in wasted time designing and building, and in the time to customer satisfaction. Instead of a long requirements gathering and up-front design trying valiantly to get it right for once, resulting in lots of useless code that makes it harder to build what it turns out is actually needed, you hard-code direct to working solutions, and then periodically perform code unification whose purpose is to further shorten the time to satisfaction of new customers. To the extent that a "design" is a structure for code that enables a single body of code to be easily configured to meet diverse needs, doing the design post-hoc assures zero waste and error.

What is the purpose of architecture and design anyway? It is to create a single body of code (with associated parameters and control tables) that meets the needs of many customers with zero changes to the code itself. The usual method is outside-in, gaze into the future. Post-hoc design is inside-out, study what you built to make a few customers happy, and reduce the number of source code copies to zero while reducing the lines of code to a minimum. The goal is post-hoc design is to minimize the time and effort to satisfy the next customer, and that's achieved by making the code Occamal, i.e., eliminating redundancies of all kinds. After all, what makes code hard to change? Finding all the places where something is defined. If everything is defined in exactly one place, once you've found it, change is easy.

Post-hoc design is a process that should continue through the whole life of a body of code. It prioritizes satisfaction of the customer in front of your face. It breaks the usual model of do one thing to build code and another to modify it. In the early days of what would normally be called a code "build," the code works, but only does a subset of what it is likely to end up doing. When customers see subsets of this kind, it's amazing how it impacts their view of their requirements! "I love that. I could start using it today if only this and that were added!" It's called "grow the baby," an amazing way to achieve both speed and quality.

New name for an old idea

All I'm doing with "Post-hoc design" is putting a name and some system around a practice that, while scorned by academia and banned by professional managers, has a long history of producing best-in class results. I'm far from the first person who has noticed the key elements of post-hoc design.

Linus Torvalds (key author of Linux, the world's leading operating system) is clearly down on the whole idea of up-front design:

Don’t ever make the mistake [of thinking] that you can design something better than what you get from ruthless massively parallel trial-and-error with a feedback cycle. That’s giving your intelligence much too much credit.

Gall's Law is a clear statement of the incremental approach:

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.

The great computer scientist Donald Knuth, author of the multi-volume Art of Computer Programming, was a master of shifting between assembler language programming and abstract algorithms and back, the key activities of the speed-to-solution and post-hoc abstraction phases of the method I've described here.

People who discover the power and beauty of high-level, abstract ideas often make the mistake of believing that concrete ideas at lower levels are worthless and might as well be forgotten. On the contrary, the best computer scientists are thoroughly grounded in basic concepts of how computers actually work. The essence of computer science is an ability to understand many levels of abstraction simultaneously.

Thanks to Daniel Lemire for alerting me to these quotes.

Conclusion

Post-hoc design is based on the idea that software is only "built" once, and after that always changed. So why not apply the optimal process of changing software from day one? And then alternate between as-fast-as-possible driving to the next milestone, with periodic clean-ups to make fast driving to the next goal optimal? Post-hoc design is a cornerstone of the process of creating happy customers and optimal code. It also happens to conform to the goals of software architecture. Post-hoc design is like first fighting a battle, and then, once the battle is over and you've won, cleaning and repairing everything, incorporating what you learned from the battle just past so that everything is ready for the next battle. Post-hoc design is the way to win.

September 20, 2022