Category: Software Programming Languages

  • How to Learn to be a Computer Programmer

    Lots of people will tell you how to become a computer programmer. Go to college and major in Computer Science is the most wide-spread advice. In the last few years things called “coding academies" have emerged as quicker, more affordable paths to that goal. There is a faster, better way to acquire the skill of computer programming that is rarely discussed. While this post is about computer programming, the educational concepts discussed are broadly applicable.

    How Skills are Acquired

    Stepping back from programming, let’s think about how skills are acquired. Physical skills like playing various sports is a good place to start. How much classroom time is required? How about text books? The basic way you learn sports or other physical skills is watching someone play them. With any sport involving a ball, you pick one up and try to throw it. Then catch it when thrown to you. Someone may give advice, but basically you try a lot, gradually learn from your mistakes and get better. It’s important to note that you see the results of your effort. You see if the ball went where you intended it to go, for example. This can continue for years.

    How about a more intellectual skill? We all learn how to talk and listen with understanding. The way we learn is similar to learning a sport – by watching, listening and then emulating. The world is incredibly complex and varied, with words associated with a huge number of things and actions. You start with a few words, and spend years adding many thousands more to them. Yes, parents give lots of feedback – not unlike the feedback you get when you see whether the ball you threw went where you wanted it to, whether the receiver caught it. Did you catch what I was trying to say there? Yup, like that.

    The language of software

    By contrast to any human language, a computer language is amazingly simple – partly because the “world” in which it “lives” is incredibly narrow and abstract. The nouns are all data of a couple different types, basically numbers and letters. You give each place that can hold data a name. The verbs are just a handful of simple actions that just grab the data that’s in a named place on your “desk,” do something with it, and put the result back or into another named place.

    A great deal of the power of software comes from what’s called the subroutine library, which is basically an attentive gang of little robots waiting to do your bidding. If you want to write something on a screen, you “call” the relevant robot, give it your instructions, and send it off. There are robot librarians and file clerks that are excellent with card catalogs and endless file cabinets, fetching what you ask for and putting away what you’re done with. There are robot butlers at each “door” (keyboard, mouse, etc.) that politely inform you when something has arrived and give you the package. Another butler will also send data where you want it to go. While central to programming is learning how to “talk,” learning about the team of robots at your disposal and what each can do for you is also important.

    Yes, there are lots of different software languages. But they’re really like local dialects. They all conform to the basic description I just gave. The gang of robots available in a given language typically vary depending on what the language is mostly used for. There are business-oriented languages that pay special attention to things like financial data, account data collections like accounts, and others that have robots that are really good at fancy math. But the way of thinking about them and writing programs is remarkably similar.

    This should make you wonder exactly why you need years of courses “taught” by fancy professors, when programming is like doing things in an amazingly simple world of data using a small set of instructions that are easy to visualize and understand. The answer is that you can do it on your own, given access to a computer, some widely available tools and a bunch of organized exercises that start super-simple and get gradually more elaborate. That’s how I learned!

    How I learned to program in ancient times

    I went to high school at the newly-opened Morris Knolls HS in Rockaway, NJ. It took students from a couple of towns and had courses that were explicitly college and non-college. There were vocational tracks such as HVAC, auto repair and secretarial. The towns weren’t poor but definitely weren’t elite, with few professionals.

    My introduction to programming and computers was a course offered in my junior year of high school, the academic year 1966-7. One of the high school teachers wanted to move from teaching to software and had no training or other way to get there. He somehow persuaded the administration to let him teach a course on programming. His break-through was arranging for the class to have computer time at a nearby company, Reaction Motors.

    The teacher found a FORTRAN textbook to serve as the basis of the class. While he was the “teacher,” it was clear that his teaching amounted to trying to be a couple chapters ahead of the class in the book. But it didn’t matter. The important thing was having a book that laid things out, along with exercises and answers. And above all, the Saturday time slot when we had access to the computer.

    Programming involved writing the program on paper, and then keying the program onto cards using one of the punch card machines that were in a room adjacent to the main machine room. We got to operate the room-sized computer, something which in the normal corporate environment was an absolute no-no, as I later learned. It was terrific fun.

    I got lucky the summer after my senior year, and got a job at a local company, EMSI, Esso Mathematics and Systems Inc. The company applied computers and math to a wide variety of issues running the oil company. I started by fixing bugs and making changes to their math optimization FORTRAN program that helped them run their oil refineries better. Did I “know” how to do this? In each case I dove in and figured it out.

    Then they needed some test programs written a new language, PL/1. No one there knew the language, so I dove into the manual and wrote the programs. Then they needed some accounting programs to help run their facilities in Asia. They wanted them written in COBOL, which made sense for financial programs, so I learned the language as I went along. I quickly fell into the pattern of getting some data from my boss, writing code that did what he wanted with it, showing him the output, and cycling for more. Each cycle required less self-education in COBOL, so I got faster. I tell more of this story here.

    What I did wasn’t that unusual at the time. Most of the early programmers basically figured things out as they went along. They needed to get stuff done. They studied programs written by other people and learned how to write their own. Compared to learning to read and write the German language, I found FORTRAN to be amazingly simple by comparison; in fact, it was a relief compared to the endless complications and words of any human language, not to mention idioms.

    How you can learn in modern times

    You won’t have to deal with card and card punch machines. You won’t have to go to a place with a room-sized computer. You have the internet to give you access to all the information about any language you want, and tools you can access to enable you to write code in that language, try to run it and see how it goes. To start you can use formal exercises with programmed solutions you can study. You can learn and advance at your own pace.

    If you want, you can poke around for a local or remote job. It’s good to start with some narrow domain of problems and tools that you’re interested in and have gained skill in. The first job doesn't even have to be programming — it could be testing, for example. Just get in somewhere and start producing. Once you prove value, you’ll move up from there, because people who can produce results aren’t that easy to find, and a proven producer is a keeper. The important thing is to keep learning and move up the skills hierarchy.

    Finally, count your blessings that the ridiculous world of job certification hasn’t yet latched its evil claws into programming very much. Yes, there are ignorant HR people who insist applicants have a degree in Computer Science, but the best programmers with degrees are good in spite of their mis-education in college.

  • Summary: Software Programming Languages

    This is a summary with links to all my posts on software programming languages.

    Computers need to be “told” what to do. Each computer has a native language it understands, a “machine language.” Each machine language has a human-readable version called “assembler language.” Starting in the 1950’s “high level” languages for telling the computer what to do (programming it) have been developed.

    111

    Here are the basics of that amazing progression.

    https://blackliszt.com/2020/09/the-giant-advances-in-software-programming-languages.html

    Software programming language core concepts

    How do you learn a programming language so you can be a programmer? No one wants to tell you that the best way to learn is on your own!

    https://blackliszt.com/2025/01/how-to-learn-to-be-a-computer-programmer.html

    One of the core features of all programming languages is the subroutine. Everyone who programs uses subroutines, but their power can be underappreciated.

    https://blackliszt.com/2021/03/fundamental-concepts-of-computing-subroutine-calls.html

    Among the earliest features added to languages were structures, blocks and macros. Each of these contributed to more concise and understandable programs.

    https://blackliszt.com/2021/09/software-programming-language-evolution-structures-blocks-and-macros.html

    Once you get past the basics of language, you find that widely used languages have rich collections of subroutine libraries and sometimes what are called frameworks, which contribute a large fraction of any program’s solution.

    https://blackliszt.com/2021/10/software-programming-language-evolution-libraries-and-frameworks.html

    The way programs flow is a key aspect.

    https://blackliszt.com/2022/06/flowcharts-and-workflow-in-software.html

    Programs consist of a combination of instructions and definitions of the data on which the instructions operate.  Exactly how those relate to each other has been the subject of major disputes – which continue to this day.

    https://blackliszt.com/2015/06/innovations-that-arent-data-definitions-inside-or-outside-the-program.html

    Some programming "advances" basically try to restrict the possible relationships betwen code and data. Bad idea.

    https://blackliszt.com/2021/08/the-relationship-between-data-and-instructions-in-software.html

    Software programming language evolution

    When talking about software languages, the term high level language was introduced early. Here’s what it means.

    https://blackliszt.com/2020/10/what-is-high-about-a-high-level-language-in-software.html

    Decades ago a couple of languages were invented and enjoyed an explosion of use that were major steps ahead of the standard languages of the time in terms of productivity.

    https://blackliszt.com/2020/11/software-programming-language-evolution-beyond-3gls.html

    Not long after these amazing, productivity-enhancing languages had their rapid growth, the modern DBMS came onto the scene, with its own language for data definition, query and update. The impact of the DBMS revolution was huge.

    https://blackliszt.com/2021/01/software-programming-language-evolution-impact-of-the-dbms-revolution.html

    One of the major reactions to the DBMS revolution was the introduction and rise to prominence of what were called 4-GL’s, all which of incorporated the DBMS as an integral part.

    https://blackliszt.com/2021/02/software-programming-language-evolution-4gls-and-more.html

    There has been an explosion of programming language invention.

    https://blackliszt.com/2014/04/continents-and-islands-in-the-world-of-computers.html

    It just goes on and on and on and on, with no end in sight.

    https://blackliszt.com/2022/04/software-programming-language-cancer-must-be-stopped.html

    Software programming language innovations

    Experts tout the major advances in programming languages made since the early days, but they don’t amount to much.

    https://blackliszt.com/2020/09/software-programming-languages-50-years-of-progress.html

    A comparison to the virtues of various human languages makes this clear.

    https://blackliszt.com/2020/12/the-bronte-sisters-and-software.html

    Attempts have been made by experts to fix problems that have been widely noted for many decades.

    https://blackliszt.com/2021/09/software-programming-language-evolution-the-structured-programming-goto-witch-hunt.html

    One of the serious detours into unproductive swamps taken by programming languages has been what’s called object-orientation. It had a promising start and would have been fine had it stopped there.

    https://blackliszt.com/2021/11/the-promising-origins-of-object-oriented-programming.html

    Lots of people have things to say about O-O languages. There are the academics and other authorities and then there are the people with real experience who aren’t groupies.

    https://blackliszt.com/2022/01/object-oriented-software-languages-the-experts-speak.html

    Here are some facts and logic to explain why O-O should be referred to exclusively in the past tense.

    https://blackliszt.com/2022/03/why-object-orientation-in-software-is-bad.html

    O-O is so bad that people who actually know stuff take pleasure from mocking it:

    https://blackliszt.com/2022/05/making-fun-of-object-orientation-in-software-languages.html

    Another category of hyped but harmful programming languages have been “functional” ones. These arose early in programming evolution and keep popping up.

    https://blackliszt.com/2021/05/software-programming-language-evolution-functional-languages.html

    The attraction of functional language has a strong basis. It’s true that increasing the declarative aspect of a total solution brings strong benefits. But embedding it in the language neuters it.

    https://blackliszt.com/2021/07/software-programming-languages-the-declarative-core-of-functional-languages.html

    Here are some detailed examples of how supposed advances in programming languages have worked out in practice, including failures of Java and 4-GL’s and a huge success with a brand-new system written in that ancient bit of history called COBOL.

    https://blackliszt.com/2021/01/software-programming-language-evolution-credit-card-software-examples-1.html

    https://blackliszt.com/2021/02/software-programming-language-evolution-credit-card-software-examples-2.html

    https://blackliszt.com/2022/08/software-programming-language-evolution-credit-card-software-examples-3.html

    After all those failed advances, there's the amazing case of HTML, a wildly successful language without all the bells and whistles that takes the winning principle of "just get the job done."

    https://blackliszt.com/2010/05/html-assembler-language-easy-to-use-tools.html

    Software programming language evaluation

    Promoters of new languages tout how they prevent programmers from making errors (ha!) and result in cleaner programs that somehow promote productivity – which is never measured or tested.

    The fact is, there is a standard by which programming languages should be judged – along with the support systems and metadata that are an integral part of the solution. It’s simple: how many places do you have to go to make a change? The right answer is one. Here’s the idea.

    https://blackliszt.com/2014/03/how-to-evaluate-programming-languages.html

     

  • Software Programming Language Evolution: Credit Card Software Examples 3

    Credit card systems are among the earlier major enterprise software systems written. The early systems were written in assembler language or COBOL. If programming languages really did get more powerful and advanced, you would think that a wave of re-writes would have transformed the industry as card systems written in creaky old languages were streamlined and turbo-charged by being written in more modern languages. Generations of industry executives and technical experts and leaders have thought exactly this.

    The earlier posts in this series have described two major such efforts that ended in face-plants, and other efforts that illustrate the ongoing power and productivity of supposedly decrepit approaches. In this post, I'll describe a couple not-famous examples of advances that actually took place.

    Clarity Payment systems and TSYS: the first version in Java

    In the late 1950's a group inside a local bank, Columbus Bank and Trust in Georgia, started building a system for processing credit cards. The division went public and eventually became known as TSYS, Total Systems, which is now one of the world's major card processing companies.

    During the early 2000's a special kind of card with limited functionality called a pre-paid debit card started to be used. Unlike a credit card, which you use to make charges and then later pay it off or use revolving credit (called "pay later"), a prepaid debit card is just what it sounds like: you first pay in some money and then can make charges using the card until the amount you put in runs out (called "pay before"). This kind of card is vastly easier to implement in software than a credit card, but is still complicated because of all the bank and card interfaces. See this for more.

    Meanwhile a small company called Clarity Payment Solutions had created a working prepaid card system. The technical founder of the company had bought into the rhetoric around the Java enterprise Object-oriented language, and had constructed the code using it. What everyone believed was that basing a program on objects would make it tremendously more flexible and easier to change than using traditional languages. Objects were thought of as being like Lego blocks with super powers, enabling you to pick out the ones you like and piece them together easily. A feature called inheritance promised the ability to make minor changes without much effort and no side effects.

    The executives at TSYS needed to get into the rapidly growing market for prepaid debit cards. They put some effort into having their staff modify their internal systems to meet the need but weren't getting anywhere fast. When they encountered Clarity Payments they were pretty happy — it's what we need! It's already working in the market! And best of all, it's written in Java, so we'll be able to make changes easily without all the trouble of systems written in prehistoric languages like the one we're stuck with! They bought the company.

    The technical leader of Clarity was sobered by his experience of writing the software for prepaid debit. It was a lot harder than he thought it would be, and the ease of making changes because of the object orientation of Java proved to be little but hollow rhetoric. He had proven to himself that it was all b.s. through hard personal experience, learning what others have learned. He now had years of practical experience building a production system to make clear to him what the real obstacles were. He was glad to sell off the company and not have to struggle with the code any more. TSYS was welcome to it!

    TxVia and Google: the second version in Java with help

    The technical founder of Clarity set about building the software infrastructure he would have liked to have had when building the Clarity code in the first place. Java by itself didn't solve the problems. It needed lots of help, and he was going to build the system that would give it help. This is a response that some smart programmers have when they get their noses rubbed in the broken promises of some new programming fad. It led this fellow to build a system that took a significant step up the hierarchy of abstraction, as I describe here. Java would remain the core language, but given a huge practical boost by having the ability to make changes built in at critical points using a kind of workbench approach with cleverly chosen "user exits" to enable safe customization. Eventually he turned the power of his new system to building prepaid debit card software, whose requirements he understood so well. Reconnecting with his old business partners, they went into the same market again and started getting some real business.

    Meanwhile a team at Google had been working on the same problem.They wanted a Google implementation of prepaid debit card functionality for the new Google Wallet. The leaders took at look at prepaid debit card functionality and immediately felt it was no big deal. It's nothing but putting money into an account and checking withdrawals to make sure they was enough money.  Adding and subtracting and a few interfaces. No biggie. But just to be safe they assembled a crack team of nearly 100 Google-level geniuses and put them on the job. They used languages and tools that were generations ahead of standard Java.

    A year later they still had nothing working. It turned out to be harder than they had thought, even with the astounding power and flexibility of Google software resources. When one of the leaders heard about TxVia, he insisted on giving them a look. A group of Googlers came to the TxVia offices and threw down the specs of what they were trying to build on the table. They sneered, we heard you guys were real smart; our managers tell us you've gotten a lot further than we have. Sure. If you're so smart, prove it by making a system like what's in this document work.

    Shortly after the TxVia team came back with — a system that met Google's requirements. That the Google team had spent a year failing to meet. Skipping over the emotions of everyone involved, which pretty much covered the gamut from embarrassed to ashamed to denial to exultant to you-can-imagine, Google bought the company. It became a key part of Google Wallet.

    This was a clear demonstration that Java or other modern languages are NOT the determining factor in programmer productivity and software effectiveness.

    The Paysys Corecard system and Apple

    I told the story of my time as CTO at Paysys in the late 1990's here, including the sale of its millions of lines of COBOL code to First Data. I gave details about how Paysys became a powerful player in the market for credit card software by increasing their breadth of automation here.

    While I was CTO, after studying the COBOL code in detail with lots of help from the people who had written it, I came up with a way to re-create the system's functionality. The idea was to implement a core set of concepts written in a small amount of abstract code and then build extensive metadata as needed to support the product’s existing functionality and more. I wasn’t fully aware of it at the time, but I used the method described here and in the linked posts. It was written in C++ (mostly the C subset) and ran of a network of servers. One of the big national consulting groups ran tests that verified it could handle tens of millions of cards with linear growth. In addition a team at First Data in Omaha ran the code and modified the metadata to make it match the functionality of their existing system written in assembler language. The trouble they had modifying the assembler language to handle a variety of requirements already met by the Paysys COBOL code was the main reason they were buying Paysys. They decided they would really like to have the new code as well.

    While his team urged the CEO to include the new code in the purchase, he decided he didn’t need it, and kept it out of the deal. The COBOL code solved immediate problems like supporting cards in Japan, and who cared what a bunch of nameless programmers babbled about the speed of making future changes?

    Thus it happened that when First Data bought the Paysys VisionPLUS COBOL code in the year 2001, the new metadata-based system was left out of the deal and stayed with the remainder company, now called Corecard.

    Years went by. Some leading people began to notice Corecard because they could make it do thing unanticipated things much more quickly and easily than with normal procedural systems. Then Apple decided to get into the credit card issuing business. Not the cheap and easy pre-paid kind, the full-featured, tough credit card kind. They took their requirements to the usual suspects who gave them the usual lengthy go-live times with the usual astronomical custom programming fees. Somehow they talked with Goldman Sachs, which had connected with Corecard through a small, adventurous group. They could get the job done, quickly and efficiently, when no one else in the industry could come close. A deal got done and the Apple card came out quickly, doing everything Apple wanted. And scaled quickly.

    Normal parameters could not have accomplished this. The TxVia workbench approach didn't have even a fraction of the functionality needed. Only a software system that went far beyond the capabilities of procedural languages of any generation could have met the challenge. In the end, languages of ANY generation can only do so much. They're like prop planes. If you want to go REALLY fast you need a rocket engine, and that's what meta-data-based systems are.

    One of the sobering lessons here is the very basic human one: No one will want a rocket engine unless they're trying to build a rocket. If you build a rocket engine, however powerful it may be, people will look at it, scratch their heads, express mild amazement, but walk away — they don't need it. And won't until they decide they want to build a rocket.

    Conclusion

    These real-life examples demonstrate the limits of normal procedural languages, no matter how modern and fancy. They demonstrate how taking even small steps up the ladder of abstraction can yield amazing gains, as they did for Google, at least after they bought TxVia. And finally they demonstrate that there's a whole quantum leap further you can go to meet software requirements beyond what procedural languages alone can handle — but no one will want to buy them until they have a problem that nothing else can solve.

  • Flowcharts and Workflow in Software

    The concept of workflow has been around in software from the beginning. It is the core of a great deal of what software does, including business process automation. Workflow is implicitly implemented in most bodies of software, usually in a hard-coded, ad-hoc way that makes it laborious and error-prone to implement, understand, modify and optimize. Expressing it instead as editable declarative metadata that is executed by a small body of generic, application-independent code yields a huge increase in productivity and responsiveness. It also enables painless integration of ML and AI. There are organizations that have done exactly this; they benefit from massive competitive advantage as a result.

    Let’s start with some basics about flowcharts and workflow.

    Flowcharts

    Flowcharts pre-date computers. The concept is simple enough, as shown by this example from Wikipedia:

    Fix lamp

    The very earliest computer programs were designed using flowcharts, illustrated for example in a document written by John von Neumann in 1947. The symbols and methods became standardized. By the 1960’s software designers used templates like this from IBM

    Flowchart

    to produce clean flowcharts in standardized ways.

    Flowcharts and Workflow

    Flowcharts as a way to express workflows have been around for at least a century. Workflows are all about repeatable processes, for example in a manufacturing plant. People would systematize a process in terms of workflow in order to understand and analyze it. They would create variations to test to see if the process could be improved. The starting motivation would often be consistency and quality. Then it would often shift to process optimization – reducing the time and cost and improving the quality of the results. Some of the early work in Operations Research was done to optimize processes.

    Workflow is a natural way to express and help understand nearly any repeatable process, from manufacturing products to taking and delivering orders in a restaurant. What else is a repeatable process? A computer program is by definition a repeatable process. Bingo! Writing the program may take considerable time and effort, just like designing and building a manufacturing plant. But once written, a computer program is a repeatable process. That’s why it made sense for the very earliest computer people like John von Neumann to create flowcharts to define the process they wanted the computer to perform repeatedly.

    What’s in a Flowchart?

    There are different representations, but the basic work steps are common sense:

    • Get data from somewhere (another program, storage, a user)
    • Do something to the data
    • Test the data, and branch to different steps depending on the results of the test
    • Put the data somewhere (another program, storage, a user)
    • Lots of these work steps are connected in a flow of control

    This sounds like a regular computer software program, right? It is! When charted at the right level of detail, the translation from a flowchart to a body of code is largely mechanical. But humans perform this largely mechanical task, and get all wrapped up in the fine details of writing the code – just like pre-industrial craftsmen did.

    Hey, that's not just a metaphor — it is literally true! The vast, vast majority of software programming is done in a way that appears from the outside to be highly structured, but in fact is designing and crafting yet another fine wood/upholstery chair (each one unique!) or, for advanced programmers, goblets and plates made out of silver for rich customers.

    Workflow

    In the software world, workflow in general has been a subject of varying interest from the beginning. It can be applied to any level of detail. It has led to all sorts of names and even what amount to fashion trends. There is business process management. Business process engineering. And re-engineering. And business process automation. A specialized version of workflow is simulation software, which led early programmers to invent what came to be called "object-oriented programming." To see more about this on-going disaster that proved to be no better for simulating systems than it has been for software in general, see this.

    When document image processing became practical in the 1980’s, the related term workflow emerged to describe the business process an organization took to process a document from its arrival through various departments and finally to resolution and archiving of the document. The company that popularized this kind of software, Filenet, was bought by IBM. I personally wrote the workflow software for a small vendor of document image processing software at that time.

    Workflow in practice

    There has been lots of noise about what amounts to workflow over the years, with books, movements and trends. A management professor in the 1980's talked about how business processes could be automated and improved using business process re-engineering. He said that each process should be re-thought from scratch — otherwise you would just be "paving the cow paths," instead of creating an optimal process. As usual, lots of talk and little action. Here's the story of my personal involvement in such a project in which the people in charge insisted they were doing great things, while in fact they were spending lots of money helping the cows move a bit faster than they had been.

    The Potential of Workflow

    The potential of workflow can be understood in terms of maps and driving from one place to another. I've explained the general idea here.

    Most software design starts with the equivalent of figuring out a map that shows where you are and where you want to get to. Then the craftsmanship begins. You end up with a hard-coded set of voluminous, low-level "directions" for driving two blocks, getting in the left lane, turning left, etc.

    When the hard-coded directions fail to work well and the complaints are loud enough, the code is "enhanced," i.e., made even more complex, voluminous and hard to figure out by adding conditions and potential directional alternations.

    Making the leap to an online, real time navigation system is way beyond the vast majority of software organizations. You know, one that takes account of changes, construction, feedback from other drivers on similar routes about congestion, whether your vehicle has a fee payment device installed, whether your vehicle is a truck, etc. Enhancements are regularly made to the metadata map and the ML/AI direction algorithms, which are independent of map details.

    When software stays at the level of craftsmanship, you're looking at a nightmare of spaghetti code. Your cow paths aren't just paved — they have foundations with top-grade rebar, concrete and curbs crafted of marble.

    Conclusion

    Metadata-driven workflow is the next step beyond schema enhancement for building automated systems to perform almost any job. It's a proven approach that many organizations have deployed — literally for decades. But all the leaders of computing, including Computer Science departments at leading universities, remain obsessed with subjects that are irrelevant to the realities of building software that works; instead they stay focused on the wonders of craftsman-level low-level software languages. It's a self-contained universe where prestige is clearly defined and has nothing to do with the eternal truths of how optimal software is built.

     

  • Software Programming Language Cancer Must be Stopped!

    Human bodies can get the horrible disease of cancer. Software programming languages are frequently impacted by software cancer, which also has horrible results.

    There are many kinds of cancer, impacting different parts of the body and acting in different ways. They all grow without limit and eventually kill the host. Worse, most cancers can metastasize, i.e., navigate to a different part of the body and start growing there, spreading the destruction and speeding the drive towards death.

    Software cancer impacts software languages in similar ways. Once a software programming language has been created and used, enthusiasts decide that the language should have additional features, causing the language to grow and increase in complexity. The language grows and grows, like a cancer. Then some fan of the language, inspired by it in some strange way, thinks a brand-new language must be created, derived from the original but different. Thus the original language evolves into a new language, which then itself tends to have cancerous growth.

    Like cancer in humans, programming language cancer leaves a trail of death and destruction in the software landscape. We must find a way to stop this cancer and deny its self-promoting lies that it’s “improving” the language it is destroying.

    Programming language origins and growth

    All computers have a native machine language that controls how they work. The language is in all cases extremely tedious for humans to use. Solutions for the tedium were invented in the early days of computing, which enabled programmers using the new languages to think more rapidly and naturally about the data they read, manipulated and put somewhere.

    Each of the new languages was small and primitive when it was “born.” As the youthful language tried getting somewhere, it struggled to first crawl, then stand with help and finally to walk and run. Growth in the early years was natural and led to good results. Once each new language reached maturity, however, cancer in its various forms began to set in, causing the language to grow in weight and correspondingly to lose strength, agility and overall health.

    I have described the giant early advances in language and reaching maturity with the invention of high level languages. After early maturity, a couple small but valuable additions to languages were made to enhance clarity of intention.

    The ability to create what amounts to “habits” (frequently used routines) were an important part of the language maturation process. The more valuable such routines were added to libraries so that any new program that needed them could use them with very little effort. There were a couple of valuable languages created that went beyond 3-GL’s, languages that were both popular and highly productive.  It’s a peculiarity of programming language evolution that these languages didn’t become the next-generation mainstream.

    That should have been pretty much it! You don’t need a new language to solve a new problem! Or an old problem.

    Languages exhibit cancerous growth

    In the early days of languages, it made sense that they didn’t emerge as full-grown, fully-capable “adults.” But after a few growth spurts, languages reached maturity and were fully capable of taking on any task – as shown for example, by the huge amounts of COBOL performing mission-critical jobs in finance, government and elsewhere, and by the fact that the vast, vast majority of web servers run on linux, written in plain-old C. The official language definitions in each case have undergone cancerous growth, ignored by nearly everyone sensible. For example, newer versions of COBOL incorporate destructive object-oriented features. Of course it’s the fanatics that get themselves onto language standardization committees and collaborate with each other to get useless but distracting jots and tittles added that endlessly complicate the language, making it harder to read, write and maintain.

    Languages metastasize

    There is plain old ordinary cancer, in which language cultists get passionate about important “improvements” that need to be made to a language. Then there are the megalomaniac language would-be-gurus who decide that some existing language is too flawed to improve and needs full-scale re-creation. Those are the august new-language creators, who make up some excuse to create a “new” language, which invariably takes off from some existing language. This has led to hundreds of “major” languages and literally thousands of others that have been invented and shepherded into existence by their ever-so-proud creators. Most such language "inventors" like to ignore the origins of their language, emphasizing its creativity and newness.

    Someone might say they’ve “invented” a language, but the reality is that the invention is always some variation on something that exists. In some cases the variation is made explicit, as it was with the verbose and stultifying variation of C called C++, which hog-tied the clean C language with a variety of productivity-killing object-oriented features. And then went on to grow obese with endless additions.

    Purpose-driven programming language cancers

    There is no unifying theme among the cancers. But high on the list is to somehow improve programmer productivity and reduce error by inventing a language with features that will supposedly accomplish that and similar goals. Chief among these purposes is the object-oriented languages, which have themselves metastasized into endless competing forms. Did you know that using a good OO language like Java results in fewer bugs? Hey, I've got this bridge to sell, real cheap! Functional languages keep striving to keep up with the OO crew for creating the most confining, crippling languages possible. It's a close race!

    The genealogy of programming languages

    Everyone who studies programming languages sees that there are relationships between any new language and its predecessors. When you look at the tree of language evolution, it’s tempting to compare it to the tree of biological evolution, with more advanced species evolving from earlier, less advanced ones. Humanoids can indeed do much more than their biological ancestors.

    That’s what the “parents” of the new languages would have you believe. Pfahh!

    I have described the explosive growth of programming languages and some of the pointless variations. But somehow programmers felt motivated to invent language after language, to no good end. Just as bad, programmers decided that existing languages needed endless new things added to them, often copying things from other languages in a crazed effort to “keep up,” I guess.

    Various well-intentioned efforts were made to prove the wonderfulness of the newly invented languages by using them to re-write existing systems. These efforts have largely failed, demonstrating the pointlessness of the new languages. There was a notable success: a major effort to re-write a production credit card system in assembler language to supposedly bad, old COBOL!

    How to stop language cancer

    Unless we want to continue the on-going cancerous growth and metastasizing of software languages, we need to … cure the cancer! Just STOP! Easy to say, when a tiny minority of crazed programmers around the globe without enough useful work to keep them from causing trouble keep driving the cancer. There is a solution, though.

    The first and most important part of the solution is Science. You know, that thing whose many results, along with effective engineering, created the devices on which we use software languages. Software is very much a pre-scientific discipline. There isn't even a way to examine evidence to decide whether one language is better than another. What is called "Computer Science" isn't scientific or even practical, as a comparison to medical science makes clear.

    The second path to a solution is to focus on status in software. Today, software people gain status in peculiar ways; usually the person with the greatest distance between their work and real people who use software has the highest status. A language "inventor" as about as far as you can get from real people using the results of software efforts. As soon as people contributing to software cancer are seen as frivolous time-wasters the better off everyone will be.

    What's the alternative to language cancer?

    The most important alternative is to cure it, as expressed above. The most productivity-enhancing effort is to focus instead on libraries and frameworks, which are the proven-in-practice way to huge programmer productivity gains. The "hard" stuff you would otherwise have to program is often available, ready to go, in libraries and frameworks. They are amazing.

    Finally, focusing on the details of language is staying fixed at the lowest level of program abstraction, like continuing to try to make arithmetic better when you would be worlds better off moving up to algebra.

    Conclusion

    Software language cancer is real. It's ongoing. the drivers of software language cancer continue to fuel more cancer by honoring those who contribute to it instead of giving them the scorn they so richly deserve. Software would be vastly better off without this horrid disease.

  • Why Object-Orientation in Software is Bad

    What?? Object-oriented programming (OOP) is practically the standard in software! It’s taught everywhere and dominates thinking on the subject. Most languages are O-O these days, and OO features have even been added to COBOL! How can such a dominant, mainstream thing be bad?

    The sad truth is that the badness of OOP isn’t some fringe conspiracy theory. An amazing line-up of astute, brilliant people agree that it’s bad. A huge collection of tools and techniques have been developed and taught to help people overcome its difficulties, which nonetheless persist. Its claims of virtue are laughable – anyone with experience knows the benefits simply aren’t there.

    Object-oriented languages and novels

    Object-orientation is one of those abstruse concepts that makes no sense to outsiders and is a challenge for people learning to program to understand and apply. To make OOP monstrosity clear, let’s apply OOP thinking to writing a novel.

    There are lots of ways of writing novels, each of them suitable for different purposes. There are novels dominated by the omniscient voice of the author. There are others that are highly action-based. Others have loads of dialog. Of course most novels mix these methods as appropriate.

    Some novels feature short chapters, each of which describes events from a particular character’s point of view. There aren’t many novels like this, but when you want to strongly convey the contrast between the contrasting experiencing of the characters, it’s a reasonable technique to use, at least for a few chapters.

    What if this were the ONLY way you were allowed to write a novel??!! What if a wide variety of work-arounds were developed to enable a writer to write  – exclusively! — with this sometimes-effective but horribly constricting set of rules?

    What if … the Word Processors (like Microsoft Word) from major vendors were modified so that they literally wouldn't allow you to write in any other way, instead of giving you the freedom to construct your chapters any way you wanted, with single-person-point-of-view as one of many options. What if each single small deviation from that discipline that you tried to include were literally not allowed by the Word Processor itself!? All this because the powerful authorities of novel creation had decided that single-person chapters were the only good way to write novels, and that novelists couldn't be trusted with tools that would allow them to "make mistakes," i.e., deviate from the standard.

    There would be a revolution. Alternative publishing houses would spring up to publish the great novels that didn’t conform to the object-novel constraints. The unconstrained books would sell like crazy, the OO-only publishing houses would try to get legislation passed outlawing the unconstrained style of writing, and after some hub-bub, things would go back to normal. Authors would exercise their creative powers to express stories in the most effective ways, using one or several techniques as made sense. The language itself would not be limited or limiting in any way.

    Sadly, the world of software works in a very different way. No one sees the byzantine mess under the “hood” of the software you use. No one knows that it could have been built in a tiny fraction of the time and money that was spent. Industry insiders just accept the systematized dysfunction as the way things are.

    This is objects – a special programming technique with narrow sensible application that has been exalted as the only way to do good programming, and whose rules are enforced by specialized  languages only capable of working in that constrained way.

    Is there nothing good about OOP?

    It isn’t that OOP is useless. The concept makes sense for certain software problems – just as completely other, non-OOP concepts make sense for other software problems! A good programmer has broad knowledge and flexible concepts about data, instructions and how they can be arranged. You fit the solution to the problem and evolve as your understanding of the problem grows, rather than starting with a one-size-fits-all template and jamming it on. You would almost never have good reason to write a whole program in OO mode. Only the parts of it for which the paradigm made sense.

    For example, it makes sense to store all the login and security information about a body of software in a single place and to have a dedicated set of procedures that are the only ones to access and make changes. This is pure object-orientation – only the object’s methods access the data. But writing the whole program in this way? You're doing nothing but conforming to an ideology that makes work and helps nothing.

    However. When you embody OOP in a language as the exclusive way of relating data and code, you’re screwed.

    In this post I describe the sensible origins of object-orientation for describing physical simulation, for example for ships in a harbor. Having a whole language to do it was overkill – I describe in the post how hard-coding the simulation in statements in a language made it hard to extend and modify instead of moving the model description into easily editable metadata — and then into a provably best optimization model..

    That is the core problem with object-oriented languages – they are a hard-coded solution to part of a programming problem, rather than one which creates the most efficient and effective relationships between instructions and data and then increasingly moves up the mountain of abstraction, each step making the metadata model more powerful and easier to change. Object-oriented concepts are highly valuable in most metadata models, with things like inheritance (even multiple inheritance, children able to override an inherited value, etc.) playing a valuable role. Keeping all the knowledge you have about a thing in one place and using inheritance to eliminate all redundancy from the expression of that knowledge is incredibly valuable, and has none of the makes-things-harder side effects you suffer when the object-orientation is hard-coded in a language. In the case of simulation, for example, the ultimate solution is optimization – getting to optimization from object-oriented simulation is a loooong path, and the OOP hard-coding will most likely prevent you from even making progress, much less getting there.

    Conclusion

    Any reasonable programmer should be familiar with the concepts of encapsulation, inheritance and the other features of object-oriented languages. Any reasonable programmer can use those concepts and implement them to the extent that it makes sense, using any powerful procedural language, whether in the program itself or (usually better) the associated metadata. But to enforce that all programs be written exclusively according to those concepts by embedding the concepts in the programming language itself is insanity. It's as bad as requiring that people wear ice skates, all the time and every day, because ice skates help you move well on ice when you know how to use them. If everything were ice, maybe. But when you try to run a marathon or even climb a hill with ice skates on, maybe you can do it, but everyone knows that trading the skates for running shoes or hiking boots would be better. Except in the esoteric world of software, where Experts with blinders on declare that ice skates are the universal best solution.

  • Object-Oriented Software Languages: The Experts Speak

    On the subject of Object-Oriented Programming (OOP), there are capital-E Experts, most of academia and the mainstream institutions, and there are small-e experts, which include people with amazing credentials and accomplishments. They give remarkably contrasting views on the subject of OOP. Follow the links for an overview, analysis and humor on the subject.

    The Exalted Experts on OOP

    Here is the start of the description of Brown's intro course to Computer Science, making it clear that "object-oriented design and programming" is the foundational programming method, and Java the best representation language:

    Brown intro

    Here's their description of OOP, making it clear that there are other ways to program, specifically the nearly-useless functional style, never used in serious production systems.

    Brown OOP

    See below to see what Dr. Alan Kay has to say about Java.

    Here is what the major recruiting agency Robert Half has to say on the subject:

    Object-oriented programming is such a fundamental part of software development that it’s hard to remember a time when people used any other approach. However, when objected-oriented programming, or OOP, first appeared in the 1980s, it was a radical leap forward from the traditional top-down method.

    These days, most major software development is performed using OOP. Thanks to the widespread use of languages like Java and C++, you can’t develop software for mobile unless you understand the object-oriented approach. The same goes for web development, given the popularity of OOP languages like Python, PHP and Ruby.

    It's clear: OOP IS modern programming. Except maybe some people who like functional languages.

    The mere experts on OOP

    We get some wonderful little-e expert witness from here.

    Implementation inheritance causes the same intertwining and brittleness that have been observed when goto statements are overused. As a result, OO systems often suffer from complexity and lack of reuse.” – John Ousterhout Scripting, IEEE Computer, March 1998

    Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function.” – John Carmack

    OO is the “structured programming” snake oil of the 90' Useful at times, but hardly the “end all” programing paradigm some like to make out of it.

    And, at least in it’s most popular forms, it’s can be extremely harmful and dramatically increase complexity.

    Inheritance is more trouble than it’s worth. Under the doubtful disguise of the holy “code reuse” an insane amount of gratuitous complexity is added to our environment, which makes necessary industrial quantities of syntactical sugar to make the ensuing mess minimally manageable.

    More little-e expert commentary from here.

    Alan Kay (1997)
    The Computer Revolution hasn’t happened yet
    “I invented the term object-oriented, and I can tell you I did not have C++ in mind.” and “Java and C++ make you think that the new ideas are like the old ones. Java is the most distressing thing to happen to computing since MS-DOS.” (proof)

    Paul Graham (2003)
    The Hundred-Year Language
    “Object-oriented programming offers a sustainable way to write spaghetti code.”

    Richard Mansfield (2005)
    Has OOP Failed?
    “With OOP-inflected programming languages, computer software becomes more verbose, less readable, less descriptive, and harder to modify and maintain.”

    Eric Raymond (2005)
    The Art of UNIX Programming
    “The OO design concept initially proved valuable in the design of graphics systems, graphical user interfaces, and certain kinds of simulation. To the surprise and gradual disillusionment of many, it has proven difficult to demonstrate significant benefits of OO outside those areas.”

    Jeff Atwood (2007)
    Your Code: OOP or POO?
    “OO seems to bring at least as many problems to the table as it solves.”

    Linus Torvalds (2007)
    this email
    “C++ is a horrible language. … C++ leads to really, really bad design choices. … In other words, the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C. And limiting your project to C means that people don’t screw that up, and also means that you get a lot of programmers that do actually understand low-level issues and don’t screw things up with any idiotic “object model” crap.”

    Oscar Nierstrasz (2010)
    Ten Things I Hate About Object-Oriented Programming
    “OOP is about taming complexity through modeling, but we have not mastered this yet, possibly because we have difficulty distinguishing real and accidental complexity.”

    Rich Hickey (2010)
    SE Radio, Episode 158
    “I think that large objected-oriented programs struggle with increasing complexity as you build this large object graph of mutable objects. You know, trying to understand and keep in your mind what will happen when you call a method and what will the side effects be.”

    Eric Allman (2011)
    Programming Isn’t Fun Any More
    “I used to be enamored of object-oriented programming. I’m now finding myself leaning toward believing that it is a plot designed to destroy joy. The methodology looks clean and elegant at first, but when you actually get into real programs they rapidly turn into horrid messes.”

    Joe Armstrong (2011)
    Why OO Sucks
    “Objects bind functions and data structures together in indivisible units. I think this is a fundamental error since functions and data structures belong in totally different worlds.”

    Rob Pike (2012)
    here
    “Object-oriented programming, whose essence is nothing more than programming using data with associated behaviors, is a powerful idea. It truly is. But it’s not always the best idea. … Sometimes data is just data and functions are just functions.”

    John Barker (2013)
    All evidence points to OOP being bullshit
    “What OOP introduces are abstractions that attempt to improve code sharing and security. In many ways, it is still essentially procedural code.”

    Lawrence Krubner (2014)
    Object Oriented Programming is an expensive disaster which must end
    “We now know that OOP is an experiment that failed. It is time to move on. It is time that we, as a community, admit that this idea has failed us, and we must give up on it.”

    Asaf Shelly (2015)
    Flaws of Object Oriented Modeling
    “Reading an object oriented code you can’t see the big picture and it is often impossible to review all the small functions that call the one function that you modified.”

    Here is Wiki's take on issues with OOP. It goes into detail.

    Here is Linus Torvald's take on object-oriented C++. Linus is merely the creator and leader of the open-source software that fuels the vast majority of the web.

    More details:

    Essay by Joe Armstrong. "After its introduction OOP became very popular (I will explain why later) and criticising OOP was rather like “swearing in church”. OOness became something that every respectable language just had to have."

    A talk given at an OOP conference by an OOP supporter who lists 10 things he hates.

    A Stanford guy telling his evolution to OOP and then out of it. Lots of detail.

    A professional who gradually realized there were issues with objects.

    I have therefore been moving away from the object-oriented development principles that have made up the bulk of my 17 year career to date. More and more I am beginning to feel that objects have been a diversion away from building concise, well structured and reusable software.

    As I pondered on this topic, I realised that this isn’t a sudden switch in my thinking. The benefits of objects have been gradually declining over a long period of time.

    A detailed explanation of how the noun-centricity of OO languages perverts everything. Here is an extended quote from the start of this brilliant essay:

    All Java people love "use cases", so let's begin with a use case: namely, taking out the garbage. As in, "Johnny, take out that garbage! It's overflowing!"

    If you're a normal, everyday, garden-variety, English-speaking person, and you're asked to describe the act of taking out the garbage, you probably think about it roughly along these lines:

      get the garbage bag from under the sink
    carry it out to the garage
    dump it in the garbage can
    walk back inside
    wash your hands
    plop back down on the couch
    resume playing your video game (or whatever you were doing)


    Even if you don't think in English, you still probably still thought of a similar set of actions, except in your favorite language. Regardless of the language you chose, or the exact steps you took, taking out the garbage is a series of actions that terminates in the garbage being outside, and you being back inside, because of the actions you took.

    Our thoughts are filled with brave, fierce, passionate actions: we live, we breathe, we walk, we talk, we laugh, we cry, we hope, we fear, we eat, we drink, we stop, we go, we take out the garbage. Above all else, we are free to do and to act. If we were all just rocks sitting in the sun, life might still be OK, but we wouldn't be free. Our freedom comes precisely from our ability to do things.

    Of course our thoughts are also filled with nouns. We eat nouns, and buy nouns from the store, and we sit on nouns, and sleep on them. Nouns can fall on your head, creating a big noun on your noun. Nouns are things, and where would we be without things? But they're just things, that's all: the means to an end, or the ends themselves, or precious possessions, or names for the objects we observe around around us. There's a building. Here's a rock. Any child can point out the nouns. It's the changes happening to those nouns that make them interesting.

    Change requires action. Action is what gives life its spice. Action even gives spices their spice! After all, they're not spicy until you eat them. Nouns may be everywhere, but life's constant change, and constant interest, is all in the verbs.

    And of course in addition to verbs and nouns, we also have our adjectives, our prepositions, our pronouns, our articles, the inevitable conjunctions, the yummy expletives, and all the other lovely parts of speech that let us think and say interesting things. I think we can all agree that the parts of speech each play a role, and all of them are important. It would be a shame to lose any of them.

    Wouldn't it be strange if we suddenly decided that we could no longer use verbs?

    Let me tell you a story about a place that did exactly that…

    The Kingdom of Nouns

    In the Kingdom of Javaland, where King Java rules with a silicon fist, people aren't allowed to think the way you and I do. In Javaland, you see, nouns are very important, by order of the King himself. Nouns are the most important citizens in the Kingdom. They parade around looking distinguished in their showy finery, which is provided by the Adjectives, who are quite relieved at their lot in life. The Adjectives are nowhere near as high-class as the Nouns, but they consider themselves quite lucky that they weren't born Verbs.

    Conclusion

    No big surprise, the experts beat the Experts hands-down. But you'd never know if you go through typical Computer Science "education," absorb the way that object-orientation is the "dominant" paradigm of computing and read the job requirements that talk about how the hiring group is serious about their object-hood. Programmers who are serious about what they do and try to understand it soon see the lack of clothing on King Object and move on.

  • The promising origins of object-oriented programming

    The creation of the original system that evolved into the object-oriented programming (OOP) paradigm in design and languages was smart. It was a creative, effective way to think about what was at the time a hard problem, simulating systems. It’s important to appreciate good thinking, even if the later evolution of that good thinking and the huge increase in the scope of application of OOP has been problematic.

    The evolution of good software ideas

    Lots of good ideas pop up in software, though most of them amount to little but variations on a small number of themes.

    One amazingly brilliant idea was Bitcoin. It solved a really hard problem in creative ways, shown in part by its incredible growth and widespread acceptance. See this for my appreciation of the virtues of Bitcoin, which remains high in spite of the various things that have come after it.

    The early development of software languages was also smart, though not as creative IMHO as Bitcoin. The creation of assembler language made programming practical, and the creation of the early 3-GL’s led to a huge productivity boost. A couple of later language developments led to further boosts in productivity, though the use of those systems has gradually faded away for various reasons.

    The origin of Object-Oriented Programming

    Wikipedia has a reasonable description of the origins of OOP:

    In 1962, Kristen Nygaard initiated a project for a simulation language at the Norwegian Computing Center, based on his previous use of the Monte Carlo simulation and his work to conceptualise real-world systems. Ole-Johan Dahl formally joined the project and the Simula programming language was designed to run on the Universal Automatic Computer (UNIVAC) 1107. Simula introduced important concepts that are today an essential part of object-oriented programming, such as class and object, inheritance, and dynamic binding.

    Originally it was built as a pre-processor for Algol, but they built a compiler for it in 1966. It kept undergoing change.

    They became preoccupied with putting into practice Tony Hoare's record class concept, which had been implemented in the free-form, English-like general-purpose simulation language SIMSCRIPT. They settled for a generalised process concept with record class properties, and a second layer of prefixes. Through prefixing a process could reference its predecessor and have additional properties. Simula thus introduced the class and subclass hierarchy, and the possibility of generating objects from these classes.  

    Nygaard was well-rewarded for the invention of OOP, for which he is given most of the credit.

    What was new about Nygaard’s OOP? Mostly, like Simscript, it provided a natural way to think about simulating real-world events and translating the simulation into software. As Wikipedia says:

    The object-oriented Simula programming language was used mainly by researchers involved with physical modelling, such as models to study and improve the movement of ships and their content through cargo ports

    In some (not all) simulation environments, thinking in terms of a set of separate actors that interact with each other is natural, and an object system fits it well. Making software that eases the path to expressing the solution to a problem is a clear winner. Simula was an advance in software for that class of applications, and deserves the credit it gets.

    What should have happened next

    Simula was an effective way to hard-code a computer simulation of a physical system. The natural next step an experienced programmer would take would be to extract out the description of the system being modeled and express it in easily editable metadata. This makes creating the simulation and making changes to it as easy as making a change to an Excel spreadsheet and clicking recalc. Here's the general idea of moving up the tree of abstraction. Here's an explanation and illustration of the power of moving concepts out of procedural code and into declarative metadata.

    This is what good programmers do. They see all the hard-coded variations of general concepts in the early hard-coded simulation programs. They might start out by creating subroutines or classes to express the common things, but you still are hard-coding the simulation. So smart programmers take the parameters out of the code, put them into editable files and assure that they're declarative. Metadata! There are always exceptions where you need a calculation or an if-then-else — no problem, you just enhance the metadata so it can include "rules" and you're set.

    The Ultimate Solution

    As you take your simulation problem and migrate it from hard-coded simulation to descriptive, declarative metadata, the notion of "objects" with inheritance begins to be a value-adding practicality … in the metadata. You understand the system and its constraints increasingly — for example ships in cargo ports.

    Then you can take the crucial next step, which is worlds away from object-orientation but actually solves the underlying problem in a way simulation never can — you build a constraint-based optimization model and solve it to make your cargo flows the best they could theoretically be!

    People thinking simulation and programmers thinking objects will NEVER get there. They're locked in a paradigm that won't let them! This is the ultimate reason we should never be locked into OOP, and particularly locked into it by having the object-orientation concepts in the language itself, instead of employed when and as needed in a fully unconstrained language for instructions and data.

    The later evolution of OOP

    In spite of the power of moving from hard-coding to metadata, the language-obsessed elite of the computing world focused on the language itself and decided that its object orientation could be taken further. Simula’s object orientation ended up inspiring other languages and being one of the thought patterns that dominate software thinking. It led directly to Smalltalk, which was highly influential:

    Smalltalk is also one of the most influential programming languages. Virtually all of the object-oriented languages that came after—Flavors, CLOS, Objective-C, Java, Python, Ruby, and many others—were influenced by Smalltalk. Smalltalk was also one of the most popular languages for agile software development methods, rapid application development (RAD) or prototyping, and software design patterns.The highly productive environment provided by Smalltalk platforms made them ideal for rapid, iterative development.

    Lots of the promotional ideas associated with OOP grew along with Smalltalk, including the idea that objects were like Lego blocks, easily assembled into new programs with little work. It didn’t work out for Smalltalk, in spite of all the promotion and backing from influential players. The companies and the language itself dwindled and died.

    It was another story altogether for the next generation of O-O languages, as Java became the language most associated with the internet as it grew in the 1990’s. I will tell more of the history later. For now I’ll just say that OOP is the paradigm most often taught in academia and generally in the industry.

    Conclusion

    OOP was invented by smart people who had a new problem to solve. Simula made things easier for modeling physical systems and was widely used for that purpose. As it was expanded and applied beyond the small category of problems for which it was invented, serious problems began to emerge that have never been solved. The problems are inherent in the very idea of OOP. The problems are deep and broad; the Pyramid of Doom is one of seemingly endless examples.

    When OOP principles and languages were applied to databases and GUI’s, they failed utterly, to the extent that even the weight of elite experts couldn’t force their use instead of simple, effective approaches like RDBMS for database and javascript for GUI’s. OOP has evolved into a fascinating case study of the issues with Computer Science and the practice of software development, with complete acceptance in the mainstream along with widespread quiet dissent resulting from fraudulent claims of virtue.

  • Software Programming Language Evolution: Libraries and Frameworks

    We've talked about the major advances in programming languages and the enhancements that brought those languages to a peak of productivity — a peak which has not been improved on since. Nonetheless there has been an ongoing stream of new programming languages invented, each of which is claimed to be "better" — with no discussion about what constitutes "goodness" in a language! This is high on the list of causes of the never-ending chaos and confusion that pervades the world of software. While loads of people focus on language, the most important programming tools that make HUGE contributions to the productivity of programmers using them goes largely unremarked, in spite of the fact that they are part of every programmer's day-to-day programming experience. What are these essential, always-used but largely in the background thing? Libraries and frameworks.

    Libraries

    A library is a collection of subroutines, often grouped by subject area, which perform commonly used functions for programs. Every widely used language has a library that is key to its practical use.

    For example, the C language has libraries that contain hundreds of functions for

    • input and output, including formatting
    • string manipulation and character testing
    • math functions
    • standard functions, dozens of them
    • date and time

    There are libraries of functions for controlling displays and input devices and many other things. After all these years and the "fatal" flaw of not being object-oriented (gasp!!), it is currently the #2 language in popularity. While the libraries don't play as important part of its popularity as certain other languages, they are still important.

    Python became widely used among people doing analytics not particularly because of the virtues of the language itself, but because it grew one of the richest modern libraries of routines for calculations available. It ended up with great support for statistics and the things you need to do with large number sets including array and matrix manipulation. When doing analytics, the library functions do nearly all the heavy lifting. All that the program has to do is deploy the rich set of functions against the problem in the right way!

    A 2021 survey of language popularity confirms the popularity of Python, and the importance of its library in giving it its popularity. Here's a summary:

    Capture

    Have you ever heard of the language R? It's an open-source language which became usable in the year 2000. Chances are if you work in statistics, data analytics, operations research or other areas involving serious data manipulation you know all about it and probably use it. The language itself has valuable features for math programming that most languages don't have, for things like vectors and matrices. More importantly it has an amazing rich library of the R equivalent of libraries, called packages. R packages contain reusable collections of functions and data definitions to perform valuable calculations. When you're trying to do something you're pretty sure someone else has tried to tackle in some way, the first thing you do is look for an appropriate package. Packages are available for most common data science tasks; they do much of the work and nearly all the "heavy lifting" of performing any job for which the package applies. R is a prime example of a language having virtues, but the bulk of the value and productivity coming from the available libraries.

    One of the sad turns taken in programming language evolution was driven by the ascendancy of the modern RDBMS. Ironically, the DBMS emerged at about the same time as do-everything language environments dominated the new minicomputer landscape. With a single environment like MUMPS or PICK you could get a whole job done — user interface, core program, data storage and access, everything! These environments enabled massive productivity gains. See this for more. But the DBMS blasted in, took over, and HAD to be used. So 3-GL's that were less productive than the new environments became even less so by finding cumbersome ways to use DBMS's. The way they did it I explain here. This was a first major step down the degenerate, productivity-killing road of layers. How did 3-GL's pull off the integration? With libraries, of course! For example, Java's JDBC library became a requirement for burdensome, error-prone access to standard databases from within Java, while those who didn't mind the overhead and "impedance mismatch" could use one of the ORM's (Object-Relational Mapping systems) that emerged.

    Not all libraries are tightly associated with a language! One such library is redis.io, an open source project started by an Italian developer to build a powerful in-memory key/value datastore and cache. It has taken off and has many more powerful features including queuing. While unknown among non-programmers, it is incredibly popular, widely used and available on the major cloud providers.

    Other libraries are directed towards a particular aspect of programming. With the emergence of the web UI, a succession of libraries to ease their creation have emerged.

    A UI library that has become dominant is React, created and then open-sourced by Facebook. It is specifically for the Javascript language and greatly enhances the productivity of building and the resulting performance of web UI's. The benefits of javascript or any other language are trivial compared to the productivity gain produced by using React. Unlike most libraries, React comes close to being a framework; it resembles an R package, in that it's a comprehensive solution to the problem of building web UI's. Interestingly, part of the programmer productivity gain is due to the fact that it has a major declarative aspect to its design, like AngularJS (see below).

    Bottom line: while the details of software language features have some impact on productivity, the availability and richness of libraries enhances productivity many times over. It's no contest.

    Frameworks

    Frameworks take the idea of libraries but with an important reversal. Libraries are sets of routines, any of which may be called by a program written in a supported language. The program written in the language is completely in charge. Frameworks, by contrast, provide an environment in which a language can operate. You select exactly one framework to work in. When you play by its rules, you normally enjoy large productivity benefits.

    A library is a rich set of resources covering many issues, like a book library. A framework is a selected set of resources enabling rapid work on a given category of issues, like a kitchen for cooking.

    One impressive framework is the RAILS framework for the language Ruby. While the Ruby language had become fairly successful, Ruby on Rails (as it was called) established it as a player because many groups discovered that it was many times faster than other tools at building a web application that has a UI and a database. The reason is simple: normally to build such an application you have one expert creating the UI probably using javascript and some library, another one building the server-side business logic, and another one creating the DBMS — with a great deal of work and trouble relating the database to the server program; see for example JDBC and ORM's above. With RAILS, you defined your database, which automatically defined the names you use inside Ruby to access the data. Similar story with the UI.

    The philosophy of RAILS centered on the DRY principle — Don't Repeat Yourself.

    Capture

    Unlike the endless repetition of the usual layered environment, RAILS took a major step towards the principle of Occamality. Was RAILS original, a real break-through? In its object-oriented environment, yes. In programming in general, no. RAILS is a typical example of the appalling lack of knowledge of history in software. Most of the benefits of RAILS were delivered in a more integrated way many years before by the Powerbuilder environment. I discuss the context here.

    What comprehensive frameworks like this REALLY do is attempt to re-create the comprehensive, everything-in-one-place development environments that emerged to enhance programmer productivity beyond what was achievable in 3-GL's, mostly by incorporating data storage and user interface functions, as I explain here. Their emergence and widespread adoption is clear evidence of the power of eliminating the insanity of layers in software.

    There are also frameworks that are much narrower in scope, addressing only part of the programming puzzle. While they leave much of an overall programming job alone, they can give a major boost to the narrow area on which they focus. One that is focused exclusively on building web UI's  that caught on and became widely used is AngularJS, a new version of which is simply Angular. This framework is highly declarative, focused more on describing the elements of the UI than the actions required to implement it.

    Capture

    This led to HUGE programmer productivity gains. Why would anyone build a web UI from scratch? Nonetheless, the similarities between the ReactJS library and the AngularJS framework are strong, and both of them powered huge programmer productivity gains that had very little to do with the virtues (or lack thereof) of the associated language, javascript.

    Languages vs Libraries and Frameworks

     New languages continue to flow out of the creative minds of groups of programmers who obviously don't have enough to do to keep themselves fully occupied. Each new language tends to be lauded, if only by its creators, with extreme claims of virtue on many dimensions. None of these claims are ever subjected to verification and testing, much less of the rigorous kind. In any case, there is simply no contest between the gains delivered by a rich library or framework and a new programming language.

    I'll just give glaring example. One of the main virtues that Object-oriented languages are supposed to have is code reuse. The concept that is bandied about is that good class system is like a set of Lego blocks, enabling new programs to be easily assembled. Mostly it doesn't happen. For re-use, libraries and frameworks are the gold standard. Think of a normal, old-fashioned book library. The whole reason they exist is that people reuse books! It's the same thing for software libraries — code gets into software libraries because it's re-used often! That's how they got to be called "libraries." Duh.

     

  • Software Programming Language Evolution: the Structured Programming GOTO Witch Hunt

    In prior posts I’ve given an overview of the advances in programming languages, described in detail the major advances and defined just what is meant by “high” in the phrase high-level language. I've described the advances in structuring and conditional branching that brought 3-GL’s to a peak of productivity.

    The structuring and branching caught the attention of academics. Watch out! What happened next was that a theorem was proved, a movement was declared and named, and a certain indispensable part of any programming language, the GO TO statement, was declared to be a thing only used by bad programmers and should be banned. Here's the story of the nefarious GOTO.

    Structures in Programming Languages

    I've described how structures were part of the first 3-GL's and how they were soon elaborated to more clearly express the intention of programmers, making code even more productive to write. The very first FORTRAN compiler, delivered in 1957, included primitive versions of conditional branching and loops, two of the foundations of programming structure. It was so powerful that the early users figured it decreased the number of statements needed to achieve a result more than 10 times.

    These are the people who actually WRITE PROGRAMS! They want to make it easier and jumped on anything that gave a dramatic improvement.

    “Significantly, the increasing popularity of FORTRAN spurred competing computer manufacturers to provide FORTRAN compilers for their machines, so that by 1963 over 40 FORTRAN compilers existed. For these reasons, FORTRAN is considered to be the first widely used cross-platform programming language.”

    Before long, the structuring capabilities of the original IF (conditional branching) and DO (controlled looping) statements were enhanced and augmented to something close to their current form. I describe this here. The result was a peak of programmer productivity that has not substantially been increased since, and often been degraded.

    The Bohm-Jacopini Theorem

    Completely independent of the amazing advances in languages and programming productivity that were taking place, math-oriented non-programmers were hard at work deciding how software should be written. Here is the story in brief:

    The structured program theorem, also called the Böhm–Jacopini theorem,[1][2] is a result in programming language theory. It states that a class of control-flow graphs (historically called flowcharts in this context) can compute any computable function if it combines subprograms in only three specific ways (control structures). These are

    1. Executing one subprogram, and then another subprogram (sequence)
    2. Executing one of two subprograms according to the value of a boolean expression (selection)
    3. Repeatedly executing a subprogram as long as a boolean expression is true (iteration)

    The structured chart subject to these constraints may however use additional variables in the form of bits (stored in an extra integer variable in the original proof) in order to keep track of information that the original program represents by the program location. The construction was based on Böhm's programming language P′′.

    The theorem forms the basis of structured programming, a programming paradigm which eschews goto commands and exclusively uses subroutines, sequences, selection and iteration.

    This theorem got all the academic types involved with computers riled up. The key to good software has been discovered! The fact that math theorems are incomprehensible to the vast majority of people, and the fact that perfectly good computer programs can be written by people who aren't math types didn't concern any of these self-anointed geniuses.

    The important thing to note about the theorem is that it was NOT created in order to make programming easier or more productive. It just "proved" that it was "possible" to write a program under the absurd and perverse constraints of the theorem to compute any computable function. Assuming you were willing to use a weird set of bits to store location information in ways that would make any such program unreadable by any normal person. Way to go, guys — let's go back to the days of writing in all-binary machine language!

    The Crisis in Software and its solution

    Not long after this, the academic group of Computer Science “experts” formed. They had a conference. They looked at the state of software and declared it to be abysmal. The whole conference was about the "crisis" in software. See this for details.

    One of the most prominent of those Computer Scientists was Edsger W. Dijkstra. He looked at the powerful constructs for conditional branching, loops and blocks that had been added to 3-GL's and invented the term "structured programming" to describe them. He related those statements to the wonderful but useless math proof about the minimal requirements for programming a solution to any "computable function." The proof "proved" that such programs could be written without the equivalent of a GOTO statement. BTW, I do not dispute this. He wrote "the influential "Go To Statement Considered Harmful" open letter in 1968."

    Among the solutions to the software crisis they proclaimed was strict adherence to the dogma of what Dijkstra called “structured programming,” which prominently declared that the GOTO statement had no place in good programming and should be eliminated.

    Does the fact that's it is POSSIBLE to program a solution to any computable function without using GOTO mean that you SHOULD write without using GOTO's? When children go to school, it's POSSIBLE for them to crawl the whole way, without using "walking" at all. Everyone accepts that this is possible. When you're on your feet all sorts of bad things can happen — you can trip and fall! Most important, you can get the job done without walking … and therefore you SHOULD eliminate walking for kids getting to school. QED.

    This is academia for you – a prime example of how Computer Science works hard to make sure that programs are hard to write, understand and deliver, all in the name of achieving the opposite.

    The debate about structured programming

    There was no debate about the utility of the conditional branching, controlled looping and block structures that rapidly became part of any productive software language. They were there and programmers used them, then and now. The debate was about "structured programming," which by its academic definition outlawed the use of the GOTO statement. That wasn't all. It also outlawed having more than one exit to a routine, breaks from loops and other productive, transparent and generally useful constructs.

    I remember clearly as a programmer in the 1980's having a non-technical manager type coming to me and quizzing me about whether I was following the rigors of structured programming, which was then talked about as the only way to write good code. I don't remember my answer, but since I knew the manager would never go to the trouble of actually — gasp! — reading code, my answer probably didn't matter.

    The most important thing to know about the leader of the wonderful movement to purify programming is his lack of interest in actually writing code:

    Dijkstra quote

    Fortunately, there are sane people in the world, including the incomparable Donald Knuth (an academic Computer Scientist who's actually great!) and a number of others.

    An alternative viewpoint is presented in Donald Knuth's Structured Programming with go to Statements, which analyzes many common programming tasks and finds that in some of them GOTO is the optimal language construct to use.[9] In The C Programming Language, Brian Kernighan and Dennis Ritchie warn that goto is "infinitely abusable", but also suggest that it could be used for end-of-function error handlers and for multi-level breaks from loops.[10] These two patterns can be found in numerous subsequent books on C by other authors;[11][12][13][14] a 2007 introductory textbook notes that the error handling pattern is a way to work around the "lack of built-in exception handling within the C language".[11] Other programmers, including Linux Kernel designer and coder Linus Torvalds or software engineer and book author Steve McConnell, also object to Dijkstra's point of view, stating that GOTOs can be a useful language feature, improving program speed, size and code clarity, but only when used in a sensible way by a comparably sensible programmer.[15][16] According to computer science professor John Regehr, in 2013, there were about 100,000 instances of goto in the Linux kernel code.[17]

    Any programmer can make mistakes. Any statement type can be involved in that mistake. For example, I think nearly everyone accepts that cars are a good thing. But over 30,000 people a year DIE in car accidents! So where's the movement to eliminate cars because of this awful outcome! It makes as much sense as outlawing the GOTO because sometimes it's used improperly. Like every other statement type.

  • Software Programming Language Evolution: Structures, Blocks and Macros

    In prior posts I’ve given an overview of the advances in programming languages, described in detail the major advances and defined just what is meant by “high” in the phrase high-level language. In this post I’ll dive into the additional capabilities added to 3-GL’s that brought them to a peak of productivity.

    History

    Let’s remember what high-level languages are all about: productivity! They are about the amount of work it takes to write the code and how easy code is read.

    The first major advance, from machine language to assembler, was largely about eliminating the grim scut-work of taking the statements you wanted to write and making the statements “understandable” to the machine by expressing them in binary. Ugh.

    The second major advance, to 3-GL’s like FORTRAN and COBOL, was about eliminating the work of translating from your intention to the assembler statements required to express that intention. A single line of 3-GL code can easily translate into 10 or 20 lines of assembler code. And the 3-GL line of code often comes remarkably close to what you actually want to “say” to the computer, both writing it and reading it.

    FORTRAN achieved this goal to an amazing extent.

    “with the first FORTRAN compiler delivered in April 1957.[9]:75 This was the first optimizing compiler, because customers were reluctant to use a high-level programming language unless its compiler could generate code with performance approaching that of hand-coded assembly language.[16]

    “While the community was skeptical that this new method could possibly outperform hand-coding, it reduced the number of programming statements necessary to operate a machine by a factor of 20, and quickly gained acceptance.”

    The reduction in the amount of work was the crucial achievement, but just as important was the fact that each set of FORTRAN statements were understandable, in that they came remarkably close to expressing the programmer’s intent, what the programmer wanted to achieve. No scratching your head when you read the code thinking to yourself “I wonder what he’s trying to say here??” This meant easier to write, fewer errors and easier to read.

    Enhancing conditional branching and loops

    The very first iteration of FORTRAN was an amazing achievement, but it’s no surprise that it wasn’t perfect. At an individual statement level it was nearly perfect. When reading long groups of statements there were situations where the code was clear, but the intention of the programmer not clearly expressed in the code itself – it had to be inferred from the code.

    The first FORTRAN had a couple of the intention-expressing statements: a primitive IF statement and DO loop. Programmers soon realized that more could be done. The next major version was FORTRAN 66, which cleaned up and refined the early attempts at structuring. Along with the appropriate use of in-line comments, it was nearly as clear and intention-expressing as any practical programmer could want.

    The final milestone in the march to intention-expressing languages was C.

    It’s an amazing language. While FORTRAN was devised by people who wanted to do math/science calculations and COBOL by people who wanted to do business data processing, C was devised to enable computer people to write anything – above all “systems” software, like operating systems, compilers and other tools. In fact, C was used to re-write the first Unix operating system so that it could run on any machine without re-writing. C remains the language that is used to implement the vast majority of systems software to this day.

    I bring it up in this context because C added important intention-expressing elements to its language that have remained foundational to this day. It enhanced conditional statements, creating the full IF-THEN-ELSE-ENDIF that has been common ever since. This meant you could say IF <condition is true> THEN <a statement>. The statement would only be executed if the condition was true. You could tack on ELSE <another statement> ENDIF. This isn’t dramatic, but the only other way to express this common thought is with GOTO statement – which certainly can be understood, but takes some figuring. In addition, C added the ability to use a delimited block of statements wherever a single statement could be used. When there are a moderate number of statements in a block, the code is easy to read. When there would be a large number, a good programmer creates a subroutine instead.

    C added a couple more handy intention-expressing statements. A prime example is the SWITCH CASE BREAK statement. This is used when you have a number of conditions and something specific to do for each. The SWITCH defines the condition, and CASE <value> BREAK pairs define what to do for each possible <value> of the SWITCH.

    Lots of following languages have added more and more statements to handle special cases, but the cost of a more complex language is rarely balanced with the benefit in ease and readability.

    The great advance that's been ignored: Macros

    C did something about all those special cases that goes far beyond the power of adding new statements to a language, and vastly increases not just the power and readability of the language but more important the speed and accuracy of making changes. This is the macro pre-processor.

    I was already very familiar with macros when I first encountered the C language, because they form a key part of a good assembler language – to the extent that an assembler that has such a facility is usually called a macro-assembler. Macros enable you to define blocks of text including argument substitution. They resemble subroutine calls, but they are translated at compile time to code which is then compiled along with the hand-written code. A macro can do something simple like create a symbol for a constant that is widely used in a program, but which may have to be changed. When a change is needed, you just change the macro definition and – poof! – everywhere it’s used it has the new value. It can also do complex things that result in multiple lines of action statements and/or data definitions. It is the most powerful and extensible tool for expressing intention and enabling rapid, low-risk change in the programmer’s toolbox. While the C macro facility isn’t quite as powerful as the best macro-assemblers, it’s a zillion times better than not having one at all, like all the proud but pathetic modern languages that wouldn’t know a macro if it bit them in the shin.

    The next 50 years of software language advances

    The refrain of people who want to stay up to date with computers is, of course, "What's new?" Everyone knows that computers evolve more quickly than anything else in human existence, by a long shot. The first cell phones appeared not so long ago, and they were "just" cell phones. We all know that now they're astounding miniature computers complete with screens, finger and voice input, cameras and incredible storage and just about any app you can think of. So course we want to know what's new.

    The trouble is that all that blindingly-fast progress is in the hardware. Software is just along for the ride! The software languages and statements that you write are 99% the same as in the long-ago times before smart phones. Of course there are different drivers and things you have to call, just as in any hardware environment. But the substance of the software and the lines of code you use to write it are nearly the same. Here is a review of the last 50 years of "advances" in software. Bottom line: it hasn't advanced!

    Oh, an insider might argue: what about the huge advances of the object-oriented revolution? What about Google's new powerhouse, Go? Insiders may ooh and ahh, just like people at a fashion catwalk. The come-back is simple: what about programmer productivity? Claims to this effect are sometimes made, but more often its that the wonderful new language protects against stupid programmers making errors or something. There has not been so much as a single effort to measure increase of programmer productivity or quality. There is NO experimental data!! See this for more. Calling what programmers do "Computer Science" is a bad joke. It's anything but a science.

    What this means is simple: everyone knows the answer — claims about improvement would not withstand objective experiments — and therefore the whole subject is shut down. If you're looking for a decades-old example of "cancel culture," this is it. Don't ask, don't tell.

    Conclusion

    3-GL's brought software programming to an astounding level of productivity. Using them you could write code quickly. The code you wrote came pretty close to expressing what you wanted the computer to do with minimal wasted effort. Given suitable compilers, the code could run on any machine. Using a 3-GL was at least 10X more productive than what came before for many applications.

    A couple language features were incorporated into the very first languages, like conditional branching and controlled looping, that were good first steps. The next few years led early programmers to realize that a few more elaborations of conditional branching and controlled looping would handle the vast majority of practical cases. With those extra language features, code became highly expressive. All subsequent languages have incorporated these advances in some form. Sadly, the even more productive feature of macros has been abandoned, but as we'll see in future posts, their power can be harnessed to an even greater extent in the world of declarative metadata.

  • The Relationship between Data and Instructions in Software

    The relationship between data and instructions is one of those bedrock concepts in software that is somehow never explicitly stated or discussed. While every computer program has instructions (organized as routines or subroutines) and data, the details of how the data is identified, named and accessed vary among programming languages and software architectures. Those differences of detail have major consequences. That’s why understanding the underlying principles is so important.

    Programmers argue passionately about the supposed virtues or defects of various  software architectural approaches and languages, but do so largely without reference to the underlying concepts. Only by understanding the basic concepts of data/instruction relationships can you understand the consequences of the differences.

    The professional kitchen

    One way to understand the relation between instructions (actions) and data is to compare it to something we can all visualize. An appropriate comparison with a program is a professional chef’s kitchen with working cooks. But in this kitchen, the cooks are a bit odd — only one of them is active at a time. When someone asks them to do something they get active, each performing its own specialty. A cook may ask another cook for help, giving the cook things or directing them to places, and getting the results back. The cooks are like subroutines in that they get called on to do things, often with specific instructions like “medium rare.” The cook processes the “call” by moving around the kitchen to fetch ingredients and tools, brings them to a work area to process the ingredients (data) with the tools, and then delivers the results for further work or to the server who put in the order. The ingredients (data) can be in long-term storage, working storage available to multiple chefs or in a workspace undergoing prep or cooking. In addition, food (data) is passed to chefs and returned by them.

    The action starts when a server gives an order to the kitchen for processing.

    • This is like calling a program or a subroutine. Subroutines take data as calling parameters, which are like the items from the menu written on the order to the kitchen.

    The person who receives the order breaks the work into pieces, giving the pieces to different specialists, each of whom does his work and returns the results. Unlike in a kitchen, only one cook is active at a time. One order might go to the meat chef and another to whoever handles vegetables.

    • This is like the first subroutine calling other subroutines, giving each one the specifics of the data it is supposed to process. The meat subroutine would be told the kind and cut of meat, the finish, etc.

    In a professional kitchen there is lots of pre-processing done before any order is taken. Chefs go to storage areas and bring back ingredients to their work areas. They may prepare sauces or dough so that everything is mixed in and prepped so that it can be finished in a short amount of time. They put the results of their work in nearby shelves or buckets for easy access later in the shift.

    • This is like getting data from storage, processing it and putting the results in what is called static or working storage, which is accessible by many different subroutines.

    There is a storage area and refrigerator that stores meat and another that stores vegetables. The vegetable area might have shelves and bins. The cook goes to the storage area and brings the required ingredients back to the cook’s work space. Depending on the recipe, the cook may also fetch some of the partly prepared things like sauces, often prepared by others, to include.

    • This is like getting data from long-term storage and from working storage and bringing it to automatic or local variables just for this piece of work.

    The storage area could be nearby. It could be a closet with shelves containing big boxes that have jars and containers in it. A cook is in charge of keeping the pantry full. They go off and get needed ingredients and put them in the appropriate storage area as needed. They could also deliver them as requested right to a chef.

    • This is like having long-term storage and access to it completely integrated with the language, or having it be a separate service that needs to be called in a special way.

    The chef does the work on the ingredients to prepare the result.

    • This is like performing manipulations on the data that is in local variables until the desired result has been produced. In the course of this, a chef may need to reach out and grab some ingredient from a nearby shelf.

    The chef may need extra space for a large project. He grabs some empty shelves from the storage area and uses them to store things that are in progress, like dough that needs time to rise. later a chef might call out “grab me the next piece of dough” or “I need the dough on the right end of the third shelf.

    • This is like taking empty space and using it. Pointers are sometimes used to reference the data, or object ID’s in O-O systems.

    The cook delivers the result for plating and delivery.

    • This is like producing a return variable. It may also involve writing data to long-term storage or working storage.

    I’m not a cooking professional, but I gather that the work in professional kitchens and how they’re organized has evolved towards producing the best results in the least amount of elapsed time and total effort. As much prep work as done before orders are received to minimize the time and work to deliver orders quickly and well. The chefs have organized work and storage spaces to handle original ingredients (meat, spices, flour, etc.) and partly done results (for example, a restaurant can’t wait the 45 minutes it might take to cook brown rice from scratch).

    In the next section, this is all described again somewhat more technically. If you’re interested in technology or have a programming background, by all means read it. The main points of this post and the ones that follow can be understood without it.

    Instructions and data in a computer program

    The essence of a computer program is instructions that the computer executes. Most of the instructions reference data in some way – getting data, manipulating it, storing results. See this for more.

    In math, from algebra on up, variables simply appear in equations. In computer software, every variable that appears in a statement must be defined as part of the program. For example, a simple statement like

              X = Y+1

    Means “read the value stored in the location whose name is Y, add the number 1 to it, and store the result in the location whose name is X.” Given this meaning, X and Y need to be defined. How and where does this happen? There are several main options:

    • Parameters. These form part of the definition of a subroutine. When calling a subroutine, you include the variables you want the subroutine to process. These are each named.
    • Return value. In many languages, a called routine can return a value, which is defined as part of the subroutine.
    • Automatic or local variables. These are normally defined at the start of a subroutine definition. They are created when the subroutine starts, used by statements of the subroutine and discarded when the subroutine exits.
    • Static or working storage variables. These are normally defined separately (outside of) subroutines. They are assigned storage at the start of the whole program (which may have many subroutines), and discarded at the end.
    • Allocated variables. Memory for these is allocated by a subroutine call in the course of executing a program. Many instances of such allocated variables may be created, each distinguished by an ID or memory pointer.
    • File, database or persisting variables. These are variables that exist independent of any program. They are typically stored in a file system or DBMS. Some software languages support these definitions being included as part of a program, while others do not. See this for more.

    There are a couple concepts that apply to many of the places and ways variables can be defined.

    • Grouping. Groups of variables can be in an ordered list, sometimes with a nesting hierarchy. This is like the classic Hollerith card: you would have a high-level definition for the whole card and then a list of the variables that would appear on the card.
      • There might be subgroups; for example, start-date could be the name of a group consisting of the variables day, month, year.
      • Referring to such a variable might look like “year IN start-date IN cust-record” in COBOL, while in other languages it might be year.start-date.cust-record.
    • Multiples. Any variable or group can be declared to be an array, for example the variable DAY could be made an array of 365, so there’s one value per day of a year.
    • Types or templates. Many languages let you define a template or type for an individual variable or group. When you define a new variable with a new name like Y, you could say it’s a variable of type X, which then uses the attributes of X to define Y.
    • Definition scope. Parameters, return values and local variables are always tied to the subroutine of which they are a part. They are “invisible” outside the subroutine. The other variables, depending on the language, may be made “visible” to some or all of a program’s subroutines. Exactly how widely visible data definitions are is the subject of huge dispute, and is at the core of things like components, services and layers.

    When you look at a statement like X = Y+1, exactly how and where X and Y are defined isn’t mentioned. X could be a parameter, a local variable or defined outside of the subroutine in which the statement appears. Part of the job of the programmer is to name and organize the data definitions in a clean and sensible way.

    The variety of data-instruction organization and relationships

    Most of the possibilities for defining variables listed above were provided for by the early languages FORTRAN, COBOL and C, each of which remains in widespread use. Not long after these languages were established, variations were introduced. Software languages and architectures were created that selected and arranged the way instructions related to data definition. Programmers and academics decided that some ways of referencing and organizing data were error-prone and introduced restrictions that were intended to reduce the number of errors that programmers made when creating programs. In software architecture, the idea arose that all of a program's subroutines should be organized into separate groups, usually called "components" or "services," each with its own collection of data definitions. The different components call on each other or send messages to ask for help and get results, but can only directly operate on data that is defined as part of the component.

    The most extreme variation of instruction/data relationship is a complete reversal of point of view. The view I've described here is "procedural," which means everything is centered around the actor, the chef who does things. The reversal of that point of view is "object-oriented," called O-O, which organizes everything around the data, the acted upon, the ingredients and workspaces in a kitchen. Instead of following the chef around as he gets and operates on ingredients (data), we look at the data, called objects, each of which has little actors assigned to it, mini-chefs, that can send messages for help, but can only work on their own little part of the world. It's hard to imagine!

    The basic idea is simple: instead of having a master chef or ones with broad specialties like desserts, there are a host of mini-chefs called "methods," each of  which can only work on a specific small group of ingredients. A master chef has to know so much — he might make a mistake! By having a mini-chef who is 100% dedicated to dough, and never letting anyone else create the dough, we can protect against bad chefs (programmers) and make sure the dough is always perfect! Hooray! Or at least that's the theory…

    Conclusion

    Computers take data in, process it and write data out. Inside the computer there are instructions and data. Software languages have evolved to make it easier for programmers to define the data that is read and created and to make it easier to write the lines of code that refer to the data and manipulate it. As bodies of software have grown, people have created ways to organize the data that a computer works on, for example putting definitions for the in-process data of a subroutine inside the subroutine itself or collecting a group of related subroutines into a self-contained group or component with data that only it can work on.

    Understanding the basic concepts of instruction/data relationships and how those relationships can be organized and controlled is the key to understanding the plethora of approaches to language and architecture that have been created, and making informed decisions about which language and architecture is best for a given problem. The overall trend is clear: Programming self-declared elites decide that this or that restriction should be placed on which variables can be accessed by which instructions in which way, with the goal of reducing the errors made by normal programming riff-raff. Nearly all such restrictions make things worse!

  • Software programming languages: the Declarative Core of Functional Languages

    What is most interesting about functional languages is that they strive to be declarative, instead of the imperative orientation of programming languages.

    In a prior post I described the long-standing impulse towards creating functional languages in software. Functional languages have been near the leading edge of software fashion since the beginning, while perpetually failing to enter the mainstream. There is nonetheless an insight at the core of functional languages which is highly valuable and probably has played a role in their continuing attractiveness to leading thinkers. When that insight is applied in the right situations in a good way it leads to tremendous practical and business value, and in fact defines a path of advancement for bodies of software written in traditional ways.

    This is one of those highly abstract, abstruse issues that seems far removed from practical values. While the subject is indeed abstract and abstruse, the practical implications are far-reaching and result in huge software and business value when intelligently applied.

    Declarative and Imperative

    A computer program is imperative. It consists of a series of machine language instructions that are executed, one after the other, by the computer's CPU (central processing unit). Each Instruction performs some little action. It may move a piece of data, perform a calculation, compare two pieces of data, jump to another instruction, etc. You can easily imagine yourself doing it. Pick up what's in front of you, take a step forward, if the number over there is greater than zero, jump to this other location, etc. It's tedious! But every program you write in any language ends up being implemented by imperative instructions of this kind. It's how computers work. Period.

    Computers operate on data.  Data is passive. Data can be created by a program, after which it's put someplace. The locations where data is held/stored are themselves passive; those locations are declared (named and described) as part of creating the imperative part of a computer program.

    Data is WHAT you've got; instructions are HOW you act. WHAT is declarative; HOW is imperative. WHAT is like a map; HOW is like an ordered set of directions for getting from point A to point B on a map.

    The push towards declarative

    From early in the use of computers, some people saw the incredible detail involved in spelling out the set of exacting instructions required to get the computer to do something. A single bit in the wrong position can cause a computer program to fail, or worse, arrive at the wrong results. This detailed directions approach to programming is wired into how computers work. Is there a better way?

    There is in fact no avoiding the imperative nature of the computer's CPU. As high level languages began to be invented that freed programmers from the tedious, error-prone  detail of programming in machine language, some people began to wonder if there were a way to write a low-level program, a program that of necessity would be imperative, that somehow enabled programs to be created in some higher level language that were declarative in nature.

    Some people who were involved with early computers, nearly all with a strong background in math, proceeded to create the declarative class of programming languages, the most distinctive members of which are functional programming languages as I described in an earlier post.

    The ongoing attempt to create functional languages and use them to solve the same problems for which imperative languages are used has proven to be a fruitless effort. But there are specific problem areas for which a declarative approach is well-suited and yields terrific, practical results — so long as the declarative approach is implemented by a program written in an imperative language, creating a workbench style of system for the declarations. The current fashion of "low code" and "no code" environments is an attempt to move in that direction. But I'd like to note that there's nothing new in those movements; they're just new names for things that have been done for decades.

    The Declarative approach wins: SQL

    DBMS's are ubiquitous. By far the dominant language for DBMS is SQL. Data is defined in a relational DBMS by a declarative schema, defined in DDL (data definition language). Data is operated on by a few different kinds of statements such as SELECT and INSERT.

    SELECT certainly sounds like a normal imperative keyword in any language, like COBOL's COMPUTE statement. But it's not. It's declarative. A SELECT statement defines WHAT data you want to select from a particular database, but says nothing about HOW to get it.

    This is one of the cornerstones of value in a relational DBMS system. A SELECT statement can be complicated, referencing columns in multiple rows of various tables joined in various ways. The process of getting the selected data from the database can be tricky. Without query optimization, a key aspect of a DBMS, a query could take thousands of times longer than a modern DBMS that implements query optimization will take. Furthermore, table definitions can be altered and augmented, and so long as the data referenced in the query still exists, the SELECT statement will continue to do its job.

    If all you're doing is grabbing a row from a table (like a record from an indexed file), SQL is nothing but a bunch of overhead, and you'd be better off with simple 3-GL Read statements. But the second things get complicated, your program will require many fewer lines of complex code while being easier to write and maintain if you have SQL at your disposal. A win for the declarative approach to data access, which is why, decades after it was created, SQL is in the mainstream.

    The Declarative approach wins: Excel

    I don't know many programmers who use Excel. Too bad for them; it's a really useful tool for many purposes, as its widespread continuing use makes clear.

    Excel is a normal executable program written in an imperative language, but it implements a declarative approach to working with data. Studying Excel is a good way to understand and appreciate the paradigm.

    An Excel worksheet is two dimensional matrix of values (cells). A cell can be empty or have a value of any kind (text, number, currency, etc.) entered. What's important in this context is that you can put a formula into a cell that defines its value. The formula can reference other cells, individually or by range, absolutely or relatively. A simple formula could be the sum of the values in the cells above the cell with the formula. It could be arbitrarily complex. It can have conditionals (if-then-else). Going beyond formulas, you can turn ranges of cells into a wide variety of graphs.

    If you're not familiar with them, you should look at Pivot tables, and when you've absorbed them, move on to the optimization libraries that are built in, with even better ones available from third parties. Pivot tables enable you define complex summaries and extractions of ranges of cells. For example, I have an Excel worksheet in which I list each day of the year and the place where I am that day. A simple Pivot table gives me the total of days spent in each location for the year, something that simple formulas could not compute.

    The key thing here is that even though there are computations, tests and complex operations taking place, it's all done declaratively. There is no ordering or flow of control. If your spreadsheet is simple, Excel updates sums (for example) when you enter or change a value in a cell. For more complex ones, you just click re-calc, and Excel figures out all the formula dependencies and evaluates them in the right order. This makes Excel a quicker way to get results than programming in any imperative language, assuming the problem you have fits the Excel paradigm.

    The Declarative approach wins: React.js

    One of the most widely-used frameworks for building UI's is React.js. Last time I looked, the header page of react.js included this:

    1

    It says right out that it's declarative! And that makes it easy! I have found a number of places (example, example) that have nice explanations of how it works and why it's good.

    The Declarative approach wins: Compilers

    The best computer-related course I took in college by far was a graduate course on the theory and construction of compilers. The approach I was taught all those decades ago remains the main method for compiler construction. First you have a lexical parser to turn the text of the program into a string of tokens. Then you have a grammar parser to turn the tokens into a structured graph of semantic objects. Then you have a code generator to turn the objects into an executable program.

    The key insight is that each stage in this process is driven by a rich collection of meta-data. The meta-data contains all the details of the input language and its grammar and the output target. A good compiler is really a compiler-compiler, i.e., the imperative code that reads and acts on the lexical definitions, the grammar and the code generation tables. The beauty is that you write the compiler-compiler just once. If you want it to work on a new language, you give it the grammar of the language without changing any of the imperative code in the compiler-compiler! If you want to generate code for a new computer for a language you've already got, all you do is update the code generation tables! Once you've got such a compiler, you can write the compiler in its own language, at the start "boot-strapping" it.

    While they didn't exist at the time I took the course, tools to perform these functions, YACC (Yet Another Compiler Compiler) and LEX, were built at Bell Labs by one of the groups of pioneers who created Unix and the C language. I didn't have those tools but used the concepts to build the FORTRAN compiler I wrote in 1973 while at my first post-college job. Since the only tool I had to use was assembler language, taking the meta-data, compiler-compiler approach saved me huge amounts and time and effort compared to hard-coding the compiler without meta-data.

    This is meta-data at its best.

    The Declarative spectrum

    Excel and SQL are 100% all-in on the declarative approach. But it turns out that it doesn't have to be all one way or the other. If you're attacking an appropriate problem domain, you can start with just programming it imperatively, and then as you understand the domain, introduce an increasing amount of declaration into your program, by defining and using increasing amounts of meta-data. This is exactly what I have described as climbing up the tree of abstraction. Each piece of declarative meta-data you introduce reduces the size of the imperative program, and moves information that is likely to be changed or customized into bug-free, easily-changed meta-data.

    Conclusion

    Functional languages as a total alternative to imperative languages will perpetually be attractive to some programmers, particularly those of a math theory bent. Except in highly limited situations, it won't happen. No one is going to write an operating system in a functional language.

    Nonetheless, the declarative approach with its emphasis on declaring facts, attributes and relationships is the powerful core of the functional approach, and can bring simplicity and power to otherwise impossibly complicated imperative programs.

  • Software programming language evolution: Functional Languages

    Once computers were invented and started being used, people discovered that writing programs for them was brutally hard. It didn’t take long before some smart folks figured out ways to make it easier, taking two giant steps forward in ease and productivity in quick succession.

    After that wonderful start, people kept on inventing new languages, but somehow never made things better – in fact the new languages often made programming harder and less productive, a situation that the experts and exalted Professors studiously ignored. They never even bothered to spell out what makes one language better than another, which you'd think would be at the heart of promoting a new language.

    I this post I’ll discuss a major category of new languages that keep getting created, receive passionate kudos, are rarely used but refuse to die: functional languages. In a subsequent post I'll describe the truly good idea that lurks inside the madness of the functional language world.

    The core idea: time and timeless, music and math

    I learned math as a kid. In high school I took the most accelerated and advanced courses available, finishing with a course in calculus, which I aced while mostly sitting in the back of the classroom teaching myself differential equations from my father’s grad school text book. I liked it. At the same time I got into music of multiple kinds but with a strong preference for classical. The strong connection between music and math has often been noted.

    During my junior year of high school I took a course in computer programming, with FORTRAN as the language. Our “teacher,” scrambling to keep a chapter ahead of us in the text book, had arranged machine time for the class on Saturdays at a computer at a nearby rocket firm, Reaction Motors. After lots more programming, by my first year of college, it was clear that math took a distinct second place to both music and software in my heart.

    In this post I explain where software comes from and its relation to music. The key concept is that, while you can read a page of music or software or math, math usually exists in a world “outside” of time – at most, time is a dimension like space; think for example of a proof in geometry. Music and software, by contrast, can only be understood as flowing through time. When you look at a page of music or software you naturally start at the beginning and hear/see it flow through time in your mind, just like reading a page of text.

    What this means is that there is a fundamental dis-connect between math and computing. I explore this disconnect here. The math types are always trying to turn software into math, for example plodding for decades on a fruitless pursuit for ways to “prove” the correctness of programs.

    This is the best way to understand both the failure and the refusal to admit defeat of the math types to develop a practical math-like way to write software, an effort and category of languages that is usually called “functional.”

    The history of functional languages

    Wikipedia's introduction to functional languages is accurate:

    Lambda calculus

    Turing had nothing to do with making software programming practical. That amazing job fell to others, who created the two giant advances in software language development. But since the math types were hanging around computers in the early days, particularly because doing math calculation was one of the earliest tasks to which computers were applied, finding a way to make programming more math-like, essentially eliminating the factor of time from software was a top priority.

    I had already done serious programming, including working on a giant FORTRAN project to optimize oil refinery operation, when a friend introduced me to functional programming for the first time. The language was APL. It came a bit closer to being able to handle practical math than many of the other non-functional "functional languages" because of its focus on the matrix, but it was as easy to read and understand as advanced math. I could and can imagine that it would be appropriate for representing certain math transformations in a compact way, but why anyone who wasn't being tortured would agree to use it for general programming remains beyond me — assuming of course that what you want is to … radical idea here, be warned! … get stuff done. As opposed to demonstrate and revel in your ideological and mathematical purity and exalted status in the pantheon of Computer Science greatness.

    None of this is any concern to the cultists, who march on inventing an endless stream of "new" functional languages. Among the names you might see if you look into this are  OCaml, Scheme, Haskell, Clojure, Scala. There are even a few that are object-oriented on top of being functional, like Erlang. Articles continue to pop their heads up out of the underground insisting that functional languages really are making inroads in commercial applications, better for getting a job, more effective to use to create your ground-breaking start-up, etc. I predict no end to the flow.

    The fact that functional languages are mostly of interest to a cult of fanatical core believers minimizes their widespread, on-going pernicious impact on software development, fueled by the near-universal math orientation of academic and corporate Computer Science. Functional concepts keep getting introduced by language fanatics into otherwise-sane normal languages.

    Conclusion

    Computer software is rarely driven by purely practical considerations much less objective knowledge and definition of what constitutes "good" in a language. But when an approach like functional programming is pushed that leads to results that are often 10X worse on multiple dimensions than "normal" programming, the problem is so large that people who aren't already committed members of the cult notice the difference and refuse to put up with the nonsense.

    In spite of all this, there is an amazing valuable insight at the core of the functional language movement that leads to highly productive, real-world-valuable results when embodied in the right way. I will spell this insight out and illustrate its use in a subsequent post. The heart of the insight is taking a declarative approach instead of an imperative one when and to the extent that is appropriate.

     

  • Fundamental Concepts of Computing: Subroutine Calls

    Subroutine calls are glossed over as yet another thing to learn when software is taught or described. The fact is that subroutine calls are one of the most amazing, powerful aspects of computer programming. They are the first step up the mountain of abstraction, the path towards creating software that solves problems with optimal efficiency, effectiveness and ease.

    What's a subroutine? In common-sense terms, it's like a habit. I hope you have the habit of brushing your teeth. Calling a subroutine is like deciding you should brush your teeth. The subroutine itself is like the steps you take to brush your teeth: open your medicine chest, get out your toothbrush and toothpaste (probably always in the same order), etc. When you're done (you've rinsed your mouth and put stuff away for the next time) the subroutine is done and it "returns" to whatever called it — what was I doing when I decided it was time to brush my teeth, you might think, but a computer subroutine always returns to exactly where it was before making the subroutine call.

    In computer subroutines this calling behavior can get more elaborate. Subroutines can call (start) other subroutines, which can call yet others, without end. This is called "nested" calling.

    The teeth cleaning subroutine ends with (we hope) clean teeth. Computer subroutines can end with (return) lots more. Suppose you're researching political revolutions and are making lots of notes. Then you decide it's important to study the 1917 Russian revolution. You put your general revolution notes to the side and dive into Russia. You start taking notes and then realize how important the Menshiviks and Bolshiviks are. You put the notes you've taken so far on 1917 on top of the general notes and dive into the next subject. When you've gone as deep down the rabbit hole as you can stand, you take the top notes from your stack on the side, and fill in the notes of what you've just learned. Eventually you pick up the next notes from the stack and so on until you're back to your revolution notes. It's also a bit similar to clicking on a web page, and then on that one and then again, and finally hitting the back arrow until you're back on the page where you started. Each new subject you dive into is like calling a subroutine, from which you can make further calls, eventually returning from each to the place you left off, only now with additional knowledge and information.

    Just like in real life, when you call a subroutine you pass it information (parameters). The information you pass tells the subroutine exactly what you wish done. When the subroutine is done it can pass information back to the caller. You might pass dirty plates to the dishwasher subroutine, for example, which passes clean plates back when it returns. Or more elaborately you could pass a bunch of ingredients to the bake-a-cake routine and it would return a cake.

    Some Technobabble

    Computer software people aren't as clear as you'd like them to be with the names they use. The term I'm using here, "subroutine," is out of fashion these days, though still in use. In different software languages or programming communities a subroutine could be called a routine, subprogram, function, method or procedure.

    Let's start with the basics of programming languages. Here's an overview of the bedrock on which everything is built, machine language, and the higher-level languages that have been built on that foundation. When you look at the memory of a computer what you see is data. It's all one's and zero's. But some of that data can be understood by the machine to be instructions. An instruction may include a reference to a memory location, an address. The instruction could ask that the data at the given address replace the value in a register, be added to it or other supported functions. There is always a "jump" instruction, which instructs the machine not to execute the next sequential instruction as it normally would, but to execute instead the instruction at the address in the instruction. There are test instructions that normally cause a jump if the specified condition is met.

    The Subroutine call

    Amongst the rich jumble of instruction types is one that stands out with amazing power: the subroutine call. A subroutine call, sometimes implemented directly in machine language and sometimes in multiple machine language instructions, does a couple things at once.

    • It saves the address of the location after the subroutine call somewhere, usually on a stack.
    • It puts some parameters someplace, often on a stack.
    • It transfers control to the starting address of the subroutine.

    The subroutine then:

    • Executes its instructions, getting parameters from wherever they were passed to it.
    • Returns to the address after the caller as stored on the stack
    • Optionally gives a "return value" to the calling code.

    Call by value vs. call by reference!

    It's a bit esoteric, but there's an important distinction between two basic ways parameters can be passed to a subroutine: pass by value and pass by reference. Some languages can handle both kinds, while others can only handle one. This sounds abstract but it's important and easy to understand. Passing by value is like when you put the ingredients for a cake into a box, pass the box through an opening to another room, where a baker turns the ingredients into a cake and passes the cake (the value) back through the opening. Passing by reference is like when there's a larger kitchen and one of the people working there is a baker. You ask the baker to bake a cake, and the baker goes into the different storage places, gets the ingredients she needs and turns them into a cake on the spot.  In passing by value the subroutine only gets what you give it. In passing by reference, you tell the subroutine (by address pointers) what it should work with. In the peculiar world of object-oriented thinking, the equivalent of calling by reference is sacrilegious. But that's a subject for another time.

    Subroutine Power

    Subroutines are the first step towards reducing the size and complexity of applications, increasing their level of abstraction, and making them easier to change and enhance with minimal trouble and error. Subroutines are high on the list of fundamental concepts of computing.

  • Software Programming Language Evolution: Credit Card Software Examples 2

    The vast majority of production software applications undergo a process of continuous modification to meet the evolving needs of the business. Sometimes organizations get fed up with the time and cost of modifying an application decide to replace it entirely. The replacement is most often a brand-new application.

    When this happens, nearly everyone agrees that a different programming language should be used for building the new version. After all, everyone knows that programming languages have advanced tremendously since the about-to-be-replaced application was written, so it just makes sense to pick a modern language and get the job done. It occurs to exactly no one that advances in science and even writing novels are regularly achieved by using the same languages that are already in wide use. See this for details.

    In a prior post I described a couple of such huge efforts in the credit card industry to take advantage of the latest software technology, specifically 4-GL's and Java. The results didn't make headlines, but news of both multi-year, multi-tens-of-million-dollar disasters spread through industry insiders, of which I was a member at the time.

    At around the same time two major card processing companies faced the same challenges and made very different decisions about how to go forward. They did NOT choose the latest-and-greatest programming languages, but went with choices most people in technology thought were obsolete. The result? In sharp contrast with the cool kids who went with powerful modern languages, both projects succeeded. Here are the stories.

    Total Systems Services (TSYS) and moving to COBOL

    TSYS began inside a little bank in Columbus Georgia as a group writing software to process credit cards in 1959. In 1974 it began processing cards for other banks, went public in the 1980’s and by the 1990’s was a major force in card processing services. Partly due to its early origins and some highly efficient early programmers, its processing software was largely written in IBM assembler code – in the 1990’s!

    Executives at the bank decided to modernize the software. I have no inside information on the reasons for their decision. It could have been as simple as a desire to have their software not tied to a particular machine. In any case they authorized a major, multi-year project to rewrite what had become a huge body of production software. Talk about risky! One of the amazing things is that they decided to close their ears to the near-universal acclaim being given to modern software languages and methods and take the relatively low-risk path of re-writing the entire body of assembler language into … COBOL – the very language that was derided and that others were going to great lengths to get out of!!

    TSYS put a big team on the job, took a couple years to get it done, and around the end of the 1990’s moved all their production from their old body of IBM assembler code to the new COBOL system with no disruptions in service. An impressive achievement, to put it mildly. The fact that they knew the requirements because of their existing working system written in assembler language played a big role. But the two failed efforts I describe here had the same advantage! The point here is that the 3-GL COBOL is HUGELY more productive than 2-GL assembler language, so the rewrite made sense, in sharp contrast to the failed efforts.

    Paysys and COBOL

    When I joined Paysys in the mid-1990’s it had two major products: CardPac (processing for credit cards issued by banks) and Vision21 (processing for credit cards issued by retailers, supporting multiple, extensive credit plans on a single account). The company created a first unification of the two products into what became the industry’s leading systems for processing cards, VisionPLUS. The project was completed and put into production while I was there. There were over 5 million lines of COBOL code in the final product.

    The COBOL code was unique in being able to handle an unprecedented range of requirements, including a large number of installations outside the US for Citibank, Amex and GE Capital, and in Japan. It handled over 150 million cards across multiple installations at that time.

    The head of First Data decided to buy the company at the end of the year 2000, mostly because his existing code base, written in assembler language, couldn’t be made to meet international requirements. The COBOL code First Data bought is now the core of their processing, handling over 600 million cards, far more than any other body of software. Migrating from assembler language to a proven body of COBOL code was a big winner. Twenty years later the code continues to be a winner as its growth by hundreds of millions of accounts demonstrates.

    Conclusion

    Why should anyone care about this ancient history? Easy: to this day, status in software is conferred by being involved with cool new languages that are oh-so-much-better than prior ones. For example if you're involved in super-cool blockchain, no one with a brain in their head would even suggest using an existing language to implement what are called smart contracts. You need something new and "safe." Sure. The fact is, there have been no significant advances in programming languages in the last fifty years. Nothing but rhetoric and baseless claims.

  • Software Programming Language Evolution: 4GL’s and more

    Not long after third-generation computer languages (3-GL’s) got established, ever-creative software types started inventing the next generation. In a prior post, I’ve covered two amazing programming environments that were truly an advance. They were both widely used in multiple variations, and programs written using them continue to perform important functions today – for example powering more hospitals than any competing system. But they were pretty much stand-alone unicorns; the academic community ignored them entirely and nearly all the leading figures, experts and trend-setters in software ignored them and looked elsewhere.

    Experts “in the know” directed their attention to what came to be called fourth-generation languages (4-GL’s) and object-oriented (O-O) 3-GL’s. These were supposed to be the future of software. Let’s see what happened with 4-GL's.

    The background of 4-GL’s

    The earlier posts in this series give background that is helpful to understand the following discussion.

    In prior posts I’ve given an overview of the advances in programming languages, described in detail the major advances and defined just what is meant by “high” in the phrase high-level language. I’ve described two true advances beyond 3-GL’s. And then there were 4-GL’s, supposedly a whole generation beyond the 3-GL’s. Let’s take a look at them.

    The best way to understand 4-GL’s is to look at the context in which they were invented. First, the academic types were busy at work creating languages that essentially ignored how data got into and out of the program. The first of these was Algol, followed by others. The academic community got all excited by this class of languages, but they were ignored by the large community of programmers who had to get things done with computers. That was in the background. In the foreground, modern DBMS’s were invented and commercialized.

    4-GL's!

    Apparently everywhere new languages sprang to life, created inside, around and on top of DBMS's. It's a revolution, a once-in-a-lifetime opportunity to become a major milestone in software history! My name could be right up there with von Neumann and Turing!

    All the major DBMS vendors created their own languages, usually with snappy names like Informix 4GL and Oracle's PL/SQL. How could they fail to respond to this massive opportunity for market expansion?

    Brand-new vendors popped up to take advantage of the hunger for DBMS along with the new hardware configuration of client-server computing, in which an application ran on a group of Microsoft Windows PC's, all connected with and sharing a DBMS running on a server. One startup that powered to great commercial success was a company called PowerSoft which created a product called Powerbuilder. The Powerbuilder development environment enabled you to work directly with a DBMS schema and create a program that would interact with a user and the data. The central feature of the system was an interactive component called a DataWindow, which enabled you to visually select data from the database and create a UI for it supporting standard CRUD (create, read, update and delete) functions without writing code. This was a real time-saver.

    The 3-GL's Respond

    Vendors of 3-GL's couldn't ignore the tumult raging outside their comfy offices. Before long support was added to most COBOL systems to embed SQL statements right in the code. Sounds simple, right? It was anything but. COBOL programs had data definitions which the majority of lines of COBOL code used. The way to handle the mis-match between SQL tables and COBOL record definitions wasn't uniform, but in many cases a single COBOL Read statement was replaced with embedded SQL with additional new COBOL code to map between the DBMS results and the data structures already in the COBOL. Ditto when data was being updated and written. Then there's the little detail that DBMS performance was dramatically worse than simple COBOL ISAM performance, since DBMS's were encumbered with huge amounts of functionality not needed by COBOL programs but which couldn't be circumvented or turned off.

    Net result: the 3-GL's were worse off, by quite a bit.

    What Happened?

    Naturally the programming landscape is dominated by 4-GL's today, right? Or maybe their successors? How could it be otherwise? Just as each new generation of languages represented a massive advance in productivity from the earlier one and became the widely-accepted standard, why wouldn't this happen again?

    It didn't happen. 4-GL's are largely of historic interest today, mostly confined to legacy code that no one can be bothered to re-write. Even the systems that genuinely provided a productivity advantage like Powerbuilder faded into stasis, rarely used to build new programs.

    There is a great deal to be said about this fact. One of the factors is certainly the rise to dominance of object-oriented orthodoxy, which in spite of supposedly being centered on data definitions (classes) is nonetheless highly code-centric and has NO productivity gain over non-O-O languages. Where have you read that before? Nowhere? Probably the same place you haven't read all the studies showing in great detail how it achieves productivity gains. What can I say? Computer Non-Science reigns supreme.

    Conclusion

    I won't be writing a follow-up blog post on 5-GL's. Yes, they existed and were the hot thing at the time. I remember vividly all the hand-wringing in the US over the massive effort in Japan with the government funding research into fifth-generation languages. The US would be left in the dust by Japan in software, just like they're beating us in car design and manufacturing! When was the last time you heard about that? Ever?

    Computers are objective things. Software either works or it doesn't. Unlike perfume, clothes or novels, it's not a mater of taste or personal preference; it's more like math. So what is it with the mis-match between enthusiasm and reality in software? It would be nice to understand it, but what's most important is to understand that much of what goes on in software is NOT based on objective right-or-wrong things like math but on fashion trends and the equivalent of Instagram influencers. Don't know anything about computer history? If you want to be accepted by the experts and elite, that's a good thing. If you want to get things done, quickly and well, ignore it at your peril.

     

  • Software Programming Language Evolution: Impact of the DBMS Revolution

    The invention and widespread acceptance of the modern database management system (DBMS) has had a dramatic impact on the evolution and use of programming languages. It's part of the landscape today. People just accept it and no one seems to talk about the years of disruption, huge costs and dramatic changes it has caused.

    The DBMS Blasts on to the Scene

    In the 1980’s the modern relational database management system, DBMS, blasted onto the scene. Started by an IBM research scientist, E.F. Codd and popularized by his collaborator Chris Date, The Structured Query Language System, SQL, changed the landscape of programming. Completely apart from normal procedural programming, you had a system in which data could be defined using a Data Definition Language, DDL, and then created, updated and read using SQL  The data definitions were stored in a nice, clean format called a schema. Best of all, the new system gave hope to all the people who wanted access to data who couldn’t get through the log jam of getting custom report programs written in a language the analysts didn’t want to be bothered to learn. SQL hid most of the ugly details because of its declarative approach.

    SQL was more than just giving access to data. There was a command to insert data into the DBMS, INSERT, then make changes to data, UPDATE, and even to send data to the great bit-bucket in the sky, DELETE. The system even came with transaction processing controls, so that you could perform a deduction from one user's account and an addition to another user's account and assure that either they both happened or neither did. Best of all the system did comprehensive logging, making a permanent record of who did what changes to which data and when.  Complete and self-contained!

    This impressive functionality led to a problem. With users demonstrating outside the offices of the programming department, things were getting rowdy. The chants would go something like this:

    Leader: What do we want?

    Shouting crowd: OUR DATA!

    Leader: When do we want it?

    Shouting crowd: NOW!

    Everyone wanted a DBMS. They wanted access to their data without having to go through the agony of coming on bended knee to the programming department to get reports written at some point in the distant future.

    The response of languages: what could have happened

    It wouldn't have been difficult for languages to give the DBMS demonstrators what they wanted with little disruption. One possibility was changing a language so that it produced a stream of data changes for the DBMS in addition to its existing data changes. A second possibility was changing a language so that its existing data statements would be applied directly to a DBMS. Either of these alternatives would have supplied a non-disruptive entry of DBMS technology into the computing world. But that's not what happened.

    I personally implemented one of these non-disruptive approaches to DBMS integration in the mid-1980's and it worked. Here's the story:

    I was hired by EnMasse Computer, one of several upstart companies trying to build computers based on the powerful new class of microprocessors that were then emerging. EnMasse focused on the commercial market which was at the time dominated by minicomputers made by companies like DEC, Data General and Prime. Having a DBMS was considered essential by new buyers in this market, but most existing applications were written in languages without DBMS support. One of my jobs was to figure out how to address the need. I was told to focus on COBOL.

    This was a big problem because the way data was represented in COBOL didn't map well into relational database structures. What I did was get a copy of the source code of our chosen DBMS, Informix, and modify it so it could directly accept COBOL data structures and data types. I then went into the runtime system and modified it to send native COBOL read, write, update and delete commands directly to the data store, bypassing all the heavy-weight DBMS overhead. This was tested and proven with existing COBOL programs. It worked and the COBOL programs ran with undiminished speed. The net result was that unmodified COBOL used a DBMS for all its data, enabling business users full access to all the data without programming.

    I did all the work to make this happen personally, with some help from an assistant.

    I thought this was an obvious solution to the problem that everyone would take. It turns out that EnMasse failed as a business and that no one else took the simple approach that was best for everyone.

    The response of languages: what did happen

    What actually happened in real life was a huge investment was made with widespread disruption. Instead of burying the conversion, massive efforts were taken to modify — practically re-write — programs written in COBOL and other languages so that instead of using their native I/O commands they used SQL commands instead, with the added trouble of mapping and converting all the incompatible data structures. More effort went into modifying a single program for this purpose than I put into making the changes at the systems level to make the issue go away. What's worse is that, because of the massive overhead imposed by DBMS's for data manipulation using their commands instead of native methods performance was degraded by large factors.

    While the ever-increasing speed of computers mitigated the impact of the performance penalty, in many cases it was still too much. In the late 1990's after massive increases in computer power and speed, program creators were using the stored procedure languages supplied by database vendors to enable business logic to run entirely inside the DBMS instead of bouncing back and forth between the DBMS and the application. While this addressed the performance issue of using DBMS for storage, it introduced having the logic of a business application written in two entirely different languages running on two different machines, usually with different ways of representing the data. Nightmare.

    The rise of OLAP and ETL

    One of the many ironies of the developments I've described is that people eventually noticed that the way data was organized for sensible computer program use was VERY different from the best ways to organize it for reporting and analysis. The terms that emerged were OLTP, On-line Transaction Processing, and OLAP, On-Line Analytical Processing. In OLTP, it's best to have data organized in what's called normalized form, in which each piece of data is stored exactly once in one place. This makes it so that a program doesn't have to do lots of work when, for example, a person changes their phone number; the program just goes to the one and only place phone number is stored and makes the change. OLAP is a different story because there's no need to update data that's already been created — just add new data.

    There were also practical details, like the fact that data was stored and manipulated by multiple programs, many of which had overlapping data contents — for example a bank that has a program to handle checking accounts and a separate program for CD's, even though a single customer could have both. This led to the rise of a special use of DBMS technology called a Data Warehouse, which was supposed to hold a copy of all a system's data. A technology called ETL, Extract Transform and Load, emerged to grab the data from wherever it was first created, convert it as needed and stored it into a centralized place for analysis and reporting.

    Given the fact that you really don't want people sending SQL statements to active transaction systems that could easily drag down performance and all the factors above, it turns out that the push to make normal programs run on top of DBMS systems was a monstrous waste of time. One that continues to this day!

    Conclusion

    Nearly all programmers today assume that production programs should be written using a DBMS. While alternatives like noSQL and key-value stores  have emerged, they don't have widespread use. Since the data structures used by programs are often very different than those used by the DBMS, a variety of work-arounds have been devised such as ORM's (Object-Relational Mappers), each of which has a variety of performance and labor-intensive issues. The invention and near-universal use of relational DBMS in software programming is a rarely recognized disaster with ongoing consequences.

     

  • Software Programming Language Evolution: Credit Card Software Examples 1

    In prior posts I've discussed the nature of programming languages and their evolution. I have given an overview of the so-called advances in programming languages made in the last 50 years. Most recently I described a couple of major advances beyond the 3-GL's. The purpose of this post is to give a couple real-life examples of how amazing new 4-GL’s and O-O languages have worked out in practice.

    I was CTO of a major credit card software company in the late 1990’s. Because of that I had a front-row seat in what turned out to be a rare clinical trial of the power and productivity of the two major new categories of programming languages that were supposed to transform the practice of programming. Of course no one, in academia or elsewhere, has written about this real-world clinical trial or any of the similar ones that have played out over the years.

    Bank One and 4-GL's

    Bank One, based in Columbus Ohio, was a major force among banks in the 1990’s. They were growing and projected a strong image of innovation. During the 1990’s the notion that applications should be based on a DBMS was becoming standard doctrine, and the companies that valued productivity over Computer Science (and internet) purity were united behind one form or another of 4-GL as the tool of choice to get things done. Together with Anderson Consulting, one of the giant consulting companies at the time, Bank One proceeded on a huge project to re-write all their credit card processing code into a 4-GL.

    After spending well north of $50 million (I heard nearly $100 million) and taking over 3 years, the project was quietly shelved, though industry insiders all heard the basic story. No one had an explanation. 4-GL’s are amazing, so much better than ancient things like COBOL – and card processing is just simple arithmetic, right, with a bit of calculating interest charges thrown in. How hard could it be? Harder than a 4-GL wielded by a crack team of one of the country’s top tech consulting firms could pull off with years of time and a giant budget, I guess.

    On top of everything else, they had a clear and unambiguous definition of what the 4-GL program needed to do in the form of the existing system. They had test cases and test data. This already eliminates a huge amount of work and uncertainty in building new software. Compared with most software projects, the work was simple: just do what the old program did, using existing data as the test case. This fact isolated the influences on the outcome so that the power and productivity of the 4-GL was the most important factor. Fail.

    Word of this should have gotten out. There should have been headlines in industry publications. The burgeoning 4-GL industry should have been shattered. Computer Science professors who actually cared about real things should have swarmed all over and figured out what the inherent limitations of 4-GL's were, whether they could be fixed, or whether the whole idea was nothing but puffery and arm-waving. None of this happened. Do you need to know anything else to conclude that Computer Science is based on less rationality than Anna Wintour and Vogue?

    Capital One and Java

    Capital One was the card division of a full-service bank that was spun out in 1994, becoming an unusual bank whose only business was to issue credit cards. In just a couple years the internet boom started, and with it enthusiasm for the most prominent object-oriented language for the enterprise, java. Capital One management was driving change in the card world and presumably felt they needed a modern technology underpinning to do it fully. So they authorized a massive project to re-write their entire existing card software from COBOL to Java. I remember reading at the time that they expected incredible flexibility and the power to evolve their business rapidly from the unprecedented power of Java.

    The project took a couple years and was funded to the tune of many tens of millions of dollars; the amounts were never made public. As time went on, we heard less about it. Then there was a small ceremony and the project was declared a success, a testimony to the forward-looking executive management and pioneering tech team at the company. Then silence. I poked around with industry friends and discovered that the code had indeed been put into production – but just in Canada, which was a new market for the company at the time, handling a tiny number of cards. Why? It didn’t have anywhere close to the features and processing power that the existing COBOL system had to handle the large US card base. Just couldn't do it and company management decided to stop throwing good money after bad.

    Conclusion

    Executives and tech teams at major corporations bought into the fantasy that the latest 4-GL's and O-O languages would transform the process of writing software. They put huge amounts of money with the best available teams to reap the benefit for their business. And they failed.

    These projects and their horrible outcomes should have made headlines in industry publications and been seared in the minds of academics. Software experts should have changed their tune as a result, or found what went wrong and fixed it. None of this happened. It tells you all you need to know about the power and productivity gains delivered by 4-GL's and object-oriented languages. Nothing has changed in the roughly twenty years since these events took place except for further evidence for the same conclusion piling up and the never-ending ability of industry participants and gurus to ignore the evidence.

     

  • The Bronte Sisters and Software

    Who would have thought that the amazing, pioneering and tragic Bronte sisters could demonstrate important things about software programming languages? Not me, until I started thinking about it. I realized that their achievement has a close parallel to what great programmers do: they don’t invent a new language, they use an existing language to express new things, thoughts that were in their heads but which hadn’t before been published.

    The Sisters

    375px-The_Brontë_Sisters_by_Patrick_Branwell_Brontë_restored

    I hope you’ve at least heard of these ladies, and even better read a couple of their wonderful novels, among which are Charlotte’s Jane Eyre, Emily’s Wuthering Heights and Anne’s The Tenant of Wildfell Hall.

    Their novels were very successful. Originally published with a man’s name listed as author, their success continued after their real identities were revealed, which demonstrates that the success of their work was solely due to the quality and originality of their work There have been movies and numerous references to them and their work in other media.

    If you haven’t already spent some time enjoying their work, I hope you will.

    In all the talk about the Bronte’s, no one bothers to mention the perfectly obvious fact that they used the English language of their day to write their novels. English didn’t hold them back. Nor did English “help” them. Their originality was completely in the way they used the English language.

    The Bronte’s and Software Languages

    The obvious response to the above is … duhhh, the reason no one mentions their use of unaltered, un-enhanced English is that nearly all novelists do the same.

    Now let’s turn to programming. Most programmers, like novelists, just use the language they've been given to get the job done. Most programmers, like most who attempt to write a novel, do pedestrian work.

    Unlike novelists, there is a subset of programmers who obsess about which is the “best” language as measured by various scales. Programmers who consider themselves to be a cut above the rest fiercely criticize this or that tiny detail of whatever language is in their cross-hairs this morning. If their ire runs at peak level for a while, may even invent a new language. Why? Their amazing new language will “prevent” programmers from making this or that kind of error – like sure, when has that ever happened – or somehow raise who ever uses it to new levels of power and productivity and quality. Not. Never happened. Baseless assertions and propaganda.

    Was something important “added” or “corrected” in the English language that enabled the Bronte’s to do what they did? Nope.

    This leads to a thought that is blasphemous to the self-appointed elite of software: the software language you use is almost irrelevant, of course with some exceptions; what's important is what you write in the language you're using. Just like with novels.

    Languages and science

    Hold on there just a sec! Novels are fiction, meant to entertain. Completely different subject. Software is like math — it's pure and exact, devoid of messy things like the emotions and nuances of human interaction that novels are full of.

    True enough. First I would say, try having a discussion about the differences between programming languages with one of the software elite who obsess about the subject. See how much "cool, calm and collected" you get; every time I've tried having a rational discussion on this subject over the years voices have gone up notches at a time, and passion has been slopping all over the place.

    Perhaps we can be enlightened by the subject that's been raised over the years of the best language for science and/or math. There are even books on the subject!

    Let's take a quick look at a bit of evidence:

    Capture

    Skipping over loads of details, what you quickly find is that, not long after Galileo broke with tradition and wrote in his normal speaking language (Italian) instead of Latin, scientists tended to write in whatever language they used in everyday life. Chemistry was dominated by the German language in the 1800's not because German was somehow better for chemistry (which didn't stop some people arguing that it was), but because most of the productive chemists happened to be most comfortable writing in German, mostly because they spoke German. A few years ago a couple Norwegian scientists were award the Nobel prize. They probably spoke Norwegian in the lab, but if they wanted to be read, they had to write in a widely-read language: English. Not because English was "better" for science, but just more widespread at that point.

    In all these cases, the language happened to be used for expressing the thoughts, facts and concepts — which were independent of the language used!

    Just like it is in … software programming!

    Conclusion

    With a few important exceptions, the language you use to write a program is like the language you use to write a scientific paper or a novel. The language used is not the most important thing. By far. The most important thing is what you have to say in whatever language you end up using.

Links

Recent Posts

Categories