Category: Software Quality

Summary: Software Quality Assurance

This is a summary with links to my posts on the optimal method of achieving software quality, and then a sample of the endless quality problems of not using those methods, including in the supposedly smart big tech companies,

Software quality results from the whole process of specifying and building software, not just the formal quality and testing disciplines, as noted here:

https://blackliszt.com/2011/07/software-quality-theory-and-reality.html

There are groups of software people who use proven, decades-old methods to build high quality software quickly and well, methods that violate the regulations and professional methods that dominate the field. Here is an introduction to the winning method of software QA.

https://blackliszt.com/2021/07/the-revolutionary-championchallenger-method-of-software-qa.html

The concept of logical redundancy in addition to physical redundancy is a key aspect of the champion/challenger approach to quality.

https://blackliszt.com/2011/04/single-point-of-failure-logical-vs-physical.html

The typical methods of QA lead to endless work, added to already-long development.

https://blackliszt.com/2015/12/speed-optimized-software-qa-or-cancer.html

Good QA recognizes that one method should be used for getting software to do what you want and a different method for keeping it right, a.k.a. “regression testing.”

https://blackliszt.com/2012/04/a-simple-framework-for-software-quality-assurance.html

A remarkable thing about traditional QA is that it’s all built and run in the lab, while good QA is focused on the only environment users care about, essentially field-testing.

https://blackliszt.com/2012/04/field-tested-software.html

Software Quality Problems

Software has a quality problem. It's big and widespread. It affects nearly all software development efforts.It's not getting better. It's not just the QA process. Here are some highlights of the horrors and why it's so bad.

https://blackliszt.com/2011/06/why-computer-software-is-so-bad.html

Is anybody worried about this? It doesn't appear that way, even though software quality is right up there with "government efficiency" and "northern hospitality."

https://blackliszt.com/2011/06/software-quality-horror-failure-tragedy-and-ineptitude.html

Even simple-sounding things like making a web site work reasonably well for the people who use it is apparently too much to ask of many software departments.

https://blackliszt.com/2011/01/what-do-consumers-want-web-site.html

Given this, you would think that expensive and time-consuming methods that don’t work would have long-since been discarded. But the bad methods are taught in academic Computer Science and broadly supported throughout the industry. The people in charge of quality say they’re doing the best they can – and they are, given the terrible methods they use.

Big companies with all their resources can’t build – or even buy – quality software.

https://blackliszt.com/2015/09/software-quality-at-big-companies-united-hp-and-google.html

https://blackliszt.com/2021/02/why-cant-big-companies-build-or-even-buy-sofware-that-works.html

https://blackliszt.com/2021/03/why-cant-big-companies-build-software-that-works.html

It's not just "core" software; quality problems infect software wherever it's used, including customer service and surveys.

https://blackliszt.com/2016/07/gartner-group-big-company-customer-service.html

https://blackliszt.com/2021/05/anthem-needs-my-feedback-reveals-deep-problems.htm

Big software companies aren't any better.

https://blackliszt.com/2015/08/large-organization-software-fails-the-case-of-microsoft-windows.html

You might think that those cool internet companies would get it right.

https://blackliszt.com/2012/01/internet-software-quality-horror-shows.html

Twitter is a good example of cool-company software failure.

https://blackliszt.com/twitter/

Like Google and FB, they mask errors in their search and sequencing algorithms giving you whatever answers their error-prone software comes up with as facts. All sorts of major organizations have endorsed Twitter results as factual when those results are riddled with error.

https://blackliszt.com/2013/05/the-bogus-basis-of-trending-on-twitter.html

Facebook is as bad as the rest. Here are the details on FB’s mobile app, including what its users have to say.

https://blackliszt.com/2014/11/facebooks-software-quality.html

And generally about quality at FB:

https://blackliszt.com/2017/03/software-giants-image-and-reality-facebook.html

When you look at a financially successful company based on software, it’s natural to think that they must know how to build great software. Facebook grew to its strong position in the market in spite of its poor software quality. Here’s an explanation of FB’s path to success, which shows all FB had to do was not be drastically worse than their competitors in terms of quality.

https://blackliszt.com/2014/12/fb.html

Here’s an example of a government software QA disaster with serious human consequences.

https://blackliszt.com/2011/07/software-quality-horror-tales-electronic-diversity-visas.html

Naturally, since government is so great at software, sometimes it intervenes to "make things better."

https://blackliszt.com/2015/04/the-government-wants-to-help-ubers-software-quality.html

Bad software QA goes way beyond screwing things up; it can kill people.

https://blackliszt.com/2012/03/why-software-quality-stinks.html

There is absolutely no excuse for this, when proven methods are available.

Here’s an overview of the book on the subject.

https://blackliszt.com/2012/10/software-quality-assurance-book.html

April 28, 2023
Software NEVER needs to be “Maintained”

We maintain our cars, homes and devices. Heating and cooling systems need regular maintenance. So do our bodies! If we don’t care for our bodies properly, they break down! Software, by sharp contrast, never needs to be maintained. NEVER! Using the word “maintenance” to describe applying a “maintenance update” to software is beyond misleading. More accurate would be to say “a new version of the software that was crippled by a horrible design error that our perpetually broken quality processes failed to catch.” That’s not “maintenance.” It’s an urgent “factory recall” to fix a design error that infects every car (copy of the software) that was built using the flawed design.

Software is different than almost everything

Software is unlike nearly everything in our experience. It is literally invisible. Even “experts” have trouble understanding a given body of code, much less the vast continent of code it interacts with. Naturally, we apply real-world metaphors to give us a chance of understanding it. While sometimes helpful, the metaphors often prove to be seriously misleading, giving nearly everyone a deeply inaccurate view of the underlying invisible reality. The notion of “software maintenance” is a classic example. The flaw is similar to the words “software factory.”

Maintaining anything physical centers around either preventing or repairing things that break due to simple wear-and-tear or an adverse event. We change the oil in a car because it degrades with use. We change the filters in heating and cooling units because they get clogged up with the gunk from the air that passes through them. We sharpen knives that have dulled as a result of use. We maintain our homes and yards. It’s the physical world and things happen.

In the invisible, non-physical world of software, by contrast, a body of software is the same after years of use as it was the moment it was created. Nothing gets worn down. Nothing gets clogged. An inspection after years of heavy use would show that every single bit, every one and zero, was the same as it was when it was created. Of course there are memory crashes, hacker changes, etc. It’s not that software is impervious to being changed; it’s just that software is unchanged as a result of being used – unlike everything in the normal physical world, which one way or another, is changed by its environment – everything from clothes getting wrinkled or dirty from wear to seats being worn down by being sat upon.

The meaning of software maintenance

When a car is proven to have a design flaw, auto manufacturers are reluctant to ship everyone a new car in which the original design flaw has been corrected. Instead, they issue a recall notice to each affected owner, urging them to bring their car to the nearest dealership to have repair done to the car that corrects the design flaw. It’s inconvenient for the owner, but far less expensive for the manufacturer. With software, by contrast, all the software vendor has to do is make a corrected version of the software available for download and installation, the software equivalent of shipping everyone a new car! It’s no more expensive to “ship” hundreds of megabytes of “brand-new” code than it is a tiny bit. Such are the wonders of software.

Software factory recalls are part of everyday life. Software creators are maddeningly unable to create error-free software that is also cyber-secure. See this.

We’ve all become accustomed to the Three Stooges model of building software.

There are highly paid hordes of cossetted employees enjoying free lunches and lounging on bean bags on luxurious campuses, “hard at work” creating leading edge software whose only consistent feature is that it’s late, expensive and chock full of bugs and security flaws.

While the Three Stooges and their loyal armies of followers are busily at work creating standards, regulations and academic departments devoted to churning out well-indoctrinated new members of the Stooge brigades, rebels are quietly at work creating software that is needed to meet the needs of under-served customers, using tools and methods that … gasp! … actually work. What an idea!

The good news is that the rebels are often richly rewarded for their apostasy by customers who eagerly use the results of their work. It’s a good thing for the customers that the totalitarian masters of the Three Stooges software status quo are no better at enforcing their standards than they are at building software that, you know, works.

January 10, 2022
The Revolutionary Champion/Challenger Method of Software QA
Having issues with software quality? Have you tried test-driven development? Do you have test scripts based on test requirements? Do you have a rich sandbox with good test data? What’s your code coverage? How often is a fully tested version released into production with problems that were missed? How often does the new feature work adequately in production but with destructive side effects that integration testing missed? How long and expensive is the pipeline from programmer’s release to working production? Are you proud of being on the forefront of development methods with Agile, Scrum and the rest but still avoid releasing what comes out of each Sprint to production because you can’t risk another disaster?

If any of these apply to you, you may want to consider a decades-old, widely proven in production method of software QA that enables frequent, error-free releases to production with almost no overhead or traditional QA work. It’s not taught, there are no certifications and It’s ignored by mainstream software experts. But it works.

I used to call it “comparison-based QA.” The term is appropriate because the core of the method is comparing results of the production version of the code with a test version. A CTO with a data science background suggested a better term, which I will use henceforth: champion/challenger software QA. It’s like when you’ve got a good model, the champion, and you want to see if a new model, the challenger, yields better results. Or it’s like A::B testing of a consumer UI, when you want to see if a proposed variation works better. In both cases, you’ve done something new and you want to do two things: (1) see if the new thing works like it should, and (2) make sure nothing that used to work is broken. Sounds like feature testing and regression testing, doesn’t it? Yup!

Nobody writes test requirements or feature tests for champion/challenger. You just feed the new thing the same data you fed the old thing and compare the results. You expect differences. If something is better and nothing is worse, you could have a winner – you test with a wider range of data, and if the results hold up, the challenger becomes the new champion and you move on.

Here are the main beauties of champion/challenger:
- The tests are nothing but inputs and saved outputs of real-world processing. Nothing artificial, no carefully crafted "test data."
- You create the code for comparison of output data once. The code/model could get endlessly complex, but the output comparison identifies differences whether it has 10KB or 10GB to compare. No test scripts!
  - You can do the comparison with a UI with some extra work, but again one-and-done.
- The comparison pulls out all the differences. You look to see if each difference is something you expected: if everything new you expected is there, and crucially if there are any unexpected differences. If there are, you’ve got a regression failure to fix.
- You can do the comparison in batch with large samples of old data, giving you great coverage.
- You can do the comparison live, in a production environment, using the challenger code/model output for comparison and sending only the champion’s output to the user.
  - You can test the changes you've made on the challenger code alone, looking at the challenger code's output, to see if you're happy with the change you made. While you're doing that, the challenger code will simultaneously be running production data, which the comparison with the champion will make sure is still OK.
  - After you’ve done enough live parallel production testing, you can be confident that the challenger will do everything the champion did – with real-world data in the production environment — so you merely turn the switch so that challenger output becomes live and the old champion is retired.
  - You can do this in stages to crank the risk way down, so that the challenger becomes champion for just 10% of the inputs and then gradually turn up the fraction. This guarantees problem-free production releases.
I've written about about the details of how to get this done in a book.

I've written a post explaining the core concepts of the two main aspects of QA and how they're different and best served by different methods.

I've written about the path that led me to writing this and related books and summaries of the books.

I've written about how this method provides crucial fuel enabling startups to surpass tech groups hundreds of times their size along with specific strategies those groups use here and here.

The method is a "secret" — because the vast majority of industry elites choose to ignore it. Want to succeed against industry giants? Use secret methods like champion/challenger to enable rapid release of quickly evolving code with zero errors in production and minimal QA effort. You'll run circles around the lumbering, stumbling giants and then leave them in your dust.
July 20, 2021
Why Can’t Big Companies Build or even Buy Software that Works?

Many large companies depend on software. They often have staffs of thousands of people using the best methods and project management techniques, supported by professional HR departments and run by serious, experienced software management professionals. They can afford to pay up so that they get the best people. Why is it, then, that after all these years, they still can't build software that works?

Some of these giants recognize that they can't build software. So they buy it instead! Surely with careful judging and the ability to pay for the best, they can at least slap their logo on top-grade software, right? Sadly, the facts lead us to respond … not right.

What company doesn't want to be part of the digital revolution and have an app? If you're a major health insurance company, why wouldn't you replace old-fashioned insurance cards with something always up-to-date that comes on an app?

As an Anthem customer, I can see that they've gotten with the program. I got this email from them:

An app, huh? Why is it called Sydney? First, let's keep it simple. They say I can now download a digital version of my ID card, so let's try that first.

I clicked on the link, which brought me to the main Anthem login. I logged in. What I expected was normal website behavior, a deep link to the right page, but having to login before getting there. This "exotic" technique, standard practice for over a decade with websites that care about their users, was beyond the professionals at Anthem. After logging in, I got to my profile. Where's my digital card?? I guess it's one of their intelligence and mental status tests, where they count the clicks and the time it takes for you to get where you're going.

Hoping to succeed, I scrolled down in the Profile section and hit gold. I saw this:

That wasn't too hard! Mobile ID cards! Let's see.

Nothing about seeing it, printing it or emailing it. Just an option to turn off getting a physical card in the mail, and a casual mention (with no link, of course) to "our Engage mobile app." What happened to Sydney??

I thought I had gotten through the usual Anthem obstacle course in record time. Nope. Dead End. There are a lot of people these days screaming about how bad disinformation is and how it needs to be stopped. Hey, guys, over here….!

Back to the home page. Look at all the menus. Check all the drop-down lists. Under "My Plans" there's something called "ID Cards." Bingo! An image of our cards, front and back, with options to print, email, etc. as promised!

Nothing about an app, Engage, Sydney or anything else.

Alright, Anthem, I've had enough of your website. Let me go to the Play Store and check out Sydney. Here's what they say it is:

Sounds pretty good, right? What can it do? Let's see:

Seems like it can do HUGE amounts of stuff! Let's keep going.

OK, I've got it. Maybe "Engage" is something Anthem's own army of programmers built. Maybe it was crap and management decided to buy some best-of-breed software. Makes sense. Perhaps some of the hundreds of programmers no longer working on Engage can be assigned to update the website and make it kinda sorta accurate and usable, you think?

No doubt Anthem management exercised great care to assure that CareMarket did a great job and was giving them a proven app that customers loved so that when it went out named Sydney, Anthem's reputation would go up. Let's see the reviews:

Over 2,600 reviews. That line by the "1" rating is pretty darn long. Looks longer than 2 to 5 added up. I guess Anthem had trouble threatening enough of their employees into giving 5 star reviews to get the job done, right?

Let's sample a couple of reviews. Here's the top one when I looked:

"This is the worst app I've ever encountered." Error messages. Failed searches. There's a response from the vendor:

Hey guys, she already gave you "a brief description." Do you test your software? Give it to normal people to try before inflicting it on your innocent, unsuspecting customers? Skimming down, I see that pretty much the same response is given to every each tale of woe. Pathetic.

Here's a sample of other reviews:

Get the general drift…?

This app has been downloaded 500,000 times!! The pain and frustration Anthem is causing is hard to fathom. Why is anyone at Anthem involved with Sydney still employed there? Silly question. Did anyone lose their job after the giant hack at Anthem and the catastrophically bad response to it that I've described?

Maybe they should hire people from the big tech companies to do stuff like this. Those people really know how to build great software! Uhhhh, not so much. Here is specifically about Facebook's app. For more see this and this and this.

This big-company software effort is bad beyond belief. I can't comprehend how it is that they pay people big bucks and come out with stuff like this. From what I can tell, though, governments are in close competition for the "prize" of doing the worst job of building and managing software. It's like there's a competition. See this and this.

The whole world is up in arms about the pandemic. Big powerful people and organizations are taking it seriously and making changes with the intention of fixing the problem. When it comes to the software pandemic, however, everyone just whistles and waltzes along like there's no problem. Everyone just expects and accepts awfulness, acting like it's just how life is.

It doesn't have to be this way.

February 15, 2021
Surprising Bug at Amazon About a Bug Repeller

I have found ample reason to mock the golden-glowed tech reputations of most of the tech giants, in addition to the supposed tech prowess of organizations such as the NSA. There is good reason to believe that old-style libraries are more secure.

I recently stumbled upon a rather glaring bug or result of hacking at Amazon — and found that Amazon provides no way I could find to report the problem.

The problem was glaring and amusing — a whole set of over 3,000 reviews of a book attached to a pest repelling product. As I'll describe in a future post, this isn't just a bug concerning bug repellers — it's the tip of an iceberg about the foundations of AI/ML.

Here's how I found the bug bug.

Stumbling on the bug at Amazon

I was looking for a product to scare away some pests that have found their way into my house. I gave a close look at the following product:

I immediately noticed and was impressed by the large number of reviews, and how favorable they trended. What a popular product! I've got to look at this one. I scrolled down:

Things are still looking good. More details:

Uh-oh! Right after we see that it has 3,051 reviews with an average rating of 4.8, which is super-good, we see that this pest repeller is #2,695 … in Books!! Something is seriously wrong. Scrolling down, we see some really weird answers to questions:

A couple of the people answering clearly tried to tell Amazon there was a problem.

This next one blew me away — over 3,000 reviews for a pest repeller??!!

Now we know for sure something's seriously wrong — a pest repeller is not Children's Literature!

Here is the nail in the coffin, the pictures of the product and a list of the phrases that frequently appear:

And here's one of the glowing reviews. It brought tears to the reviewer's eyes, and not because of the strong odor generated by the pest repeller:

What's going on, Amazon? I don't usually get to mock you. It's usually the other tech giants that receive my sarcasm, places like Google, Facebook, Twitter, Apple and Microsoft. Amazon has done a reasonable job maintaining quality and growing into new areas. This is such a peculiar bug — it's as though some disgruntled employee were messing with things to show his displeasure with the world.

As we'll see in a following post, this is not an isolated bug, and it has implications that go far beyond Amazon — implications concerning the glorious future promised by all the AI/ML enthusiasts.

September 24, 2019

Computer Security Breach Response Excellence

Here's what the experts do for computer security:

Hire security experts to implement best-in-class security.
Follow all the regulations.
Pass all the audits.
Spend lots of money.

Then, of course, you get breached, because in spite of doing the above, you have no idea what you're doing…

Here's how you respond:

Get more experts to find what happened.
Establish a carefully-thought-out strategy to recover from the breach and minimize damage to your reputation.
Alert the public and your users about the event and your concerned, respectful response.

Then, of course, you change your website, put lots of money into attractive graphics, while making it hard for users to login or reset their passwords.

The share-your-expertise website Quora is surely in the running for best-in class when it comes to computer security; they have followed the above plan with true excellence.

The Quora Story

I got this email from Quora, of which I'm an occasional user, on December 3, 2018:

Capture

Dear David B. Black,

We are writing to let you know that we recently discovered that some user data was compromised as a result of unauthorized access to our systems by a malicious third party. We are very sorry for any concern or inconvenience this may cause. We are working rapidly to investigate the situation further and take the appropriate steps to prevent such incidents in the future.

What Happened

On Friday we discovered that some user data was compromised by a third party who gained unauthorized access to our systems. We're still investigating the precise causes and in addition to the work being conducted by our internal security teams, we have retained a leading digital forensics and security firm to assist us. We have also notified law enforcement officials.

While the investigation is still ongoing, we have already taken steps to contain the incident, and our efforts to protect our users and prevent this type of incident from happening in the future are our top priority as a company.

What information was involved

The following information of yours may have been compromised:

Account and user information, e.g. name, email, IP, user ID, encrypted password, user account settings, personalization data

Public actions and content including drafts, e.g. questions, answers, comments, blog posts, upvotes

Data imported from linked networks when authorized by you, e.g. contacts, demographic information, interests, access tokens (now invalidated)

Non-public actions, e.g. answer requests, downvotes, thanks

Questions and answers that were written anonymously are not affected by this breach as we do not store the identities of people who post anonymous content.

What we are doing

While our investigation continues, we're taking additional steps to improve our security:

We’re in the process of notifying users whose data has been compromised.
Out of an abundance of caution, we are logging out all Quora users who may have been affected, and, if they use a password as their authentication method, we are invalidating their passwords.
We believe we’ve identified the root cause and taken steps to address the issue, although our investigation is ongoing and we’ll continue to make security improvements.

We will continue to work both internally and with our outside experts to gain a full understanding of what happened and take any further action as needed.

What you can do

We’ve included more detailed information about more specific questions you may have in our help center, which you can find here.

While the passwords were encrypted (hashed with a salt that varies for each user), it is generally a best practice not to reuse the same password across multiple services, and we recommend that people change their passwords if they are doing so.

Conclusion

It is our responsibility to make sure things like this don’t happen, and we failed to meet that responsibility. We recognize that in order to maintain user trust, we need to work very hard to make sure this does not happen again. There’s little hope of sharing and growing the world’s knowledge if those doing so cannot feel safe and secure, and cannot trust that their information will remain private. We are continuing to work very hard to remedy the situation, and we hope over time to prove that we are worthy of your trust.

The Quora Team

What a bunch of careful, responsible people, those folks at Quora are! So appropriate for a share-your-expertise site!

After this notice, I kept getting the occasional teaser email from Quora, tempting me to click and answer a question or see an answer someone else gave. For example I got this one a couple weeks before the breach:

I know, it's not click-bait for the general public, but definitely a good one for me.

Yesterday I got the first teaser I'd gotten since the breach email reproduced above. Here's the lead:

Not a killer issue, but I clicked out of mild curiosity about the answer, and also to see whether Quora was up and running normally. What I got was a lesson in how to respond to a security breach by driving your customers off. It's true, after all, that if there aren't any users, there won't be any meaningful security breaches — problem solved!!

Here's the landing page — a new thing in itself, because clicking on an email used to be enough to identify you.

The cute graphics are all new. I put in my password and got the box in red above, telling me I had to reset the password by responding to the email they sent. OK.

I got a typical password reset email:

I clicked on the link. I got to see even more wonderful new graphics! These guys are really trying! Then I put in my old password, because I wanted to; it's my password, I should be able to pick any one I want, unless they tell me there are rules.

Can't use my old password, huh? If you're so sensitive and caring, you could just possibly have warned me about that up front. Oh well. Here's a new one:

I put it in. It's new. They match. I click on the Reset Password button. Nothing. I change the password and click again. Nothing. Again. Nothing again.

They just don't want me, it's clear. If I were a normal user, it would have been game over. But I'm not, so I went back to the password reset email and clicked again. This time I put in a brand-new password. Then, clicking worked — it got me to the login page, where I had to enter my email and new password yet again.

Quora has a big, fat, ugly, super-obvious, BUG in their "we're taking responsibility for this breach and hoping to win back the trust of our users" new entry door to their site, not bothering to perform super-elementary QA on one of the main pathways of the new code. Not some obscure condition. Software QA 1.01.

So just who are these geniuses at Quora? Are they the super-smart, rich, cool kids that have such a track record of excellence at other tech sites? Like Facebook and Twitter and the rest? It takes a bit of looking, but the simple answer is: yes. Super-smart. Beyond cool. Rich. And still can't get the most elementary details right!

Business as usual in software. Whether it's government, big corporation or cool young hip tech company the story is the same: getting stuff to actually, you know, old-fashioned WORK is beneath, beyond, above or whatever for whoever's involved. Not to mention make software that protects customer data.

January 8, 2019

Speed-optimized Software QA; or Cancer
QA is QA, right? Either you’re committed to quality or you’re not. If you are, you use the widely accepted tools and techniques and produce a high-quality product, or you accept the fact that you’ll churn out crap. What else is there to say?

Here’s what: if you accept this view and apply it in the usual software development environment, your QA processes metastasize, invade all parts of the process, and cause endless pain and suffering – not to mention crap software. Eliminating the cancer can only be done by leaving the cancer-causing environment. That means finally saying good-by to peace-time software and going to war.

Software Quality Assurance

Software QA is a BIG subject. There are lots of methods within a broad set of accepted practices. Experts assure us that “quality” goes way beyond testing; it infiltrates every aspect of the software SDLC.

But testing by itself, which is just part of quality, is a HUGE subject. You can get certified. Here are some of subjects involved in testing according a leading certification organization.

Here’s what’s involved just in test planning:

The SQA cancer

Most organizations aren’t aware of it, but the typical, expectations-based, peace-time software development organization is highly prone to QA and testing cancer.

Everybody is supposed to do good work. You’re supposed to take the flawless results of the previous group’s work, do what you do to it, and pass it on to the next group – error-free.

When it’s obvious you’ve been given crap, it’s easy to reject what you’ve been given and throw it back at the slugs who gave the crap to you. But all too often, the problems don’t emerge until you’re pretty far down the line of doing your own work. Then it’s upsetting and embarrassing. The schedule is at risk, and who’s at fault?

Once this happens a couple times, on the sending and receiving ends, most people respond by creating as many quality measure as they can, to be applied to the work they’re given. Then, having been blamed for producing bad work a couple times, they run all sorts of tests on the work they’ve produced before passing it on.
Why wouldn’t you? You don’t want to receive bad work, and you certainly don’t want to be blamed for giving bad work to someone else. But you can see how the QA/testing steps are reproducing.

Inside coding itself, it can be even worse. In addition to inspecting the code that's calling them and that they're calling, the programmers can "enhance" their code with additional code whose sole purpose it to check whether the inputs they're given — at run-time — are correct, and then again to test the outputs they're about to give.
As though that isn't enough, lots of quality-minded programmers are way into writing code that tests each individual piece of code; this can be before you write the code (i.e., test-driven development, which is currently fashionable, something programmers say they want to do to show everyone else how good they are) or after you write the code (i.e., plain old unit testing).

What's worse is, I've just skimmed the surface!

There is no escape

You think that's bad? Try to do something about it. Whenever anything goes wrong, you're likely to hear one of the following:
- I wasn't allowed to buy adequate test tools
- We weren't adequately trained on the tools
- I didn't have enough time to get enough coverage
- Lots of old tests had bugs, I had to disable them to meet the deadline
- We changed so much old code, I didn't have time to update all the tests for it.
- My regression testing has poor coverage, I need more time/money
- We have too much emphasis on testing — quality is something we need to build in
Then you'll find a group that isn't testing inputs or outputs. More time and money needed.

Get mad and remove any of these steps? You've just given the group a "get out of jail free" card if/when something goes wrong.

Cancer-free testing and quality

Testing cancer is one of the main reasons why peacetime software is a lengthy, organized expensive process … that produces disastrous results. Then, when you try to fix the problem so it doesn't happen next time — the cancer spreads. Current quality and testing methods are a cancer on software development, and there is only one known cure.

The cure is simple. It's documented. It's proven in practice, at many places over many years. But it's radical, and requires a shift to wartime software thinking. It requires shifting to global process optimization, and optimizing for speed. It takes less time and produces far superior results. Here are some of the key points:
- Forget "quality," whatever that is. Concentrate on testing.
- Move from correctness-based testing everywhere to change-based testing at just a couple key points.
- Move all testing from the lab to production.
- Move all testing from periodic to continuous.
There are a number of posts in this blog that spell this out. There are even books! The advance of software QA cancer is inexorable and unstoppable in environments that are friendly to it. There is only one cure: Wartime software and its associated methods.
December 8, 2015
Software Quality at Big Companies: United, HP and Google

I would love to avoid the issue of software quality — but it keeps finding me and biting me, as I'm innocently going about my business. I guess you can understand that there are issues at a giant company whose main business is flying airplanes. It gets more annoying when the company says it makes computers. It's even worse when it's an incredibly well-regarded software company. Here are just a couple personal examples. They seem small. But they're indicative of a pattern in practice.

United Airlines

I fly on United airlines a fair amount. I needed information about one of their flights, and not even on a day when a computer systems failure brought everything there to a halt. Just a regular day after they released some new software, software that no customer was pounding their fist on a table demanding — just regular old new software they felt compelled to release. Software that didn't work.

They point out that the new version isn't available — but neglect to point out that the old isn't available either! Sad. Pathetic. They put the effort into assuring that their error message would include an attractive picture of one of their planes flying — perhaps they could instead have put a bit more effort into keeping their software flying?

HP

This once-great company has been drifting for years. I'm amazed they still have as many customers as they do. Clearly some executive in some cushy suite is putting pressure on the marketing people to generate more leads. So I've been getting spam from HP like never before — yes, HP is "spamming" me.

Word has clearly also come down to keep the pressure on those recalcitrant would-be customers like me. So, like a nice, obedient spam target, I click the opt-out button at the bottom of the e-mail.
I have great expectations, because, after all, "HP respects your privacy." I go to the relevant page,

So I go to the form, and make sure the "unsubscribe all" box is checked before clicking the button.
Then, I get a re-assuring page saying it's all set, no more spam.

Everything is OK then, right, because HP respects me and everything about me; they say so.

Except: I've gone through this exact process or one similar to it ten times in the last month, and nothing changes! HP apparently is eager for me to receive their information, and they respect me as much as ever. Their software is broken and no one cares. Is this huge? No, of course not. But it's the small things that tell you what's really going on.

Google

For reasons that escape me, the general impression is that Google is great and everyone who works there is a genius. I get business plans telling me that everything is great with their software because they've hired a team from … Google! Case closed!

Except it's not, from big things to small. Here's a small personal example. I went onto Google+ (one of the many projects/services that is rarely on the short list of great Google achievements) to get my posts. Here's what I got:

I tried and tried. No luck that day.

Can you imagine something being down for a day? The recent American Airlines system outage that I had the pleasure to personally experience while caught in a system-wide ground halt lasted a couple hours. In that context, it's a good thing that Google+ is nothing but a free service for helping people waste time.

Conclusion

Software quality is a huge, on-going, unsolved (at most organizations) problem. There are ways to solve it. The overwhelming majority of practicing professionals and computer science academics prefer to ignore it. Meanwhile, the rest of us get the message loud and clear: we don't matter to them, and words to the contrary are nothing but propaganda.

September 29, 2015
Large Organization Software Fails: the case of Microsoft Windows

Large organizations have trouble building software. This has been true since the dawn of software history, and shows no signs of changing. The decades-long, rolling disaster of Microsoft Windows is a great example of this. I've been hit personally with this. Recent experiences with Windows 8 have renewed my appreciation of the breadth and depth of the on-going awfulness of Windows.

Windows Screen Saver

I got a new computer. It had Windows 8. I was setting up my new machine and I wanted to do something simple. I had remembered that in some earlier version of Windows, you could get the screen saver to display the file name of the photo it was showing. This was useful if you wanted to get your hands on the photo that just flashed by. It's a pretty small feature, but one anyone who stores photos on their PC could find it useful.

So I drilled in to the screen saver.

I went into the settings, and didn't see the control I was hoping would be there.

So I clicked on Help, something I rarely do, but what the heck, that's what it's there for. Here's what I got: The content is missing!

It's a little thing. It's not like my computer crashed. In the world of books, it's like a footnote was missing — hey, that's an idea, let's compare the new edition of Windows to the new edition of a book!

Software and Books

Most of us know how to judge books. If a book is poorly produced, like the pages tear easily and the type is hard to read, most of us will toss it aside — it may have great content, but it's not worth reading. If we get past the first impression, we'll dive in and start reading. The next potential barrier is how well the book has been edited. If the book is full of spelling, usage and grammatical errors, many of us will think poorly of the author, the editor and the publishing house — the author shouldn't have made the mistakes in the first place, the editor should have caught and corrected them, and the publishing house shouldn't have put sloppy trash in print. Then and only then do we get to the style and substance of the book.

I read a lot of books from many publishers in many genres — fiction, history, science, etc. — and I'm happy to report that I rarely encounter a published book that has editing errors.

And by the time a particularly timeless book gets to later editions? There are never errors.

In that context, how is Windows 8?

I've got the latest version of Windows, 8.1, running on a new machine. It's hardly a first edition. Microsoft pours out updates, and I'm up to date. Here's a snapshot:

Note the scroll bar — there were hundreds more updates that had been applied.

The lovely option that lets you see the file name along with the picture was in an earlier version of Windows. Making a new edition of software isn't that much different than making a new edition of a book — basically, unless you add or change something, it stays the same. In this case, someone had to make a conscious decision to drop an isolated, harmless feature that gave value to many customers.

Why would someone do that? It's more trouble to drop a feature than just let it ride along on the next edition, so someone had to actively remove it. There is no conceivable objection to the feature. While not everyone would want it, since it's an opt-in feature, it harms no one. It's like someone deciding to drop a short appendix from a book — not everyone will want it, but those who do value it. In the paper publishing world, dropping it might save a page or two. But in the electronic world? There's no conceivable reason.

I don't claim for a second that displaying the file name on the screen saver is important. I simply claim that the decision to drop it exemplifies the pervasive anti-customer attitude of the Microsoft organization, which unfortunately is typical of large software-building organizations in general.

It's the missing Help file though, that really set me off. Again, it's a trivial error, like dropping a footnote. But why would you do it? How could it possibly slip though what should be a totally automated editing/QA process?? It may somehow be complicated in the labyrinthine world of Windows development, but it's a fixable thing. You have a program that assures that for each instance of Help there's a corresponding piece of content, and for each piece of content there's a way to reach it. There either is no such program or it's broken. In the overall scheme of things (Windows remains horrifically slow, it freezes and crashes, etc.) it's a small thing, but surely by the edition of Windows 8 I am suffering with it would have been found and fixed?

Conclusion

Software is all about productivity, attention to detail and automation. Unless you've got a de facto monopoly, software is also about meeting customer needs. Large organizations in general (for example government, big corporations) and Microsoft in particular don't get that, in spite of the billions they spend on development and (supposedly) quality. I would love to be able to say it's getting better, but most of the evidence is on the other side. Which is why, among other things, good software will continue to be produced mostly by organizations that are small and willing to do things the "wrong" way.

August 3, 2015
The Government wants to Help Uber’s Software Quality

It's reported that New York City's Taxi and Limousine Commission (TLC) wants to pre-approve new software releases by ride companies like Lyft and Uber. Since the TLC is well-known to be heavily staffed with software experts, what can be bad about this idea? Other than just about everything, that is?

The proposal

Here's what they're saying:

Uber and Lyft have to buy smartphones and give them to the TLC because the Commission runs such a tight budget that there's no way it could afford the required thousands of dollars. Oh, wait … the planned 2015 revenue of the TLC is projected to be $545.6 million, with expenses of $61,045,000. That leaves just $480 million or so, which is undoubtedly already committed to something or other, which is probably terribly important.

Let's assume it happens. How is it going to work? Uber gives a release to the TLC, which takes exactly how long to test it how rigorously by what means? By the time it gets around to organizing to test one release, another will have arrived. So the pressure will immediately come to have fewer, larger releases. Then will come the time when the TLC approves a release and there's a bug. There will be commissions, reviews, and a big operation will be set up to implement industry best-practices, government-style. Things will get even slower and longer, and government tentacles will start weaving their way into Uber's software development organization. In the end, New York will end up getting a small number of releases, way after the rest of the world has them, buggier than everyone else, and the costs will be passed on to the drivers and riders.

Why?

Right. Sure.

The Reality

Governments can't build software that works in any reasonable time. See this.

No matter how hard they tried, software testing in the lab just doesn't work. See this.

They will press to have fewer releases, when more frequent releases are the key to good software quality. See this.

Finally, most important of all, we don't need to be protected, thank you very much. If it doesn't work, people will stop using it, and the company will either fix its problems or go out of business. That's the way the greatest wealth-creating and poverty-eliminating system ever invented works.

April 29, 2015
Facebook’s Software Quality: the Implications

I have pointed out Facebook's lack of desire or ability (who cares which?) to deliver software that actually works. I've pointed out that they're hardly alone in this respect. It's important to accept this observation as true, so that you can change behaviors that may have been unconsciously predicated on the supposition that Facebook delivers great software, effectively and efficiently. They don't. So don't hire their people and expect great things to happen, and don't mindlessly emulate their methods or use their tools!

The Unspoken Assumption

Facebook is a wildly successful company, worth over $200 billion. I'd like my company to be worth even 1% of Facebook. So I better find out what Facebook did, and learn from it. Facebook is a software company, so their engineers must be smart and effective. I better get some of them in so they can teach us the "Facebook way." And their tools — wow. If Facebook uses something, what an endorsement that is. My guys had better have a real good reason to use something else; I look at what FB's worth and what we're worth — don't we want to be like them? If a tool or method is good enough for FB, it should be plenty good enough for us.

The role played by software in FB's success

Here's the logic:

FB is wildly successful.

FB is built on software.

Therefore, FB software must be wildly excellent.

We already know by examining the quality of FB software that it's crappy. So we have reason to suspect that the virtues of FB software may NOT be a driver of FB's success. Consider this thought: What if FB is wildly successful IN SPITE OF its crappy software? If that's true, the LAST thing you'd want to do would be to infect your reasonably healthy engineers with disease vectors from FB.

Explaining FB's Success

There are lots of reasons software companies can become very successful other than having great software. In fact, by the time a company gets large, bureaucracy and mediocrity normally take over, and any great qualities in the software are normally eliminated. The most common reason a software company gets and stays successful is the network effect, the self-validating notion that "everyone" is using the software, therefore I should too.

The network effect becomes even more powerful when there's a marketplace. E-Bay is a great example. If you're a seller, you want to sell in the place that has the most buyers. If you're a buyer, you want the greatest choice of things to buy. Similarly, if FB is where all your friends are, you'd better sign up — which makes the network effect even stronger.

FB, by chance or plan, leveraged the network effect for growth brilliantly. Harvard already had a physical book with everyone's pictures in it, called the Facebook by students. The basic education and promotion problem was solved out of the gate: Harvard students knew what a "facebook" was; they all had a physical one, and used it, if only because their own information was there. For example, here's me in the 1968 edition:
However straight-laced those Harvard freshman looked, a fair number of them were hackers and troublemakers. Here's the very last page of the 1968 FB. Look at the last guy listed.

There's a similar entry, with a different photo, at the start of the book.

Zuckerberg was solidly in the long-standing Harvard hacker tradition. He had already illictly grabbed student photos for a prior application, which both got him in trouble and made him famous on campus. So when he launched "thefacebook," of course all the Harvard students would check it out. He did this in January. It was used by about half of all Harvard undergrads within a month.

His next smart move was to open it just to students at a couple more elite schools, and then Ivy League schools. Once established there, he expanded. He did NOT open the doors and let anyone join — he moved from one natural community to the next, letting the network effect do its magic before moving on. Finally, alumni were allowed to join, but only if they had a .edu address proving their affiliation. That's when I joined. Only after a whole generation of students had made it the standard did FB allow their parents to join.

The quality of the software had nothing to do with this. If people had to pay for it, FB would have flopped. Feature after feature came pouring out of the self-declared brilliant minds of the top people at FB, many of them flops, mixed in with scary experiments with privacy. But it was "good enough" most of the time, it's free, it's where your friends are, what can you do?

The conclusion is clear: FB grew to be a huge success IN SPITE OF having rotten software quality and development methods that are just horrible.

The FB environment and yours

Facebook software development methods and tools are NOT something a small, fast-moving, high-quality software shop should want to emulate. Their quality methods in particular are not only trashed by their users, but also by a fair number of ex-employees. The same thing goes for the computing and server environment.

If you find a talented ex-FB-er, by means hire him or her — but only after verifying that they're sick of how things are done at FB and want to work at a high-quality place.

Above all, don't emulate the actions of FB's leadership. It's the network-effect flywheel that continues to bring eyeballs to their applications, NOT their great software.

And think about this: if they're so brilliant and such great developers, why have they done about 50 acquisitions in their short life, a couple of which are important to their growth?

December 7, 2014
Facebook’s Software Quality: the Facts

Facebook is an incredibly successful company, one of the most valuable on the planet. It is natural to assume that a main reason for this is that they've got a boatload of great programmers who produce code that users love. This assumption is wrong. In fact, the widespread adoption of Facebook masks deep, long-term quality issues that are not getting better.

Facebook Success

Facebook recently passed $200 billion in market value. Amazing! It has billions of users world-wide and has no serious competition. No one can question FB's success in user count and market capitalization.

Facebook Mobile App

Mobile device use is going through the roof. We are in the middle of a massive, rapid migration from workstations and laptops to tablets and smart phones. This trend impacts FB just like everyone else. At the recent Money2020 conference, a top FB executive laid out the numbers, which are stunning; in short, FB mobile use nearly equals normal web use. If anything is important at FB, it's got to be getting the mobile app right.

Facebook Mobile App Quality

So how is FB doing, this premier, ultra-successful company with no lack of resources to do an excellent job? They've got to be doing way better than the rest of the industry, right?

Let's start by looking at user reviews:

Not too bad, 4 stars out of 5, right? But out of more than 22 million reviews, more than a quarter gave 1, 2 or 3 stars, more than 6,000,000 reviewers! Let's look at a few of those reviews. (I didn't scan for exceptionally bad reviews; I just picked off ones that were near the top of the Play store.)

Here are a couple reviews. Cindy gave 1 star because the app doesn't work at all, and Johnny gave 2 because he suddenly can't avoid being buried in notifications.

Here are a couple more reviews. The third reviewer gave 3 stars even though the app is basically disfunctional.

These are educational:

The 3 on the left describe things that worked on a prior release that no longer work, which is the cardinal sin of quality testing. Look at Bratty's review awarding 4 stars, even though he/she can't use the app at all. Makes you wonder if anything but 5 stars is good for FB. Jeremy's review sums it up: "you're still not listening to your users." If only 5 stars represents satisfied users, the ratings mean that about half of FB app users have a serious bone to pick. Which is quite a statement.

FB App Quality in Context

Compare the performance of the FB app to the performance of your car. Getting a new release of the app is similar to getting your car back from the repair shop, only with little trouble on your part and no expense. Most cars run pretty well — they start in the morning, run through the day, and rarely break down. When you get your car back from the repair shop, it's even better, even less likely to break down.

Not true for FB. Even though it's "in the repair shop" pretty frequently, the FB "mechanics" all too often find a way to break things that used to work, and fail to fix things that didn't work when it went into the "shop." FB programmers and managers think they're way smarter than auto mechanics, but if the car people performed even a little bit like the FB crew, they'd be out of business. The reality is that, with all their oh-so-highly-educated-and-smart mountains of cool (mostly) dudes, the FB crowd can't come close to delivering the quality that nearly every corner-garage mechanic delivers every day.

FB quality stinks, and it stinks for their fastest-growing, flagship product. In saying so, I'm simply summarizing the expressed experiences of literally millions of their users. There are ways to achieve high quality software. FB does not lack the resources. The fact that they don't deliver quality and aren't even embarassed about it tells us that they just don't care.

November 23, 2014
Lessons for Software from the History of Scurvy

Software is infected by horrible diseases. These awful diseases cause painfully long gestation periods requiring armies of support people, after which deformed, barely-alive products struggle to be useful, live crippled existences, and are finally forgotten. Software that functions reasonably well is surprisingly rare, and even then typically requires extensive support staffs to remain functional.

Similarly, sailors suffered from the dread disease of scurvy until quite recently in human history. The history of scurvy sheds surprising light on the diseases which plague software. I hope applying the lessons of scurvy will lead to a world of disease-free, healthy software sooner than would otherwise happen.

Scurvy

Scurvy is caused by a lack of vitamin C. It's a rotten disease. First you get depressed and weak. Then you pant while walking and your bones hurt. Next your skin goes bad,

your gums rot and your teeth fall out.

You get fevers and convulsions. And then you die. Yuck.

The Impact of scurvy

Scurvy has been known since the Egyptians and Greeks. Between 1500 and 1800, it's been estimated that it killed 2 million sailors. For example, in 1520, Magellan lost 208 out of a crew of 230, mainly to scurvy. During the Seven Years' War, the Royal Navy reported that it conscripted 184,899 sailors, of whom 133,708 died, mostly due to scurvy. Even though most British sailors were scurvy-free by then, expeditions to the Antarctic in the early 20th century were plagued by scurvy.

The Long path to Scurvy prevention and cure

The cure for scurvy was discovered repeatedly. In 1614 a book was published by the Surgeon General of the East India company with a cure. Another was published in 1734 with a cure. Some admirals kept their sailors healthy by providing them daily doses of fresh citrus. In 1747 the Scottish Naval Surgeon James Lind proved (in the first-ever clinical trial!) that scurvy could be prevented and cured by eating citrus fruit.

Finally, during the Napoleonic Wars, the British Navy implemented the use of fresh lemons and solved the problem. In 1867, the Scot Lachlan Rose invented a method to preserve lime juice without alcohol, and daily doses of the new product were soon standard for sailors, which is how "limey" became synonymous with "sailor."

Competing Theories and Establishment Resistance

The effective cures that had been known and used by some people for centuries were not in a vacuum. There were competing theories. Cures included urine mouthwashes, sulphuric acid and bloodletting. As recently as 100 years ago, the prevailing theory was that scurvy was caused by "tainted" meat. How could this be?

We've seen this movie before. Over and over again. I told the story of Lister and the discovery of antiseptic surgery — and the massive resistance to the new method by the leading authorities at the time.

Software Diseases

This brings us back to software. However esoteric and difficult it may be, software is a human endeavor: people create, change and use software and the devices it powers. Like any human endeavor, some of what happens is because of the subject matter, but a great deal is due to human nature. People are, after all, people, regardless of what they do. Patients were killed for lack of antiseptic surgery — and the surgical establishment fought it tooth and nail. Millions of sailors were killed by scurvy, when a cure had been known, practiced and proved for centuries. Why would we expect any other reaction to cures for software diseases, when the "only" consequence of the diseases are explosive growth in the time, cost and risk to build and maintain software, which is nonetheless crappy and late?

Is there a general outcry about this dismal software situation? No! Why would anyone expect there would be? Everyone thinks it's just the way software is, just like they thought scurvy in sailors and deaths after surgery were part of life. Government software screws up,

software from major corporations is awful,

software from cool new social media companies is inexcusably bad. Examples of bad software can be listed for endless, boring, tedious, like forever lengths.

Toward Healthy Software Development

If I had spent my life in the normal way (for a software guy), I wouldn't be on this kick. But I didn't and I am on this most-software-sucks kick. Early on, I had enough exposure to large-group software practices to convince me that I wanted none of it. I'd rather actually get stuff done, thank you very much. Now, looking at many young software ventures over a period of a couple decades, the patterns have emerged clearly.

I have described the main sources of the problems. I have described the key features of disease–free software development. I have explained the main sources of the resistance to a cure, for example in this post. And I have no illusion that things will change any time soon.

It will sure be nice when the pockets of healthy software excellence that I see proliferate more quickly than they are, and when an anti-establishment consensus consolidates and gains visibility more quickly than it is. In the meantime, there is good news: groups that use healthy, disease-free software methods will have a massive competitive advantage over the rest. It's like ninjas vs. a collection of retired security guards. It's just not fair!

February 27, 2014
The Bogus Basis of “Trending on Twitter”

People write and talk about what's "trending on Twitter" as though the trend meant something. It doesn't. It's based on deeply flawed Twitter search software that gives random, widely varying results. I know the weatherman is often wrong, but what if he said it was going to be sunny in the 70's tommorow and as often as not there was a blizzard — would you keep listening? It's the same with Twitter, only worse.

Trending on Twitter is everywhere

It's amazing how widespread this useless stuff is. New York Times editors are in on the game.

It's even now got a prominent place on Wall Street!

You can not only follow what's trending in general, but you can narrow it down to different locations.

When a Twitter account is hacked, bad things happen.

And sure enough, the markets react.

We seem to care not only about what the Boston bomber says on Twitter:

But we also pay attention to the useless Twitter trends about it:

We've really got to stop this. It's not as though we've got reliable data here. It's just not. Twitter has been a technical joke for years, and there are no signs of improvement.

Trending on Twitter is meaningless garbage

I don't have the access to perform a universal test. But I did perform a test, and anyone else can reproduce my results. I did searches over a couple week period for the same term and saved the results. Sometimes the results were correct, but most of the time, items that were there before disappeared, only to pop up again on a subsequent search. Sometimes just a couple things were missing, and sometimes the gap was massive. Here is the evidence.

Then I took the search that appeared to have the most gaps, and performed the identical search about a week later. As I documented, one search had just 5 items and the other had 32, when they should have been identical. About 85% of the search results had been dropped by Twitter!

"Trending on Twitter" is based on comparing results of a search performed on one day to the same search performed on other days. If the number of results goes up or down, you've got a trend. Or so you think. But what if the results are really as bad as I have documented? I found that "blackliszt" went up or down by a factor of 6, like 600%! Wow!

Conclusion

Twitter software has always been bad. Management has learned to disguise the awfulness by suppressing the appearance of the "fail whale," but they clearly haven't actually, you know, made the software better. Anyone who takes its results as actually meaning something is depending on bogus data.

May 16, 2013
Twitter Software Quality: An Oxymoron

Twitter software quality Stinks. As I've demonstrated. On revisting and updating the facts, I've decided that "Twitter Software Quality" should be promoted to the status of oxymoron, joining the august company of terms such as "southern efficiency," "northern hospitality," and "government worker."

A Brief History of Random Awfulness

I took samples of searches for "blackliszt" on these dates: Apr 18, 19, 20, 22, 24, 25, May 1, 8. A total of 8 samples.

All searches were done as "All" to tell Twitter I wanted, you know, all the results, not just the ones Twitter felt like disclosing at the moment.

I only grabbed the first page from each search. I've shown the results in another post. Of the 8 searches, the one on May 1 is the most extreme. Here's a copy of the May 1 search for "blackliszt:"

You can see there are 5 tweets in the list of results, from Apr 11 to Oct 13. I decided to try to find out how many tweets there actually were between Oct 13 2012 and May 1, 2013, the date of the search pictured above.

I did this research on May 8. At least on May 8, Twitter was willing to admit that there were a total of 32 tweets in the same date range, although one of them (Feb 27) appears twice. Here they are:

A Twitter search for "blackliszt" performed on May 1 resulted in a list of 5 tweets going back to Oct 13. The same search for "blackliszt" performed on May 8 (above) resulted in a list of 32 tweets that should have been returned by the May 1 search. Maybe there are more! Given that one is double-counted (Feb 27), who the &*() knows?? What I do know is that on May 1, Twitter decided to discard 27 out of 32 potential results of a search. Roughly 85% of the tweets were gone!

Summary

I already knew that Twitter software quality was bad. It turns out that it's worse than I ever imagined. It's "Twitter-quality"-is-an-oxymoron bad.

You know all those "trending on Twitter" items you're seeing now that seem so modern and cool? They all assume that getting more or fewer results from a search means something. We now know that the results can easily go up by a factor of six, or drop by the same factor, just because of Twitter "quality." It's obvious that "trending on twitter" deserves to be the punchline of a joke, not something that anyone pays attention to.

May 15, 2013
Twitter Software Quality Stinks

There are big problems with software quality. The problems range from social apps to corporate to academia, include "mission critical" software, and everywhere in between. The social apps in particular have decided it's embarassing. But instead of actually, you know, fixing the problems, they seem to have decided to mask the problems! Twitter is a great example of this disease.

Two ways of Responding when you don't know the Answer

Suppose you're a kid and someone is demanding answers from you. Either you know the answer or you don't. If you know the answer, it's simple: just give the answer!

Q: When did Columbus sail the ocean blue?

A: 1492

If you don't know the answer, there are two ways to respond: the right way and the wrong way. The right way to respond is simple: Just say you don't know!

Q: When did Columbus sail the ocean blue?

A: I don't know.

The wrong way to respond is a little more complicated. You have to guess at an answer, state it as though you knew the answer, and hope no one cares or that the person asking doesn't know either so you can get away with it.

Q: When did Columbus sail the ocean blue?

A: 1542.

When the question you're asked has several answers, you can be wrong in a different way. For example:

Q: Name the ships in Columbus' voyage to the New World.

A: The Nina and the Santa Maria.

Q: Is that all of them?

A: Yes.

Twitter's Response when it doesn't know the answer

I never thought it would happen, but now I have fond feelings for Twitter's Fail Whale, which I haven't seen recently. You would think that the fail whale not showing up as often would be a good sign. It's not. It's a sign that Twitter has decided that it's better to lie than to admit it doesn't know the answer to the question you're asking. Instead of forthrightly saying "I don't know," Twitter now brazenly gives the wrong answer. Even worse, it gives a different wrong answer from one day to the next!

Twitter's Bogus Search results

Here are some screen shots of the results of the identical query, for "blackliszt," over a couple of weeks. I always selected "All results" to remove any excuse that Twitter was selecting the "top" results to help me out.

Let's go through time. Here's the result from the first day, Apr 18:

I tried again the following day, Apr 19, and was quite surprised with the result: the Rebelmouse tweet simply disappeared, pulling an older one into the results!

On Apr 20 I added a tweet and did the search again. My new tweet was there, and RebelMouse came back!

On Apr 22 I tried yet again and got another brand-new variation: this time Cadencia's tweet disappeared!

The results were unchanged on Apr 24 and 25. I gave Twitter a couple days to lose some data, and had my patience rewarded when I searched again on May 1. The first result was Rebelmouse; the most recent posts, my post on ballet, Cadencia and Rob Majteles, were all gone! Here's May 1:

Finally, look at this simple list of my tweets taken Apr 23, not a search:

Note that I had tweets on Apr 10 and Mar 25, both of which included "blackliszt," neither of which appeared in any of the search results!!

Sadly, I can't even claim that the folks at Twitter have it out for me. It's just the way things work there … uhhh, I mean, the way things don't work there…

Conclusion

Social Media software quality stinks. It's worth every cent you paid for it. Oh, you didn't pay anything for it, you say? Well, that's my point. When a program like Twitter gives you an interface, lets you do a search, gives you a result that's even worse than my "Nina and Santa Maria" answer, brazenly implies that it's the right answer and everyone just ignores the issue, something is wrong.

Q to Twitter exec: Why does your software randomly leave out results from searches? Why should anyone look at "trending tweets" or anything else when the data is randomly bogus?

A: I've never been asked that question before. The answer is simple: I do it because I can, because I don't care, because no one else seems to and because I'm worth a great deal of money and you're not. Next question please.

Thanks to MaryAnn Bekkedahl for inspiring me to write this up.

May 2, 2013
Software Quality Assurance Book

I've written quite a bit about software quality over the years. In addition to quite a number of posts on this blog, I've written a short book about it. Currently, I just distribute it in PDF form to work-related people, but I'm thinking about releasing it on Kindle as an e-book.

Background

Anyone involved in software who's, like, alive, gets real involved with software quality. Many years ago, I discovered it was useful to follow up meetings I had with software groups with an e-mail summarizing the ideas. As common themes emerged, I found myself with a small library of e-mails, cutting and pasting them. Then the collection turned into a document, since the ideas were so inter-related.

I started giving the document to groups before meeting with them. I got feedback during and after meetings, everything from mistakes I'd made to important issues I had ignored. So the document grew as it went through at least 15 revisions.

The document/paper/book is pretty long and comprehensive, and I haven't been discovering new things to add to it recently. So it must be "done." I've even taken the time to throw together a crappy-looking cover:

Mainstream Thinking

There are literally hundreds of books on software quality. There are tools. There are certifications. There's a huge body of work out there. Why did I put this book together? Does the world really need another book on software quality? What more can there possibly be to be said?

First of all, let's notice that in spite of all the books, methods, quality software and certifications, software quality still stinks. It stinks in big, process-laiden corporations. It stinks in cool young web start-ups. It stinks all over this land!

So what's the problem? Do people simply ignore best practice? Do they not understand it? Do they try to apply it but screw up?

The answer is pretty simple: mainstream software quality methods are no good. They cost a lot, take a lot of time, slow down development and modification, and don't improve quality much to speak of. What's more, most people in the industry who aren't completely asleep at the wheel know it — which is the origin of the typical complaint of quality groups, that they're understaffed, underfunded, and never given enough time to do their job the "right" way. This complaint is generally justified! And it's likely to stay that way, because whenever those groups get what they want, cost and time goes up and quality stays roughly the same.

So that's why I wrote what I wrote — I wrote what you couldn't read elsewhere, about ideas and methods that were ignored by the mainstream. Who knows why? I've stopping caring.

Validating the ideas

I'm only comfortable talking about stuff I know personally. The origin of the book was a large software project, comprising over 7 million lines of code. It processed credit card transactions. I was CTO, and Y2K was rapidly approaching. It was too late to do things the "right" way. We couldn't afford it anyway. Doing nothing was not an option.

So I dredged up some methods I had used in systems software testing that I realized no one knew about in applications. Because there was no other option, everyone rallied to this one. We got the job done and passed Y2K with flying colors.

Later, as I became more involved with Oak companies, I noticed that the short cycle times of web development forced small groups of desperate programmers to re-invent a subset of the ideas I was beginning to systematize. When things were really bad in companies not already using the methods, I could sometimes get them to try them, and the ones that really shifted to the new methods found success. By "success" here I mean simply that they got higher quality software with less time and effort and shorter cycle times, with less "tax" on development.

For a few years, I thought it was important to keep this magic bullet secret. Hah! Glaciers will melt before most software development groups try anything that challenges the way they've done them for years.

The Down Side

There's a down side to pretty much everything. Down side to publishing the book? Can't think of one. Down side to using the methods? Definitely. Here are two big, fat problems that emerge when using the new methods, quoting from the book:

With no big, formless, unproductive but “necessary” QA group, there is
no place to put weird new hires in hopes that they’ll get bored and leave.
There’s also no place to send people who are just too stupid or lazy or
socially skilled to make it as programmers, but you don’t have the heart to
fire them.

There are no big, fire-breathing, invective-filled meetings populated
exclusively with overhead jobs (managers and marketing) who argue about
“pulling things in” and “risks” and what happened last time and “competitive
pressures” and elaborate project management charts in 4 point type that someone
made up last night but everyone makes believe actually have a relation to
reality other than “not.” Meetings like this raise everyone’s heart rate way
more than hours in the gym and supply anecdotes providing amusement and smarmy
edification for weeks. They would be missed.

Conclusion

Will I push the "publish" button? Probably. I'm thinking about it. Update: I've thought about it. The button has been pushed. The book is here.

October 22, 2012
Software Quality Assurance Book now available

The book I threatened to release is now available on Amazon.

October 20, 2012
Does your Software Work Well? Look Good?

It doesn't matter how good your software looks if it crashes. If it's broken, please take it home and don't come back. But if it does work, then the next thing you should care intensely about is how good it looks. Software winners tend to combine great design with code that, you know, works.

I'm a fanatic about software QA. It is one of the most unappreciated aspects of software development. But it's not the only thing that matters. Just as in math, beauty is something that pervades all aspects of great software, internally and externally. The impact of great design should never be minimized.

One of my favorite Oak companies is OneMedical. They are re-inventing the doctor's office. They use great software behind the scenes, much better and more effective by far than most doctors use. They use modern methods for patient/doctor communications; you can e-mail with your doctor — what a concept!

On top of all that goodness, they have great design. The pictures on their website give you the basic idea.

I just went in for an annual checkup. My doctor, Malcolm Thaler, was great. Here's a picture I took while in the "waiting room" (given how it looks, I kind of hate to call it that).

Here's a picture I took through the window of Dr. Thaler's office.

Even if you don't have access to the software behind the scenes like I do (I mean in general — no, I can't and wouldn't want access to patient records), the great design of their offices gives you the visceral impression that good things are happening at OneMedical. As indeed they are.

The lesson is a general one. Once you've gotten good, be sure you look good, so that people will think the right thing at first glance.

August 27, 2012
A Simple Framework for Software Quality Assurance
Software Quality Assurance (QA) has grown wildly complex and specialized, while at the same time becoming increasingly ineffective. It's time to question a couple key assumptions and to establish and simple, clean, unifying framework to get the results we want from SQA: "get it right" and "keep it right."

Software QA Today

There's a huge amount going on with software QA. For example, the American Society for Quality has a Certification for Software Quality Engineer.The test has about 160 subject areas. Here is a sample of just four of them:

On one of the many websites devoted to software QA, I found links to over 100 recommended books.

Assumptions

If you accept the assumptions about software development that underly most modern QA, it doesn't seem completely insane. However, I challenge several of the core assumptions.

It is a historical fact that much of modern software development theory, including QA, is based on the notion that building software is somehow like a manufacturing process. There are a couple edge cases in which this assumption is useful; but for the general case, it's just wrong and should be discarded.

It is similarly clear that most modern software development theory, including QA, is based on the notion that the root fact of software, the fact from which everything else grows, is a set of requirements. There are important cases in which this assumption is useful (a large number of them are algorithms, think Knuth); but for most software, it causes more trouble than anything else, and should be forgotten like a bad dream.

Finally, much software is developed with the notion of predictability as central. This is a choice organizations make. If they want predictability, let them have it; it's a valid (if mostly brain-dead) choice. But for most of the software that makes a difference, speed is more important than predictability.

Instead of these assumptions, it is more fruitful in most cases to assume the following:
- Software is primarily a design process
- The requirements of the software evolve as you build it
- Speed is more important than predictability.
Core Software QA Function: Keep it Right

"Keep it Right" is one of the two major categories of software QA. It has its own distinct methods. It can and should be largely automated.

The core, transformative observation here is that your software need not (yet) be correct. When you alter the software, the main thing you want is to avoid changing its behavior (except as intended). The killer in all regression testing software is that you have to go function by function and write scripts to make sure the "right" thing happened. So your work is proportional to the size and complexity of your program.

Once you understand the role of "Keep it Right," you see that all you need to do is compare what the software did before with what the changed version does. There are always way to accomplish this. The key observation here is that writing comparisons (think diff) is something you do just once, for the point at which the differences in the software are being compared (for example, the UI or the DBMS). So your work is independent of the size and complexity of your program.

The key method of "keep it right" is to replace "correctness testing" with "comparison testing." This can be completely automated; it reduces programmer overhead; and it does what users do, viz., ask the question "what's different?"

Core Software QA Function: Get it right

"Get it Right" is one of the two major categories of software QA. It has its own distinct methods. It cannot be automated to any significant extent.

Scripts or automation of any kind are irrelevant to "get it right." Instead of having product people write requirements that someone down the line eventually turns into test plans and test cases, the people who would normally write requirements watch the software's behavior as it grows in capability and guide its development interactively. They see whether this iteration of the software does what they expected, and they also watch their own reactions as they use it, learning from those reactions. Outside people may also play a part in this process.

Meanwhile, of course, "keep it right" is running all the time, assuring that only forward steps are taken, and that unintended side-effects are caught so they can be corrected. Once the new round of changes have been "gotten right," they become part of the base set that is "kept right" using the comparison-based test-for-change method.

Conclusion

Keep it Right and Get it Right alter the organization, methods, work and results not just of QA, but of the entire software-producing organization. They change who does what and how they do it. When done reasonably well, they align quality efforts with customer expectations, which nearly always are based on the hopeful assumption that at least you will avoid breaking what used to work. At least as important, your product managers and programmers can "goal seek" their way rapidly to an effective solution with minimal overhead.
April 18, 2012