The Black Liszt - Page 22 of 27 - Fact-based contrarian thoughts about software

Storage For the Cloud
The massive movement to Cloud architectures puts new demands
on systems vendors that most of them are unprepared to meet, while at the same
time devaluing special features that many vendors used to differentiate their
products. Nowhere has this trend been more evident than in storage.

For years, storage has had its own silo in the data center,
SAN and/or NAS, with its own storage managers and administrators. They became
dependent on various storage-centric features of the different vendors.

The Cloud has disrupted this comfortable island of
automation.

The Cloud is all about reliable, low-cost self-service, with
tremendous automation and integration. Service, capacity and performance need
to be available on-demand, with no human intervention. Everything needs to be
able to grow and shrink as application needs change, with a sharp eye to
capacity utilization, since it’s easier than ever to switch Cloud vendors when
one stumbles or is simply no longer competitive. The same observations are true
of “private clouds.”

Virtualization is a key part of achieving Cloud goals, and
virtualization changes the rules of the systems game. Functions that were
traditionally part of storage are now performed as an integral part of
operating systems and/or virtualization software, to make them more agile.

Many companies have observed that traditional,
controller-centric, feature-rich SAN and NAS solutions are not appropriate for
the Cloud environment. They are simply using inexpensive JBOD’s for storage and
depending on massive replication by the file system to provide reliability,
typically making a minimum of 3 whole copies of the data, before backups, in
order to assure availability. If the alternative is an old-technology NAS or
SAN, this is a smart idea, which is why its use is growing so quickly.

X-IO has a whole different approach to storage. It’s not
NAS. It’s not SAN. It’s not cheap JBOD’s with a make-lots-of-copies filesystem.
It’s an intelligent storage node that not only uses, but enhances the drives from one of the major OEM
suppliers, Seagate. X-IO makes them better by a large margin, and it doesn’t do all the things that are no longer
needed in the Cloud environment. X-IO gives you more of what you do need for Cloud, and none of what you don’t need.

The X-IO approach to storage assumes you’re smart about
building your data center. You’ll take a building-block approach, with lots of
well-configured servers, network and storage blocks, with a layer of software
on top of it all to orchestrate it. You want each building block to be great at
what it does – do a lot, cost a little, and play its role in the overall system.

In the end, storage comes down to a small set of storage
components used by everyone. Rather than ignore the details of the drives and
wrap them in fancy, useless (in the Cloud) packaging like everyone else, X-IO adds value, real value, to the
drives themselves. This value persists as Seagate develops and releases new
drives – the 2 to 5X X-IO advantage over every other storage solution will ride the
waves of new drives into the future.

X-IO spent over 10 years of deep development of unique IP
(the first 5 as a Seagate division). Over that time it invented and hardened
algorithms and code and incorporated the experience from having thousands of
units in the field over many years. The results are clear, and differentiate
the X-IO storage brick approach from everyone else. Given a set of drives, X-IO
will make them:
- Perform at least twice as fast,
  often 3-4X anyone else when near capacity
- Deliver at least twice the throughput
- Fail at less than 1% the rate of
  anyone else
- Not require replacement during their
  5 year warranty
- Take much less space, often 30-50%
  less
- Require much less power, often 50%
  less
- Require less cooling
Finally, X-IO can incorporate SSD drives as required to
achieve even better performance, though this is needed much less often than
with other vendors.

In service operations, Cloud is measured on cost and SLA’s.
X-IO storage is all about cost and SLA’s. X-IO is the winning choice of
storage for Cloud.
Software: Comparing Waterfall and Agile

Lots of people talk about the evils of waterfall-style development. They aspire to move to something they think is better. Agile is high on most short lists for the something better. How different are waterfall and agile? Answer: not much.

Waterfall

The Waterfall model is an ordered, systematic method for determining what a computer system needs to do (the requirements) and then getting it done and into production. Like this:

The method is well-named. It really does look like a waterfall, like that big one famous for honeymoon visits on the St Lawrence River:

Above is a picture of Niagara Falls I look a little while ago, and is good for understanding software waterfalls. See the big river of water flowing from the upper right? See how everything is clear as it starts to fall? Then you see there's all the mist, making it very hard to see anything clearly at the end. Kind of like most software projects… This one gives you a good sense of the transition from clarity to mist:

Of course everyone hopes for the good outcome, for the rainbow emerging out of the mist:

But, I'm sad to say, the experience of Ms. Annie Edson Taylor comes closer to the common experience of waterfall software development:

While there is a vast array of software development philosophies, waterfall appears to be the standard against which most of them are compared; her concluding remarks saying it all: "nobody ought ever do that again."

Agile

Naturally, people look for better ways, and find lots and lots of ways that are thought to be better. It is incredible the number of software development philosophies there are. They go on and on! At least in my experience, Agile is the one I most often hear as a replacement for waterfall.

Like with all these things, people have a lot to say about Agile. There are books and books and conferences and training and certification, endlessly. Here is a summary diagram, given at roughly the same level of detail as the waterfall diagram above:

Lots of strong claims are made for Agile. It's faster, leads to better results, etc. Stuff that everyone says they want. But what are the real differences?

Comparing waterfall and agile

Take a close look at the two diagrams. Both of them start from requirements and go through design, development, test, integration and delivery. Here's the difference: with waterfall, you determine all the requirements up front and then drive through to delivery. The requirements are fixed, and you determine the time from there. In Agile, you determine a bunch of starting requirements, deliver them in a fixed time period (for example 2 to 6 weeks), and then get another set of requirements, and keep cycling until the project is done.

Waterfall: first fix the requirements, figure out the time.

Agile: fix the time periods, and then repeat until you're done.

Putting all the rhetoric aside, the difference between the two methods is simple: one determines the time from fixed requirements, and the other takes fixed time periods and fits requirements into them as appropriate. In other words, Agile is little more than a series of time-fixed waterfalls!

Remember, it's all just Process!

It's easy to get caught up in all this and forget that the most important thing isn't what makes Waterfall and Agile different — it's how they're the same. Not exactly the same, but the same kind of thing: process!

You can build 100,000 lines of really crappy code using Agile. You can build 10,000 lines of great code that accomplishes the same thing using Waterfall. Or the other way round.

In Simple Terms

In simple terms, Waterfall is:

Do once: {Define. Design. Do. Check. Deliver.}

and Agile is:

Do until done: {Define. Design. Do. Check. Deliver.}

Conclusion

There are many good things about Agile. It's more iterative and can allow for more feedback loops than pure Waterfall. But its difference from Waterfall is easily exaggerated, which helps explain why the results in practice are so often disappointing. In the end, switching the precedence of the two key variables (requirements and time) can't make that much difference when the fundamentals of software and its Postulates are not addressed.
In storage, there is X-IO, and then there are all the others…

In normal times, when there is no major technology
disruption in the market, there are two categories of storage companies.

Most storage revenue goes to the big names everyone knows
(EMC, NetApp, Etc.). These companies have comprehensive storage solutions and
services to meet nearly any need. Their products are solid and meet most
mainstream needs. They don’t innovate much and aren’t the most cost-effective,
but they work.

A good deal of attention in the storage industry goes to the
hot new companies, which are all about the latest technologies (e.g. SSD) or
features. They usually don’t do the old things as well as the established
companies. But by focusing on the hot new thing, they often do that one thing
pretty well, and so appeal to the usually tiny part of the market that feels
the corresponding pain. If they get market momentum, they are usually bought by
an established vendor.

This is the way it works. The established companies take most
of the revenue and do little innovation. There is always a flurry of new
companies trying to innovate, sometimes getting traction, and getting absorbed
by the established vendors.

Then there are technology disruptions. That’s when the rules
change. Suddenly the comprehensive product lines of the established vendors
don’t meet the needs of the emerging landscape very well (in spite of the
furious efforts of their marketing groups to claim they do), and most of the
new vendors don’t get the new situation and continue to do little but exploit
new devices or add features onto the existing pile.

Today’s Technically Disrupted World

That’s the situation we’re in today, with the combined
technology disruptions of data centers employing virtualization, moving to the
Cloud, and attempting to exploit Big Data. In addition, there is a new storage
technology, flash (SSD), which vendors are scrambling to exploit. The situation
is confusing for buyers and chaotic for vendors, since most vendors try to act
as though nothing fundamental has changed. But it has!

The Cloud is all about reliable, low-cost self-service, with
tremendous automation and integration. Service, capacity and performance need
to be available on-demand, with no human intervention. Everything needs to be
able to grow and shrink as application needs change, with a sharp eye to
capacity utilization, SLA's and costs, since it’s easier than ever to switch Cloud vendors when
one stumbles or is simply no longer competitive.

Virtualization is a key part of achieving Cloud goals, and
virtualization changes the rules of the systems game. Functions that were
traditionally part of storage are now performed as an integral part of
operating systems and/or virtualization software, to make them more agile. This
also drives the movement to software-defined networking and storage.

Big Data is the same only more so, with its emphasis on
linearly scalable arrays of compute nodes and storage nodes.

In response to this massive technology disruption, many
companies realize that brand-name vendors no longer make much sense, and are using inexpensive JBOD’s for storage and depending on
massive replication by the file system to provide reliability, typically making
a minimum of 3 whole copies of the data, before backups, in order to assure
availability. If the alternative is an old-technology NAS or SAN, this is a
smart idea, which is why its use is growing so quickly.

X-IO

And then there is X-IO. While X-IO is a storage company,
it’s different than all the others. It was built for a vision of computing that
we now call “Cloud.”

When X-IO was started about 10 years ago as the Advanced
Storage Architecture division of Seagate, its goal was to build highly compact,
efficient and reliable storage building blocks using Seagate HDD’s. While the
rest of the storage world was ignoring the details of the devices on which it
was built, piling on features and management systems that have become obsolete
in the Cloud world, the ASA group was inventing the technology of the storage
“brick,” now amounting to over 50 patents and a great deal of field-hardened
code that delivers more of what Cloud needs than any existing system, by far.

All storage vendors, whether established or emerging, use
the same drives from the same couple of leading vendors, mostly Seagate or
Western Digital (WD). All of them except X-IO package them in roughly the same
way and throw some features on top of them to “differentiate” themselves from
the other guys who use the same disks. It’s just as though all cars had one of
two different kinds of nearly-identical engines in them – each of the car
vendors would try to distract you from the engine, and try to get you to
appreciate how wonderful their steering wheels or cup holders were. That’s even
true of NAS and SAN, which seem so different, but really have the same engines
(disks) in them – it’s like one has front-wheel drive and the other rear-wheel
drive, but under most conditions, their speed, fuel efficiency, acceleration
and service frequency are identical.

The only storage vendor that is different is
X-IO, and X-IO’s difference just happens to be on all the dimensions that
matter most for the new world of Cloud, virtualization and Big Data.

X-IO’s Difference

First of all, X-IO doesn’t build feature-encrusted storage,
like a “trophy car.” It’s basic storage, a storage building block or brick,
ideal for plugging into nodes in a Cloud server farm under virtualization
control, or a Hadoop cluster.

Second, and most important, comes from its heritage as part
of Seagate. While X-IO uses the same Seagate drives that other vendors use, all
the other vendors just plug the drives in and proceed to concentrate on everything but the drive. X-IO’s technology, in sharp
contrast, is all about making that drive perform at its very best. You wouldn’t
think there would be much that could be done. But there is! X-IO reduces the
error rate of the drives so much (more than 100X) that they can be sealed in
containers, which makes them take much less space, consume less power and
generate less heat than the same drives in any other system. Then the X-IO
software actually gets more
than twice the I/O’s per
second (iops) from each drive than any other vendor.

Let’s think about a car rally. Most of the cars will vary
greatly in size, shape, color and gizmo’s. The X-IO car will be the plain one.
Imagine them in a distance race. Most of the cars will overheat or have to stop
for gas pretty often. Only the X-IO car will never overheat and get vastly
better mileage than the others. Many of the other cars will break down along
the way. X-IO won’t. Here’s the amazing thing: the X-IO car will cross the
finish line in half the time of its nearest competitor.

Now let’s think about sending an important package. Using
normal cars, you’d better send 3 identical packages by different routes to make
sure it gets there. With X-IO, you only need one car, and it will get there
faster than any other car, using less fuel.

In the world of Cloud, this translates into not having to buy expensive SSD drives to
get performance, though X-IO has them available if you need to go even faster
than X-IO normally goes. It translates into not having to over-provision to get
performance. It translates into not having to store 3 or more copies of
your data to assure it’s still there tomorrow. It translates into buying a half or third of the number of racks (or rows!) you
would normally have to buy in order to make a given amount of data available at
a given performance level. It translates into dramatically lower operating
costs for those racks, which at Cloud scale and Cloud competitive pricing can
be the difference between growing profitably and losing to the competition.

No other storage vendor offers these benefits. No one but X-IO.

Conclusion

The “cloud” as we know it today didn’t exist when the ASA
division of Seagate started inventing the deep technology that has now matured
in X-IO. But its simple mantra of getting more value out of devices was a
unique quest. No vendor has equaled it, and no one is even close. As new drives
are released, the X-IO advantage will persist as a multiplier on whatever
Seagate ships. All the other vendors will plug Seagate drives into their
systems and try to distract you, drawing your attention to “anything but” the
actual characteristics of the storage – its performance, space and power use,
reliability. These thing are old news in the old world of storage, but they’re
the only thing that matters in the new world of Cloud. Which is why there are
all the storage vendors – and then there’s X-IO.
Software Postulate: the Measure of Success

There is a Postulate of software development that, like all postulates, has huge impact on much of what goes on in software. This postulate concerns what is the measure of success in software. There are many ways to formulate it, but at heart, it's simple: success is measured by meeting expectations that have been set. Just as getting to modern physics requires changing the parallel postulate in geometry, so does getting to modern software require changing the "meet expectations" postulate in software development.

Expectations in Software

Two typical CIO's are talking with each other. They get together to share experiences because, while they don't compete with each other, their groups manage technologies of similar size and complexity.

CIO A is real happy today. "I guess they finally listened last time. They had been late once too often. I dished out a pretty blistering speech about how awful it was, and I added on a couple of threats about what was going to happen to careers if it happened again. We just had our project review meeting, and for the first time in memory, most items were green, with just a smattering of yellows and a couple reds. I breathed such a sigh of relief."

CIO B isn't so happy. "That's where I was a couple months ago. Why can't these guys just keep it up? What is it with programmers? We were mostly green, but now green projects are a fading memory. Mostly we're in the yellow and red. Yuck."

What are these guys talking about? Project Management. They're talking about whether the expectations set by their staff have been met (green) or not (yellow and red).

The green, yellow and red are at the end of a road that starts with requirements, moves on to the crucial, notoriously difficult art of estimation, and then proceeds to implementation. Green says that the estimates are being met, and yellow and red say, well, maybe not.

Introducing Absolute Measurement in Software

The CIO's get over their griping and start to compare notes on some recent projects. As it turns out, they're both building a Data Warehouse. They're in the same industry, and the projects are similar in nearly every way, at least from the outside. Common sense tells them that the internals of their projects should be pretty similar. So they compare notes.

CIO A (the happy one): "My project sounds about the same as yours. It's such a relief that we're on track. We've got a lean team of just 10 working on it (at one point I thought it might take 20 people), and we're just 6 months from the end of the 18 month project."

CIO B (the unhappy one): "What? I've got 2 people working on what sounds like the same project as yours. I'm 4 months into the project, and instead of finishing in 2 more months, they're telling me is going to stretch out 2 more weeks, a 25% overrun of the remaining time, which is why I'm so annoyed."

CIO A: "Are you kidding me? You've got just 20% of the staff with a target of 1/3 of my timeline, and you're mad? Your whole project is a rounding error compared to mine."

CIO B: "I guess my guys are doing OK after all. I just wish they could set expectations better."

On this rare occasion, the software managers confronted the reality of absolute measurement in software — but only by chance, and only by comparing two projects to each other. They're not really even approaching absolute measurement — if their two projects had been run equally incompetently, there would have been no surprise!

What is the Measure of Success?

What this dialog reveals is the near-universality of measuring success by comparing results to expectations. The CIO who was mad was spending very little money and getting results in a fraction of the time of his compatriot, while the CIO who was happy was doling out the money like confetti and taking the slow boat — but since his team had started by giving him even worse estimates, he thought he was doing great.

Neither CIO was measuring the success of the projects the way most of us measure. They started with estimates. If the work was coming in better than the estimate, it was judged successful; if the work was delivered worse than the estimate, it was not successful.

This is the way it works in software. It's an unspoken assumption, a postulate that underlies nearly everything that is done in software. People don't propose alternatives, just as no one proposed an alternative to the parallel postulate in geometry for more than a thousand years. It's considered the one and only way to do things.

There are other measures of success

Groups that do an outstanding job of producing software often achieve it with a different measure of success. Just as you can optimize your work for setting expectations and meeting them, you can optimize your work to achieve maximum velocity. Estimates are less important than maximum speed. For example, there's the well-known answer to the question of how to avoid being eaten by a hungry tiger. It has nothing to do with expectations. It's simple: run faster than the other guys.

Conclusion

This is a point of deep theoretical interest, and also great practical application. It's related to building bridges in war and peace. If you're under no time or budget pressure, then maybe the meet-expectations assumption is the way you should measure your software efforts. But if you are under competitive pressure, then you might want to think about organizing your software efforts according to a different measure of success: the velocity method.
Software Development Process in Simple Terms
Software development is complicated to understand, and even more complicated to do. What's worse, developers disagree among themselves about nearly everything. Nonetheless, it's worth understanding at least the basics of what they do, confining ourselves here just to software process, ignoring (for now) the far more important software substance.

Software Terminology

Most of the talk you hear about software is about process, things like requirements, design, how and when testing should be involved, etc. There is a sea of specialized language about every aspect of software process, much of it coming from conflicting methodologies.

All of software process can be boiled down to a small number of basic, understandable things.The main steps are nearly always:
- defining what you're going to do
- how you're going to do it
- doing it
- checking it
- delivering it
What are you going to do?

Whether it's called "requirements" or "user stories," pretty much every software process starts here.

How are you going to do it?

This one amounts to the design phase. Are you going to use a DBMS? Existing libraries? Are you going to apply design patterns? Usually groups have strong preferences for these things, so the usual decisions are endorsed and people move on.

Do it

Finally! People actually do stuff!

Check it

If there were a software equivalent of the Garden of Eden, in which software happened without bugs (sin), I am unaware of it. So everyone assumes that someone (probably the other guy) screwed up, and we need to fix it.

Deliver it

Finally the software needs to get from where it's built to where it's used. The methods and destinations vary, but that's what happens in this final step.

This is all process

What I've done here, as critics would say, is "over-simplify." Given the incredible number of different software philosophies, this is understandable. Even within a philosophy, differences that seem minor to outsiders are of crucial importance to those who care about that kind of thing.

This is all just process! We're just talking about formalities. For example, the process I've described also applies to building a physical structure. The same steps apply whether you're building a simple house

or the Taj Mahal.

If essentially the same process can result in a world-wide tourist destination or a starter home, is process really the most important thing? In other words, substance is vastly more important than process.

Nonetheless, Process still matters

If substance is so important, should process be ignored? Of course not — having a sound process is essential. The five steps I defined above need to happen, and depending on the process, appropriate sub-steps as well. For example, unless and until you hire programmers directly from the programmer equivalent of Eden, checking is a non-negotiable requirement. That's exactly why it's important to understand software process in these extremely simple terms. It's got to happen.

But then you spend most of your time and effort on building your starter homes or your Taj Mahal. In other words, you concentrate on the substance.

Conclusion

Software development is plagued with warring methodologies and a surfeit of terminology. It's worth remembering that, in the end, it all boils down to a set of simple, understandable steps that are universal.
Process and Substance in Software Development

High among the concerns of software management are questions of organization and process. While these are reasonable concerns to have, I generally find that paying attention to substance is more productive. If you think of your organization as being like a software factory (a line of thought I generally discourage), this means you should pay more attention to the widgets that come out than the organization of the shop floor.

Process

It is easy to be totally consumed by process, organization and people. Everyone wants to know who's their boss. When there are disputes, who has the deciding vote? Many people want to know their "next step" in the organization, the path to greater responsibility, power and pay. Such concerns tend to be greater in the minds of the people on the upper part of the ladder, not to mention the top, since they usually had to work at getting where they are.

Process and organizational structure are tightly tied. Is QA a separate group with its own head? Or are there QA people as part of each small group of developers? If QA is distributed, what is the reporting structure? This is complicated by the myriad of process fashions that sweep through the industry — there are literally dozens of them in play at any given time, things like Agile and Extreme, with Lean coming up fast.

Substance

Substance is embodied in the code that is produced. Given a set of general requirements, the substance of what is produced can differ wildly. Suppose you're extending your application to mobile. Do you use HTML 5? How do you bridge to the details of the local device? Do you write in Objective C (the native language for the Apple devices)? How much do you store locally, and how do you communicate with the servers? What about all the Android devices?

And I'm just talking about the simplest questions here. Real substance is contained in the details of how the code is written in the chosen environment. For example, the code can be pretty "straight," it can have loads of parameters, it can be layered to varying extents, it can be driven to varying extents by meta-data, etc. These choices have a huge impact on the outcome.

Process vs. Substance

Dilbert illustrates the point nicely, as he often does. In the cartoon below, the pointy-haired boss focusses, as you would expect, on process. He is concerned about dates and whether Wally has met expectations that have been set, completely ignorant of the substance.

Wally, crafty as ever, claims to have created a disastrous substance. The pointy-haired boss, unable to determine whether Wally's claims about substance are true, and unwilling to risk that they may be true, gives in.

Conclusion

Don't be Wally — but also don't be the pointy-haired boss. Pay attention to substance. Make it your business to understand it. Your attention will provide an example to your group, telling them what's important to you. Your attention to substance will be like a chef who cares that the diners love the food that comes out of the kitchen, and does so by — what an idea — paying attention to the food itself.
Postulates of Software Development

A great deal of what we do in software is a direct consequence of a couple of fundamental assumptions we make: postulates of software development. Only by questioning and changing those assumptions can we bring about fundamental change in the way we build software.

Postulates or axioms are rarely discussed or thought about. We just accept them, like breathing air or walking on the ground. Changing a postulate or assumption normally results in a cascade of consequences that changes a great deal.

Geometry: The Parallel Postulate

We can understand postulates in software by seeing how they work in geometry. In Euclidean geometry, there are four fundamental postulates, and a pivotal fifth one, the parallel postulate. This is the one that says, basically, if the angle between two lines isn't exactly 180 degrees, the lines will eventually cross; otherwise, they are parallel, and never meet.

What's important about this postulate (and the others) is that all the rest of Euclidean geometry is derived from them. Given the postulates, all the theorems are implied.

For example, the famous Pythagorean Theorem is one of the many theorems whose truth grows out of the small seeds of the postulates.

In the diagram above, the theorem states that a² + b² = c².

Non-Euclidean Geometries

What if parallel lines can meet? Think it's impossible? Well, think about lines on a globe.

Lines that are parallel end up meeting — and this is business as usual in Elliptic Geometry. What's worse, the Pythagorean Theorem does not hold in non-Euclidean geometries in general, and spherical geometry in particular.

This isn't just textbook stuff. For example, Einstein's General Theory of Relativity is based on non-Euclidean geometry. In fact, questioning the Parallel Postulate and devising ways of thinking about and describing non-Euclidean spaces was essential to the development of modern physics. So long as geometry was Euclidean and only Euclidean, progress was impossible.

The Postulates of Software

So what are the software equivalents of the Euclidean Postulates? There are few questions that are more important, because only when the foundation is questioned and changed is rational, constructive, internally consistant change possible. Only with new postulates can we derive a whole new set of theorems to define software practice. Only then is fundamental change and improvement possible.
Human and Inhuman Analytics

While people talk about analytics in general, there are really two distinct varieties: human analytics and inhuman analytics. First, there is analytics for and by humans, i.e., numbers, tables and graphs designed by humans for human consumption and consideration. Second, there is algorithmic analytics, originally designed by humans but then set off to make observations, decisions and perhaps actions on its own. I dub this "inhuman analytics," because that's what it is. It is incredibly important to understand the differences between these two things, related in name but little else.

Human Analytics

When most people think about analytics, they're usually thinking about things like Data Warehouse (DW), Online Analytic Processing (OLAP), Business Intelligence (BI), and related subjects.

This is a subject that is broad and deep, with many products and vendors that have evolved over time. But there is a simple unifying theme: these are tools intended to provide information to people, often in the form of graphics, so that those people can understand what's going on and take any action that may be appropriate.

Oracle, for example, has a wide variety of such tools:

Microsoft also has a variety of such tools.

Note that both companies illustrate their approach using screens and people. That's what this type of analytics is all about.

There are a wide variety of BI tools from many vendors, in addition to open source.

Inhuman Analytics

Inhuman analytics, a terms that no one else uses, so far as I am aware, is a whole different thing. This is also a subject that is broad and deep and undergoing constant innovation. It includes such diverse subjects as machine learning (ML), advanced statistics, operations research (OR) and related subjects.

In general, inhuman analytics are far more specialized than human analytics. They are nearly impossible for anyone but a specialist to understand. There is often lots of math involved. They are not primarily about presenting information so that it makes sense to human beings — they are about figuring stuff out that most humans wouldn't be able to figure out at all, or figure it out with a precision that exceeds human capability.

Because of this, there aren't great pictures to illustrate inhuman analytics. But here's an illustration of the ML process from one company's ML toolkit:

Inhuman analytics are behind a large number of modern innovations, though they rarely get credit for it, since the way they work is essentially like magic to most people This is a vibrant subject with a rich history. I suspect I will come back to this in some future post.

Conclusion

Human analytics has many uses and is a good thing. The visual tools it emphasizes enables knowledgeable and motivated people to explore and understand a data set, and to track it over time. Sometimes you can even discover new things, particularly in the early stages of understanding and optimization

However, inhuman analytics are the serious, heavy-duty tools to help derive value from data. They can and regularlly do figure things out and solve problems that are beyond human capability, even with the aid of human analytics.

Human analytics has its place. But it's no substitute for inhuman analytics for serious value creation.
Fundamental Concepts of Computing: Closed Loop

Like other fundamental concepts of computing, this is pretty simple: closed loop is better than open loop. But I concede that it's a touch more complex than other basic concepts, like counting.

Closed and Open Loop

A closed loop system is one that operates with feedback. By contrast, an open loop system operates without feedback.

This of course is a universal concept, not specific to computing. It's how living things operate in general, for example; it's the central characteristic that makes them successful and enables them to improve.

The concept applies to mechanical things as well. When steam engines, for example, were run on the open loop principle (i.e., with no feedback), they were difficult to manage and liable to unpleasant things like blowing up. Then James Watt made his steam engine into a closed loop system by applying a centrifugal governor to it to control the pressure of the steam.

Moving from open to closed loop

The classic open loop process is simple, like driving a car with your eyes closed. You have a goal, which is to drive on the road, not crashing into anything, not going too fast or too slow, until you reach your destination. With your eyes closed, i.e., without the main feedback loop in operation, it's kinda hard. Most people wouldn't try.

The closed loop process is just slightly more complicated, like driving a car with your eyes open. Everything is like driving with your eyes closed, except that you adjust the speed and direction based on the feedback you get from your eyes.

I think it's fair to say that most people prefer closed loop driving. They find it safer and easier on the nerves.

Closed loop applied to the process of building software

The usual way of building software involves working out a plan and then building the software according to the plan. This is like figuring out your route on a map, getting into the car, and driving to your destination with your eyes closed. The good news is, you know for sure when something goes wrong.

Classic waterfall is like checking at the end of a closed-eye drive whether you reached your destination. Since this is obviously insane, all waterfall processes add elaborate checks along the way to see whether you've driven off the road, etc. Never helps much.

Agile is like checking on a regular basis whether you've crashed and setting a new goal based on the latest disaster. Agile is just like waterfall except that the crashes are more spread out.

What is writing software using a closed loop process? Simple. It's like growing a baby.

Closed loop applied to system design

Just as feedback can work in the time dimension during the software building process, it can work in the conceptual dimension during the software design process. You may break down a system in layers, from UI down to storage, and design each in isolation. This is typical, and not a great idea.

It's far better to take repeated conceptual passes through the system design, as though you were driving through them in time, and apply the feedback of what each layer has to do to the other layers, optimizing the whole rather than the individual parts. This simple exercise can yield astounding results.

Closed loop applied to the software that you build

Most programmers seem to think that part of their job is creating work for systems administrators; in other words, they create software that requires care and feeding to keep running. This is strange, because software is all about automation. Why would you create software that is, in effect, open loop? Like creating a steam engine without a governor?

Just as you should apply the concept of closed loop to the process of building software, so should the software you end up building have feedback loops incorporated into it, so that to the greatest extent possible it is self-managing. Without such built-in feedback loops, we're doing the equivalent of building a steam engine with no governor. Making our software liable to blow up, like this steam engine did:

Conclusion

Sensible people keep their eyes open when driving a car. But a shocking number of otherwise sensible people do the equivalent of keeping their eyes closed when programming or designing systems, and they build software that is, in effect blind. What's this about? Open your eyes, already!!
The Big Data Technology Fashion
Where there are people, there are fashions. Why should technology be immune? The current fashion of "big data" is a classic exemplar of the species.

The Books

Books are a good place to observe the common themes of technology fashions. You'll see patterns that resemble the ones I previously pointed out for project management.

I think it's fair to say it's not a legitimate technology trend if it's not covered in an "X for Dummies" book.

Similarly, it's got to be big. Be Revolutionary. Transform lots of stuff.

Its got to be a big, scary thing that needs taming.

For any fashion trend, its important to make sure that other things are hitched to its wagons.

Let's not forget that, if it's worth paying attention to, there's got to be a way to make money from it.

It's never too soon to start adding layers of process and paranoia to it, to assure that costs skyrocket and that hardly anything ever gets done; in other words, governance.

Finally, anything but anything has to have a human side.

I swear, I sometimes think there's a central planning committee for technology fashions. They plan when the next new label on something old and not all that interesting is going to come out, grab their standard set of titles, and pass them out to people to write the books.

But then, I guess it can't really be that organized, because there are usually so very many books, each of them covering the same small set of themes over and over and over, with slightly different language. The themes always seems to include:
- X is revolutionary; it will change lots of important stuff.
- X is big and scary, and you need help to tame it or bring it under control
- There are lots of ways to screw up doing X, so you need to pay lots of money for Y to get it right
- You're a Dummy, but I'll help you understand what you need to know about X anyway.
- X has a human side
The Conferences

Things aren't that different with conferences. They take the themes established in the books and embellish them a bit.

There are conferences for people who work in particular sectors.

You can't pass up an opportunity to learn from the very best.

Who can resist going to a conference which cuts through all the crap and helps you do stuff?

Anyway, you get the idea — there are lots of conferences. The themes are predictable, even without the aid of big data or predictive analytics. Because they apply to any technology fashion trend.

Conclusion

Technology fashions — they are forever in fashion!
“Big Data:” Some Little Observations
"Big Data" is everywhere. If only because of this, it is important, like the way Paris Hilton

is famous for being famous.

What's included in "Big Data?"

If your concern is storing, serving or transmitting it, you don't care what kind of data it is — data is data, a pile of bits.

But not all data is created equal. The easiest way to understand this is to break all the bits into relevant buckets. By far, the largest bucket is for image data (including both still and moving pictures, videos). While the ratios vary, it's not unusual for there to be 100 bits of image data for each bit of other data.

While there's not a commonly accepted terminology, all the rest of the data can be understood as "coded" data. This again falls into two categories. The larger portion is "unstructured" data, things like documents, blogs, e-mails and most web pages (except for the images and videos on them). The smaller portion is "structured" data, which includes all databases, forms and anything else that can show up in a report.

When people talk about "big data," they could be talking about any of the above, but mostly people talk about it because they want to extract actionable information from it, and the source of most actionable information is structured data. So in the vast majority of cases, when people talk about "big data," they're talking about structured data.

Did Data used to be Small and Now it's Big?

Think about a bank statement. There's a little information about you at the top, but most of the statement is probably taken up by the transactions — money moving into and out of the account. In general terms, this is the action log, the transaction history. This pattern of having an account master and detail records is a common one.

Now think about a web site. The site itself is like the bank statement, and the record of people visiting and intereacting with it is like the transaction history, generally known as a web log.

People generate far more transaction records when interacting with
the web than other human activities; for example, you probably click on
hundreds of pages for each bank transaction you make. So the amount of data can be pretty big.

The simple answer is: before the web, transaction data wasn't very big, and with the web, there's a lot more of it than there was before. Of course big data isn't just about the web; but the web has certainly gotten people to pay attention.

So where did "Big Data" come from?

It would be interesting to do a cultural history, but I suspect that the current interest in "big data" stems from the following factors:
- Companies that pay attention to web logs get information about visitor behavior that can be used to make more money.
- Internet advertising companies have done exactly this for years, and are getting really good at it.
- Shockingly, most people don't analyze their data to improve their behaviors.
- A closed loop system in which the results of your actions are used to enhance future actions is the clear winning strategy.
- This requires (gulp) collecting and analyzing the relevant data, which is far larger than most people are used to dealing with.
Thus the term "big data," which currently applies to just about any body of transaction data.

What's "Big" about "Big Data?"

Let's start by applying one of the fundamental concepts of computing to the question: counting. One of the first disk drives I got to use was a twelve inch removable pack developed by IBM:

Its capacity was about 1MB. While that may sound small by today's standards, let's put it in perspective. Each byte is the equivalent of a character that you can type. Using a generous measure of 30 wpm and 5 cpw, that's 9,000 characters in an hour of continuous typing with no breaks, so the disk above has a capacity of more than 100 hours of continous typing. That's one reason I thought the disk's capacity was huge — it easily held the source code for the FORTRAN compiler I wrote at the time, which was about a year's worth of work!

Now let's get modern. Drives have gotten smaller while holding more and more. Here's a good visualization of the progression:

We're now at the point where truly small drives (1 to 2.5 inches) hold massive amounts of data; 1TB or more is common.

How much is that? Remember, it would take 100 hours of continuous typing to fill up the large disk pictured earlier. How much space would those drives fill if you had 1TB to store? That's about 1 million of the older disks; if you packed them tightly, they would fill a room that was about 100 feet long, 100 feet wide and 10 feet high. And I would have to type for 100 million continuous hours to fill them up. Now, that's big data.

Now that we've got a sense of how big a TB is, let's get real.

On a good day, this blog might have 100 page views, each generating a server log record. Such records vary in length, but let's say they average 100 bytes in length each, or 10K bytes a day. Not much.

Let's say I caught up to the Washington Post, a site which is in the top 100 in the US. It gets about 1 million page views a day. That would be a mighty 100 million bytes a day of raw server log data. 10 days would add up to a GB of data, which means that ten thousand days, about 30 year's worth of data would fit on one of those physically little drives pictured above that holds just 1TB of data.

The Washington Post is a major site; top 100. Their web transaction logs are the biggest data for analysis they've got. And here's what 30 year's worth of their data will fit on:

That's what they call "big data." This is why I instinctively drop into cynical mode when the subject of "big data" comes up. It just isn't usually very big!

How much data do you need?

It depends on context. If you're a website like Facebook offering a free service holding user's data, the answer is simple: you keep as much of the user's data as you feel like. You can (and if you're Facebook, regularly do) throw out data any time you feel like it, or just drop it on the floor and lose it because your programmers weren't up to dealing with it.

If you're a money-making business that depends on data, you could probably run your business better if you
1. Kept all the data
2. Analyzed it
3. Came up with useful observations, and
4. Changed your behaviors accordingly.
But most businesses don't do this very well, if at all. And they are feeling increasingly guilty about it. Thus the marketing drum-beat for selling everything that can possibly be labelled "big data."

Sarcasm aside, the fact is that most businesses don't need much data in order to perform wonderfully useful analyses. The reasons are simple:
- The things that matter the most are things you're not doing yet. The data you've got is historic. It's like if you're a comedian and the audience doesn't laugh much; no amount of big data analysis of audience reaction will help you come up with better laughs.
- The impact of big potential changes will be seen in lots of your data. Go back to statistics 1.01. How much data do you need to see that the coin you're flipping isn't a fair one? Only enough to prove that the 2 out of 3 times it comes up heads isn't a fluke.
- In the end, how many changes can you realistically make? Hundreds? How about rank ordering them, finding the most important ones first, then moving on from there? You'll quickly get to diminishing returns.
Finally, more important than anything else, is getting into an experimental, data-driven, closed-loop system. This is always the key to success. It how organizations become successful, get more successful, recover from trouble, and stay on a winning path.

Conclusion

For better or worse, "big data" is likely to be with us for awhile, at least as a technology fashion trend. Like all such fashion trends, it's a useful occasion for getting us all to check if we're putting our transaction data to its most optimal use in keeping us on the track we're on and getting us onto improved ones.
Computer Storage and Batteries in the 21st Century
Computer storage is a key weapon in the arsenal of Cloud service providers. It's the difference between a mediocre service and a great one. Batteries play a similar strategic role in electric cars. A bulky, old-style battery consigns an electric car to trailing the pack. Comparing these two domains can help us understand both of them.

Batteries

I hope most people know that cars have batteries like this one:

Batteries are an essential but minor part of normal gasoline-powered cars. But in hybrids and all-electric cars, their characteristics determine the overall success of the car.

When you drive an all-electric car, you can experience the importance of the battery.
- How fast does the car accelerate? In part, this depends on how fast the electricity flows from the battery.
- How long can you drive it? In part, the more charge the battery holds, the longer you can drive. You can also drive farther if you can use all the electricity in the battery.
- How long do you have to wait to drive again while re-charging?
- How many years will the battery last? How often do you need to service it?
- The weight and size of the battery are also key factors. Everything else being equal, a battery that weighs twice as much will make acceleration and drive time worse, and a battery that takes twice as much space will similarly degrade operation.
- Finally, cost. Let's not forget about how much you have to pay.
When you walk into a dealership and ask about electric cars, you may think purchase cost is the main thing that matters. But as you get educated, you learn about these other factors that are just as important.

Boston Power Batteries

Oak invests in the maker of the best battery for electric cars, Boston Power. Boston Power didn't invent the underlying chemistry being exploited, Lithium Ion. But they have scores of patents for making the underlying chemistry safe at car-sized applications, dense, light, and fast and effective at taking and giving electricity.

Each one of these factors is important. You can experience them personally in a car. The safety issue isn't a minor factor, since lithium ion batteries, when not built with Boston Power safety technology, can catch fire and explode; there have been massive recalls as a result of this. Here's an illustration:

If a little notebook computer battery can do that, imagine what could happen with a car-sized battery!

The key thing is that Boston Power's batteries are best-in-class at all the things that matter: energy density, long life, fast charge, safety and environment.

Computer Storage

I hope most people know that computers have storage like this one:

Storage is an essential part of computers. But just as things change when batteries power whole cars, what is the best storage changes when computing moves into the Cloud.

It's not as easy to personally experience the impact of storage as it is to experience the impact of a battery while test-driving an all-electric car. But the change in scale is every bit as dramatic. While your department's computers might fit in a closet or small room, Cloud data centers go on for acre after acre.

It doesn't make much difference if your department's system takes one rack or two — but if a given storage system requires two acres to do its job when a Cloud-sensitive one can be better while taking just one acre, that makes a big difference.

When you operate on a Cloud scale, factors that don't matter much at a smaller scale become hugely important. The important factors are remarkably similar to those of a battery:
- How quickly can you store and retrieve data? If it's too slow, you'll have to buy more to get the speed you need.
- Can you fill it completely with data and still have it perform?
- How many years can you use it? How often is service required, and how costly is the service?
- Size and power consumption are key factors. Space and power may not seem like large factors, but on a per-acre scale, they are huge.
When you first learn about storage, the only question you ask is how much it costs to buy a given amount. As you get educated, you find these other factors are just as important.

X-IO Storage

Oak invests in the maker of the best storage for large-scale data centers, X-IO Storage. Just as Boston Power didn't invent the chemistry, X-IO doesn't make the basic storage devices. Just as Boston Power has made the chemistry practical for car-scale application, X-IO has scores of patents for making large numbers of storage devices (spinning disks and SSD's) safe and practical for acre-scale applications: dense, low-power, long-lived, low-maintenance, fast and effective at taking and giving data.

For example, most storage systems treat their disks as throw-away items: devices that often fail and must be replaced frequently. Typical rates are amazingly high, resulting in substantial labor, replacement and error costs. The Google video below illustrates the consequences of this well; start at 2:42.

The Managed Reliability aspect of the X-IO technology reduces storage device failure rates by over 100 times. This is such a huge advance that disks can be sealed in their enclosures, which leads to other benefits.

The key thing is that X-IO storage devices are best-in-class at all the things that matter in storage: storage density, long life, reliably high performance, low power and environment.

Conclusion

Whether it's batteries that make electric cars practical or storage that makes acre-scale data centers affordable, Oak invests in companies that develop fundamental, industry-changing technologies over many years, and sees those companies through to success.
CTO + CFO = CFBCO
The CTO and the CFO aren't natural best friends in any organization. They are typically separated by a huge gulf of perspective; neither understands or appreciates what the other thinks or does. The best thing for any organization is when the two of them can truly take the other's perspective, and change what they do as a result.

CTO

What's the Chief Technology Officer (CTO) about? He or she had better be the best technical person in your organization. The one who understands the details, the big picture and everything in between. The one who actually understands all those computer acronyms, can sling them with the best, can pick the best and harness them for the good of your organization. At best, the CTO can rally the tech nerd employees to the cause and also convince the suits that everything is good.

CFO

What's the Chief Financial Officer (CFO) about? He or she had better be the best financial person in your organization.The one who understands every line of every statement and report, what's behind it, what led to it and where it's going. The one who understands how all that mass of detail relates to company tactics and strategy, and plays a key role it making them align. At best, the CFO can handle the big picture and the details, issues and people inside and outside the company, all to advance the company towards its goals.

CTO vs. CFO

When not at their absolute best, the CFO and CTO can have a chilly relationship.

The last thing the CTO wants is for some nosy bean-counter to mess with his stuff. Even simple questions are suspect: why is he asking? what's he looking to cut? Go away! Most CFO's seem like clueless idiots to even average CTO's. All they can possibly do is waste time and, through crass stupidity, make things even harder than they already are.

CFO's often see the whole tech group and the CTO in particular as being a bunch of whiney, spoiled, self-absorbed brats. They're anti-social, talk among themselves in their private language, get huffy or all-too-patient in response to even the simplest of common-sense questions, and seem perversely intent at avoiding anything that increases revenues or profits. They're perpetually late, reluctant to commit to anything, always complaining about lack of support, but still manage to have an attitude about pretty much everything.

CTO Fundamentals

Most of the thoughts I've had about computing that are worth anything took me years, often decades, longer than they should have to penetrate my thick skull. But one of the earliest realizations I had remains both true and rarely discussed: to the extent that computers are applied well, they cut jobs, and therefore (usually) costs.

This isn't the pretty way to put it. Most people prefer to think about how computers enhance people's efforts. And they do — meaning you can get the same job done with fewer people. Or you can deliver more with the same people — to deliver more without computers, you would have had to hire more people. Any way you cut it, the more widely and effectively computers are used, the fewer people you need to get a given job done.

Put all the gobble-de-gook aside, and what computers come down to is simple: cut costs, do more with less, get it done faster, etc.

Now what does this sound like? Could it, perhaps, sound like the kind of thing CFO's are supposed to worry about? Hmmmm….

The CFBCO

Maybe there's some middle ground here. At the heart of the matter, there is no skill set in an organization better suited to helping a CFO meet his goals than the CTO's. And when a CTO who is really good in nerd terms wakes up and realizes what his job is really about, there is no person better suited to be a company-maker than a bottom-line-oriented CTO.

Put in the most basic terms, a good CTO can make things happen:
- Faster
- Better, and
- Cheaper.
Faster, better, cheaper. That's computing in a nutshell, and when computing is applied to an organization to greatest effect, that's what happens to the organization. It does what it does faster, it does it better, and it does it cheaper. FBC. Wouldn't it be nice if we could have a Chief Faster-Better-Cheaper Officer?

Conclusion

In most organizations, there is a spectrum of leadership. At one end of the spectrum is the nerdy CTO. At the other end is the what's-your-handicap CFO. If you're very lucky and very deserving, maybe you have someone at the center of that spectrum who combines the best of both extremes in a single individual. The CFBCO. The CFBCO combines ultimate nerd-power with dollars-driven vision and insight, and makes the relevant numbers better in a way that even the dullest and most distracted board of directors can understand and appreciate.
Software Quality Assurance Book

I've written quite a bit about software quality over the years. In addition to quite a number of posts on this blog, I've written a short book about it. Currently, I just distribute it in PDF form to work-related people, but I'm thinking about releasing it on Kindle as an e-book.

Background

Anyone involved in software who's, like, alive, gets real involved with software quality. Many years ago, I discovered it was useful to follow up meetings I had with software groups with an e-mail summarizing the ideas. As common themes emerged, I found myself with a small library of e-mails, cutting and pasting them. Then the collection turned into a document, since the ideas were so inter-related.

I started giving the document to groups before meeting with them. I got feedback during and after meetings, everything from mistakes I'd made to important issues I had ignored. So the document grew as it went through at least 15 revisions.

The document/paper/book is pretty long and comprehensive, and I haven't been discovering new things to add to it recently. So it must be "done." I've even taken the time to throw together a crappy-looking cover:

Mainstream Thinking

There are literally hundreds of books on software quality. There are tools. There are certifications. There's a huge body of work out there. Why did I put this book together? Does the world really need another book on software quality? What more can there possibly be to be said?

First of all, let's notice that in spite of all the books, methods, quality software and certifications, software quality still stinks. It stinks in big, process-laiden corporations. It stinks in cool young web start-ups. It stinks all over this land!

So what's the problem? Do people simply ignore best practice? Do they not understand it? Do they try to apply it but screw up?

The answer is pretty simple: mainstream software quality methods are no good. They cost a lot, take a lot of time, slow down development and modification, and don't improve quality much to speak of. What's more, most people in the industry who aren't completely asleep at the wheel know it — which is the origin of the typical complaint of quality groups, that they're understaffed, underfunded, and never given enough time to do their job the "right" way. This complaint is generally justified! And it's likely to stay that way, because whenever those groups get what they want, cost and time goes up and quality stays roughly the same.

So that's why I wrote what I wrote — I wrote what you couldn't read elsewhere, about ideas and methods that were ignored by the mainstream. Who knows why? I've stopping caring.

Validating the ideas

I'm only comfortable talking about stuff I know personally. The origin of the book was a large software project, comprising over 7 million lines of code. It processed credit card transactions. I was CTO, and Y2K was rapidly approaching. It was too late to do things the "right" way. We couldn't afford it anyway. Doing nothing was not an option.

So I dredged up some methods I had used in systems software testing that I realized no one knew about in applications. Because there was no other option, everyone rallied to this one. We got the job done and passed Y2K with flying colors.

Later, as I became more involved with Oak companies, I noticed that the short cycle times of web development forced small groups of desperate programmers to re-invent a subset of the ideas I was beginning to systematize. When things were really bad in companies not already using the methods, I could sometimes get them to try them, and the ones that really shifted to the new methods found success. By "success" here I mean simply that they got higher quality software with less time and effort and shorter cycle times, with less "tax" on development.

For a few years, I thought it was important to keep this magic bullet secret. Hah! Glaciers will melt before most software development groups try anything that challenges the way they've done them for years.

The Down Side

There's a down side to pretty much everything. Down side to publishing the book? Can't think of one. Down side to using the methods? Definitely. Here are two big, fat problems that emerge when using the new methods, quoting from the book:

With no big, formless, unproductive but “necessary” QA group, there is
no place to put weird new hires in hopes that they’ll get bored and leave.
There’s also no place to send people who are just too stupid or lazy or
socially skilled to make it as programmers, but you don’t have the heart to
fire them.

There are no big, fire-breathing, invective-filled meetings populated
exclusively with overhead jobs (managers and marketing) who argue about
“pulling things in” and “risks” and what happened last time and “competitive
pressures” and elaborate project management charts in 4 point type that someone
made up last night but everyone makes believe actually have a relation to
reality other than “not.” Meetings like this raise everyone’s heart rate way
more than hours in the gym and supply anecdotes providing amusement and smarmy
edification for weeks. They would be missed.

Conclusion

Will I push the "publish" button? Probably. I'm thinking about it. Update: I've thought about it. The button has been pushed. The book is here.
Software Quality Assurance Book now available

The book I threatened to release is now available on Amazon.
The Disease of Software Project Management

There are a lot of books on the market about project management in general and software project management in particular. More than 6,000 of them.

They all appear to think that software project management is a good thing — at least the brand they preach.

I've threatened to publish a book saying that it ain't so. Giving details, arguments and examples. Sounds radical — but it's not. Most sensible, productive software people know that software project management's effectiveness is best compared to the fineness of the emperor's new clothes:

In publishing this book, I'm not doing any more than the little boy in the story, who cried out "But he's not wearing anything at all!" In other words, I'm just saying what everyone who isn't blind already knows.

The book is now available on Amazon for Kindle. I even made a nerdy cover for it:

My hope with this book is to assure the people who know there's something deeply wrong with project management orthodoxy that they're sane people. but living in an asylum which the inmates have taken over. I hope the book will arm them with the concepts they need to make a break for it, so they can experience the fresh air and freedom they deserve.
Software Project Management Book

I've written a fair amount about software project management in this blog. I've also written a short book about it. Like the software quality book, so far I've only distributed it privately. But also like that book, I'm thinking of publishing it as a Kindle book.

Tid-bits on the blog

It's hard to be seriously involved with software and avoid run-in's (not to mention complete co-option) with project management. You can hardly start to think about writing some code without someone popping out with "how long do you think it will take," the question of estimates. If you resist or act uncomfortable, you're put on the spot. Everyone, you see, wants their software group to be as predictable as though it were a software factory. The people who talk this way clearly don't understand that dates are evil, but there are so many of them, it's like you live in a land of zombies.

Background

While many programmers resist it, they most often accept project management as a necessary evil, as something that they can't avoid. As they age, sadly, most programmers accept this perverse thought as though it were a natural accoutrement of adulthood: wild young programmers may resist the bridle, but mature ones accept that it's part of life.

I too resisted it, and I too came to appreciate some of the rhetoric of software project management. But then reality intervened.

A bit more than 20 years ago I ran a small software group doing pioneering work in document imaging and workflow. A new management team took over, and were appalled that we just wrote code. I was guilty of about the worst thing a manager could be accused of (in their eyes): running an out-of-control, seat-of-the-pants operation in which people just did stuff, without the comfort and support of project management.

Things changed. Expensive project management software got bought.

Expensive consultants came in and lots of formerly productive people sat in excruciatingly long training classes. For days! Then we settled into a regimen in which lots of reports and dense charts

were generated regularly, and we threw around terms like "critical path."

Well, we "got under control." And stopped writing much code. And fell behind the market.

As we became more predictable, we became more inflexible. Timelines stretched out so far that sales people lost heart. It was sad.

After that baptism by torture, which was followed by many more, I really began to think about what was going on when I got involved with Oak. I had a chance to see lots of companies producing software with varying doses of project management involved.

I noticed that the Indian Outsourcing companies were pushing project management big-time, and winning business with it. It must be a good idea, right? When you dove into the details, they did not win by being faster and more flexible. They were completely rigid and slower. But predictable and marginally less expensive. Here's the bottom line: they won business by costing less. They cost less because they paid their programmers only about one tenth of the equivalent programmer in the U.S. But their methods had so much overhead that they staffed every project so much that the final bill to the customer ended up being only about 30% less than doing it in-house. So, oddly enough, the Outsourcers with their devotion to project management proved the point of how bad it is.

On the positive side, I saw entrepreneurial companies doing more work with less, having more flexibility, less overhead, and shorter cycle times. Had they found clever new ways to implement project management? No. They just found better ways to develop better software with fewer bugs, more quickly. That's all!

Systematic Thought about Project Management

These experiences led me to try to understand what project management was really all about — why everyone kept trying to apply it to software, why it never works (except if you don't care about time or money), and what the alternatives are.

It was a long journey, and I was surprised that I ended up with a short book. As I state in the book:

“Project management” is as effective at guiding
software projects to success as hopping and grunting is at helping pool balls
to drop in the intended pockets – it may be entertaining to watch, but it has
no constructive impact on the outcome. More important, to the extent that we
focus on our hopping and grunting technique, we fail to pay attention to what
really matters – hitting the ball correctly with the queue. Similarly, in
software projects, the more things get off track, the more we seem to focus on
project management hopping and grunting activities, so much so that the shaking
floor actually makes things worse.

Project Management needs to be taken down a few notches

Part of the problem is that it just doesn't work. Another part is that everyone with experience knows it doesn't work. The crowning part of the problem is that even people who know it doesn't work and put it to the side when they really have to get something done, continue to kow-tow to it. This is illustrated by a story I personally experienced that I tell in the book.

I recently spent some time with the seasoned,
non-technical leader of one of our portfolio companies, and some of his lead
technical people. We discussed one of their most successful products. The CEO
described how he got involved with a couple customers who had a problem that no
one could solve, how he promised them a solution and got his programming team
to throw something together that sort of worked. They then scrambled, fixing
problems and coming out with a flurry of new releases, always listening to the
customer and evolving their code until things settled down, the customer’s
needs were met and the company had a new product line.

“Of
course,” said the CEO, glancing over at his technical people, “that was the
wrong way to do things. Later, we settled down and got back to proper project
management.” Of course – the CEO had to intervene and make sure something
important actually got done. Later, “project management,” i.e., doing very
little but trying hard to do that little on time and on budget, could be
allowed to return.

Conclusion

Again, I'm thinking of pulling the trigger on the Project Management book. But first I need to finish formatting it.

Update:

Trigger pulled. Book available.
There are Lots and Lots of Books on Software Project Management

There are an amazing number of books on software project management, each promising to tell you how to "manage" your way to software success. Amazon lists over 6,000 of them! There seems to be no end of books that claim to do it "better," or claim mastery over some sub-specialty.

Here are some books listed on Amazon. There are books (of course) for dummies:

Books from major publishers and computer societies:

Here's one emphasizing numbers, from a major publisher and a big specialist consultancy:

This one appears to be a hit — it's got over 30 reviews on Amazon, most favorable. It's from a major publisher, about a particular flavor of software project management:

This project management thing is serious stuff — my next example is in its tenth edition(!), and is one of the books you need to read to be certified as a Project Managment Professional:

Oh, and about that exam — you don't want to flunk it, do you? You better pick this up to help you succeed:

And now, we've gotten all the way to item 30 in a list of 6.225 items, just skimming the highlights of about one half of one percent of the list!

What can possible have gone un-said about that deep subject of software project management?

How about that it's a pernicious disease, and does more harm than good?
Fundamental Concepts of Computing: Counting

Most fundamentals of computing are simple and so is this one: If some aspect of your software counts, count it!

Computer Fundamentals

People who are accomplished software engineers tend to be pretty smart and hard-working. They show mastery of difficult concepts and technologies that are beyond the grasp of most people. It's natural that people like this, when presented with a problem, would tend to dive right in to the tough stuff, both solving the problem and showing how smart and accomplished they are.

But the fact is, computing, like many other fields, benefits from regular re-visiting of the fundamentals, the software equivalent of "blocking and tackling," or even more basic, physical fitness. There is no better way to achieve great results in software than to revisit and re-apply the fundamentals on a regular basis.

Fundamental of Computing: Count it

Even though he was not a specialist in computing, I can't think of anyone who "gets" this concept better than Count von Count.
This guy really knows his counting. From Wikipedia:

The Count has a love of counting; he will count anything and everything, regardless of size, amount, or how much annoyance he is causing the other Muppets or human cast.

The Count also knows that the earth revolves around the Pun, not the other way round:

The Count mentions 2:30 at any chance he can get and often makes jokes about it. This number may represent an inside joke ("Tooth Hurty"). During the afternoon, his segments of the show always come on at exactly 2:30 p.m. or during the "fashionably late" segment, which airs at 2:31.

Counting How Big and How Often

There are many opportunities to find out how big something is. Whenever I hear about a bunch of data, I immediately go to the fundamentals and find out how big it is. I often find out that the number isn't as big as people are acting as though it might be. At this point, I won't do anything more than mention "Big Data" for fear of ruining my mood.

The way you do this is pretty simple: find out how big the average thing is, how many of them there are, and multiply. Not tough! That leads you to think about how you're going to store it. When you do, you find that the typical software methods people use are they same as a decade ago, when capacities were tiny compared to today. It's worth re-thinking storage methods!

There's nothing wrong, for example, with keeping a complete working set in an in-memory secure data store, everything including logs and history in a disk-based DBMS or flat files, and a historical set of data in a data warehouse organized for retrieval and analysis.

The important thing is that counting how much leads you to the fundamental architectural decisions that determine so much about how much work it takes to create and maintain a piece of software.

Counting How Many and How Long

Counting how many and how long are things I think the Count would also approve of. For example, consider the fundamentals of web site design.

Even before the user has a chance to get an impression of whether they find the site to be attractive, there's the issue of how long they have to wait for it to come up. How long you have to wait is the overwhelmingly most important factor in web site design. If a great-looking design is just too slow, the users will go elsewhere.

Second, count the keystrokes and clicks it takes to accomplish a given task. Sound trivial? Yes, of course, just like most of the fundamentals of computing. However, "trivial" factors like whether what stands between a consumer and his goal is 5 clicks or 3 clicks turns out to make a huge difference. Every "extra" action you require the user to take is an opportunity for that user to decide that his goal is one bridge too far, and he's got better things to do with his time.

Conclusion

Count von Count may be irritating, but he's nearly always right: count it!
Oak Investment Partners in the 2012 WSJ top 50 VC Companies

Oak Investment Partners backs 4 of the 50 companies in the 2012 WSJ list of top VC-backed companies. This isn't the first time Oak has been well-represented in that list, or in other important lists. But it feels great every time.

Venture Capital and VC-backed Companies

There are a very large number of companies backed by VC's, and a similarly large number that aspire to that backing. For this list, 5,900 companies were considered, so the list is what the WSJ considers the top 1% of all such companies. An elite list!

As to VC firms, there are also quite a few. The NVCA gives a couple definitions; depending on the one you prefer, there are between 460 and 791 venture firms in the US. This means that most venture firms probably have no companies they back on the WSJ list.

And our companies aren't just any old companies. Last year, our company Castlight Health occupied the number one spot. We've got the number one spot again by backing Genband.

The Companies

As I've done in the past, here's a quick summary of the companies:

#1 Genband. This is a rapidly growing, complex company that provides products and services deep in the innards of networks. The simplest way to understand them is experts in implementing the long evolution of fixed networking and communications systems to ones that are IP-based, for example VOIP.

#25 SmartDrive. SmartDrive has been on the list before. They're pretty much the same thing as they were, except they've clawed their way higher in the list this year, as they richly deserve. They still help drivers of commercial vehicles drive more safely and use less fuel. The market has rewarded them by installing their service on more than 10,000 commercial vehicles.

That's 10,000 vehicles that are safer, more fuel efficient and more cost effective than there were before, something which benefits everyone.

#27 Movik. Movik is deep inside the mobile networks. Most people don't think about what happens when they talk on their mobile phones while walking or driving, and they don't need to, because of the astounding web of complex systems that make it all happen. But we all know the mobile networks aren't flawless, in spite of the billions of dollars spent to upgrade and maintain them. This is where Movik steps in. With their deep insiders' knowledge, they have constructed a kind of real-time "big data" application with analytics and automated responses. They get a flow of information from the various internal systems and decide, for example, that a person walking and talking is connected to a local cell tower that is becoming overloaded, and there's a nearby one that he's walking towards that has excess capacity — and gets him switched. It's cool stuff, and creates a win for customers and the carriers.

#46 Keep Holdings.I'm having a lot of fun working with Scott Kurnitt and his ace team, based here in NYC, as they rapidly evolve their way from good ideas and implementations to great ones. Starting with AdKeeper, they've now added a service

to enable consumers to get back control over their in-boxes from commercial messages, seeing offers when and how they want to. They're also rolling out a "social commerce service"

that plays in the intersection of e-commerce, social networking and consumer curation of products.