Category: Computer storage

  • Replication: Good Idea! Storage replication? Nah!

    Everyone knows losing your data is a bummer. If you're in charge of your organization's data, you know that losing data is the shortest path to "don't let the door hit you on the way out."

    All the ways to assure your data is still available when you wake up tomorrow share a common theme: "make a copy." This is such a popular theme that it has turned into a theme-and-variations: "make a copy; make another copy; copy the copy; etc."

    This sounds simple, but we all know that in computing, stuff is supposed to be complicated. Sure enough, this simple "just copy it" theme has gotten mired in hotly competing ways to get it done. And of course, there are politics — whose responsibility is it to assure against loss?

    So let me boil it down: there are two basic ways to do the copy:

    1. The guys in charge of the data, the storage guys, should copy the data from the original bunch of storage to a second bunch of storage.
    2. The guys who write the data, the applications or systems guys, should get their applications or systems to talk to each other and write the data twice.

    The only reason this is hard is that politics and history are involved. If you had fresh, educated people starting from scratch, it would be no contest: way number 2 wins, almost every time. It's faster, cheaper and easier than way number 1. But since when can we wave a magic wand and eliminate politics and history? The reality is, storage guys own the data, they want to protect it, and so they (usually) really, really, REALLY want to be in charge.

    Here's why they shouldn't be.

    You've got two sites, number 1 and 2. Each one of them has a database and a bunch of storage. Transactions come into site 1 and get written to storage. D1

    Here's a simple transaction that might be written to the database.

    Txn 1
    It's a SQL statement that says the DBMS should write the transaction into the transaction table. The transaction contains the usual fields, things like the unique ID for the tranaction, the account number it's applied to, the amount of the transaction, etc. This is usually a simple string, a line or two long.

    When the database processes the transaction, it gets complicated, of course.

    Txn 2
    When the Insert statement goes to the DBMS, the DBMS has to write the transaction itself, but it also has to write at least a couple of the fields to index tables, kind of like card catalogs in old-style libraries that let you find where things are. Indices typically use well-know things called b-trees, which may require a couple of writes to create a multi-level index, for the same reason you put related files into sub-folders so you have some chance of finding them later. There will certainly be an index for the account ID and one for the account number. Finally, there's a log to enable the DBMS to figure out what it did in case bad things happen. D2

    All this happens when the Insert transaction comes in. One simple request to the DBMS, many writes and updates to the storage, usually involving reading in big blocks of data, modifying a smal part of the block, and writing the whole thing out again.

    Now we come to the crux of the matter: how do we get the data over to site 2? Does the DBMS at site 1 talk with his buddy at site 2 to get it done, or are the relevant storage blocks in site 1 copied over to site 2?

    In the diagram, I show the DBMS doing the job in green and the storage doing the job in red. D3

    You'll notice that the DBMS only has to send a tiny amount of data over to site 2, essentially the insert statement. Once it's there, DBMS #2 updates all the storage, something it's really good at doing.

    To replicate the data once it's been stored (in red), HUGE amounts of data need to be sent over the network to site #2. It's not unusual for the ratio to be hundreds or thousands to one. US Letter Blank_2

    Sending data between sites is a relatively slow and expensive operation. That's why, if you want replication that's fast, reliable and inexpensive, you want the application to do the job, not the storage.

    The storage replication people don't like to talk about the things that go wrong, but of course they do. What happens if some of the blocks get over but others don't. Or they're out of order. Or syncing with the database doesn't happen. Or any number of other bad outcomes.

    Other applications

    I'm using a database application to illustrate the principle, but similar dynamics work out with other applications. All major databases can replicate (Oracle, MySQL, SQLServer, MongoDB, etc.), the major file systems can replicate (for example Microsoft has VSS), and all the hypervisors can replicate.

    The hypervisors are amazing. The first thing the storage guys will come back with is how many different applications you have to fiddle with to protect their data. The answer of substance is that the incremental effort for each application is truly trivial, well under 1%. The quick answer is that hypervisors (VMware, Hyper-V, etc.) are universal, and their replication is superior to storage replication. This is exactly why, as organizations move their data centers to the cloud, they are abandoning expensive, inefficient storage vendor-lock-in features like replication in favor of doing it in the hypervisor.

    Conclusion

    You have to protect and preserve your data. Non-negotiable. The storage guys used to have a monopoly on it. But their high-priced, inefficient copy methods are rapidly giving way to more effective, modern ways that save money and are nearly standard in the SLA-centric world of cloud computing.

     

  • Obstacles to Scaling: Centralization

    Want to build a scalable application? Use a scalable architecture. What's a scalable architecture? Simple. A scalable architecture is "shared nothing," an architecture in which nothing is centralized. This seems to be harder to achieve the "deeper" you go into the stack; many software architects still seem to like centralized databases and storage. It's sad: centralized database and/or storage are the most frequent cause of problems, both technical and financial, in the systems I see.

    Scalability

    Scaling is simple concept. As your business grows, you should be able to grow your systems to match, with no trouble. Linear scalability is the goal: 11 servers should be able to do 10% more  work than 10 servers. Adding a server gives you a whole server's worth of additional capacity. With anything less, you don't have linear scalability.

    This is what we normally enjoy with web servers, due to the joys of web architecture and load balancers.

    Sadly, this is often not what we normally enjoy with databases, because of mindless clinging to obsolete practices and concepts.

    Databases

    Databases are a wonderful example of a tool that was invented to solve a hard problem and has created a lot of value — but has turned into a self-contained island of specialization that tends to cause more problems than it solves.

    Databases are a Classic example of a Software Layer

    Most people in software seem to think that having layers is a good thing. Software layers are, with few exceptions, a thing that is very, very bad! The existence and necessity of the layer tends to be accepted by everyone. It's so complicated that it requires specialists. The specialists are special because they know all about the layer and what it can do. They compete with other specialists to make it do more and more. Their judgments are rarely questioned. Sadly, they are wrong all too often both on matters of strategy and detailed tactics. All these characteristics of software layers apply to the database.

    Database pathology is a classic result of the speed of computer evolution

    Databases were invented by smart people who had a hard problem to solve. But the fact that they have persisted as a standard part of the programmer's toolkit, essentially unchanged, is a classic side-effect of the fact that computer speed evolves much more quickly than the minds and practices of the programmers who use them. This concept is explained and illustrated here.

    How to fix the problem

    There are a couple of approaches, depending on how radical you are.

    • Fix the scalability problem by moving beyond databases

    If you have the chance, you should do yourself and everyone else a favor and move to the modern age. As I show in detail here, the fierce speed of computer evolution has solved most of the problems that databases were designed to solve. The problem no longer exists! Get over it and move on!

    • Fix the scalability problem by moving to shared nothing

    If you're not willing to risk being burned at a stake for the heresy of claiming that a problem involving a bunch of data can be solved nicely without a database, there are almost always things you can do to fix the typical centralized database pathologies.

    The desire to have all the data in a single central DBMS is strong among database specialists. This desire is what fuels the incredible amount of money that goes to high-end solutions like Oracle RAC. The desire is completely understandable. It's not unlike when a bunch of guys get together, bragging rights go to the one with the coolest car or truck.

    However understandable, this desire is misguided, counter-productive and remarkably ignorant of fundamental DBMS concepts, like the difference between logical and physical embodiments of a schema. There is no question that there needs to be a single, central logical DBMS. But physical? Go back to database school, man! All you need to do is apply a simple concept like sharding, which in some variation is applicable to every commercial schema I've ever seen, and you've gone most of the way to the goal of a shared-nothing architecture, which gives you limitless linear scaling. Game over!

    Analysis

    Computers evolve far more quickly than software, which itself evolves far more quickly than the vast majority of programmers. There is nothing in human experience that evolves so quickly. This fact explains a great deal of what goes on in computing.

    I've found that the more layers a given computer technology is "away from" the user, the more slowly it tends to change, i.e., the farther in the past its "best practices" tend to be rooted. In these terms, databases are pretty deeply buried from normal users, metaphorically many archaological layers below the surface. They are "older" in evolutionary terms than more modern things like browsers. Similarly, storage is buried pretty deep. That's why most of the people who devote their professional careers to them are mired in old concepts. If you think about it, you realize that DBMS and storage thinking strongly resembles thinking about those ancient beasts that used to rule the earth, mainframes!

    Conclusion

    Most software needs to be scalable. "Shared nothing" is the key architectural feature you need to achieve the gold standard of scalability, linear scalability. Shared nothing is common practice among layers of systems that are "close to" users, but relatively rare among the deeper layers, like database and storage. But by dragging the database function to within a decade or so of the present, and by applying concepts that are undisputed in the field, you can achieve linear scalability even for the database function, and usually save a pile of money and trouble to boot!

     

  • Storage For Big Data

    In Big Data, computers and storage are organized in new ways
    in order to achieve the scale required. The major storage companies just assert, without justification, that their old products are just fine. They're not.

    Big Data is way bigger than the biggest
    computers. In Hadoop, you solve the problem with an array of servers that
    can be as big as you like. Hadoop organizes them for linear scaling. While most
    storage vendors continue to plug their old centralized storage architectures
    and claim they’re good for Big Data, the only solution that’s actually scalable
    is an array of storage nodes, directly connected to the compute/storage nodes.
    Hadoop organizes the computing to use such an array of compute and storage
    nodes optimally, and it can grow without limit, for example to thousands of
    nodes.

    Hadoop has its own file system and database. The NAS systems
    pushed by legacy vendors just add expense and slow things down. The old
    centralized controller SAN systems are expensive and not scalable. Some vendors
    promote how they are good for Big Data because they use lots of SSD – but
    that’s way too expensive for Big Data. Others promote hybrid systems, but make
    them affordable by playing tricks like compression, which just add expense and
    slow things down.

    Exactly one vendor has a storage system that is best for Big
    Data: X-IO. X-IO has exactly the kind of storage nodes that Hadoop wants. Its
    independent storage nodes are linearly scalable, without limit. Its software
    makes spinning disks deliver at least twice the performance compared to any
    other system. It can optionally incorporate SSD’s for even better performance,
    without using the distracting tricks used by others – you just get better
    blended performance, without effort. Because of the inherent reliability of the X-IO ISE units, you don't need as many copies of the data.

    If it's Big, if it's Cloud, if it's virtual, the X-IO is the place to go for storage.

  • Storage Vendors in the Cloud

    When computer vendors encounter a major technology disruption, they respond the same way, with fervent claims that their products are really well suited for the new environment, when of course they are not. The response of storage vendors to the new ground-rules of the Cloud provide a timely illustration of this near-universal phenomenon.

    Our Product is Definitely in Fashion

    Computers
    are complicated. Many people have trouble just keeping the buzzwords in
    mind, much less understanding what, if anything, is behind them — much
    less actually understanding things. It's particularly tough when a wave
    of fashion sweeps the industry, as it so often seems to. Then everyone
    but everyone immediately claims to be at the forefront of whatever that
    fashion is.

    This
    was true years ago when the good thing to be in databases was
    "relational," and suddenly every database vendor revealed that their
    precious products were, in fact, "relational." At first I laughed. What
    idiots these marketing people were — why anyone can tell that C's
    product wasn't relational when it was built, isn't now, and probably
    never will be. What a joke!

    It turns out the joke was on me. Whatever the buzz-fashion-word of the moment, Industry-standard practice is to claim it. And for most people to accept the claim!

    This is a big deal for the established vendors. There is a lot
    of money riding on maintaining market share as the new trend takes
    hold. When "relational" becomes the hot thing, and your marketing people
    are any good at all, then by golly, our database is relational — because I say it is!

    The Cloud — the Buzz-Fashion-Word of the Moment

    Now
    the Cloud is hot. Surprise, surprise — everyone's product claims to be
    "cloud-ready," "Cloud-optimized" or whatever it is they think you want
    to hear.

    Everyone's product is just great for the Cloud. The major vendors:
    EMC
    Netapp

    and everyone else.

    Inside the Marketing Department

    Something like the following dialog probably happens inside each major vendor.

    Bright New Kid: "I'm having real trouble producing that marketing piece about our products for the Cloud. I've read a lot about Cloud, and we just don't fit. I don't know what to do!"

    Seasoned Veteran: "You're making it too hard. We make storage, right? Our storage is great, right? Cloud needs storage, just like everything else, right? So our storage is ideal for the Cloud. That's it!"

    Bright New Kid: "I'm not so sure –"

    Seasoned Veteran: "You're over-thinking it, kid. Our storage is great, so it's great for Cloud. Just get over yourself and write it."

    What's Different about the Cloud?

    There
    is no cloud industry association to certify what the criteria are for
    cloud appropriate. This is just as well, because the cloud is just another name for something we already do — run data centers.

    But the reality is that things are different in the cloud.


    The
    bottom line is simple — it's the bottom line! Literally! Meaning, the
    cloud is all about making things faster to implement and change; better
    performing and more responsive; and less expensive. I make no secret of my preference here. But the point and my analysis would be the same even if I had no horse in the race. It's not about feature X or service Y, all of which are irrelevant or migrating up the stack in Cloud applications. It's about the bottom line, not just purchase price, but TCO.

    The vast majority of data centers have been run essentially without competition. The people who pay the bills haven't been able to choose. It's the in-house data center or nothing.

    With the Cloud, suddenly there's competition. Buyers compare on price and quality — and can even switch if the promises prove to be hollow ones! So things are different in the Cloud. The arm-waving is replaced by the simple measures of capacity, performance, energy and space utilization, management costs, and maintenance.

     

  • Storage For the Cloud

    The massive movement to Cloud architectures puts new demands
    on systems vendors that most of them are unprepared to meet, while at the same
    time devaluing special features that many vendors used to differentiate their
    products. Nowhere has this trend been more evident than in storage.

    For years, storage has had its own silo in the data center,
    SAN and/or NAS, with its own storage managers and administrators. They became
    dependent on various storage-centric features of the different vendors.


    The Cloud has disrupted this comfortable island of
    automation.

    The Cloud is all about reliable, low-cost self-service, with
    tremendous automation and integration. Service, capacity and performance need
    to be available on-demand, with no human intervention. Everything needs to be
    able to grow and shrink as application needs change, with a sharp eye to
    capacity utilization, since it’s easier than ever to switch Cloud vendors when
    one stumbles or is simply no longer competitive. The same observations are true
    of “private clouds.”

    Virtualization is a key part of achieving Cloud goals, and
    virtualization changes the rules of the systems game. Functions that were
    traditionally part of storage are now performed as an integral part of
    operating systems and/or virtualization software, to make them more agile.

    Many companies have observed that traditional,
    controller-centric, feature-rich SAN and NAS solutions are not appropriate for
    the Cloud environment. They are simply using inexpensive JBOD’s for storage and
    depending on massive replication by the file system to provide reliability,
    typically making a minimum of 3 whole copies of the data, before backups, in
    order to assure availability. If the alternative is an old-technology NAS or
    SAN, this is a smart idea, which is why its use is growing so quickly.

    X-IO has a whole different approach to storage. It’s not
    NAS. It’s not SAN. It’s not cheap JBOD’s with a make-lots-of-copies filesystem.
    It’s an intelligent storage node that not only uses, but enhances the drives from one of the major OEM
    suppliers, Seagate. X-IO makes them better by a large margin, and it doesn’t do all the things that are no longer
    needed in the Cloud environment. X-IO gives you more of what you do need for Cloud, and none of what you don’t need.

    The X-IO approach to storage assumes you’re smart about
    building your data center. You’ll take a building-block approach, with lots of
    well-configured servers, network and storage blocks, with a layer of software
    on top of it all to orchestrate it. You want each building block to be great at
    what it does – do a lot, cost a little, and play its role in the overall system.

    In the end, storage comes down to a small set of storage
    components used by everyone. Rather than ignore the details of the drives and
    wrap them in fancy, useless (in the Cloud) packaging like everyone else, X-IO adds value, real value, to the
    drives themselves. This value persists as Seagate develops and releases new
    drives – the 2 to 5X X-IO advantage over every other storage solution will ride the
    waves of new drives into the future.

    X-IO spent over 10 years of deep development of unique IP
    (the first 5 as a Seagate division). Over that time it invented and hardened
    algorithms and code and incorporated the experience from having thousands of
    units in the field over many years. The results are clear, and differentiate
    the X-IO storage brick approach from everyone else. Given a set of drives, X-IO
    will make them:

    • Perform at least twice as fast,
      often 3-4X anyone else when near capacity
    • Deliver at least twice the throughput
    • Fail at less than 1% the rate of
      anyone else
    • Not require replacement during their
      5 year warranty
    • Take much less space, often 30-50%
      less
    • Require much less power, often 50%
      less
    • Require less cooling

    Finally, X-IO can incorporate SSD drives as required to
    achieve even better performance, though this is needed much less often than
    with other vendors.

    In service operations, Cloud is measured on cost and SLA’s.
    X-IO storage is all about cost and SLA’s. X-IO is the winning choice of
    storage for Cloud.

  • In storage, there is X-IO, and then there are all the others…

    In normal times, when there is no major technology
    disruption in the market, there are two categories of storage companies.

    Most storage revenue goes to the big names everyone knows
    (EMC, NetApp, Etc.). These companies have comprehensive storage solutions and
    services to meet nearly any need. Their products are solid and meet most
    mainstream needs. They don’t innovate much and aren’t the most cost-effective,
    but they work.

    A good deal of attention in the storage industry goes to the
    hot new companies, which are all about the latest technologies (e.g. SSD) or
    features. They usually don’t do the old things as well as the established
    companies. But by focusing on the hot new thing, they often do that one thing
    pretty well, and so appeal to the usually tiny part of the market that feels
    the corresponding pain. If they get market momentum, they are usually bought by
    an established vendor.

    This is the way it works. The established companies take most
    of the revenue and do little innovation. There is always a flurry of new
    companies trying to innovate, sometimes getting traction, and getting absorbed
    by the established vendors.

    Then there are technology disruptions. That’s when the rules
    change. Suddenly the comprehensive product lines of the established vendors
    don’t meet the needs of the emerging landscape very well (in spite of the
    furious efforts of their marketing groups to claim they do), and most of the
    new vendors don’t get the new situation and continue to do little but exploit
    new devices or add features onto the existing pile.

    Today’s Technically Disrupted World

    That’s the situation we’re in today, with the combined
    technology disruptions of data centers employing virtualization, moving to the
    Cloud, and attempting to exploit Big Data. In addition, there is a new storage
    technology, flash (SSD), which vendors are scrambling to exploit. The situation
    is confusing for buyers and chaotic for vendors, since most vendors try to act
    as though nothing fundamental has changed. But it has!

    The Cloud is all about reliable, low-cost self-service, with
    tremendous automation and integration. Service, capacity and performance need
    to be available on-demand, with no human intervention. Everything needs to be
    able to grow and shrink as application needs change, with a sharp eye to
    capacity utilization, SLA's and costs, since it’s easier than ever to switch Cloud vendors when
    one stumbles or is simply no longer competitive.

    Virtualization is a key part of achieving Cloud goals, and
    virtualization changes the rules of the systems game. Functions that were
    traditionally part of storage are now performed as an integral part of
    operating systems and/or virtualization software, to make them more agile. This
    also drives the movement to software-defined networking and storage.

    Big Data is the same only more so, with its emphasis on
    linearly scalable arrays of compute nodes and storage nodes.

    In response to this massive technology disruption, many
    companies realize that brand-name vendors no longer make much sense, and are using inexpensive JBOD’s for storage and depending on
    massive replication by the file system to provide reliability, typically making
    a minimum of 3 whole copies of the data, before backups, in order to assure
    availability. If the alternative is an old-technology NAS or SAN, this is a
    smart idea, which is why its use is growing so quickly.

    X-IO

    And then there is X-IO. While X-IO is a storage company,
    it’s different than all the others. It was built for a vision of computing that
    we now call “Cloud.”

    When X-IO was started about 10 years ago as the Advanced
    Storage Architecture division of Seagate, its goal was to build highly compact,
    efficient and reliable storage building blocks using Seagate HDD’s. While the
    rest of the storage world was ignoring the details of the devices on which it
    was built, piling on features and management systems that have become obsolete
    in the Cloud world, the ASA group was inventing the technology of the storage
    “brick,” now amounting to over 50 patents and a great deal of field-hardened
    code that delivers more of what Cloud needs than any existing system, by far.

    All storage vendors, whether established or emerging, use
    the same drives from the same couple of leading vendors, mostly Seagate or
    Western Digital (WD). All of them except X-IO package them in roughly the same
    way and throw some features on top of them to “differentiate” themselves from
    the other guys who use the same disks. It’s just as though all cars had one of
    two different kinds of nearly-identical engines in them – each of the car
    vendors would try to distract you from the engine, and try to get you to
    appreciate how wonderful their steering wheels or cup holders were. That’s even
    true of NAS and SAN, which seem so different, but really have the same engines
    (disks) in them – it’s like one has front-wheel drive and the other rear-wheel
    drive, but under most conditions, their speed, fuel efficiency, acceleration
    and service frequency are identical.

    The only storage vendor that is different is
    X-IO, and X-IO’s difference just happens to be on all the dimensions that
    matter most for the new world of Cloud, virtualization and Big Data.

    X-IO’s Difference

    First of all, X-IO doesn’t build feature-encrusted storage,
    like a “trophy car.” It’s basic storage, a storage building block or brick,
    ideal for plugging into nodes in a Cloud server farm under virtualization
    control, or a Hadoop cluster.

    Second, and most important, comes from its heritage as part
    of Seagate. While X-IO uses the same Seagate drives that other vendors use, all
    the other vendors just plug the drives in and proceed to concentrate on everything but the drive. X-IO’s technology, in sharp
    contrast, is all about making that drive perform at its very best. You wouldn’t
    think there would be much that could be done. But there is! X-IO reduces the
    error rate of the drives so much (more than 100X) that they can be sealed in
    containers, which makes them take much less space, consume less power and
    generate less heat than the same drives in any other system. Then the X-IO
    software actually gets more
    than twice
     the I/O’s per
    second (iops) from each drive than any other vendor.

    Let’s think about a car rally. Most of the cars will vary
    greatly in size, shape, color and gizmo’s. The X-IO car will be the plain one.
    Imagine them in a distance race. Most of the cars will overheat or have to stop
    for gas pretty often. Only the X-IO car will never overheat and get vastly
    better mileage than the others. Many of the other cars will break down along
    the way. X-IO won’t. Here’s the amazing thing: the X-IO car will cross the
    finish line in half the time of its nearest competitor.

    Now let’s think about sending an important package. Using
    normal cars, you’d better send 3 identical packages by different routes to make
    sure it gets there. With X-IO, you only need one car, and it will get there
    faster than any other car, using less fuel.

    In the world of Cloud, this translates into not having to buy expensive SSD drives to
    get performance, though X-IO has them available if you need to go even faster
    than X-IO normally goes. It translates into not having to over-provision to get
    performance. It translates into not having to store 3 or more copies of
    your data to assure it’s still there tomorrow. It translates into buying a half or third of the number of racks (or rows!) you
    would normally have to buy in order to make a given amount of data available at
    a given performance level. It translates into dramatically lower operating
    costs for those racks, which at Cloud scale and Cloud competitive pricing can
    be the difference between growing profitably and losing to the competition.

    No other storage vendor offers these benefits. No one but X-IO.

    Conclusion

    The “cloud” as we know it today didn’t exist when the ASA
    division of Seagate started inventing the deep technology that has now matured
    in X-IO. But its simple mantra of getting more value out of devices was a
    unique quest. No vendor has equaled it, and no one is even close. As new drives
    are released, the X-IO advantage will persist as a multiplier on whatever
    Seagate ships. All the other vendors will plug Seagate drives into their
    systems and try to distract you, drawing your attention to “anything but” the
    actual characteristics of the storage – its performance, space and power use,
    reliability. These thing are old news in the old world of storage, but they’re
    the only thing that matters in the new world of Cloud. Which is why there are
    all the storage vendors – and then there’s X-IO.

  • Computer Storage and Batteries in the 21st Century

    Computer storage is a key weapon in the arsenal of Cloud service providers. It's the difference between a mediocre service and a great one. Batteries play a similar strategic role in electric cars. A bulky, old-style battery consigns an electric car to trailing the pack. Comparing these two domains can help us understand both of them.

    Batteries

    I hope most people know that cars have batteries like this one:


    DieHard battery
    Batteries are an essential but minor part of normal gasoline-powered cars. But in hybrids and all-electric cars, their characteristics determine the overall success of the car.

    When you drive an all-electric car, you can experience the importance of the battery.

    • How fast does the car accelerate? In part, this depends on how fast the electricity flows from the battery.
    • How long can you drive it? In part, the more charge the battery holds, the longer you can drive. You can also drive farther if you can use all the electricity in the battery.
    • How long do you have to wait to drive again while re-charging?
    • How many years will the battery last? How often do you need to service it?
    • The weight and size of the battery are also key factors. Everything else being equal, a battery that weighs twice as much will make acceleration and drive time worse, and a battery that takes twice as much space will similarly degrade operation.
    • Finally, cost. Let's not forget about how much you have to pay.

    When you walk into a dealership and ask about electric cars, you may think purchase cost is the main thing that matters. But as you get educated, you learn about these other factors that are just as important.

    Boston Power Batteries

    Oak invests in the maker of the best battery for electric cars, Boston Power. Boston Power didn't invent the underlying chemistry being exploited, Lithium Ion. But they have scores of patents for making the underlying chemistry safe at car-sized applications, dense, light, and fast and effective at taking and giving electricity.

    Each one of these factors is important. You can experience them personally in a car. The safety issue isn't a minor factor, since lithium ion batteries, when not built with Boston Power safety technology, can catch fire and explode; there have been massive recalls as a result of this. Here's an illustration:

     

    If a little notebook computer battery can do that, imagine what could happen with a car-sized battery!


    Boston-power-ford
    The key thing is that Boston Power's batteries are best-in-class at all the things that matter: energy density, long life, fast charge, safety and environment.

    Computer Storage

    I hope most people know that computers have storage like this one:

    Seagate-hard-disk-drive

    Storage is an essential part of computers. But just as things change when batteries power whole cars, what is the best storage changes when computing moves into the Cloud.

    It's not as easy to personally experience the impact of storage as it is to experience the impact of a battery while test-driving an all-electric car. But the change in scale is every bit as dramatic. While your department's computers might fit in a closet or small room, Cloud data centers go on for acre after acre.

    Google data center
    It doesn't make much difference if your department's system takes one rack or two — but if a given storage system requires two acres to do its job when a Cloud-sensitive one can be better while taking just one acre, that makes a big difference.

    When you operate on a Cloud scale, factors that don't matter much at a smaller scale become hugely important. The important factors are remarkably similar to those of a battery:

    • How quickly can you store and retrieve data? If it's too slow, you'll have to buy more to get the speed you need.
    • Can you fill it completely with data and still have it perform?
    • How many years can you use it? How often is service required, and how costly is the service?
    • Size and power consumption are key factors. Space and power may not seem like large factors, but on a per-acre scale, they are huge.

    When you first learn about storage, the only question you ask is how much it costs to buy a given amount. As you get educated, you find these other factors are just as important.

    X-IO Storage

    Oak invests in the maker of the best storage for large-scale data centers, X-IO Storage. Just as Boston Power didn't invent the chemistry, X-IO doesn't make the basic storage devices. Just as Boston Power has made the chemistry practical for car-scale application, X-IO has scores of patents for making large numbers of storage devices (spinning disks and SSD's) safe and practical for acre-scale applications: dense, low-power, long-lived, low-maintenance, fast and effective at taking and giving data.

    For example, most storage systems treat their disks as throw-away items: devices that often fail and must be replaced frequently. Typical rates are amazingly high, resulting in substantial labor, replacement and error costs. The Google video below illustrates the consequences of this well; start at 2:42.

     

    The Managed Reliability aspect of the X-IO technology reduces storage device failure rates by over 100 times. This is such a huge advance that disks can be sealed in their enclosures, which leads to other benefits.

    The key thing is that X-IO storage devices are best-in-class at all the things that matter in storage: storage density, long life, reliably high performance, low power and environment.

    Conclusion

    Whether it's batteries that make electric cars practical or storage that makes acre-scale data centers affordable, Oak invests in companies that develop fundamental, industry-changing technologies over many years, and sees those companies through to success.

  • Three Most Important Factors in Storage: Performance, Performance and Performance

    We all know about the importance of location in real estate. What's the equivalent in storage? Performance. It's the one thing that you can't fix. When you look at storage, it should be what you look at first, second, and last.

    Location and Performance

    The three most important things in real estate are location, location and location. Real estate agents may talk about the attractive paint job, the great landscaping and the new roof. But if you didn't like them, you can fix them. The one thing you can't fix? The house's location.That's why it gets the top three slots.

    What about storage? You'd never know it from storage vendors, but the three most important things in storage are performance, performance and performance. You can fix most everything else with server-based software. You need replication? Your database can do it all by itself. You think thin provisioning is great? It's cheaper and better to get it with a VM. But performance? You want that go-cart to do 100 mph … uphill??? Fuhhgeddahbouddit, buddy. However fast you're going is as fast as you're going to go.

    The Storage Performance Problem

    We know there's a performance problem because of the fundamentals of spinning disks. We know there's a problem because vendors are coming out with expensive solutions that emphasize performance, and companies are going public based on storage performance; oddly enough, in the case of Fusion IO, they don't even deliver real storage, just a board that goes into a server! But people are so desperate for performance, they try it anyway. One company has even come out with an affordable solution that just screams performance.

    The biggest thing that convinces me there's a problem comes from the leader in server consolidation and virtualization, VMware. I went through their best practices in configuring virtual storage, which tells you all you need to know. They have four best practices. All four best practices amount to the same thing: make sure you get enough performance from your storage! Their best practices are explicit: you should buy storage not based on capacity, but on performance.

    Here they are:

    • Configure and size storage resources for optimal I/O performance first, then for storage capacity. –> Don't buy TB, buy iops (i/o's per second).
    • Aggregate application I/O requirements for the environment and size them accordingly. –> When you buy iops, make sure you look at all your applications.
    • Base your storage choices on your I/O workload. –> In case you didn't get it yet, pick storage based on iops!
    • Remember that pooling storage resources increases utilization and simplifies management, but can lead to contention. –> Remember that using a classic SAN can make storage performance worse, so don't be fooled.

    According to VMware, there are four most important factors in storage: performance, performance, performance, and performance!

    Does anything but performance matter?

    Of course it does. Do you want to lose your data when a server fails? You'd better not buy server-based storage. Do you want your performance to drop to a crawl when there's a disk fault? You'd better ask how frequently that happens, how badly RAID re-builds impact performance, and for how long (hours of severely degraded performance taking place weekly is not unusual in a large system). But performance still takes the top 3 slots. It's just like location: if there are two equally-well-located houses, you avoid the shack with the outhouse and buy the comfortable, modern house. With storage, if you have two systems with enough performance to meet your current and future needs, you pick the one that isn't a board stuck in a server, and the one that has enough affordable capacity.

    Conclusion

    Performance is more important, by far, than any of those silly features the SAN vendors love to rattle on about. But in a post-SAN world, performance is front and center. The bigger disks get, the worse performance gets. The more you virtualize and consolidate your servers, the more performance you need. In a word, you need SSD's, because they're fast storage. But they're expensive. So an appropriate blend of SSD's and spinning disks would be great, fast but affordable, if they really were in a seamless pool of storage. That's the Xio Hybrid ISE in a nutshell. In a performance-starved world, it's food for the hungry — food you can actually afford to buy.

  • Storage: The KISS Principle in a Post-SAN World

    The dominant model in storage today is the SAN (Storage Area Network), a.k.a. "storage mainframe." While "SAN" makes you think of a storage version of a LAN (Local Area Network), it is far from it. In fact, SAN's are monolithic, mono-vendor, administratively heavy-weight, burdensome beasts. They are laden with "must-have" features that sound good, but which are mostly crippled versions of functions performed more effectively, at lower cost, by server software.

    Most storage vendors make it clear what they mean by the KISS principle: "Keep it SAN Storage." Why? more revenues, more profits, more high-margin maintenance — in general, more for the vendor and less for the buyer. It's time for buyers to revolt. It's time to enter a post-SAN world of simple effectiveness. It's time for storage to be fast, scalable and affordable. It's time for a return to the original meaning of KISS: "Keep it Simple, Stupid." In other words, it's time for the Xio ISE storage blade.

    The Controller is the Problem

    Storage buyers buy, well, … storage. Duhh. If they didn't need storage, they wouldn't be talking with storage vendors. And it's true that storage vendors deliver storage. But what do they sell? Anything but storage. It sounds strange, but it's not — since every storage vendor sells storage, how can you tell one storage vendor from another? Only by talking about something else. Today it is standard practice for storage vendors to emphasize the importance of features that are somehow related to storage, but aren't actually storage.

    This brings us to the controller. Every traditional storage vendor has a monolithic controller. The controller is an expensive box that sits between the servers and the actual storage. The controller is where all these storage-related features are implemented. The game every storage vendor plays is to make you want what's in the controller, because whatever it is, only that vendor has it. The controller is what makes you buy one vendor's terrabytes rather than another vendor's. The controller is where vendor differentiation is. Last but not least — the controller is where vendor profits are.

    What about those "must-have" features in the Controller?

    I would love it if someone de-constructed them all, publicly and effectively. But let me start from a simple observation. In my job, I get to closely follow the technology decisions and deployments of dozens of growing, leading-edge companies, and I get a quick look inside many more. What I find says volumes about the status of all those "value-adding" features of storage systems: nobody uses the fancy, "value-adding" features of storage systems — they just use storage! As in plain old storage, like reading and writing.

    The reason all these leading-edge people just use plain-old KISS storage, is pretty simple: they focus on bulding technology that supports their business. The value is delivered by applications that use files and databases. Files and databases need storage. Give them storage and you're done!

    Just this morning I talked with some terrific folks who operate a leading edge internet advertising service. They already handle monster volumes out of multiple data centers. When orders are placed, the orders need to get out to all the ad servers. A perfect application for that popular feature of SAN controllers, volume mirroring, right? Wrong. There are at least a handful of reasons why this would be a terrible solution. But it doesn't matter, because they get the job done, effectively, quickly and well, with mysql's replication facility. Their application puts the ad opportunity into the master database, which replicates it to read-only slaves. Problem solved.

    I see this pattern everywhere: the problems that SAN vendors use to sell their controllers are better solved, more simply and with less expense, with applications and server software.

    Introducing the no-controller SAN: post-SAN Storage

    What does that mean? If you remove the controller in a SAN, what are you left with?

    With the old SAN vendors, you're left with a big, expensive pile of storage you can't use. In the post-SAN world, exemplified by the Xio ISE, you've got what amounts to storage blades. You can direct-connect a storage blade to a server or to a couple of servers, or you can network a set of blades together with a set of servers.

    Each ISE storage blade comes with a full complement of storage capacity and performance. If you need a truly giant pool of storage, you can combine any number of them into a single volume using server-based software. But more likely you'll want to share them among a pool of servers, which can easily be automated using RESTful calls from a UI or script.

    Then there's the issue of storage performance. Bigger disk capacities equals shrinking performance. That's why Fusion IO and similar companies are so hot. Note that Fusion IO doesn't have controllers or any of the fancy (= useless) features that come with them. People are snapping them up anyway. Maybe that post-SAN storage is worth looking into … you could save a bunch of money, keep things simple and above all deliver the performance your business demands. If you've got the kind of problem Fusion IO says they solve (a storage performance problem), you should do yourself a favor and discover how Hyper-ISE delivers the performance you need at price you can afford. And Hyper ISE gives you performance while keeping your data safe and providing full fail-over, unlike Fusion IO, which isn't really "storage" at all — it's just a board in a server, so when anything about that server goes wrong, your data goes wrong with it..

    Conclusion

    Most computer storage today is anything but simple, scalable or affordable. Intelligent storage buyers are increasingly buying the storage that they need and only the storage they need. They are saving money, time and trouble by refusing to buy expensive controllers that are laden with features they already have in the server, features they just don't need. These buyers have effective, simple, scalable storage that costs less, performs better and lasts longer than old-style SAN's. Welcome to the post-SAN world, where the sun shines, things are simple and life is so good, it makes you want to KISS someone in a new way.

  • Fusion IO and Xiotech Hybrid ISE

    Congratulations to Fusion IO for pulling off the most visible, most large-scale event in the storage industry in recent months: their highly successful IPO. And congratulations to Xiotech for pulling off the storage industry's most important event in recent months: the GA release of Hybrid ISE storage blades.

    Fusion IO

    Fusion IO went public and traded up in an over-subscribed offering. The excitement about the company and its prospects are justified: the storage industry has been basically ignoring its large and growing performance problem for years now. The industry has toyed with a variety of ineffective strategies involving the obvious alternative technology, SSD, but done nothing to move the performance needle.

    The brilliance of Fusion IO is to come out with a new category of product; they call it "memory storage." It's a board that you typically put into a server.

      Iodrive

    This is the brilliant part! The people who run applications have a huge performance problem, and the storage "experts" refuse to solve it for them, so Fusion IO gives them something in "their world" — a server board — that they can use, along with application changes, to solve their problem. It's a classic strategy: sell to the guy who actually has the problem.

    Here's the other thing I like about Fusion IO: their success clearly and unambiguously demonstrates that applications are experiencing storage performance problems, and that these problems are house-is-burning serious.

    Xiotech Hybrid ISE Storage Blade

    Xiotech's release for GA (general availability) of the Hybrid ISE last week is the most important recent event in the storage industry.

    The success of Fusion IO clearly demonstrates that increasing numbers of applications are experiencing house-is-burning performance issues. Most of the traditional SAN storage vendors have responded by simply putting SSD drives into their existing products. In all cases, the result has been modest upticks in performance coupled with drastic reductions in capacity and dramatic increases in cost. Not a good combination, and not popular with customers.

    The Fusion IO approach works well for the small number of companies who have just a couple of hugely important applications that are completely under their control, and who therefore don't mind violating the normal principles of storage management and re-writing their applications to take advantage of server-based storage. In fact, I've just described exactly the situations of FaceBook and Apple, who between them account for more than two thirds of Fusion IO's recent revenues!

    What about the vast majority of companies who have pressing performance problems and don't want to or simply can't break the bank and madly rewrite their applications? What they need is a new kind of storage they can just plug in and make their problems disappear: storage that is affordable, reasonably priced, many times faster than what they have. In other words, storage that is, well, storage, that looks and acts like normal storage in every way … except that it is hugely faster, like five to ten times faster.

    Hybrid ISE, that's your cue…

    Hybrid ISE

    The Xiotech Hybrid ISE. The right combination of HDD, SSD and RAM cache in a 3U rack-mount package to provide the excellent performance and capacity your applications need. Just by plugging it in. Check it out!

  • HDD Capacity Improves While Slowing Down. Help!

    Here's a typical hard disk drive (HDD):
    HDD

    This is where your data spends most of its time.

    The amount of data you can put on each HDD has gotten exponentially better over time. For example, this chart (all credits: Wikipedia):
    500px-Hard_drive_capacity_over_time.svg[1]

    shows hard drive capacity in GB over time.

    Back in the mid-1980's, HDD's were sized about 10MB. By the mid-1990's they were up to about 1GB, an increase of 100 times. Now they're up around 1TB, an additional increase of 1,000 times. Amazing, particularly when you consider that the average HDD was also shrinking in physical size over that time:
    5.25_inch_MFM_hard_disk_drive

    Right: an older 5 1/4" HDD with 110MB; left: a 2 1/2" HDD with 6.5GB.

    Suppose the size of your main customer database is 1TB. In the mid-1990's, it would have required about 1,000 HDD's to store the data, and today you can put it all on a single HDD: it's wonderful, yes? Well, maybe not. The problem is using your data.

    The problem is simple to understand. Here:
    HDD inside

    is the inside of a HDD. Your data is stored on platters. It's read and written by a head, which is at the end of the long arm that ends near the center of the platter. In order to read any piece of data, the arm has to position the head on the right track of the platter (i.e., move it towards the edge or towards the center of the platter), and then has to wait until the spinning platter brings the data to the head.

    Here's the killer: these times (called seek times) haven't improved much in the last 25 years! In the mid-1980's they were around 20ms; today, most HDD's are around 10-15MS, and the very fastest are around 3ms, around a 7X improvement at best.

    These trends (more data in less space, and unchanging seek time) are likely to continue.

    Improvements in HDD's since 1980's:

    Speed 7X
    Capacity 100,000X

    The bad news isn't over yet. Remember how your customer data used to take 1,000 HDD's? That meant that you had 1,000 HDD's all available to read and write your data. Now you've got just one — and it's hardly any faster than HDD's were 25 years ago!

    What does this mean? If you want to not just keep your data, but also access and update it, you've got a big problem today, and it's getting worse.

    How are people responding to this situation? Simple: they are turning to SSD's. SSD's solve the speed problem! But, as provided by most vendors, SSD's introduce a whole host of new problems. Today, storage buyers confront an extremely unpleasant choice: buy extremely expensive SSD's that lack fundamental storage features, or buy affordable HDD's that simply aren't fast enough. Yuck.

    There is an alternative. It's a good one. It combines everything you like about HDD's with the speed of SSD's in a novel hybrid combination. I can't stop thinking about about it. Check out the Xiotech hybrid ISE.

  • Xiotech’s Hybrid ISE and Storage Performance

    Everything gets better with computers, every year. For the same price or less we can get a bigger screen, a faster processor, more memory, more storage, a faster connection, a lighter device, or a combination of the above. We are so used to this, we don't question it or think about it. It's just the way things are.

    HOWEVER, there is a BIG, FAT exception to this generally wonderful trend. The exception is something most consumers don't think about, but it's having an increasingly dramatic impact on the world of IT professionals. In a world of increasing expectations, the thing that is getting inexorably WORSE every year is: storage performance.

    Worsening Storage Performance: the Market Speaks

    It's pretty easy to tell that storage performance is going to the dogs by noticing the following:

    • The hottest subject in the storage world is solid state disk (SSD). What is SSD? In a nutshell, it is a new version of storage that is WAY more expensive than regular storage. Since when do people get excited about something that's more expensive than everything else? Simple: it's faster. A lot faster. If people didn't have a speed problem, they wouldn't pay a micro-second's attention to this expensive new form of storage.
    • One of the hottest companies in storage today is Fusion IO, which delivers SSD in a server card. Of course Fusion IO's products are incredibly expensive (you already knew that). You have to open up your server to plug it in. You have to change your application to take advantage of it. When the server breaks, the storage is inaccessible. But it's fast! Fusion IO's story could only sell to buyers who are desperate for performance. There must be a lot of them out there.
    • Data center managers are busily virtualizing their servers, and getting major cost and managability benefits from running applications on a smaller number of physical servers. But they are avoiding virtualizing their most important, mission-critical applications. Why? Because running applications on a smaller number of servers concentrates their demands for storage, making a really bad problem even worse.
    • Traditionally, the unit of measure for storage is (of course) how much it stores, the number of GB or TB it holds. But buyers are increasingly buying more capacity than they want in order to get the performance that they need. There is talk of "short stroking" and "over-provisioning." Performance used to be a given in storage; now you have to really pay attention.

    From the way people are acting and the market is evolving, it is safe to conclude that, contrary to everything else in the world of computing, storage performance is swirling its way down the toilet.

    Worsening Storage Performance: the Fundamentals

    In a world of constant improvement, why is storage performance alone the stand-out? The reasons are simple:

    • While each individual disk holds more data than it did years ago, its ability to access the data has only improved a little. It's as though storage rooms were doubling in size every year, but the designers kept putting the same dinky one-person-at-a-time doors on them. Sure, any one room stores more and more stuff, but you can't put stuff in or take stuff out any faster than before, so what's the point? Every time the room gets bigger but the door stays the same, the storage performance problem gets worse.
    • While servers have advanced to nicely scalable blade formats, storage systems continue to be designed as huge monolithic, mainframe-like behemoths. It's easy to grow the capacity of mainframe-like storage systems, but what you'd really like is a blade format, so that when you added capacity, you added performance so you could actually use the added storage.

    Worsening Storage Performance: the Holy Grail

    What would solve the problem? How about:

    • A blade format for storage, so that when you added capacity, you added the performance required to actually use it.
    • A seamless hybrid of SSD and traditional storage, so that you could get the capacity and the performance you need at a price you can afford.
    • A real storage format and interface (unlike a server board) so you can just plug it in and go.
    • A real set of storage features like (replication and fail-over) so you're not taking a step backwards.
    • While we're dreaming here, why not throw in dramatically better reliability, density, power consumption and managability than any product on the market?

    OK, storage gurus, those are your specs. That's what the market would really like to have.

    Worsening Storage Performance: The Holy Grail is the Xiotech Hybrid ISE!

    There's nothing more to say here. There is a problem. Everyone in the industry knows it. It shows in market trends. It makes sense in terms of fundamentals. What a great solution would be is obvious. The only remaining questions should be are: does the hybrid ISE exist? Does it meet the requirements listed above?

    Yes, the hybrid ISE meets the above description. The hybrid ISE is currently in limited engineering release, and is being shipped to the companies at the head of a rapidly growing list of early-adopter customers. It's built on the solid foundation of thousands of ISE blades already in the field. It contains dozens of innovations to make it all work the way customers want it to work: simple, fast, and affordable.

    • How simple? It plugs into the same plug storage normally plugs into. No changes to applications or anything else required.
    • How fast? Depending on the application, 4 to 10 times faster than normal spinning storage. That is a VERY large fraction of the speed of SSD-only solutions, plenty of speed for most needs.
    • How affordable? Roughly a 1/3 premium over pure spinning storage. VERY affordable.

    Yes, I'm biased, as I have disclosed before. But the "bias" comes from insider information, and I can tell you that this is a rocketship in lift-off mode.

  • Databases and Applications

    When databases were invented, they solved a huge problem that couldn't be solved any other way. Anyone who cares to look can see that the original problem that caused us computer programmers to invent databases has largely gone away. So why is it exactly that application programmers reflexively put their data in a database? In a surprisingly wide range of cases, it sure isn't because of necessity. Could it, perhaps, perhaps, be nothing but habit and the little-discussed fact that change happens in software at roughly the same rate that change happens in glaciers?

    From the beginning of (computer) time, instructions have needed to be in memory to be executed, and data has needed to be in memory to be operated on by instructions. The memory in which instructions execute and fiddle with data has always been way faster and way more expensive than the large, slow but cheap places they are put when they're not in memory (call it storage, whether the storage is punch cards or tape or disk). It was this way at the beginning of time and it's true now.

    Think of memory as your work table. Eons ago, your work table was really tiny, like this:

    Tiny table
    You can hardly fit anything at all on it! So you'd better have a really big storage place to keep all your stuff, like a pantry:

    Pantry

    OK, that's cool. You've got all your stuff in storage, but you can only work on when it's in memory (on the table). What do you need? You need to get the stuff you want to work on now from the pantry, and you need to put the stuff you're done with back in the pantry. In other words (if you're in the world of computers)…you need a database!

    The very most basic function of a database is pretty simple: its job is to shuffle your data between memory and disk. It's also nice if it keeps everything straight, avoids dropping bits on the floor, and cleans everything up when something goes wrong during the shuffling.

    That was then. But things have changed. Remember Moore's Law? The amount of memory available to us at surprisingly reasonable prices has grown hugely. Exponentially. Our work tables now look more like this:

    Giant table
    And our pantries? Well, they've grown a bit too:

    Giant warehouse
    So how much data do you have? Run the numbers. It goes without saying that it's going to fit in storage. But how about that work table? Here's the question you have to think about:

    Will all your data fit on the work table (i.e., in memory)?

    With memory available at reasonable prices for 64GB and even 256GB, the answer is often YES! It will!

    Hmmmm. What was it the database does? If all my data fits in memory, why was it I needed that database???

    I know, I know. Databases can be nice for reporting and data analysis and "persistence" and a few other things. I'm not saying you never use them. But for your real application, the one that takes user requests and responds to them, if you don't need to have the database shuttling stuff between the work table and the pantry… Hmmmm.

  • The Xiotech ISE and Technology Fashion

    We all know what fashion is. Think of Vogue Magazine, or impossibly tall, thin women walking in that special way down the elevated runway, wearing something no normal person would be able to wear, or would want to wear if they could.

    SAIC_Fashion_Show_2008 But fashion extends way beyond women's clothing. Let's start with men's clothing: how many guys wear suits and ties to the office today? Then cars — how many modern cars have those giant fins that were popular in the '60's?  The kind of popular music you like dates you at least as much as wrinkles on your skin. The more you think about it, the more you realize how pervasive fashion is.

    Fins_close_up

    "Technology is a counter-example," perhaps you say. "It's bits and bytes and silicon, no fashion there!" Well, that's true, except that it's people who buy the technology, and people are fashion-driven creatures. Let's face it: the cool kids who once drove sporty cars now pull out their iPhones at the slightest excuse. Waiting on line to see the Beatles; waiting on line to get the latest iPhone — what's the difference? They're both fashion-driven fads.

    Iphone3g_line_2

    "I concede that consumer technology is fashion-driven," perhaps you admit. "But hard-core computing technology, where nerds are building things for nerds; how can that possibly be driven by fashion?" I fully concur that no nerd techie would ever admit that his choices, selections and designs are driven by fashion, not even to himself. But all too often, that's exactly what's happening. The techie nerd who comes up with a design approach for solving a problem almost always prides himself on originality and foresight, without any awareness of how fashion-determined his most important decisions are. These decisions are often not made consciously; they are assumptions. "It's not worth discussing, of course we'll take approach X," the techie would respond in the unlikely event that the assumption is questioned — by some "ignorant" (which is tech-talk for "unfashionable") person. Just to be clear: we're not talking about how nerds dress; we're talking about how nerds think.

    Gisele_nerd

    There are examples in every field of computing technology. The Java/J2EE fad during and after the internet bubble is an obvious example, and before it client/server computing was a huge fad.

    There is a clear example in storage technology today. The fashion is as clear and obvious as short skirts, and moreover is explicitly stated by its adherents: the fashion is that storage functionality should be provided as a body of software, independent of any hardware embodiment, and without regard to any particular storage hardware. Companies that previously sold storage hardware no longer have real hardware design functions — all they do is bundle their software with hardware provided by others and sell the combination. The most popular form of this approach is to buy drive bays from an OEM and connect them to controllers consisting of off-the-shelf specially configured processors; 3Par and many others do this, for example. IBM's xiv implements a variation on this theme, using all IBM commodity server hardware. While there are still loads of dollars being spent on old-style, hardware-centric storage systems (think EMC), engineers building new storage systems are uniformly following the software-centric fashion.

    In this sense, the Xiotech ISE is decided unfashionable. The ISE was invented at Seagate, in response to the CEO, Steve Luczo, wanting to create a storage product that was higher value than spinning magnetic disk drives. The idea was simple: build a fixed-format super-disk, with many Seagate drives, intelligence, etc. It would be bigger than a disk, but smaller than a SAN. It would emphasize basic storage functions (write, protect, read) and leave the "high level" storage functions to the SAN vendors.

    What is interesting is that Steve Sicola and a group of other storage industry veterans ended up working side-by-side with Seagate engineers, something they never would have done at a SAN vendor. Sicola and his team knew the evolving fashions in storage quite well: ignore the details of the drives, that's "just storage." Build fancy high-level functions.

    But since they were stuck with the drive engineers, they did something unusual: they actually listened to them! They learned about the amazing functions the engineers embedded in the drives that all the SAN vendors ignore. They learned how annoyed the Seagate engineers were at all the drives marked "bad" by SAN's, the vast majority of which are actually good; they learned about error codes and performance details that all the other storage engineers in the industry were studiously ignoring.

    Before long, they got absorbed in what you could really do once you really knew the hardware. And, being good nerds, they invented a bunch of stuff, like how to virtualize over a fixed number of heads so that top performance was maintained even when the disks are filled up. They also invented a bunch of stuff that provides major, persisting advantages as new drives with higher capacities come out.

    Since I know a fair amount about Xiotech's ISE, I want to go on and on about it. But I won't, because the point of this post is technology fashion. The purpose of bringing up the ISE is that it's a great illustration of the power of fads and fashion in technology. Any normal group of self-respecting storage nerds would have built a completely hardware-independent storage system. As such, it may have had nice features, but it would be pretty much like all the others in terms of its basic functions of reading and writing disks. But because these storage engineers were sequestered with hardware types and had a unique mission imposed from above, they did something very rare: they built a leading-edge storage system that is decidedly unfashionable. Because the engineers actually paid attention to the hardware, the ISE does things (performance, reliability, density, scalability, energy use, etc.) that no other storage system on the market today does, even though it uses the same disks available to others.

    Fashion is, of course, a relative term. Fashion is one thing at diplomatic receptions, and quite another hiking in the wilderness or in a war zone. What is appropriate for one doesn't work for the other. Shoes that are appropriate for a salon can cripple you in the woods.

    Well, it turns out that the modern storage fashion of ignoring the storage hardware may be acceptable in salon-type environments (where appearance and style is important but there's no heavy lifting to be done), but is as crippling as high heels in the I/O-intensive environments that are increasingly found in virtualized, cloud data centers. The ISE is like storage fashion for war zones of data, for data-intensive applications like virtualized servers, where the applications are concentrated in a small number of servers, all fighting to get their data. Most storage systems know how to hold their tea cups and conduct refined discussions and other things that matter when getting your data sometime today would be nice, thanks.

    A-Tea-Party

    But when you've got a crowd of rowdy, tense applications all of whom are demanding their data NOW, perhaps more of a war-time style is appropriate; that's what the "unfashionable" nerds at Xiotech created in the ISE.

    An-Angry-Crowd-Giclee-Print-C12371290.jpeg

  • Success with New Technology and Mouse Madness

    You’ve
    got a wonderful new technology. It works. Customers benefit from it. Game over,
    right?

    Wrong.

    Sadly
    (for you), the world does not revolve around you. The world is not on constant
    alert waiting for better mousetraps to appear somewhere so that the world can
    beat a path to your door.

    Let’s
    take the B-to-B case. If you’re a big technology buyer, you’ve got better
    things to do than constantly flirt with new vendors. To the contrary, you want
    to find ways to cut the number of vendors you work with. If your
    existing vendors aren’t screwing up, if their products are good enough and
    their price is good enough, chances are you’ll save time and stress and feed
    them more orders as you grow. You may take meetings from wanna-be’s, mostly
    for your general education; it’s a waste of the wanna-be’s time, but that’s not
    your problem.

    The
    reality is that most big technology buyers have a barn of designated winners,
    all of whom are “approved vendors.” The vendors 
    have won the design competition, and are now baked in to the buyers’
    expansion plans.

    The
    only thing that is likely to change this situation is a combination of buyer
    pain and incumbent supplier inadequacy. While some technology buyers are
    motivated by opportunity, most consider a vendor change only when they feel
    pain. The pain is nearly always driven by a need to reduce costs. In order to
    reduce costs, the vendor needs to do X, and if it can do X, it needs to be able
    to do it for a particular price.

    Since
    this sounds abstract, let me illustrate it with a real example from Xiotech, my
    favorite storage vendor.

    Xiotech
    has invented a new mousetrap, the storage blade, which delivers
    storage in better ways than existing storage products, sometimes dramatically
    better. Is it really, objectively better? Yes. Can buyers get more done and
    spend less money? Yes. Does that matter to most buyers? Do storage buyers beat
    a path to Xiotech’s door? By now, the VP of Sales at Xiotech has accepted the
    fact that they do not. Having hoped for the easy life of taking orders, he is
    resigned to having to go out and sell stuff. The world feels cruel and unjust,
    but that’s how it is.

    Xiotech
    has a better storage mousetrap, and the world has reacted the way it always
    reacts to better mousetraps. So what does Xiotech do?

    It’s
    pretty simple, actually. If you had a mousetrap that was really and truly
    better than the other guys’, wouldn’t it make sense for you to find places that were totally, horribly overrun
    by mice?
    Not just places that have mice – places where the mice are in
    charge; places where the mice start trying to invent “peopletraps” because they
    think they’re infested. Places, in other words, where the inadequacies
    of the best existing mousetrap technology have been put to the test and come up
    short – miles short.

    No
    one expects the incumbent mousetrap vendors to walk away from the business that
    has served them so well for so many years. They are bold and shameless, and
    will come up with all sorts of arguments. They will argue that if you have a
    mouse problem, you obviously haven’t bought nearly enough of their wonderful
    traps; the solution is to buy more! If that doesn’t work, they will wave their
    arms furiously about the new trap that is about to come out, avoiding any
    mention of the increase in maintenance charges for existing traps. Meanwhile,
    you walk around and see the mice dancing on the existing vendor’s obviously
    ineffective traps.

    Pd-mice-sample
    If
    you’ve really got a better mousetrap and are looking for motivated buyers, the
    place with the super-sized, invasion-from-Mars MOUSE problem is your candidate.
    Most people buy the same mousetraps they’ve always bought. They may be lousy
    traps, but they just don’t care. It doesn’t matter enough. But the guy with the
    MOUSE problem, HE CARES. He pays attention to mousetrap technology, because he
    knows he’s got a whole pile of mice that need catching REAL BAD.

    The
    cruel fact of life for inventors of better mousetraps is that, if you can’t
    find anybody with SERIOUS mouse invasions, you are hosed. Your superior trap
    will do nothing but waste the time and money of everyone involved. But even if
    you can find places where the mice are dancing in the halls with impunity, your
    mousetrap had better be SERIOUSLY better than the incumbent. If the incumbent
    knocks off a mouse or two but leaves dozens aspiring for a spot on Dancing with
    the Stars, while yours knocks off two or three or four but still leaves most of
    the mice dreaming about skimpy costumes and getting “ten’s” from the judges,
    you may as well hang it up now. To make a long story short, you had better:

    • Find
      a truly worthy mouse problem.
    • Solve
      it. Really solve it. No kidding, nail it!
    • Solve
      it for a reasonable price. If you’re too expensive, you may sell to a few
      desperate buyers, but you’ll never become the industry-standard mousetrap.

    Back
    to Xiotech storage. Xiotech is focusing on buyers who have the kind of problems
    that Xiotech storage, with its high performance and linearly scalability at a
    reasonable price, is uniquely positioned to solve. Just as mousetrap guys look
    for places with too many mice, Xiotech looks for places that can’t get at their
    storage. Just as the incumbent mousetrap guys boldly say “just buy more of my [crummy]
    mousetraps,” the incumbent storage vendors say “just buy more of my [slow]
    storage [that gets even slower as you fill it up].” In spite of such incumbent
    resistance, smart buyers with mouse madness buy better mousetraps, and smart
    buyers with storage bottlenecks buy better storage.

    Success
    with new technology is usually only achieved when:

    • there
      are pockets of buyers who have SERIOUS pain;
    • the
      pain is worth serious money;
    • you
      can find the people with the pain;
    • you
      can address their pain;
    • your
      technology can really make the pain go away;
    • ideally
      without charging more money.

    Failing
    this, I guess you could try waiting and hoping that the world will beat
    a path to your door…

     

  • Moore’s Law, Less’s Law and Storage

    Everyone who has anything to do with technology knows about Moore's Law. If you don't know it in detail or by name, you know it because you have a set of expectations about technology. You expect that whatever is available this year at a given price, next year you'll be able to get more of it and/or it will be cheaper and/or it will be smaller. This is Moore's Law in effect: computer-based stuff gets physically smaller, cheaper, faster; it can hold more and do more at the same price.

    Moore's Law applies in spades to CPU's, memory, displays and even networking. All these things get amazingly better seemingly just by the passage of time. It even applies to disk storage: For example, the 1MB, 12 inch removable disk drives of my early programming days are now supplanted with tiny 300GB drives hidden somewhere inside my laptop.

    While everyone more-or-less knows about Moore's Law, not so many people know about "Less's Law." Maybe it's because Gordon Moore was famous in his own right as a co-founder of Intel, while Seymour Less is famous only for confounding people who think that Moore's Law results in nothing but more and more good things happening. Seymour was fond of saying things like "the more Moore's Law expands disk capacity, the less Less's Law says you can access that capacity." In a time of belt-tightening, managers everywhere are saying "do more with less." Seymour Less is the guy who originally pointed out that, when it comes to disk, you have to find a way to "do less with more."

    I've talked about the impact of Less's Law before, called it something boring like the "performance gap" in storage, and pointed out how it impacts the move to server virtualization. But I've been realizing recently that the implications of Less's Law go way beyond computer storage. The combined impact of Moore's Law (making most computer things better, faster, cheaper) and Less's Law (making storage less accessible) has a profound impact on software architectures. I still find ten-year-old software architectures being touted as "advanced," when they're anything but. On the other side, I see programming groups who are under pressure to deliver good stuff quickly adapting to the new world, and feeling their way to styles of building software that more or less reflect the combined impacts of Moore's and Less's Laws.

    This is a big subject. I hope to explore it in future posts.

  • Server Virtualization Problems and Xiotech ISE Storage Blades

    Server Virtualization
    (Hyper-V, VM-Ware, etc.) is making people aware of the growing crisis in
    SAN/storage performance. There is a solution: Xiotech ISE Storage Blades.
    Xiotech Storage Blades are the least expensive, least disruptive and most
    effective solution to the performance problems that virtualizing servers almost
    always seems to cause.

    Is there a
    problem?

    Anyone who has tried
    seriously deploying virtualization in a data center knows there is. A recent
    post
    by the ESG’s Mark Bowker makes the issue very clear.

    IT sells the business on the value of server
    virtualization and calculates the ROI on the back of a napkin during a lunch
    meeting. They get the green light. … Confidence is high and they start to
    target the next tier of applications, such as Microsoft Exchange, and suddenly
    realize that the 7200RPM SATA drives they purchased to support their entire
    virtualization deployment may not cut the mustard

    The fear is that they drop this new Microsoft
    Exchange VM in place and start having major performance issues… [with] their
    existing virtualization investment that has the compute horsepower and storage
    capacity available, but not the storage performance. …

    As a result of all this and other similar
    scenarios, server virtualization deployments are stalling. 

    The entire post is worth
    reading, but I’d like to point out the core of the issue: the typical
    virtualization environment “
    has the compute horsepower and storage
    capacity available, but not the storage
    performance.

    Why Is there
    a problem?

    The core reason there’s a
    problem is that as disks get more and more capacity, they don’t get any faster.
    Imagine a terabyte of data in the old world, on 10 disks. If you have 5
    programs asking for parts of that data, chances are pretty good it’s going to
    be on a disk that isn’t busy right now, so the performance will be great. In
    the wonderful new world, that same terabyte of data fits on just one disk. So
    if you have the same 5 programs asking for data, there is a 100% probability
    that the one disk that has all the data is already going to be busy with
    someone else’s request. Here is a more detailed discussion of the performance
    gap in storage
    if you’re interested.

    So quite apart from
    virtualization trouble, storage is getting slower and slower.

    Why Does
    Virtualization Make it Worse?

    Server virtualization is an
    excellent thing. It helps you make more efficient use of your hardware. It does
    this by distributing a set of programs that need computing resource
    over a set of servers. This is just like running several programs on one
    machine at the same time, except that now we’re distributing a set of programs
    over a set of machines, and the programs can even require different operating
    systems (like Windows or Linux) and the virtualization still works. So instead
    of having 40 programs running on 40 machines, virtualization might let you run
    them on just 10 machines. A huge savings!

    The trouble comes when those
    programs start asking for data – pesky programs, always wanting data!  Now, instead of requests coming to the storage
    from 40 machines, we have the same number of requests coming from just 10
    machines – a 4 to 1 concentration of requests. The storage doesn’t “know” about
    the 40 programs. It just sees the demand for its services going through the
    roof. It’s like people trying to get into a ball park for a ball game. If you
    suddenly block off 30 of the 40 entrances and make everyone come in through the
    remaining 10, the lines are going to be long, the ticket takers frazzled, and
    everyone is going to be mad. Not unlike what happens when you virtualize
    servers in the average SAN environment!

    We have a problem because programs
    running on fewer servers (because of virtualization) are trying to get to their
    data from fewer disks (because of increased capacity per disk).

    Xiotech ISE
    Storage Blades to the Rescue

    What made anyone think that
    sleek, efficient server blades would work well with the average storage
    mainframe
    in the first place? Inertia, I guess. If you’ve got linearly scalable server blades, wouldn’t
    you want … linearly scalable Storage
    blades
    (bricks) to go with them?

    Let’s talk performance for a
    minute. How about:

    10,000
    Exchange users per 3U ISE

    And then add a second for
    20,000 users, a third for 30,000 users, and so on. Here
    is a post
    with details on how others attempt to meet the need, a video
    about the benchmark, etc.

    There is certainly a problem.
    The amount of money going into expensive SSD’s tells us there’s a problem.
    Stalled virtualization projects tell us there’s a problem. Xiotech ISE Storage Blades
    with awesome performance that doesn’t degrade as the device fills up are the
    solution. There is even software that
    makes setup painless
    in a VM environment!

  • Mainframes and Storage blades: Xiotech’s ISE

    In computing, the transition from mainframes to blades is
    well established. What about storage? There is exactly one choice: the Xiotech
    ISE.

    We throw words around like crazy in computing. Once a word
    starts to become associated with something good, every person in marketing in
    the universe latches on to that word, and finds a way to associate it with
    their product. Most of the analysts aren’t much help here.

    The opposite applies, too. Once a word is no longer in fashion,
    no vendor admits to selling it, and no buyer admits to owning one. A case in
    point is “mainframe.” Some time ago, you had to have a mainframe to be in the
    big leagues of computing; otherwise, you were in the minor leagues, dealing
    with unimportant problems. Now, even though mainframes are alive and well and
    broadly used, it’s hard to get anyone to admit it. “Everyone” knows that
    mainframes are old-fashioned.  

    In spite of their un-cool-i-tude, let’s talk mainframes for
    a quick minute. What’s a mainframe? In computing, it’s a big single thing that
    gets all of many big jobs done at the same time. It is your central computing
    resource. It’s valuable. It takes lots of time and attention to manage, but
    rewards the attention by being the work-horse of your data center. As your
    workload grows, your mainframe can start to get overloaded; no problem.
    Capacity measurement and planning is a key skill with mainframes, and even
    better, mainframes are built to be expanded. That’s why they’re called “main
    frame.” It isn’t the CPU – it’s the frame (the main one!) that structures your
    computing engines. Sometimes your mainframe needs more compute-power; no
    problem, you can add it in. Sometimes your mainframe needs more I/O channels
    or local memory; no problem.

    What’s cool today, if not mainframes? We all know the
    answer: it’s racks and racks of servers or blades. Each one is powerful but
    inexpensive. You increase capacity by buying more of them. That’s why
    virtualization is such a powerful trend in the data center: VM-Ware (and
    similar products) helps you use all those servers more efficiently. Everyone
    has lots and lots of servers; therefore, the need to use them as effectively as
    possible is ubiquitous; therefore, Hyper-V and its brethren are hot.

    What’s going on in the world of storage? Do we have storage
    mainframes? Of course not! Heaven forfend! “Mainframes” are the bad old,
    un-cool thing, so there’s no way my storage is a mainframe – it’s a SAN, a
    storage area network! It’s a network, see, it’s cool!

    Now let’s cut through the verbiage, and apply the criteria
    of mainframe to storage. A mainframe (see above) is:

    • A big single thing that gets all of many big storage jobs
      done at the same time; check.
    • Your central storage resource; check.
    • It’s valuable; check.
    • It takes lots of time and attention to manage; check.
    • Capacity measurement and planning is a critical function;
      check.
    • You expand capacity by augmenting it, adding things into it;
      check.

    There’s a simple rule here. Suppose you buy a “small, simple”
    mainframe. When you’ve added huge amounts of capacity to it, how many do you
    have? If the answer is “one,” you’ve got a mainframe. If you’ve got a small,
    simple server/blade collection, you’ve already got a bunch of them. When you’ve
    added loads of capacity, you’ve got loads more of them. You’ve never got just
    one. You start with some, you grow to lots, and expand to lots and lots.

    Applying our rule to storage, it is undeniable that while vendors
    will avoid the “m-word” like crazy, what they’ve all got is storage mainframes.
    They are single things, just like a railroad train is a single thing regardless
    of the number of engines at the front or freight cars at the back. Even if they
    choose to call it “cloud storage,” the brutal fact is that it is still a
    monolithic, single, unbroken entity – a mainframe, in short!

    There is exactly one vendor in the market who has created
    for storage what blades are for computing, and that is Xiotech, with its ISE
    product. You do not expand an ISE; you buy another one. Each ISE is a separate,
    free-standing, inexpensive but powerful resource, directly connected to a
    switch – just like a server.

    The Xiotech ISE is the anti-mainframe, and is the
    natural choice for server-farm data center storage.

  • Xiotech and the Strange World of Storage Administration

    Storage administration is a world of its own, controlled by storage administrators, running by their own set of rules. It's got to stop!

    Not entirely, of course. Storage is a specialty, and it definitely rewards having someone really know all about it.

    Think about it from an application programmer's point of view. What if files had the same rules as storage? Instead of just adding a "create file" statement to your program, you'd have to request one from the "file administrator," who may give you one after talking with you. You would agree on the name. You would talk about how big the file would get, and how often you intended to access it.

    If everything went well, you would each go off to your respective domains, enter exactly the same information into your respective systems, and the file would be available. It would almost certainly be created before you needed it (better that than after you needed it!), and be set up to give you ample space.

    What would be the net result?

    • You and the "file administrator" would spend more time setting up files.
    • There are more opportunities for mistakes and mis-communications.
    • The files would be set up longer than needed, and possibly larger than needed.
    • They wouldn't be deleted automatically by your program — the file administrator would need to be notified.

    In other words, the situation would be worse in just about every possible way. More time, more chances of error, less automation, more resources used longer than necessary.

    Fortunately, this was a hypothetical for files. We have language-specific versions of "create file" and "delete file," so that applications can control their own lives, and programmers can program and be done with it.

    Unfortunately, this is daily reality for storage! Bummer! Why shouldn't applications be able to control storage the way they do files? This may not matter much for persistent storage that doesn't change much, but it matters hugely for storage that is used on a temporary basis. Most places that I've looked at don't think about it any more. Why should they? They just over-provision like crazy and be done with it. 

    This situation changes the second you've got a simple set of API's you can call from scripts or applications that do for storage what we already know how to do for files. This is a capability all storage vendors should provide. Can you guess which one of my favorite storage vendors has a real story on this subject? Good, check it out…

    Actually, just going to their website isn't enough (for now), since the RESTful API for storage administration is still in advanced beta. But it's real, it's cool, I've seen it, and they're even starting to admit its existence, per Brian Reagan:

    CTO Steve Sicola detailed the Q1 and Q2 roadmap, including CorteX
    (coming in Q1) – Xiotech’s RESTful API that will allow developers
    simple yet powerful access to ISE.

  • Computer Storage’s Performance Gap

    There is a growing "performance gap" in computer storage due to basic physics, and exacerbated by server virtualization. People who run data centers see the problem, and so do their customers. It gets worse every year.

    Most storage vendors have no real solution to this problem. Xiotech, with its ISE product, is a notable exception. For this reason (among several others), Xiotech is one of my favorite companies.

    The Performance Gap

    In a nutshell, the "performance gap" is that as disk-based storage gets less expensive, the rate at which any one customer or user can get at their data gets worse and worse. Capacity goes up and performance goes down, resulting in a performance problem. Seems strange, doesn't it? Doesn't everything about computers get faster, smaller and cheaper? What's this about something getting worse all the time? Well, it's true.

    As everyone knows, computers get faster and less expensive. As they get faster, they read and write data more quickly. Storage that is built out of the same electronic "stuff" as the computer (RAM, DRAM, the "main memory" of the computer) gets faster and less expensive at pretty much the same rate as the processors. No problem there. But what about the disks (hard disk drives, HDD's)? Do they get faster and less expensive too?

    Well, that's the problem. They do get less expensive — that's part of why we have iPod's and digitial cameras now. But they do not get much faster. Mostly what happens is they hold more data and get physically smaller.

    A dozen years ago, you could have bought a HDD that had 10GB on it. Today, you can buy HDD's that are physically smaller that hold over 1TB. For less money!

    Smaller, more capacity, less expensive. What's wrong with that? Imagine that you have 1TB of data. A dozen years ago, you would have stored it (ignoring details like overhead and extra space) on 100 HDD's, each holding 10GB. Today you would only need one HDD — a 100:1 advantage! That's the good news.

    The bad news is that a dozen years ago, you would have had 100 HDD's, each with a head and data channel to read and write your data, while today you would have a single read/write head to access the same amount of data. This is 100 times worse than it was a dozen years ago! It's like Yankee Stadium with the same number of fans, only locking all the entry gates except one; do think there would be a line at that single door?

    If banks operated the same way with ATM's the way drive vendors do with HDD's, here's what it would be like. The bankers figure they're going to put a certain amount of cash in ATM's for the citizens of, say, New York City. Each year, new ATM's become available that can magically store a lot more cash in less space at lower cost. Of course they go to town, replacing each couple of old ATM's with one double-capacity ATM. They're feeling good about themselves — they've made the same amount of cash available to their customers while lowering their costs.

    Suppose that ten years ago, their were 100 ATM's in NYC. With this incredible technical growth in ATM's, the bankers can make the same amount of cash available using just one ATM — isn't that great?!

    Of course, the problem is obvious. People don't care about how many ATM's it takes — they care whether there's an ATM near them when they need cash, and whether there's a line of people waiting for access to that ATM. Imagine the impact of reducing the number of ATM's by a factor of 100. The same amount of cash is in the remaining ATM's as before, so the capacity is not reduced, just the number of access points.

    That is the performance gap: the same amount of "stuff" is crammed into a tiny fraction of the number of "boxes," but the "doors" to the few remaining boxes are no larger.

    Fundamentals: Physics vs. Electronics

    Hard Disk Drives (HDD's) contain one or more little platters (kind of like small CD's) that spin in a sealed enclosure. The platters spin, and as they pass by the read/write heads, data may be written or read. Here is a basic summary of HDD technology, with diagrams.

    Every year, vendors manage to make the little platters even smaller, make the read/write heads able to handle "bits" that are smaller and smaller. More data gets crammed into less space.

    The trouble is that the platter can't spin much faster than it already does. If the data you want is on the other side of the platter from the head, you still have to wait for the platter to spin around — and that takes the same time as it did when there was 100 times less data on that platter.

    In addition, platters (like CD's) put data everywhere, from the small inner part to the larger outer edges. With a single head to read or write, the head still has to be moved to the right place (this is called "seek time"), and that movement isn't much faster than it was years ago. 

    Finally when you've got the "ATM" problem or the "Yankee Stadium" problem, the problem gets really bad — the chances that your data is on the same platter as someone else who wants their data at the same time has gotten 100 times worse.

    As the electronics of shrinking bits gets better, the performance gap gets worse, because physics doesn't let us move things or spin things way faster, and the simple arithmetic of cramming more data through the same-size door just blocks us.

    Server Virtualization

    Server virtualization is a major trend in data centers. It basically enables data centers to reduce the number of servers they use by using virtual machines instead of physical machines. With the help of software, applications that used to require dedicated machines can share a smaller number of physical machines. 

    This is generally a good idea and saves money. Except for this little problem about storage. By putting multiple applications on each server, the number of read/write requests coming out of each server has just gotten larger. Which makes the storage performance even worse. Ughh.

    Conclusion

    Advances in storage technology are paradoxically making storage performance worse. Server virtualization, while generally a good thing, exacerbates the problem. How do you solve the performance gap? There are several major ways.

    • Buy enough HDD's to give you the performance you need. This may mean that you are way over-capacity, but who cares? Your users aren't trying to wring your neck because they can't get their data. And the extra money all that costs? Well, that's life.
    • Pay 10 to 20 times more per GB and buy solid-state storage (SSD's) — and change your applications to take advantage of it. Problem solved! The extra money and trouble? Well, that's life.
    • Buy ISE's from Xiotech. Buy only the capacity you need. Your users will love you and you won't have to touch your applications to make it work. The extra money and trouble? There is no extra money or trouble. That's life — the good life, that is.

    Am I proud to be associated with Xiotech? You betcha.

Links

Recent Posts

Categories