Author Archive

Introducing CoprHD. EMC Changes the Game for Software-defined Storage Automation and Management

EMC is certainly no stranger to open source. EMC and Pivotal are both founding members of the CloudFoundry Foundation. And EMC recently announced a $10 million investment and its first CloudFoundry dojo, based in Cambridge, MA, that will attract developers and facilitate the creation of applications on CloudFoundry.  In November, EMC announced the EMC OpenStack Reference Architecture Partner Program and partnerships with Canonical Ubuntu, Mirantis and Red Hat.  EMC also recently launched EMC {code} – the Community Onramp for Developer Enablement, which provides both EMC and community contributions of open source code, drivers, tools, samples, and more. EMC supports and contributes to open source in a number of ways, yet EMC is still considered a proprietary vendor. Well, if none of the above proves EMC’s open source bona fides, perhaps this will: On May 5th, EMC is moving EMC ViPR Controller development into the open source community.

This is big news. For the first time, EMC is taking a commercial product and releasing it to community-driven development.  The open source project, named CoprHD, makes the code for ViPR Controller – all the storage automation and control functionality – available in the open source community. Customers, partners, developers and other storage vendors can download, expand and contribute to CoprHD. EMC will continue to sell EMC ViPR Controller as a commercial offering enhanced with service, support, training, and more to help organizations quickly adopt software-defined storage.

It’s been an amazing journey. Two years ago, EMC announced and subsequently launched EMC ViPR Software Defined Storage. Two years later, The ViPR Controller code, now open source project CoprHD, will be open and available for download on Github.  This signifies a fundamental change to EMC’s development model. All development for ViPR Controller and CoprHD will be done in the open source community, with EMC and others contributing.  CoprHD is licensed under the Mozilla Public License 2.0 (MPL2.0), which encourages community sharing and requires anyone who modifies the source code to share those modifications with the community.  EMC is also establishing free and frictionless access to CoprHD to facilitate community-driven collaboration that will accelerate and expand functionality and support for third party storage.

Why is EMC taking this step? EMC fundamentally believes that software-defined storage is a strategy, not a product. The goal of software-defined storage is to give customers choice of storage services and hardware platforms, make it all simple and less costly to manage, and eliminate proprietary lock-in. Making the ViPR Controller source code available as open source project CoprHD will accelerate development and increase support for non-EMC storage arrays and data protection technologies. It also strengthens CoprHD as a single, vendor-neutral API control point for software-defined storage automation.

This open source model of open, collaborative development is crucial to the future success of software-defined storage and storage automation and management. CoprHD and ViPR Controller will give customers choice, flexibility, and transparency. Purpose-built storage platforms from EMC and others will always remain data center necessities. But customers increasingly value more plug and play architectures – driven by software-defined solutions and standardized infrastructure – and will often sacrifice some level of efficiency to obtain best-of-breed features, more flexibility and lower switching costs. In the modern data center, successful storage vendors will compete on the merits of their solutions and deliver compelling customer experiences. As CoprHD and ViPR Controller extend support to more and more storage platforms, EMC welcomes this new competitive playing field. EMC is ready to lead in this new software-defined world.

Are you a developer that has contributed to a product in the open source community before? Are you planning on contributing to CoprHD?  Are you a storage administrator or architect looking to evaluate and deploy CoprHD? If so, tell us about your experience!  We invite you to join us on this new journey and share your discoveries…let’s see where it takes us.

Icy Hot: Cold Storage is a Hot Market and Object Storage is Heating Up

How do we measure the mission criticality of storage systems? What comes to mind when you hear or read the words, “mission critical”? Certainly, you’d think of reliability,Icy Hot resiliency, data protection, etc. But I’m willing to bet that you also, almost reflexively, think of performance –measured in millions of IOPS, transactions per second, or sub-millisecond latencies. To many, mission critical means fast. Think all flash arrays and high-end block storage. This is what the industry refers to as “Hot” storage.

“Cold” storage, on the other hand, gets no love.  When you think cold storage, you think of old data you don’t want but can’t get rid of. You think of tapes in caves or a $0.01 per GB/month cloud storage service. Think low cost, commodity and object storage. Cold storage has an image problem, thanks in no small part to Amazon Web Services introducing Glacier in 2011 as a cold archiving service. You don’t often hear the terms “mission critical” and “cold storage” in the same sentence (see what I did there?). You think cold storage isn’t important. And you’d be wrong.

You’d be wrong because the world of storage doesn’t bifurcate so neatly into just two storage categories. Cold storage, which is frequently delivered by an object storage platform, can actually be different temperatures – cool, chilled, cold, colder than cold, deep freeze, etc. Confused? IDC explains:
System Type
Source: IDC Worldwide Cold Storage Ecosystem Taxonomy, 2014 #246732

It all depends on the use case and how active the data is. Extreme or deep freeze archive is when the data is seldom, if ever, accessed. Amazon Glacier is an example. Access times can range from hours to more than a week depending on the service – and you pay for the retrieval. Deep archive makes up the bulk of the cold storage market. The data is also infrequently accessed but it remains online and accessible. IDC cites Facebook Open Vault as an example. Active archive is best for applications that may not modify data frequently, if at all, but can read data more frequently as in Write Once, Read Many (WORM). An example use case is email or file archiving; IDC cites EMC Centera as an example. EMC Atmos and EMC Isilon are also good examples.

Object storage, general speaking, falls under the category of cold storage and is used for any temperature. But it should not be pigeonholed as an inactive, unimportant storage tier. Object storage is a critical storage tier in its own right and directly influences the judicious use of more expensive hot storage. With the explosion in the growth of unstructured content driven by Cloud, mobile and big data applications, cold secondary storage is a new primary storage. To the salesperson or insurance adjuster in a remote location on a mobile device, the object storage system that houses the data they need is certainly critical to their mission.

The importance of cold storage is best explained in the context of use cases. The EMC ECS appliance is a scale-out object storage platform that integrates commodity off-the-shelf (COTS) components with a patent-pending unstructured storage engine. The ECS Appliance is an enterprise-class alternative to open source object software and DIY COTS. ECS offers all the benefits of low cost commodity but saves the operational and support headache of racking and stacking gear and building a system that can scale to petabytes or exabytes and hundreds or thousands of apps. Organizations evaluating ECS appliance are generally pursuing a scale-out cloud storage platform for one or more of the following three use cases:

Global Content Repository

This is often an organization’s first strategic bet on object and cloud storage.  Object storage, due to its efficiency and linear scalability, makes an ideal low cost utility storage tier when paired with COTS components. The ECS appliance delivers the cost profile of commodity storage and features an unstructured storage engine that maintains global access to content at a lower storage overhead than open source or competing object platforms. This lowers cost and makes their hot storage more efficient and cost- effective by moving colder data to their object archive – without diminishing data access. But it’s more than that. A crucial aspect of a global content repository is that it acts as an active archive; the content is stored efficiently but is also always accessible – often globally.  And it’s accessible via standard object storage APIs. Consequently, the global content repository also supports additional uses such as next-generation file services like content publishing and sharing and enterprise file sync and share. And there is an ecosystem of ISV partners that build cloud gateways/connectors for the ECS appliance that extend the use case further.

Geo-scale Big Data Analytics

Geo-scale Big Data Analytics is how EMC refers to the additional use of a Global Content Repository for Big Data Analytics. The ECS Appliance features an HDFS data service that allows an organization to extend their existing analytics capabilities to their global content repository. As an example, one ECS customer uses their existing Hadoop implementation to perform metadata querying of a very large archive. ECS appliance treats HDFS as an API head on the object storage engine. A drop-in client in the compute nodes of an existing Hadoop implementation lets organizations point their MapReduce tasks to their global archive – without having to move or transform the data. The ECS appliance can also be the data lake storage foundation for EMC Federation Big Data solution. This can extend analytics scenarios to include Pig, Hive, etc. In addition, since ECS is a complete cloud storage platform with multi-tenancy, metering and self-service access, organization can deliver active archive analytics or their data lake foundation as a multi-tenant cloud service.

The ECS appliance overcomes some of the limitations of traditional HDFS. ECS handles the ingestion and efficient storage of a high volume of small files, high availability/disaster recovery is built in, and distributed erasure coding provides lower storage overhead than the 3 copies of data required by traditional HDFS.

Modern Applications

Mainstream enterprises are discovering what Web-centric organizations have known for years. Object storage is the platform of choice to host modern, REST-based cloud, mobile and Big Data applications. In addition to being a very efficient platform, the semantics of object make it the best fit for Web, mobile and cloud applications.

I recommend viewing the webcast, “How REST & Object Storage Make Next Generation Application Development Simple” to get an in-depth look at object architecture and writing apps to REST based APIs. However, there are two features unique to ECS that facilitate the development and deployment of modern applications:

  • Broad API support. ECS supports Amazon S3, OpenStack Swift or EMC Atmos object storage APIs. If developing apps for Hadoop, ECS provides HDFS access.
  • Active-active, read/write architecture – ECS features a global index that enables applications to write to and read from any site in the infrastructure. ECS offers stronger consistency semantics than typically found in eventually consistent object storage. ECS ensures it retrieves the most recent copy of a file. This helps developers who previously had to contend with the possibility of a stale read or write conflict resolution code into their applications.

Noam Chomsky once said, “I like the cold weather. It means you get work done.” You can say the same for cold storage; it also means you get work done.  It’s become a workhorse storage platform. It doesn’t get the sexy headlines in trade rags. But I hope after reading this and understanding the actual use cases for ECS appliance and object storage, you have a better appreciation and some love for cold storage. There are lots of solutions for storing old data that just can’t be thrown away and most compete purely on price. But, if your applications and data fall into one or more of these use cases, then the ECS appliance should be at the top of your list.

Past is Prologue: What 1990s Technology Tells us about Today’s Cloud Competitive Landscape

Last week, Amazon launched Zocalo, an enterprise file sync and share service. The press chatter has been all about how Amazon is now competing with Box and Dropbox. Dropbox also happens to be an Amazon Web Services S3 customer. This is not the first time Amazon has released a new service that competes against its own customers. Nasuni, Engine Yard, Heroku, OpenShift and MongoDB are just a few technology vendors with whom Amazon can partner/host and compete. This is not that unusual in the era of more open architectures and “co-opetition.” However, with this recent announcement, I couldn’t help but think back to when proprietary hooks and vendor lock-in were common and more explicit. What have we learned and what does it portend for the future of cloud computing and services?

I love just about anything that has to do with American history – even recent history. I’m a sucker for a good kurtretrospective. So I was psyched last week when the National Geographic channel debuted,“The 90s, The Last Great Decade?”, the follow up to their successful retrospective on the 1980s, “The 80’s: The Decade that made Us”. I guess all of us like nostalgia to some extent. We enjoy looking back at the horrible fashion choices we made; the hairstyles, the music, the movies, and everything pop culture. But one of the things I like best is seeing how far technology has evolved. And more importantly, what we learned from how things used to be. Past is prologue and the evolution of technology decades ago can be a harbinger of things to come.
netscapeIt’s easy to forget that the PC boom and Internet boom both happened in the span of a single decade, the 90s. And one of the most compelling stories retold in National Geographic Channel’s series is the “Browser Wars” that foreshadowed the end of the PC era and beginning of the Internet era. The short story is that Microsoft introduced the Internet Explorer (IE) browser in response to Netscape Navigator’s commanding share of the Web browser market and then used the monopoly power of its 90% share of the operating system market to make IE the default browser for PC manufacturers and Internet Service providers; effectively killing Netscape (though Netscape’s slow pace of innovation certainly didn’t help). Due to Microsoft’s monopoly power over the platform, Netscape, and every other independent software vendor (ISV) for that matter, had no choice but to build their application for the Windows platform and live with the risk that their success could lead Microsoft to introduce a competitive offering and use their OS monopoly power to make it the default choice.

I guess the retelling of the Browser Wars story and the AWS/ Zocalo announcement in the same week was fitting. Of course, neither AWS nor anyone else can command 90% of the cloud services market but that doesn’t mean that technology providers and vendors can’t pursue other ways of building “walled gardens” and monopoly-like power. They just have to go about it differently. But much like Microsoft 20 years ago, AWS has a growing ecosystem of ISVs that get to market via AWS and fill in gaps in AWS offerings. The gaps that AWS identifies as profitable are targets for AWS services that will compete with the ecosystem. ISVs know this, yet still gravitate to AWS due to its simplicity, APIs and low cost/GB. And even with the strong gravitational pull of their ecosystem, Amazon is nowhere near a monopoly. The answer is not a lawsuit, it’s more competition. Competition would foster innovation and give ISVs more choice in how or where to host their applications.

So why is there not enough competition? Many cloud service providers (CSPs) and ISVs have been scaredECS-Boxes-1 away from competing with public Infrastructure-as-Service (IaaS) providers or hosting their own infrastructure because they fear a never-ending race to the bottom. They don’t believe they can offer or afford a storage infrastructure that features the hyperscale economics and capabilities of public IaaS. And, for the most part, they’ve been right. Until today.

With the introduction of the EMC ECS Appliance, powered by ViPR, EMC brings hyperscale capabilities and economics to everyone. The ECS Appliance is a commodity-based storage appliance differentiated by the ViPR software. For the first time, customers can purchase a complete cloud storage platform in a box with self-service access and universal protocol support. The EMC ECS Appliance features include:

• It’s built for hyperscale – ECS Appliance leverages commodity components, architected in such a way that it features no single points of failure and delivers an unmatched combination of storage efficiency and data access.
• Lower cost than public cloud – EMC TCO models show the ECS Appliance is 9-28% cheaper than public IaaS at scale.
• Universal protocol support – The ECS Appliance’s support for Object, Block and HDFS eliminates storage silos and inefficient Extract/Transform and Load (ETL) processes. This comprehensive support for different protocols and data types on the same platform means a CSP can support a broad range of applications and services on the same platform.

For the first time, enterprises, CSPs and ISVs can build a modern hyperscale storage infrastructure that leverages commodity platforms. Any data center can feature the economics of commodity platforms and the simplicity of cloud. They can offer the same experience; host a broad range of applications, offer developers simple access to industry standard APIs such as Amazon S3, OpenStack Swift, EMC Atmos, HDFS, etc., and be price competitive. I suppose in another decade or so, I can look forward to the next retrospective on the 10s. Maybe a 70-something year-old Rob Lowe will tell the story of how the economics of cloud storage changed in 2014 and ushered in a new era of innovation and economic growth. And how stupid everyone looked in skinny jeans.

In it for the Long Run: EMC’s Object Storage Leadership

EMC’s Hopkinton, MA headquarters is right down the street from the Boston Marathon starting line. With the 2014 race fast approaching, I often think of inspirational quotes to get me Marathonout the door into the darkness, single digit temperatures, and icy streets and sidewalks. One of my favorites is from international track star and running legend, Steve Prefontaine who said, “To give anything less than your best is to sacrifice the gift”. It’s more than just saying “don’t rest on your laurels”; it makes the point that our success, talent or list of accomplishments should be respected. And we respect them by continuing to work hard and giving our best. Failing to do so is to “sacrifice the gift”. Of course, we can apply this sentiment to more than running. As individuals we should always seek to maximize our talents. Companies should too. Good companies have their share of successes. Great companies, however, recognize what made them successful and put just as much effort, if not more, into the next innovation. EMC’s leadership in object storage is a great example.

Gartner Research recently rated EMC Atmos for overall product viability for object storage platform in their report, “Critical Capabilities for Object Storage.1” EMC Atmos achieved the highest possible rating given based on capabilities as well as by use case. EMC is proud of this accomplishment and believes it reflects years of hard work and innovation. EMC pioneered object storage with the introduction of EMC Centera in 2002. Not content to be a leader in on-premise compliance archiving, EMC took object storage to the next level with EMC Atmos. Today, Atmos boasts over 1.5 exabytes sold and over 300 customers, including more than 60 global service providers and some of the largest Web-based companies in the world. We believe that EMC is a leader in object storage platforms but, once again, EMC is pushing the pace to not just maintain, but extend its leadership in object storage with EMC ViPR and the ViPR object data service.

This is not the time to relax. Another unique trait of Steve Prefontaine was his approach to competition. He was a frontrunner. He liked to get out front, push the pace and make everyone run his race. He knew that no one could beat him at his race. Similarly, EMC is pushing the pace to extend its leadership in object storage with the ViPR Object data service. The ViPR object data service provides the ability to store, access and manipulate unstructured data as objects on ViPR-managed storage. The ViPR object data services builds on EMC’s success and adds a host of new capabilities that not only support petabyte-scale but exabyte-scale:

  • Object as a universal capability – As software-only services, ViPR data services can be layered over both traditional and new storage and enable hybrid data types. With ViPR, organizations can access and manipulate unstructured data as objects on file arrays such as EMC VNX, EMC Isilon and NetApp and, in the coming months, Centera arrays and commodity hardware.
  • Highly efficient geo-storage – The ViPR object data service stores objects in append-only containers or “chunks”. This enables support for multiple concurrent writes to an object and ensures very efficient geo-storage.
  • Geo-scale index and metadata – ViPR object can quickly locate objects in a massive scale, geo-distributed object store. A key bottleneck of existing object storage platforms is the indexing and metadata databases. They are typically stored in separate databases and do not scale well. This isn’t poor design; it’s simply a matter of unforeseen data growth. We’re entering an era of multi-petabyte and even exabyte scale! ViPR handles metadata and the index differently. Rather than maintaining a separate index in a database, which can’t scale well, the index and metadata are located on the actual arrays in containers. By storing the index and metadata on the actual arrays, the ViPR object data service can locate objects in a massive scale, geo-distributed object store much faster.
  • Support for low latency transactions – The ViPR Object data service can execute a large number of user transactions concurrently with very little latency. The ViPR Object data services supports box-carting to handle workloads with high transaction rates. When an application is writing a lot of small files with high I/O, ViPR can take multiple requests together and write them as one. This improves performance by reducing the round trips to and from the underlying storage. Competitors may market similar capabilities but require a separate platform such as Flash. ViPR executes box-carting in software so there is no additional infrastructure to buy.
  • Universal API support – The ViPR object data service supports Amazon S3, OpenStack Swift and EMC Atmos object storage APIs and will soon add support for Centera CAS API. Choose your API, write your app, and run it on any ViPR-managed storage.

Among many new features, ViPR will also soon add geo-replication, geo-distribution, compliance, and support for commodity hardware. Project Nile will leverage ViPR and ViPR data services to deliver hyper-scale cloud storage and one-click access to object, file and block storage on an EMC provided commodity platform or as a software-only solution. All this innovation is great news for Atmos and Centera customers. Customers that have made and continue to make investments in Centera and Atmos will be able to extend and leverage those investments and get access to new capabilities on other platforms with ViPR.

EMC is thrilled to be recognized as a leader in object storage technology and is dedicated to extending that leadership position. We will continue to give our customers the choice and flexibility to mix software and array-level features in a way that meets any application or Big Data workload. As our customers evolve from the second platform of IT to the third, EMC will continue its object storage innovation to accelerate their journey.



Connect with us on Twitter