Posts Tagged ‘Software-Defined Storage’

Galaxy: A Workflow Management System for Modern Life Sciences Research

Nathan Bott

Healthcare Solutions Architect at EMC

Am I a life scientist or an IT data manager? That’s the question many researchers are asking themselves in today’s data-driven life sciences organizations.

Whether it is a bench scientist analyzing a genomic sequence or an M.D. exploring biomarkers and a patient’s genomic variants to develop a personalized treatment, researchers are spending a great amount of time searching for, accessing, manipulating, analyzing, and visualizing data.

Organizations supporting such research efforts are trying to make it easier to perform these tasks without the user needing extensive IT expertise and skills. This mission is not easy.

Focus on the data

Modern life sciences data analysis requirements are vastly different than they were just a handful of years ago.

In the past, once data was created, it was stored, analyzed soon after, and then archived to tape or another long-term medium. Today, not only is more data is being generated, but also the need to re-analyze that data means that it must be retained where it can be easily accessed for longer periods.

Additionally, today’s research is much more collaborative and multi-disciplinary. As a result, organizations must provide an easy way for researchers to access data, ensure that results are reproducible, and provide transparency to ensure best practices are used and that procedures adhere to regulatory mandates.

More analytics and collaboration represent areas where The Galaxy Project (also known as just Galaxy) can help. Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform designed to help make computational biology accessible to research scientists that do not have computer programming experience.

Galaxy is generally used as a general bioinformatics workflow management system that automatically tracks and manages data while providing support for capturing the context and intent of computational methods.

Organizations have several ways to make use of Galaxy. They include:

Free public instance: The Galaxy Main instance is available as a free public service at UseGalaxy.org. This is the Galaxy Project’s primary production Galaxy instance and is useful for sharing or publishing data and methods with colleagues for routine analysis or with the larger scientific community for publications.

Anyone can use the public servers, with or without an account. (With an account, data quotas are increased and full functionality across sessions opens up, such as naming, saving, sharing, and publishing Galaxy-defined objects).

Publicly available instances: Many other Galaxy servers besides Main have been made publicly available by the Galaxy community. Specifically, a number of institutions have installed Galaxy and have made those installations either accessible to individual researchers or open to certain organizations or communities.

For example, the Centre de Bioinformatique de Bordeaux offers a general purpose Galaxy instance that includes EMBOSS (a software analysis package for molecular biology) and fibronectin (diversity analysis of synthetic libraries of a Fibronectin domain). Biomina offers a general purpose Galaxy instance that includes most standard tools for DNA/RNA sequencing, plus extra tools for panel resequencing, variant annotation, and some tools for Illumina SNP array analysis.

A list of the publically available installations of Galaxy can be found here.

Do-it-yourself: Organizations also have the choice of deploying their own Galaxy installations. There are two options: an organization can install a local instance of Galaxy (more information on setting up a local instance of Galaxy can be found here), or Galaxy can be deployed to the cloud. The Galaxy Project supports CloudMan, a software package that provides a common interface to different cloud infrastructures.

How it works

Architecturally, Galaxy is a modular python-based web application that provides a data abstracting layer to integrate with various storage platforms. This allows researchers to access data on a variety of storage back-ends like standard direct attached storage, S3 object-based cloud storage, storage management systems like iRODs (the Integrated Rule-Oriented Data System), or a distributed file system.

For example, a Galaxy implementation might use object-based storage such as that provided by Dell EMC Elastic Cloud Storage (ECS). ECS is a software-defined, cloud-scale, object storage platform that combines that cost advantages of commodity infrastructure with the reliability, availability, and serviceability of traditional storage arrays.

With ECS, any organization can deliver scalable and simple public cloud services with the reliability and control of a private-cloud infrastructure.

ECS provides comprehensive protocol support, like S3 or Swift, for unstructured workloads on a single, cloud-scale storage platform. This would allow the user of a Galaxy implementation to easily access data stored on such cloud storage platforms.

With ECS, organizations can easily manage a globally distributed storage infrastructure under a single global namespace with anywhere access to content. ECS features a flexible software-defined architecture that is layered to promote limitless scalability. Each layer is completely abstracted and independently scalable with high availability and no single points of failure.

Get first access to our Life Sciences Solutions

You can test drive Dell EMC ECS by registering for an account and getting access to our APIs by visiting https://portal.ecstestdrive.com/

Or you can download the Dell EMC ECS Community Edition here and try it for FREE in your own environment with no time limit for non-production use

Goodbye Data Migration: ScaleIO Software-Defined Block Storage

Jason Brown

Consultant Product Marketing Manager at Dell EMC
Data migration is a big IT headache – but could software-defined storage be the remedy? Let’s see how Dell EMC ScaleIO software-defined block storage makes data migration blues a thing of the past.


Traditionally, data migration during a hardware ‘tech refresh’ is a costly and painful process that also puts your data at risk – but it’s unfortunately been a requirement with traditional SAN arrays on a 3-5 year repeating cycle.

There are three major categories of pain that need to be considered in regard to a traditional data migration:

  • Human impact
  • Technology challenges
  • Financial costs

The pain of data migration to your people

First consider the human impact of data migration. There’s the stress caused to the storage admin or team – and the owners of the applications and data that are hosted on the storage. A data migration is very disruptive and can typically be a year-long process from start to finish.

For any datacenter-class array, the planning for a data migration is a months-long exercise that involves going to every single affected application owner, and scheduling a window to migrate their application and planning for the potential downtime. The storage admin then needs to build a calendar with the schedule for each individual app migration. This negotiation process can be a very painful and politically charged process – and causes a lot of stress on the admin and the organization.

Then, the execution of the migration takes place, usually over many months, with the migration team using nights and weekends to perform the migration during lowest impact hours. During the actual execution of the migration, the storage admin will face even more pain, such as:

  • 24/7 monitoring and daily reporting
  • Angst and stress of the migration process
  • Constant demands from application teams during the process
  • Time away from other job activities (and potentially family) due to the migration.

The technology challenges of data migration

Another key category of pain is the technology aspect of data migration. How will the migration be executed? Using a specialist migration tool or application? Conducted by professional services or storage team? What processes will be used to decrease risk and minimize downtime?

There are also big risks associated with data migration. What if you don’t complete your data migration within the allotted time? What if the migration fails due to a data corruption during the movement? There’s no 100% guarantee of success. Organizations that are particularly risk-averse will move all production off that system first – but this kind of pre-work beforehand leads to more costs.

The economic and financial costs of data migration

A study by Wikibon in 2014 estimated the cost of a storage migration project to be approximately 17% of the total cost of ownership for a storage array, including initial cost and maintenance. The typical cost of a data migration could therefore amount to hundreds of thousands of dollars for a datacenter-class storage array.

The expense of moving to a new storage system will include not only the so-called ‘hard costs’:

  • Purchase and installation cost of the new storage array itself.
  • Cost of specialized migration software and services if using them.
  • Overtime costs for admins executing and monitoring the migration.

Additionally there are the ‘soft costs’ involved:

  • Economic impact of lost revenue from application downtime.
  • Typical loss at least a year of serviceable time on the storage array due to the year-long migration process.
  • Any costs associated with risk / downtime / migration issues.

A future without data migration?

All these kinds of painful issues are top-of-mind for admins when it comes to data migration. Getting rid of migrations is about getting rid of all these hard and soft costs. It also releases value by giving your IT people a big chunk of their lives back, and allowing them to focus on value-added projects – instead of another year of moving data around.

One of the attractive promises made for new software-defined storage (SDS) is its elimination of the dreaded process of data migration. So can SDS really remove data migration from the storage management equation?

Eliminating data migrations – with Dell EMC ScaleIO

With Dell EMC ScaleIO software-defined block storage you really can transform your approach. In a similar way that VMware virtualized the compute operations of the datacenter 15 years ago, ScaleIO today abstracts and pools local storage resources in each server, and automates provisioning of capacity and performance back to applications on those servers.

Your shared pool of ScaleIO storage is physically dissociated into distributed software components running on industry-standard x86 server hardware. This software-defined approach helps make data movement between storage painless and automatic.

To extend the analogy with virtualization, think about how an admin does a ‘slow roll’ when upgrading a compute cluster. There is no wholesale VM migration or replacement of the cluster – it’s an ongoing perpetual process, performed a server at a time as needed, and requiring no downtime or impact to the hosted applications. There are also specific tools in place (from the hypervisor) to execute and automate this as a seamless and transparent background process.

Guess what? ScaleIO achieves a very similar transformation for storage!

 

A seamless, perpetual process

Of course, data movement still happens within ScaleIO – but we like to think it’s nothing like ‘data migration’ as you’ve previously known it. And much of the time, you won’t even notice it’s happening.

The processes of balancing in a new node and evicting an old node are all seamless to your applications. When a new server node is added to the cluster, an automatic data rebalance occurs, spreading some data from each existing server node across to it. Similarly, when the storage admin decides to ultimately decommission a server node, it is removed by issuing a command that automatically redistributes its data across the rest of the servers in the cluster. This is all completely transparent to your applications.

With ScaleIO, maintaining and upgrading storage becomes a painless process – rather than the old-style painful epic project. This is basically the model of storage management pioneered by web-scale companies like Facebook, Google, and Amazon. They had to overcome the challenges of their huge data environments, running mission critical apps that cannot afford any downtime, and where wholesale migrations are not feasible. They achieved this by moving from arrays to home-grown software-defined infrastructure running on top of many x86 servers. ScaleIO is an enterprise-grade, commercially supported, and standards-based solution that follows similar operational principles.

It’s automatic, it’s programmatic – it’s ScaleIO greased lightning!

Organizations that are making the best use of software-defined technologies like ScaleIO tend to be highly automated in their IT. They are already using software automation tools for deployment of servers, OSs and applications. With ScaleIO, storage provisioning and deployment is automated in the same framework – programmatically.

You can simply decide when it’s time to expand your storage and do it in granular steps, one x86 server at a time. Similarly, when a server is old and no longer desired, you can decide to evict it from the cluster, one server at a time, on-demand. ScaleIO enables the concept of a perpetual rolling upgrade, so you maintain optimum efficiency and performance across all your storage – without the headaches of traditional data migration.

With ScaleIO, it doesn’t really matter what specific kind of server is attached to the storage cluster – whether it’s VMware, Hyper-V, bare-metal Linux, or whatever. There’s no need to worry about operating any particular environment. ScaleIO takes a platform-agnostic, heterogeneous approach. It enables the same processes, regardless of your OS or applications.

Say goodbye to data migration challenges with ScaleIO

We think you might enjoy a world without data migration headaches. So why not explore the advantages of Dell EMC ScaleIO software-defined block storage for yourself?

Dell EMC ScaleIO software is available to download and try for free.

Learn more about Dell EMC ScaleIO advantages for your datacenter.

Dell & EMC Showcase Their Synergies with the All-Flash ScaleIO Ready Node

Jason Brown

Consultant Product Marketing Manager at Dell EMC

scaleioimage1In case you missed it, on September 15th we announced the Dell EMC ScaleIO Ready Node. This announcement highlights months of collaboration between Dell and EMC to combine the best of both worlds – Dell PowerEdge servers and Dell EMC ScaleIO. The ScaleIO Ready Node brings All-Flash capabilities to Software-Defined Storage to enable customers to transform their data centers, making the path to the modern data center easier with Dell EMC:

There are tons of specs and details about the ScaleIO Ready Node which I won’t rehash here. You can check out the data sheet, spec sheet, and FAQ to get all the details you need. What I’d like to highlight are two key points regarding this announcement:

  1. Bringing best of breed EMC software-defined storage and Dell server hardware together
  2. Optimizing All-Flash with SDS

The first point is really important. There’s a reason why Dell spent 500 gazillion dollars on EMC… oh, it was only $67 billion you say? Peanuts then! But seriously, there are obviously a lot of synergies and opportunities between Dell and EMC, and the ScaleIO Ready Node is one of the first examples. Dell builds best-of-breed servers with its PowerEdge line. EMC is a leader in software-defined storage (SDS), with ScaleIO as its block storage offering. And guess what? ScaleIO runs on x86-based industry standard servers! Bringing PowerEdge servers and ScaleIO together was a no-brainer. It’s like peanut butter and jelly. Or mac and cheese. Or even “peas and carrots”*.

But, it’s not as simple or straightforward as you think. A ton of thought and work went into the planning and R&D processes associated with the ScaleIO Ready Node. Yeah, we’ve loaded ScaleIO onto a variety of Dell PowerEdge servers. But it doesn’t stop there. We’ve introduced a system that is pre-configured, pre-validated, and optimized for running ScaleIO. Plus, it comes with “one throat to choke” for procurement and support: Dell EMC.

I can’t emphasize how important that is. When I talk to customers, they get SDS, they understand there can be significant 3-5-year TCO savings, and they absolutely love the performance, scalability, and flexibility of ScaleIO. But, when the rubber meets the road, a majority of customers are not going to buy ScaleIO software and then procure <insert your favorite brand> servers from another vendor (but if they do, I hear Dell has good stuff). So, we’ve simplified the process and enabled faster time-to market by using Dell EMC’s supply chain and services so customers can hit the ground running – while preserving the flexibility which is a huge differentiator for ScaleIO. See what two ScaleIO veterans have to say about this:

The second point is the formal introduction of ScaleIO into the All-Flash (AF) arena. Yeah I know, every product out there has AF capabilities, and yeah, flash is becoming commoditized, and yeah, you could run AF ScaleIO clusters before the ScaleIO Ready Node. Regardless, AF is the way of the future and one of the foundations of the modern data center. So, we’re combining two key foundations to transform your data center – All-Flash and Software-Defined Storage – into a single platform to make it much easier for customers to start their journey to the modern data center.

What’s important about the AF ScaleIO Ready Node is how we optimize flash with SDS. ScaleIO’s architecture is unique and is the key behind unlocking the power of AF. All of the SSD drives within the All-Flash ScaleIO Ready Node work in parallel, eliminating any and all bottlenecks. Each node scaleioimage2participates in IOPS and there’s no cache to get in the way – for reads or writes. The ability to take full advantage of the aggregate performance of all SSDs makes it possible for performance to scale linearly as you add more servers.

Customers have the ability to migrate Tier 1 and Tier 2 application workloads with high-performance requirements to All-Flash ScaleIO Ready Nodes without missing a beat! Check out Dell EMC’s All-Flash page for more details, and if you want to see some guy talking about the All-Flash ScaleIO Ready Node, click here.

We’re extremely excited about the release of the ScaleIO Ready Node. It’s awesome to be one of the first products to be released by Dell Technologies that highlights the synergies between Dell and EMC. With this collaboration, we’re able to bring peace of mind to customers and provide unique product capabilities now and in the future. Please visit the Dell EMC ScaleIO Ready Node page to learn more! #GoBigWinBig

*If you’ve never seen Forrest Gump, go watch it. Now!

Software defined scale-out NAS extends your Data Lake from core to edge: IsilonSD Edge NAS software

Sri Seshadri

Product Marketing at EMC Isilon

When people consider enterprise data growth, they often focus on the ‘core’ IT within the corporate headquarters and datacenter. But what’s happening further away from the core – at your remote offices and branch offices?

We all know that the amount of enterprise data requiring storage is doubling every 2–3 years (according to analyst IDC’s ‘Digital Universe’ study). Managing these ever-growing quantities of (mostly unstructured) data is a constant challenge for most enterprises.

At the enterprise core, EMC Isilon is already addressing that challenge. The Isilon data lake offering helps you consolidate your data, eliminate storage silos, simplify management, increase data protection, and gain value from your data assets. Isilon’s built-in multi-protocol capabilities support a wide range of traditional and next-gen applications – including data analytics that can be used to gain better insights to accelerate your business.

But data is also growing at enterprise edge locations. A recent ESG study (“Remote Office/Branch Office Technology Trends”, May 2015) showed that 68% of organizations now have an average of more than 10 TB of data stored at each branch office – while only 23% reported this amount of edge-stored data in 2011.

(more…)

Free and Frictionless Downloads of EMC Software-defined Solutions Now Available

Sam Grocott

Senior Vice President, Marketing & Product Management at EMC ETD

You’ve been hearing a lot from EMC about our software-defined storage (SDS) offerings and how you can use them to build a more adaptive, agile storage infrastructure. But enough talk — it’s time you tried these products in your own environment and experienced those benefits firsthand. To that end, we’re offering Free and Frictionless downloads of these SDS products: ECS, ScaleIO and IsilonSD Edge.

Free and FrictionlessFree and Frictionless means just what it says: downloads at no cost, easy installation on your preferred hardware, testing on your own terms and timeline — and then, once you’re ready to go into production, a simple, direct route to purchase. It’s our way of enabling you to see for yourself exactly how EMC software-defined storage can benefit you before you invest any budget in it.

These aren’t streamlined trial-only versions, either. They’re full-featured, enterprise-grade offerings that cover the following SDS products:

  • ECS, EMC’s software-defined, elastic cloud storage platform for web, mobile and cloud applications
  • ScaleIO software-only, server-based SAN with scale-out performance
  • IsilonSD Edge provides SDS solutions for enterprise edge locations including remote and branch offices

Just click on these ECS, ScaleIO and IsilonSD Edge links and start using the software for free right away. Installation is automated and takes about 15 minutes, and every product comes with automated configuration management, too. Try it, use it, share it within your organization, innovate with it — whatever you want to do is going to be easy and effortless, so go for it.

Only when you’re ready to go into production do you pay for anything. And that’s easy, too: Just choose whether you want to continue with the software-only product or buy it as part of an EMC storage solution, and then visit the EMC eStore You have ample opportunity to experiment with our products and you won’t have to spend any dollars until you’re dead sure. At that point, you also get the combined weight and power of EMC’s entire support infrastructure behind you.

But that’s later. Right now, go get your ECS, ScaleIO and IsilonSD Edge versions of the EMC software-defined storage product you want to try. This is where the rubber meets the road. And you don’t even have to talk to a tire salesman.

 

 

EMC’s Commitment to Everything Software-Defined

Varun Chhabra

Director of Product Marketing, Advanced Software Division at EMC

At EMC, our commitment to creating new solutions for software-defined storage is part of our much larger commitment to supporting the entire software-defined data center infrastructure, in which software, completely abstracted from hardware, enables more adaptive, agile operations. Within the software-defined data center, EMC’s evolving suite of software-defined storage solutions plays an important role in addressing the explosive data growth – both in the volume and variety of data — that poses such a tremendous challenge today. We’ve designed these solutions with features like elastic scale-out to incrementally add storage capacity, open APIs for programmatic flexibility and support for analytics-in-place workloads. With software abstracted from hardware, customers can deploy these and other storage capabilities on the hardware of their choice rather than being locked into a narrow proprietary hardware platform, which means vendor flexibility, lower acquisition costs and more efficient storage provisioning for lower TCO over the long term.

In recent years, EMC has beenCommitment to SDS leading the way in introducing new software-defined storage platforms as well as working to transition our existing industry-leading storage solutions into the software-defined model. We entered the software-defined storage market in 2013 with ViPR Controller, which automates storage provisioning to reduce manual tasks and improve operational efficiency by up to 63%. It delivers storage-as-a-service to consumers, minimizing dependencies on the IT team. Since then, we’ve doubled down on our commitment to providing customers with a comprehensive software-defined storage portfolio. We’ve launched ScaleIO, a server-based storage area network (SAN) with a wide variety of deployment options – available as software on commodity hardware, as an appliance (VxRack™ Node) and as VxRack converged infrastructure from VCE (VxRack Flex System) that can linearly scale performance to thousands of nodes in a single federated cluster. On the cloud/object storage front, we’ve launched Elastic Cloud Storage, or ECS, a software-defined cloud storage platform that is built specifically for web, mobile and cloud applications, designed to run as a software-only solution on existing or commodity hardware. ECS scales effortlessly, and provides benefits such as superior economics and global access associated with the public cloud, while minimizing data residency and compliance risks. Both ScaleIO and ECS are also available for consumption as appliances or as software-only solutions.

Moreover, our software-defined products have very tight integrations with other EMC products. For example, our customers can use ScaleIO in conjunction with EMC XtremCache for flash cache auto-tiering to further accelerate application performance. And those who seek advanced-level protection and recovery for their confidential data can use ScaleIO with EMC RecoverPoint to provide replication and disaster recovery protection in ScaleIO environments.

We also made our EMC Isilon storage family, which has long provided industry-leading scale-out storage for unstructured data, available as a software-only solution. Available now, the Software-defined EMC Isilon (IsilonSD Edge) provides the same ability to manage large and rapidly growing amounts of data in a highly scalable and easy-to-manage way, but with the added benefit of hardware flexibility. Customers can deploy IsilonSD Edge on commodity hardware and easily manage enterprise edge locations including remote and branch offices, replicate the edge data to the core data center and seamlessly tier to private or public clouds.

As our customers move into the new world of software-defined IT, EMC provides a solid base on which to build the scalable, flexible infrastructures that will transform your data centers to meet the future head-on. Our growing portfolio of software-defined storage solutions is a fundamental component of that base, providing a range of scale-out solutions to meet rapidly growing and changing data demands.

To keep up with more EMC SDS information and trends, visit: www.emc.com/sds

 

Categories

Archives

Connect with us on Twitter