TGen Cures Storage Needs with Dell EMC to Advance Precision Medicine

Sasha Paegle

Sasha Paegle

Sr. Business Development Manager, Life Sciences

As the gap between theoretical treatment and clinical application for precision medicine continues to shrink, we’re inching closer to having the practice of doctors using individual human genomes to prescribe specific care strategies become a commonplace reality.

Organizations such as the Translational Genomics Research Institute (TGen), a leading biomedical research institute, are on the forefront of enabling a new generation of life-saving treatments. With innovations from TGen, breakthroughs in genetic sequencing are unraveling mysteries of complex diseases like cancer.

To help achieve its goal to successfully use –omics to prevent, diagnose and treat disease, the Phoenix-based non-profit research institute selected Dell EMC to enhance its IT system and infrastructure to manage its petabyte-size sequencing cluster.

Data Tsunami 

The time and cost of genomic sequencing for a single person has dropped dramatically since the Human Genome Project, which spanned 13 years and cost $1 billion. Today, sequencing can be completed in roughly one day for approximately $1,000. Furthermore, technological advances in sequencing and on the IT front have enabled TGen to increase the number of patients being sequenced from the hundreds to the thousands annually. To handle the storage output from current sequencing technologies and emerging single molecule real-time (SMRT) sequencing, TGen required an infrastructure with the storage capacity and performance to support big data repositories produced by genetic sequencing—even as they grow exponentially.

“When you get more sequencers that go faster and run cheaper, and the more people are being sequenced, you’re going to need more resources in order to process this tsunami of data,” said James Lowey, TGen’s CIO.

TGen stores vast amounts of data generated by precision medicine, such as genetic data and data from wearables including glucose monitors and pain management devices, as well as clinical records and population health statistics. Scientists must then correlate and analyze this information to develop a complete picture of an individual’s illness and potential treatment. This involves TGen’s sequencing cluster churning through one million CPU hours per month and calls for a storage solution that is also able to maintain high availability, which is critical to the around the clock research environment.

Benefits for Researchers

In the coming years, researchers can expect genetic sequences to increase in addition to SMRT sequencing paving the way for larger data volumes.

Lowey notes, “As genetic data continues to grow exponentially, it’s even more important to have an extremely reliable infrastructure to manage that data and make it accessible to the scientists 24/7.”

Having a robust storage infrastructure in place allows researchers to fully devote their time and attention on the core business of science without worrying if there’s enough disk space or processing capacity. It also helps scientists get more precise treatments to patients faster, enabling breakthroughs that lead to life-saving and life-changing medical treatments – the ultimate goal of TGen and like-minded research institutes.

Looking Ahead

With the likelihood of sequencing clusters growing to exabyte-scale, TGen and its peers must continue to seek out an enterprise approach that emphasizes reliability and scalability and ensures high availability of critical data for 24/7 operations.

Lowey summarizes the future of precision medicine and IT by saying, “The possibilities are endless, but the real trick is to build all of that backend infrastructure to support it.”

To learn more about Dell EMC’s work with TGen, check out our video below.

 

Get first access to our Life Sciences Solutions

The Next Element for IT Service Providers in the Digital Age

Diana Gao

Senior Product Marketing Manager at EMC² ECS

Digital technology has disrupted large swaths of the economy and is generating huge amount of data, where the average backup hovers at around a petabyte. Not all organizations can cope up with this data deluge and look to service providers for storage and protection. Many service providers provide tape-based backup and archiving services. Despite their best efforts to innovate, data volumes always seem to grow faster, pushing the boundaries of tape capacity.

Today, companies of all sizes still use tape to store business information, but now it is more for cold storage than for data that needs to be accessed frequently. While tape as a low cost and reliable storage option is ideal for data not being accessed often, maintaining multiple versions of software and legacy infrastructure can put a burden on already taxed resources. These challenges come at a cost including software licenses, maintenance, and a waste of technical resources that could be spent on other more important initiatives to help drive business innovation. As a service provider, you need a secure and compliant data storage option that will enable you to sell more value added services.

As reported in Tech Target, a Storage magazine Purchasing Intention survey showed that the trend away from tape continues – 76% of IT professionals see their use of tape as a backup format either declining or staying the same.

Some service providers are considering offering cloud-based backup-as-a-service without causing any security concerns for their customers. Others are looking for a solution that combines the benefits of faster data access along with the cost advantages of tape.

More than a few service providers have discovered an ideal solution that covers all of these benefits: Elastic Cloud Storage (ECS) object storage platform. As a highly scalable, multi-tenant, multi-protocol object storage system, ECS is the perfect platform that helps service providers to better meet their service-level-agreement (SLA) commitments to customers by offering highly resilient, reliable and low-cost storage services with enterprise-class security.

Iron Mountain® Incorporated (NYSE: IRM), a leading provider of storage and information management services, is one of those who have discovered this solution. In additional to its traditional tape-based storage-as-a-service, it partnered with Dell EMC to provide a cost-effective, scalable and modern Cloud Archive as a part of their services portfolio. Designed to scale as the volume of data grows with ECS as the backend storage platform, the Cloud Archive solution is ideal for organizations needing offsite, pay-as-you-use archival storage with near-infinite scalability.

“Our customers trust that we know where the data is by having those cloud-based solutions in our datacenters. It gives them a peace of mind where they know where their data is at rest.” said Eileen Sweeney, SVP Data Management at Iron Mountain.

Watch the video below to hear more about how Iron Mountain uses ECS to modernize its storage management services for 95% of Fortune 1000 companies. 

You’ll find the full rundown of Iron Mountain Cloud Archive solution with ECS here.

Planning on getting away to Barcelona for Mobile World Congress (MWC) 2017? Stop by at VMWare Stand at Hall 3, Stand 3K10 to meet with Dell EMC experts!

Unwrapping Machine Learning

Ashvin Naik

Cloud Infrastructure Marketing at Dell EMC

In a recent IDC spending guide titled Worldwide cognitive systems and artificial intelligence spending guide,   some fantastic numbers were thrown out in terms of opportunity and growth 50+ % CAGR, Verticals pouring in billions of dollars on cognitive systems. One of the key components of cognitive systems is Machine Learning.

According to wikipedia Machine Learning is a subfield of computer science that gives the computers the ability to learn without being explicitly programmed. Just these two pieces of information were enough to get me interested in the field.

After hours of daily  searching, digging through inane babble and noise across the internet, the understanding of how machines can learn evaded me for weeks, until I hit a jackpot. A source, that should not be named pointed me to a “secure by obscurity” share that had the exact and valuable insights on machine learning. It was so simple, elegant and completely made sense to me.

Machine Learning was not all noise, it worked on a very simple principle. Imagine, there is a pattern in this world that can be used to forecast or predict a behavior of any entity. There is no mathematical notation available to describe the pattern, but if you have the data that can be used to plot the pattern, you can use Machine Learning to model it.  Now, this may sound like a whole lot of mumbo jumbo but allow me to break it down in simple terms.

Machine learning can be used to understand patterns so you can forecast or predict anything provided

  • You are certain there is a pattern
  • You do not have a mathematical model to describe the pattern
  • You have the data to try to figure out the pattern.

Viola, this makes so much sense already. If you have data, know there is a pattern but don’t know what that is, you can use machine learning to find it out. The applications for this are endless from natural language processing, speech to text to predictive analytics. The most important is forecasting- something we do not give enough credit these days. The Most critical component of Machine Learning is Data – you should have the data. If you do not have data, you cannot find the pattern.

As a cloud storage professional, this is a huge insight. You should have data. Pristine, raw data coming from the systems that generate it- sort of like a tip from the horses mouth. I know exactly where my products fit in. We are able to ingest, store, protect and expose the data for any purposes in the native format complete with the metadata all through one system.

We have customers in the automobile industry leveraging our multi-protocol cloud storage across 2300 locations in Europe capturing data from cars on the roads. They are using proprietary Machine Learning systems to look for patterns in how their customers- the car owners use their products in the real world to predict the parameters of designing better, reliable and efficient cars. We have customers in the life-sciences business saving lives by looking at the patterns of efficacy and effective therapies for terminal diseases. Our customers in retail are using Machine Learning to detect fraud and protect their customers. This goes on and on and on.

I personally do not know the details of how they make it happen, but this is the world of the third platform. There are so many possibilities and opportunities ahead if only we have the data. Talk to us and we can help you capture, store and secure your data so you can transform humanity for the better.

 

Learn more about how Dell EMC Elastic Cloud Storage can fit into your Machine Learning Infrastructure

 

 

Goodbye Data Migration: ScaleIO Software-Defined Block Storage

Jason Brown

Consultant Product Marketing Manager at Dell EMC
Data migration is a big IT headache – but could software-defined storage be the remedy? Let’s see how Dell EMC ScaleIO software-defined block storage makes data migration blues a thing of the past.


Traditionally, data migration during a hardware ‘tech refresh’ is a costly and painful process that also puts your data at risk – but it’s unfortunately been a requirement with traditional SAN arrays on a 3-5 year repeating cycle.

There are three major categories of pain that need to be considered in regard to a traditional data migration:

  • Human impact
  • Technology challenges
  • Financial costs

The pain of data migration to your people

First consider the human impact of data migration. There’s the stress caused to the storage admin or team – and the owners of the applications and data that are hosted on the storage. A data migration is very disruptive and can typically be a year-long process from start to finish.

For any datacenter-class array, the planning for a data migration is a months-long exercise that involves going to every single affected application owner, and scheduling a window to migrate their application and planning for the potential downtime. The storage admin then needs to build a calendar with the schedule for each individual app migration. This negotiation process can be a very painful and politically charged process – and causes a lot of stress on the admin and the organization.

Then, the execution of the migration takes place, usually over many months, with the migration team using nights and weekends to perform the migration during lowest impact hours. During the actual execution of the migration, the storage admin will face even more pain, such as:

  • 24/7 monitoring and daily reporting
  • Angst and stress of the migration process
  • Constant demands from application teams during the process
  • Time away from other job activities (and potentially family) due to the migration.

The technology challenges of data migration

Another key category of pain is the technology aspect of data migration. How will the migration be executed? Using a specialist migration tool or application? Conducted by professional services or storage team? What processes will be used to decrease risk and minimize downtime?

There are also big risks associated with data migration. What if you don’t complete your data migration within the allotted time? What if the migration fails due to a data corruption during the movement? There’s no 100% guarantee of success. Organizations that are particularly risk-averse will move all production off that system first – but this kind of pre-work beforehand leads to more costs.

The economic and financial costs of data migration

A study by Wikibon in 2014 estimated the cost of a storage migration project to be approximately 17% of the total cost of ownership for a storage array, including initial cost and maintenance. The typical cost of a data migration could therefore amount to hundreds of thousands of dollars for a datacenter-class storage array.

The expense of moving to a new storage system will include not only the so-called ‘hard costs’:

  • Purchase and installation cost of the new storage array itself.
  • Cost of specialized migration software and services if using them.
  • Overtime costs for admins executing and monitoring the migration.

Additionally there are the ‘soft costs’ involved:

  • Economic impact of lost revenue from application downtime.
  • Typical loss at least a year of serviceable time on the storage array due to the year-long migration process.
  • Any costs associated with risk / downtime / migration issues.

A future without data migration?

All these kinds of painful issues are top-of-mind for admins when it comes to data migration. Getting rid of migrations is about getting rid of all these hard and soft costs. It also releases value by giving your IT people a big chunk of their lives back, and allowing them to focus on value-added projects – instead of another year of moving data around.

One of the attractive promises made for new software-defined storage (SDS) is its elimination of the dreaded process of data migration. So can SDS really remove data migration from the storage management equation?

Eliminating data migrations – with Dell EMC ScaleIO

With Dell EMC ScaleIO software-defined block storage you really can transform your approach. In a similar way that VMware virtualized the compute operations of the datacenter 15 years ago, ScaleIO today abstracts and pools local storage resources in each server, and automates provisioning of capacity and performance back to applications on those servers.

Your shared pool of ScaleIO storage is physically dissociated into distributed software components running on industry-standard x86 server hardware. This software-defined approach helps make data movement between storage painless and automatic.

To extend the analogy with virtualization, think about how an admin does a ‘slow roll’ when upgrading a compute cluster. There is no wholesale VM migration or replacement of the cluster – it’s an ongoing perpetual process, performed a server at a time as needed, and requiring no downtime or impact to the hosted applications. There are also specific tools in place (from the hypervisor) to execute and automate this as a seamless and transparent background process.

Guess what? ScaleIO achieves a very similar transformation for storage!

 

A seamless, perpetual process

Of course, data movement still happens within ScaleIO – but we like to think it’s nothing like ‘data migration’ as you’ve previously known it. And much of the time, you won’t even notice it’s happening.

The processes of balancing in a new node and evicting an old node are all seamless to your applications. When a new server node is added to the cluster, an automatic data rebalance occurs, spreading some data from each existing server node across to it. Similarly, when the storage admin decides to ultimately decommission a server node, it is removed by issuing a command that automatically redistributes its data across the rest of the servers in the cluster. This is all completely transparent to your applications.

With ScaleIO, maintaining and upgrading storage becomes a painless process – rather than the old-style painful epic project. This is basically the model of storage management pioneered by web-scale companies like Facebook, Google, and Amazon. They had to overcome the challenges of their huge data environments, running mission critical apps that cannot afford any downtime, and where wholesale migrations are not feasible. They achieved this by moving from arrays to home-grown software-defined infrastructure running on top of many x86 servers. ScaleIO is an enterprise-grade, commercially supported, and standards-based solution that follows similar operational principles.

It’s automatic, it’s programmatic – it’s ScaleIO greased lightning!

Organizations that are making the best use of software-defined technologies like ScaleIO tend to be highly automated in their IT. They are already using software automation tools for deployment of servers, OSs and applications. With ScaleIO, storage provisioning and deployment is automated in the same framework – programmatically.

You can simply decide when it’s time to expand your storage and do it in granular steps, one x86 server at a time. Similarly, when a server is old and no longer desired, you can decide to evict it from the cluster, one server at a time, on-demand. ScaleIO enables the concept of a perpetual rolling upgrade, so you maintain optimum efficiency and performance across all your storage – without the headaches of traditional data migration.

With ScaleIO, it doesn’t really matter what specific kind of server is attached to the storage cluster – whether it’s VMware, Hyper-V, bare-metal Linux, or whatever. There’s no need to worry about operating any particular environment. ScaleIO takes a platform-agnostic, heterogeneous approach. It enables the same processes, regardless of your OS or applications.

Say goodbye to data migration challenges with ScaleIO

We think you might enjoy a world without data migration headaches. So why not explore the advantages of Dell EMC ScaleIO software-defined block storage for yourself?

Dell EMC ScaleIO software is available to download and try for free.

Learn more about Dell EMC ScaleIO advantages for your datacenter.

Digital Transformation with Radical Simplicity

Corey O'Connor

Senior Product Marketing Manager at Dell EMC² ETD

Digital Transformation with Radical Simplicity

Welcome to another edition of the Emerging Technologies ECS blog series, where we discuss topics related to cloud storage and ECS (Elastic Cloud Storage), Dell EMC’s cloud-scale storage platform.

The Inflection Point

It’s no surprise that unstructured data continues to grow exponentially year over year and doesn’t show signs of slowing down anytime soon. Some organizations are left managing this data with traditional storage infrastructure which is not only very expensive, but does not scale at the rate in which the data is growing.  IT budgets continue to remain flat or grow at an ungenerous rate of about 5% annually and capital expenses tend to double almost every year for most organizations.  The other pressing issue is the requirement of somehow maintaining the same (if not better) level of service with fewer resources as data growth continues to strain storage infrastructure.  This type of trend is not sustainable and if certain organizations do not transform their business then they will struggle without question.  We know what you’re thinking – wouldn’t it be great if the world’s largest provider of data storage systems created a cost effective, cloud-scale solution to solve this enterprise level challenge?

Dell EMC’s Elastic Cloud Storage (ECS)

Elastic Cloud Storage (ECS) is Dell EMC’s 3rd generation object-based storage system that provides the ability to:

  • Consolidate primary storage resources and elevate ROI
  • Modernize traditional and legacy applicationsfor better storage utilization
  • Accelerate cloud native applications to deliver new business value

ECS delivers a multipurpose platform that satisfies a variety of different use cases and plugs in perfectly to almost any existing Dell EMC investment(s). ECS singlehandedly simplifies management, increases agility, and most importantly – lowers costs.  At scale, ECS is undoubtedly one of the most cost effective solutions available in the market today.  In fact, analyst Enterprise Strategy Group (ESG) recently conducted a survey that shows ECS provides a 60% or greater cost advantage compared to other leading public cloud providers.   

ECS extends the cloud to primary storage and allows you to free up your infrastructure through Dell EMC cloud-enabled solutions (e.g. CloudPools, CloudBoost, CloudArray, etc.).  Customers have the ability to seamlessly tier colder, inactive workloads from existing primary storage investments (e.g. Isilon, VMAX Series, VPLEX, VNX Series, Vx Series, Data Domain, Data Protection Suite, etc.) to ECS.  This resource consolidation eliminates the need to purchase additional, more expensive platforms and better utilizes the infrastructure you have in your storage environment today.

An object-based platform like ECS can drastically increase responsiveness and better secure data when you compare it to that of a traditional NAS system.  Data is protected using erasure coding and the chunks of data are then geo-distributed across all nodes within the system providing instant read/write access from any location.  Strong consistency semantics ensures only the most recent copy of data is accessed simplifying application development efforts.  A geo-caching capability further enhances responsiveness through intelligent system recognition of access patterns which minimizes WAN traffic and improves system latency.

ECS provides simple and easy access to applications through a single global namespace.  This makes it easy for developers not having to deal with complex NFS file systems – they can focus on app development and not the operations and implementation details behind it.   By modernizing traditional applications into an object store, users get fast and easy provisioning, direct access to content over the web via HTTP, global accessibility through a single namespace, and the absolute best utilization of storage resources in the datacenter.

Cloud-native applications take full advantage of a cloud system framework.  ECS’ architecture is completely software defined with total abstraction from the north and southbound allowing compute and storage resources to scale independently from each other.  Everything within ECS is containerized and there are no hardware dependencies or the need to re-code, re-tool or reconfigure applications as ECS provides multi-protocol support. This allows developers to innovate and deliver their applications to market at a much quicker rate.

Bridging the Gap

Enterprises and cloud service providers alike can leverage ECS as a way to fund their ‘digital transformation’ as traditional, line-of-business applications go into decline and cloud-native apps begin to surge over the next decade.  ECS bridges the gap between Platform 2 (traditional) and Platform 3 (next-gen) applications on a single storage system.  Not only can ECS easily handle the extraordinary amount of unstructured data that’s growing, but as a multi-purpose platform it can serve up all the many different workloads you currently manage today and ready your organization for what the future throws at you.

Why Healthcare IT Should Abandon Data Storage Islands and Take the Plunge into Data Lakes

One of the most significant technology-related challenges in the modern era is managing data growth. As healthcare organizations leverage new data-generating technology, and as medical record retention requirements evolve, the exponential rise in data (already growing at 48 percent each year according to the Dell EMC Digital Universe Study) could span decades.

Let’s start by first examining the factors contributing to the healthcare data deluge:

  • Longer legal retention times for medical records – in some cases up to the lifetime of the patient.
  • Digitization of healthcare and new digitized diagnostics workflows such as digital pathology, clinical next-generation sequencing, digital breast tomosynthesis, surgical documentation and sleep study videos.
  • With more digital images to store and manage, there is also an increased need for bigger picture archive and communication system (PACS) or vendor-neutral archive (VNA) deployments.
  • Finally, more people are having these digitized medical tests, (especially given the large aging population) resulting in a higher number of yearly studies with increased data sizes.

Healthcare organizations also face frequent and complex storage migrations, rising operational costs, storage inefficiencies, limited scalability, increasing management complexity and storage tiering issues caused by storage silo sprawl.

Another challenge is the growing demand to understand and utilize unstructured clinical data. To mine this data, a storage infrastructure is necessary that supports the in-place analytics required for better patient insights and the evolution of healthcare that enables precision medicine.

Isolated Islands Aren’t Always Idyllic When It Comes to Data

The way that healthcare IT has approached data storage infrastructure historically hasn’t been ideal to begin with, and it certainly doesn’t set up healthcare organizations for success in the future.

Traditionally, when adding new digital diagnostic tools, healthcare organizations provided a dedicated storage infrastructure for each application or diagnostic discipline. For example, to deal with the growing storage requirements of digitized X-rays, an organization will create a new storage system solely for the radiology department. As a result, isolated storage siloes, or data islands, must be individually managed, making processes and infrastructure complicated and expensive to operate and scale.

Isolated siloes further undermine IT goals by increasing the cost of data management and compounding the complexity of performing analytics, which may require multiple copies of large amounts of data copied into another dedicated storage infrastructure that can’t be shared with other workflows. Even the process of creating these silos is involved and expensive because tech refreshes require migrating medical data to new storage. Each migration, typically performed every three to five years, is labor-intensive and complicated. Frequent migrations not only strain resources, but take IT staff away from projects aimed at modernizing the organization, improving patient care and increasing revenue.

Further, silos make it difficult for healthcare providers to search data and analyze information, preventing them from gaining the insights they need for better patient care. Healthcare providers are also looking to tap potentially important medical data from Internet-connected medical devices or personal technologies such as wireless activity trackers. If healthcare organizations are to remain successful in a highly regulated and increasingly competitive, consolidated and patient-centered market, they need a simplified, scalable data management strategy.

Simplify and Consolidate Healthcare Data Management with Data Lakes

The key to modern healthcare data management is to employ a strategy that simplifies storage infrastructure and storage management and supports multiple current and future workflows simultaneously. A Dell EMC healthcare data lake, for example, leverages scale-out storage to house data for clinical and non-clinical workloads across departmental boundaries. Such healthcare data lakes reduce the number of storage silos a hospital uses and eliminate the need for data migrations. This type of storage scales on the fly without downtime, addressing IT scalability and performance issues and providing native file and next-generation access methods.

Healthcare data lake storage can also:

  • Eliminate storage inefficiencies and reduce costs by automatically moving data that can be archived to denser, more cost-effective storage tiers.
  • Allow healthcare IT to expand into private, hybrid or public clouds, enabling IT to leverage cloud economies by creating storage pools for object storage.
  • Offer long-term data retention without the security risks and giving up data sovereignty of the public cloud; the same cloud expansion can be utilized for next-generation use cases such as healthcare IoT.
  • Enable precision medicine and better patient insights by fostering advanced analytics across all unstructured data, such as digitized pathology, radiology, cardiology and genomics data.
  • Reduce data management costs and complexities through automation, and scale capacity and performance on demand without downtime.
  • Eliminate storage migration projects.

 

The greatest technical challenge facing today’s healthcare organizations is the ability to effectively leverage and manage data. However, by employing a healthcare data management strategy that replaces siloed storage with a Dell EMC healthcare data lake, healthcare organizations will be better prepared to meet the requirements of today’s and tomorrow’s next-generation infrastructure and usher in advanced analytics and new storage access methods.

 

Get your fill of news, resources and videos on the Dell EMC Emerging Technologies Healthcare Resource Page

 

 

Follow Dell EMC

Categories

Archives

Connect with us on Twitter