Buckets, Apps & Digital Exhaust…All in a Day’s Work For a Dell EMC Splunk Ninja

Cory Minton

Principal Systems Engineer at EMC

Grab your hoodies, your witty black t-shirts, and maybe your capes…it’s time for another exciting Splunk .conf, the annual Splunk User Conference taking place this week at the Walt Disney Swan and Dolphin Resort.  All of us at EMC are excited to be sponsoring .conf for the third year in a row, and this year our presence will be bigger and better than ever before. Dell EMC is hosting two technical sessions this year, we’ll have more than 20 of the Dell EMC Splunk Ninjas running around learning and a large booth in the partner pavilion demonstrating our technology solutions.. For all the details, check out our .conf16 site.

This year marks the beginning of a great relationship between two awesome tech businesses: Dell EMC and Splunk.  We joined forces through a formal strategic alliance that started in February.  This alliance enables Dell EMC and its partners to sell Splunk’s industry leading platform. And, it allows Dell EMC unique access to Splunk technical resources for solution design, testing, and validation.  Most importantly, it creates a framework for these two technology powerhouses to collaborate more effectively for customer success.

Why Dell EMC for Splunk?


When we talk about customer success, we mean it in two distinct ways: deploying Splunk on Dell EMC platforms and, using Splunk to derive value from Dell EMC infrastructure.

First, we believe success is deploying Splunk on a flexible infrastructure that not only helps Splunk run fast and efficiently, but also one that can scale easily as the usage of Splunk evolves in a customer organization.  We believe that converged and hyper-converged technologies powered by Dell EMC’s robust portfolio of storage technologies delivers on this vision and provide additional enterprise capabilities:

  • Cost effective & Optimized Storage – Dell EMC delivers optimized and efficient storage by aligning the right storage to Splunk’s hot, warm, and cold data long retention and varying performance requirements.
  • Flexible & Scale-Out capacity consumption model – Scale-out infrastructure to meet capacity and compute requirements independently or as a single, converged platform as per your data growth.
  • Data Reduction & other data Powerful Enterprise Capabilities – including secure encryption, compression & deduplication of indexes, and fast, efficient zero-overhead copies for protection.
    Bottom-less cold bucket – Scale-Out storage platforms, whether on premise or in the cloud, obviates the need for a frozen bucket by providing a PB-scale cold bucket solution, simplifying data management and making data always searchable.

splunk2

 

Splunk and Dell EMC engineering teams have engaged in a strategic collaboration to ensure that all Dell EMC platforms have been validated by Splunk to “meet or exceed Splunk’s published reference server hardware” guidelines.  The Splunk team takes this validation process very seriously and customers can rest assured that if they are considering infrastructure for your Splunk deployment, we have done extensive testing. Whether you are looking at hyper-converged solutions like VXRail or VxRack, converged solutions like VBlock systems, or just storage from EMC like ScaleIO, XtremIO, VNX, Unity, Isilon, or ECS, you can be confident that the work has been done by both Splunk and Dell EMC to make sure it runs well.

Secondly, we believe Splunk is an incredibly powerful platform for capturing and deriving value from machine data.  As it turns out, Dell EMC products spin off a massive amount of “digital exhaust” that can be captured easily and used to drive operational intelligence in IT.  Dell EMC has made massive investments over the last few years to build apps for our platforms and make them available in Splunkbase for free.  We’ve built apps for XtremIO, Isilon  and VNX and expect to have many more in the works.  These apps make it simple to ingest data from Dell EMC platforms, and we offer useful, pre-built reports and dashboards to make monitoring these assets simple.  And it doesn’t stop there…once the data is extracted from your Dell EMC platforms, the underlying searches powering our reports or just the indexes themselves can be used in investigations across the entire IT service stack.  One of my favorite things to hear from our customers is the exciting ways they use the apps beyond just simple reporting and I hope to hear many more stories this year at .conf2016.

Dell EMC Splunk Ninjas And Our Top Ten List

 

splunk_ninja_resizeThe Dell EMC Splunk Ninja team are at the show in their Dell EMC blue Ninja Hunt shirts. The Dell EMC Splunk Ninja team is a group of more than 40 systems engineers from across Dell EMC who have been trained the same way Splunk trains its own systems engineers.  The Ninjas hold certifications ranging from SE1 all the way to SE3, we’ve got skills across not only using Splunk, but administering and architecting it at scale.  This is a global team not only available to talk to you at .conf, but also available in the field to have direct conversations with you when you head back to the office.

 

 

Our wise, passionate and zany Ninja team recently pulled together a list of Top Ten Best Practices for Splunk on Dell EMC. This list has been amassed based on years of lab testing and real world customer experience. You may say ‘Duh’ to some, but others may surprise you.

Happy Splunking!

Metalnx: Making iRODS Easy

Stephen Worth

Stephen Worth is a director of Global Innovation Operations at Dell EMC. He manages development and university research projects in Brazil, is a technical liaison between helping to improve innovation across our global engineering labs, and works in digital asset management leveraging user defined metadata. Steve is based out of Dell EMC’s RTP Software Development Center which focuses on data protection, core storage products, & cloud storage virtualization. Steve started with Data General in 1985, which was acquired by EMC in 1999, and Dell Technologies in 2016. He has led many product development efforts involving operating systems, diagnostics, UI, database, & applications porting. His background includes vendor & program management, performance engineering, engineering services, manufacturing, and test engineering. Steve, an alumnus of North Carolina Status University, received a B.S. degree in Chemistry in 1981 and M.S. degree in Computer Studies in 1985. He served as an adjunct faculty member of the Computer Science department from 1987-1999. Currently Steve is an emeritus member of the Computer Science Department’s Strategic Advisory Board and is currently chairperson of the Technical Advisory Board for the James B. Hunt Jr. Library on Centennial Campus.

Latest posts by Stephen Worth (see all)

DNA background

Advances in sequencing, spectroscopy, and microscopy are driving life sciences organizations to produce vast amounts of data. Most organizations are dedicating significant resources to the storage and management of that data. However, until recently, their primary efforts have focused on how to host the data for high performance, rapid analysis, and moving it to more economical disks for longer-term storage.

The nature of life sciences work demands better data organization. The data produced by today’s next-generation lab equipment is rich in information, making it of interest to different research groups and individuals at varying points in time. Examples include:

  • Raw experimental and analyzed data may be needed as new drug candidates move through research and development, clinical trials, FDA approval, and production
  • A team interested in new indications for an existing chemical compound would want to leverage work already done by others in the organization on the compound in the past
  • In the realm of personalized medicine, clinicians may need to evaluate not only a person’s health history, but correlate that information with genome sequences and phenotype data throughout the individual’s life.

The great challenge is how to make data more generally available and useful throughout an organization. Researchers need to know what data exists and have a way to access it. For this to happen, data must be properly categorized, searchable, and easy to find.

To get help in this area, many research organizations and government agencies worldwide are using the Integrated Rule-Oriented Data System (iRODS), which is open source data management software developed by the iRODS Consortium. iRODS enables data discovery using a data/metadata catalog that can retain machine and user-defined metadata describing every file, collection, and object in a data grid.

Additionally, iRODS automates data workflows with a rule engine that permits any action to be initiated by any trigger on any server or client in the grid. iRODS enables secure collaboration, so users only need to login to their home grid to access data hosted on a remote, federated grid.

Leveraging iRODS can be simplified and its benefits enhanced when used with Metalnx, an administrative and metadata management user interface (UI) for iRODS. Metalnx was developed by Dell EMC through its efforts as a corporate member of the iRODS Consortium. The intuitive Metalnx UI helps both the IT administrators charged with managing metadata and the end-users / researchers who need to find and access relevant data based upon metadata descriptions.

Making use of metadata via an easy to use UI provided by Metalnx working with iRODS can help:

  • Maximize storage assets
  • Find what’s valuable, no matter where the data is located
  • Automate movement and processing of data
  • Securely share data with collaborators

Real world example: Putting the issues into perspective

A simple example illustrates why iRODS and Metalnx are needed. Plant & Food Research, a New Zealand-based science company providing research and development that adds value to fruit, vegetable, crop and food products, makes great use of next-generation sequencing and genotyping. The work generates a lot of mixed data types.

“In the past, we were good at storing data, but not good at categorizing the data or using metadata,” said Ben Warren, bioinformatician, at Plant & Food Research. “We tried to get ahead of this by looking at what other institutions were doing.”

iRODS seemed a good fit. It was the only decent open source solution available. However, there were some limitations. “We were okay with the rule engine, but not the interface,” said Warren.

A system administrator working with EMC on hardware for the organization’s compute cluster had heard of Metalnx and mentioned this to Warren. “We were impressed off the bat with its ease of use,” said Warren. “Not only would it be useful for bioinformaticians, coders, and statisticians, but also for the scientists.”

The reason: Metalnx makes it easier to categorize the organization’s data, to control the metadata used to categorize the data, and to use the metadata to find and access any data.

Benefits abound

At Plant & Food Research, metadata is an essential element of a scientist’s workflow. The metadata makes it easier to find data at any stage of a research project. When a project is conceived, scientists will start by determining all metadata required for the project using Metalnx and cataloging data using iRODS. With this approach, everything associated with a project including the samples used, sample descriptions, experimental design, NGS data, and other information are searchable.

One immediate benefit is that someone undertaking a new project can quickly determine if similar work has already been done. This is increasingly important in life science organizations as research become more multi-discipline in nature.

Furthermore, the more an organization knows about its data, the more valuable the data becomes. Researchers can connect with other work done across the organization. Being able to find the right raw data of a past effort means an experiment does not have to be redone. This saves time and resources.

Warren notes that there are other organizational benefits using iRODS and Metalnx. When it comes to collaborating with others, the data is simply easier to share. Scientists can put the data in any format and it is easier to publish the data.

Learn more

Metalnx is available as open source tool. It can be found at Dell EMC Code www.codedellemc.com  or on Github at www.github.com/Metalnx . EMC has also made binary versions available on bintray at www.bintray.com/metalnx  and a Docker image posted on Docker Hub at https://hub.docker.com/r/metalnx/metalnx-web/

A broader discussion of the use of Metalnx and iRODS in the life sciences can be found in an on-demand video of a recent web seminar “Expanding the Face of Meta Data in Next Generation Sequencing.” The video can be viewed on the EMC Emerging Tech Solutions site.

 

 

As New Business Models Emerge, Enterprises Increasingly Seek to Leave the World of Silo-ed Data

Keith Manthey

CTO of Analytics at EMC Emerging Technologies Division

As Bob Dylan famously wrote back in 1964, the times, they are a changin’. And while Dylan probably wasn’t speaking about the Fortune 500’s shifting business models and their impact on enterprise storage infrastructure (as far as we know), his words hold true in this context.

Many of the world’s largest companies are attempting to reinvent themselves by abandoning their product-or manufacturing-focused business models in favor of a more service-oriented approach. Look at industrial giants such as GE, Caterpillar or Procter & Gamble to name a few and consider how they leverage existing data about products (in the case of GE, say it’s a power plant) and apply them to a service model (say for utilities, in this example).

The evolution of a product-focused model into a service-oriented one can offer more value (and revenue) over time, but also requires a more sophisticated analytic model and holistic approach to data, a marked difference from the traditional silo-ed way that data has been managed historically.

Transformation

Financial services is another example of an industry undergoing a transformation from a data storage perspective. Here you have a complex business with lots of traditionally silo-ed data, split between commercial, consumer and credit groups. But increasingly, banks and credit unions want a more holistic view of their business in order to better understand how various divisions or teams could work together in new ways. Enabling consumer credit and residential mortgage units to securely share data could allow them to build better risk score models across loans, for example, ultimately allowing a financial institution to provide better customer service and expand their product mix.

Early days of Hadoop: compromise was the norm

As with any revolution, it’s the small steps that matter most at first. Enterprises have traditionally started small when it comes to holistically governing their data and managing workflows with Hadoop. In earlier days of Hadoop, say five to seven years ago, enterprises assumed potential compromises around data availability and efficiency, as well as how workflows could be governed and managed. Issues in operations could arise, making it difficult to keep things running one to three years down the road. Security and availability were often best effort – there weren’t the expectations of  five-nines reliability.

Data was secured by making it an island by itself. The idea was to scale up as necessary, and build a cluster for each additional department or use case. Individual groups or departments ran what was needed and there wasn’t much integration with existing analytics environments.

With Hadoop’s broader acceptance, new business models can emerge

hadoop_9_resizeHowever, last year, with its 10-year anniversary, we’ve started to see broader acceptance of Hadoop and as a result it’s becoming both easier and more practical to consolidate data company-wide. What’s changed is the realization that Hadoop was a true proof of concept and not a science experiment. The number of Hadoop environments has grown and users are realizing there is real power in combining data from different parts of the business and real business value in keeping historical data.

At best, the model of building different islands and running them independently is impractical; at worst it is potentially paralyzing for businesses. Consolidating data and workflows allows enterprises to focus on and implement better security, availability and reliability company-wide. In turn, they are also transforming their business models and expanding into new markets and offerings that weren’t possible even five years ago.

Analyst firm IDC evaluates EMC Isilon: Lab-validation of scale-out NAS file storage for your enterprise Data Lake

Suresh Sathyamurthy

Sr. Director, Product Marketing & Communications at EMC

A Data Lake should now be a part of every big data workflow in your enterprise organization. By consolidating file storage for multiple workloads onto a single shared platform based on scale-out NAS, you can reduce costs and complexity in your IT environment, and make your big data efficient, agile and scalable.

That’s the expert opinion in analyst firm IDC’s recent Lab Validation Brief: “EMC Isilon Scale-Out Data Lake Foundation: Essential Capabilities for Building Big Data Infrastructure”, March 2016. As the lab validation report concludes: “IDC believes that EMC Isilon is indeed an easy-to-operate, highly scalable and efficient Enterprise Data Lake Platform.

The Data Lake Maximizes Information Value

The Data Lake model of storage represents a paradigm shift from the traditional linear enterprise data flow model. As data and the insights gleaned from it increase in value, enterprise-wide consolidated storage is transformed into a hub around which the ingestion and consumption systems work. This enables enterprises to bring analytics to data in-place – and avoid expensive costs of multiple storage systems, and time for repeated ingestion and analysis.

But pouring all your data into a single shared Data Lake would put serious strain on traditional storage systems – even without the added challenges of data growth. That’s where the virtually limitless scalability of EMC Isilon scale-out NAS file storage makes all the difference…

The EMC Data Lake Difference

The EMC Isilon Scale-out Data Lake is an Enterprise Data Lake Platform (EDLP) based on Isilon scale-out NAS file storage and the OneFS distributed file system.

As well as meeting the growing storage needs of your modern datacenter with massive capacity, it enables big data accessibility using traditional and next-generation access methods – helping you manage data growth and gain business value through analytics. You can also enjoy seamless replication of data from the enterprise edge to your core datacenter, and tier inactive data to a public or private cloud.

We recently reached out to analyst firm IDC to lab-test our Isilon Data Lake solutions – here’s what they found in 4 key areas…

  1. Multi-Protocol Data Ingest Capabilities and Performance

Isilon is an ideal platform for enterprise-wide data storage, and provides a powerful centralized storage repository for analytics. With the multi-protocol capabilities of OneFS, you can ingest data via NFS, SMB and HDFS. This makes the Isilon Data Lake an ideal and user-friendly platform for big data workflows, where you need to ingest data quickly and reliably via protocols most suited to the workloads generating the information. Using native protocols enables in-place analytics, without the need for data migration, helping your business gain more rapid data insights.

datalake_blog

IDC validated that the Isilon Data Lake offers excellent read and write performance for Hadoop clusters accessing HDFS via OneFS, compared against via direct-attached storage (DAS). In the lab tests, Isilon performed:

  • nearly 3x faster for data writes
  • over 1.5x faster for reads and read/writes.

As IDC says in its validation: “An Enterprise Data Lake platform should provide vastly improved Hadoop workload performance over a standard DAS configuration.”

  1. High Availability and Resilience

Policy-based high availability capabilities are needed for enterprise adoption of Data Lakes. The Isilon Data Lake is able to cope with multiple simultaneous component failures without interruption of service. If a drive or other component fails, it only has to recover the specific affected data (rather than recovering the entire volume).

IDC validated that a disk failure on a single Isilon node has no noticeable performance impact on the cluster. Replacing a failed drive is a seamless process and requires little administrative effort. (This is in contrast to traditional DAS, where the process of replacing a drive can be rather involved and time consuming.)

Isilon can even cope easily with node-level failures. IDC validated that a single-node failure has no noticeable performance impact on the Isilon cluster. Furthermore, the operation of removing a node from the cluster, or adding a node to the cluster, is a seamless process.

  1. Multi-tenant Data Security and Compliance

Strong multi-tenant data security and compliance features are essential for an enterprise-grade Data Lake. Access zones are a crucial part of the multi-tenancy capabilities of the Isilon OneFS. In tests, IDC found that Isilon provides no-crossover isolation between Hadoop instances for multi-tenancy.

Another core component of secure multi-tenancy is the ability to provide a secure authentication and authorization mechanism for local and directory-based users and groups. IDC validated that the Isilon Data Lake provides multiple federated authentication and authorization schemes. User-level permissions are preserved across protocols, including NFS, SMB and HDFS.

Federated security is an essential attribute of an Enterprise Data Lake Platform, with the ability to maintain confidentiality and integrity of data irrespective of the protocols used. For this reason, another key security feature of the OneFS platform is SmartLock – specifically designed for deploying secure and compliant (SEC Rule 17a-4) Enterprise Data Lake Platforms.

In tests, IDC found that Isilon enables a federated security fabric for the Data Lake, with enterprise-grade governance, regulatory and compliance (GRC) features.

  1. Simplified Operations and Automated Storage Tiering

The Storage Pools feature of Isilon OneFS allows administrators to apply common file policies across the cluster locally – and extend them to the cloud.

Storage Pools consists of three components:

  • SmartPools: Data tiering within the cluster – essential for moving data between performance-optimized and capacity-optimized cluster nodes.
  • CloudPools: Data tiering between the cluster and the cloud – essential for implementing a hybrid cloud, and placing archive data on a low-cost cloud tier.
  • File Pool Policies: Policy engine for data management locally and externally – essential for automating data movement within the cluster and the cloud.

As IDC confirmed in testing, Isilon’s federated data tiering enables IT administrators to optimize their infrastructure by automating data placement onto the right storage tiers.

The expert verdict on the Isilon Data Lake

IDC concludes that: “EMC Isilon possesses the necessary attributes such as multi-protocol access, availability and security to provide the foundations to build an enterprise-grade Big Data Lake for most big data Hadoop workloads.”

Read the full IDC Lab Validation Brief for yourself: “EMC Isilon Scale-Out Data Lake Foundation: Essential Capabilities for Building Big Data Infrastructure”, March 2016.

Learn more about building your Data Lake with EMC Isilon.

The Democratization of Data Science with the Arrival of Apache Spark

Keith Manthey

CTO of Analytics at EMC Emerging Technologies Division

As an emerging field, data science has seen rapid growth over the span of just a few short years. With Harvard Business Review referring to the data scientist role as the “sexiest job of the 21st century” in 2012 and job postings for the role growing 57 percent in the first quarter of 2015, enterprises are increasingly seeking out talent to help bolster their organizations’ understanding of their most valuable assets: their data.

The growing demand for data scientists reflects a larger business trend – a shifting emphasis from the zeros and ones to the people who help manage the mounds of data on a daily basis. Enterprises are sitting on a wealth of information but are struggling to derive actionable insights from it, in part due to its sheer volume but also because they don’t have the right talent on board to help.

The problem enterprises now face isn’t capturing data – but finding and retaining top talent to help make sense of it in meaningful ways. Luckily, there’s a new technology on the horizon that can help democratize data science and increase accessibility to the insights it unearths.

Data Science Scarcity & Competition

dataThe talent pool for data scientists is notoriously scarce. According to McKinsey & Company, by 2018, the United States alone may face a 50 to 60 percent gap between supply and demand for “deep analytic talent, i.e., people with advanced training in statistics or machine learning.” Data scientists possess an essential blend of business acumen, statistical knowledge and technological prowess, rendering them as difficult to train as they are invaluable to the modern enterprise.

Moreover, banks and insurance companies face an added struggle in hiring top analytics talent, with the allure of Silicon Valley beckoning top performers away from organizations perceived as less inclined to innovate. This perception issue hinders banks’ and insurance companies’ ability to remain competitive in hiring and retaining data scientists.

As automation and machine learning grow increasingly sophisticated, however, there’s an opportunity for banks and insurance companies to harness the power of data science, without hiring formally trained data scientists. One such technology that embodies these innovations in automation is Apache Spark, which is poised to shift the paradigm of data science, allowing more and more enterprises to tap into insights culled from their own data.

Spark Disrupts & Democratizes Data Science

Data science requires three pillars of knowledge: statistical analysis, business intelligence and technological expertise. Spark does the technological heavy-lifting, by understanding and processing data at a scale that most people aren’t comfortable. It handles the distribution and categorization of the data, removing the burden from individuals and automating the process. By allowing enterprises to load data into clusters and query it on an ongoing basis, the platform is particularly adept at machine-learning and automation – a crucial component in any system intended to analyze mass quantities of data.

Spark was created in the labs of UC Berkeley and has quickly taken the analytics world by storm, with two main business propositions: the freedom to model data without hiring data scientists, and the power to leverage analytics models that are already built and ready-for-use in Spark today. The combination of these two attributes allows enterprises to gain speed on analytics endeavors with a modern, open-source technology.

The arrival of Spark signifies a world of possibility for companies that are hungry for the business value data science can provide but are finding it difficult to hire and keep deep analytic talent on board. The applications of Spark are seemingly endless, from cybersecurity and fraud detection to genomics modeling and actuarial analytics.

What Spark Means for Enterprises

Not only will Spark enable businesses to hire non-traditional data scientists, such as actuaries, to effectively perform the role, but it will also open a world of possibilities in terms of actual business strategy.

Banks, for example, have been clamoring for Spark from the get-go, in part because of Spark’s promise to help banks bring credit card authorizations back in-house. For over two decades, credit card authorizations have been outsourced, since it was more efficient and far less dicey to centralize the authorization process.

The incentive to bring this business back in-house is huge, however, with estimated cost savings of tens to hundreds of millions annually. With Spark, the authorization process could be automated in-house – a huge financial boon to banks. The adoption of Spark allows enterprises to effectively leverage data science and evolve their business strategies accordingly.

The Adoption of Spark & Hadoophadoop_1_resized

Moreover, Spark works seamlessly with the Hadoop Distributions sitting on EMC’s storage platforms. As I noted in my last post, Hadoop adoption among enterprises has been incredible and is quickly becoming the de facto
standard for storing and processing terabytes or even petabytes of data.

By leveraging Spark and existing Hadoop platforms in tandem, enterprises are well-prepared to solve the ever-increasing data and analytics challenges ahead.

Open Source and the Modern Data Center: How {code} by Dell EMC enhances ScaleIO software-defined block storage

Joshua Bernstein

VP of Technology, Emerging Technology Team at Dell EMC

Why has open source become such a big deal, even in the enterprise data center? If you answered “to save money”, you wouldn’t be in the minority. But, despite what many may assume, it’s not principally about cost savings – although that may be one benefit. The attraction of open source is in its name – that is, its ‘open’ nature, both in terms of access to the code and to the developers who maintain and enhance it.

To boil it down, open source enables you to run data centers through software, with better and easier integration opportunities between diverse systems than has ever been possible before with proprietary offerings.

Open Source Advantages for the Modern Data Center

emc_code_1_resized_again_1emc_code_2_resized_againemc_code_3_resized_again
Access to Open Source Code & Project DevelopersFreedom of Choice & Flexibility for UsersEasier Integrations Between Diverse Systems

Emerging open source infrastructure software thrives on freedom, flexibility, innovation and integration. Integration is particularly important because it enables discrete components to seamlessly work together as a system. This software thrives through community involvement, and the ability to integrate with both modern and existing processes and infrastructure, which leads to quicker adoption.Enterprises are looking to data center IT transformations to help them meet the ever-growing and fluid expectations of their customers. Key to this is establishing a modern data center strategy, specifically one that is optimized for resource consumption. By embracing systems that are operated as software, organizations are more readily able to adapt to changing demands and opportunities.

{code} is Dell EMC’s open source initiative to deepen ties with the developer and open source communities. Through {code}, Dell EMC is enabling these communities to seamlessly fuse proprietary software with open source technologies.

Leveraging Container-Focused Solutions

Containerization is having its big moment in the world of enterprise IT – specifically with open source infrastructure and application platforms, such as Docker, Mesos, Cloud Foundry and Kubernetes. Container-based infrastructure represents a major evolution in the way applications are deployed and managed. Not since the appearance of the virtual machine has a technology been so transformative. Containers give IT more choice of infrastructure, since it gives teams greater control over application dependencies, which enables them to adopt more agile operational methods.

However, a big challenge for fully adopting container technology is that it’s not a one-to-one comparison to virtual machines. With containers, how can users run persistent applications inside of these lightweight, ephemeral constructs? We believe this is a key challenge that prevents the wider adoption of container-based infrastructure.

REX-Ray: Meeting the Persistent Storage Challenge

To solve for this challenge, {code} has been working for more than a year on developing REX-Ray to deliver persistent storage capabilities to container run times. It provides a simple and focused architecture for enabling advanced storage functionality across common storage, virtualization and cloud platforms. As an open source project, new features and functionality continue to be added to REX-Ray, aimed at continuing to lead and set the bar for providing persistence capabilities to containers.

Storage is a critical element of any IT environment. By focusing on storage within the context of open source and software, we’re able to offer users more functionality, choice and value from their deployments. One solution that works really well with REX-Ray is Dell EMC’s ScaleIO software-defined block storage.

REX-Ray and ScaleIO: Simpler Block Storage for Containerization

REX-Ray acts as the ‘glue’ between the container platform and ScaleIO – a software-defined storage solution that provides block level storage services on commodity hardware. This solution enables IT to move beyond purely stateless applications for containers, to confidently deploying critical stateful applications in containers as well.

ScaleIO is the gold standard for software-defined block storage platforms. It gives organizations the flexibility and freedom to provide storage through commodity servers in a range of deployment models – including hyper-converged architectures without a performance overhead. Through the seamless integration between REX-Ray and ScaleIO, the complete life cycle of storage is managed and consumed by container solutions such as Docker, Mesos, Cloud Foundry and Kubernetes.

Through {code}, Dell EMC has demonstrated its commitment to support the open source community. By ensuring that its software-defined storage solutions such as ScaleIO work seamlessly within a modern data center (which already integrates wide-ranging technologies such as virtualization, containerization, automation and cloud) and DevOps environment, we are making software-based storage technologies relevant in the open source community. Advanced integration, developer enablement and dynamic engagement all made possible by {code} are making ScaleIO an increasingly valued and attractive block storage option for the open source community.

Learn more about {code} by Dell EMC.

Join the {code} Community.

Want to get your hands a little dirty with the technology?

Download and test ScaleIO inside a VM environment.

Request a vLab demo:

Docker, Mesos, and ScaleIO for your persistent applications.

 

Follow Dell EMC

Categories

Archives

Connect with us on Twitter