Posts Tagged ‘EMC’

Converged Infrastructure + Isilon: Better Together

David Noy

VP Product Management, Emerging Technologies Division at EMC

You can’t beat Isilon for simplicity, scalability, performance and savings. We’re talking  world-class scale-out NAS that stores, manages, protects and analyzes your unstructured data with a powerful platform that stays simple, no matter how large your data environment. And Dell EMC already has the #1 converged infrastructure with blocks and racks. So bringing these two superstars together into one converged system is truly a case of one plus one equals three.

This convergence—pairing Vblock/VxBlock/VxRack systems and the Technology Extension for Isilon— creates an unmatched combination that flexibly supports a wide range of workloads with ultra-high performance, multi-protocol NAS storage. And the benefits really add up, too:

As impressive as these numbers are, it all boils down to value and versatility. These converged solutions give you more value for your investment because, quite simply, they store more data for less. And their versatility allows you to optimally run both traditional and nontraditional workloads These include video surveillance, SAP/Oracle/Microsoft applications, mixed workloads that generate structured and unstructured data, Electronic Medical Records and Medical Imaging and more – on infrastructure built and supported as one product.

With a Dell EMC Converged System, you’ll see better, faster business outcomes through simpler IT across a wide range of application workloads. For more information on modernizing your data center with the industry’s broadest converged portfolio, visit emc.com/ci or call your Dell EMC representative today.

 

Learn more about Converged Infrastructure and IsilonAlso, check out the full infographic

Your Data Lake Is More Powerful and Easier to Operate with New Dell EMC Isilon Products

Karthik Ramamurthy

Director Product Management
Isilon Storage Division at Dell EMC

Earlier this year Dell EMC released a suite of Isilon products designed to enable your company’s data lake journey. Together IsilonSD Edge, Isilon OneFS 8.0, and Isilon CloudPools transformed the way your organization stores and uses data by harnessing the power of the data lake. Today we are pleased to announce all three of these products have been updated and further enhanced to make your data lake even more powerful and easier to operate from edge to core to cloud.

Starting with the release of OneFS 8.0.1

OneFS 8.0.1 builds on the powerful platform provided by OneFS 8.0 released in February 2016. The intent of this newest release is to provide features important to unique customer datacenter workflows, enhance usability and manageability of OneFS clusters. In addition, OneFS 8.0.1 is the first release that takes full advantage of the non-disruptive upgrade and rollback framework introduced in OneFS 8.0.

Let’s review some of the most compelling features of this software release.

Improved Management, Monitoring, Security and Performance for Hadoop on Isilon

Expanding on the Data Lake, one of the focus areas of this new release was increasing the scope and usefulness of our integration with leading Hadoop management tools. OneFS 8.0.1 delivers support for and integration with Apache Ambari 2.4 and Ranger. A single management point now allows Ambari operators to seamlessly manage and monitor Hadoop clusters with OneFS as the HDFS storage layer. Ranger is an important security management tool for Hadoop.  These Ambari and Ranger integration features benefit all customers using Hortonworks and ODP-I compliant Hadoop distributions with OneFS.

Additionally OneFS 8.0.1 adds new features including Kerberos encryption to secure and encrypt data between HDFS clients and OneFS. In addition, Datanode load balancing avoids overloading nodes and increases cluster resilience. OneFS 8.0.1 also supports the following HFDS distributions: Hortonworks HDP 2.5, Cloudera CDH 5.8.0, and IBM Open Platform (IOP) 4.1.

Introducing Scale-Out NAS with SEC Compliance and Asynchronous Replication for Disaster Recovery

With OneFS 8.0.1, Isilon becomes the first and only Scale-Out NAS vendor that offers SEC-17a4 compliance via SmartLock Compliance Mode combined with the asynchronous replication to secondary or standby clusters via SyncIQ. This powerful combination means companies that must comply with SEC-17a4 are no longer caught in a choice between compliance and data recovery – with OneFS 8.0.1 they have both!

Storage Efficiency Designed for the Healthcare Diagnostic Imaging Needs

For many years PACS (Picture Archiving and Communication System) applications diagnostic imaging data was stored in large “container” files for maximum storage efficiency. In recent years, the way referring physicians’ access individual diagnostic images changed and, as a result, the methods used to store diagnostic imaging files had to change as well. OneFS 8.0.1 has a new storage efficiency feature specifically designed for the Healthcare PACS archive market to provide significantly improved storage efficiency for diagnostic imaging files.  Isilon customers can expect to see storage efficiency similar to OneFS’s large file storage efficiency for diagnostic imaging files when using this feature.   you leverage Isilon to store your PACS application data you will want to talk with your sales representative and learn more about this new feature.

Upgrade with Confidence

OneFS 8.0, released in February 2016, provided the framework for non-disruptive upgrades for all supported upgrades going forward and the addition of release rollback. OneFS 8.0.1 is the first OneFS release that you will be able to test and validate and, if needed, rollback to the previously installed 8.0.x release. This means that you can non-disruptively upgrade to 8.0.1, without impacting users or applications! You will be able to upgrade sets of nodes or the entire cluster for your testing and validation and then, once complete, you decide to commit the upgrade or rollback to the prior release. Once committed to OneFS 8.0.1, future upgrades will be even easier and more transparent with the ability to view an estimate of how long an upgrade will take to complete and transparency of the upgrade process. The WebUI was enhanced to make upgrade management even easier than before.

Manage Performance Resources like Never Before

Even more exciting is the new Performance Resource Management framework introduced in OneFS 8.0.1. This framework is the start of a revolutionary scale-out NAS performance management system. In OneFS 8.0.1 you will be able to obtain and view statistics on the performance resources (CPU, operations, data read, data written, etc.) for OneFS jobs and services. This will allow you to identify quickly if a particular job or service may be the cause of performance issues. These statistics are available via the CLI, Platform API and can be visualized with InsightIQ 4.1. In future releases these capabilities will be expanded to clients, IP addresses, users, protocols and !

These are just some of the new features OneFS 8.0.1 has to offer. OneFS 8.0.1 improves on our support for MAC OS clients, SMB, audit, NDMP and data migrations, to name a few other areas.  The white paper, Technical Overview of New and Improved Features of EMC Isilon OneFS 8.0., provides additional details on these and other new and improved features in OneFS 8.0.1

Isilon SD Edge Management Server version 1.0.1

This July EMC released a new version of IsilonSD Edge Management Server. Version 1.0.1 provides support for VMware ESX 6.0 in addition to previously supported ESX versions. This management server also enables monitoring of the IsilonSD Edge Clusters via EMC’s Secure Remote Support (ESRS) server and tools.

Isilon CloudPools Just Got Easier to Manage

OneFS 8.0.1 provides improved flexibility for CloudPools deployments in the enterprise with the introduction of proxy support. This allows administrators to specify one or more proxy servers between the Isilon cluster and your cloud provider of .

The Data Lake Journey is Just Beginning!

OneFS 8.0.1 is an important step on the data lake journey; however, you can rest assured we are not stopping here! Look forward to amazing new hardware and software features in coming releases as we build on the Performance Resource Management Framework, provide more workload specific enhancements to address our customers’ needs and deliver new levels of supportability, serviceability, scale and performance.   Don’t wait, upgrade .  Click here to download OneFS 8.0.1.

Survey findings show close alignment with Dell EMC strategy

Charles Sevior

Chief Technology Officer at EMC Emerging Technologies Division

Media Workflow Trends Survey-  Industry Transformation is Underway

Earlier in 2016, Dell EMC commissioned Gatepoint Research to conduct an extensive survey with Media Industry executives.  The survey, entitled Media Workflow Trends yielded some interesting results that point to a good understanding of the pace of change, and the need to stay agile for competitive advantage.

The results of that survey are summarised in a new Infographic, which apart from being much more interesting than a series of pie charts brings to the surface the key themes that align with the technology development strategy of Dell EMC.

Content Storage Demands Are Exploding

I have worked in the media industry for decades, and so this is hardly a surprising finding.  Early in my career, it was commonplace to find production offices full of shelves and compactus storage units.  These were crammed with videotapes. Then there were boxes stacked everywhere – also full of tapes with titles scrawled on the back.  There were colour-coded stickers – “Master”, “Protection Master”, “Edit Copy”, “HOLD”… There was a warehouse full of tapes of various types, even old films.  One thing you learned, is that nothing was ever thrown away (but plenty of things went missing).

Fast-forward to 2016, and most media companies involved in production and distribution of content have shifted to file-based Media Asset Management systems – or at least a media content archive repository.  This has helped to contain the data sprawl into a central location, but it has done nothing to reduce the total storage capacity requirement.  Think about the increasing resolution of content, the increasing number of channels, multiple versions for different delivery platforms and of course the increasing “shoot to use” ratio.  Sports events have increasing number of cameras with retained ISO recordings for highlights and post-match inquiries, Reality TV formats are based on multi-cam techniques to get every reaction from different angles.  Whilst these programs are in production, the storage capacity demands can skyrocket.

Only 3% of our survey respondents replied that storage needs are flat or negative – and 50% responded that the demand for storage capacity is growing rapidly and a major concern.

Multiplatform Content Delivery

Pretty much every major media company is either doing this already, or has a plan to extend their audience reach beyond simple linear broadcast channels in the next few years.  But what is interesting is the increasingly careful way in which media companies are deploying their solutions.

Recognising that the simple approach of outsourcing multiplatform content delivery to a third-party OVP (Online Video Platform) is not very revenue accretive, Media companies are now starting to embrace DIY in order to pull-back some profit margin in what is otherwise a very difficult to monetise delivery strategy.  As we learn more from some of the leaders in this industry – such as MLBAM – we can see the benefits in taking control and managing as much of the content delivery process end to end.  Just like we always did with linear content delivery over terrestrial RF transmitters, satellite transponders and cable TV networks.

One of the key tips is being ready to scale.  As streaming demand spikes and grows with popular content, how can every incremental viewer bring incremental profit – not just rising CDN costs?  Taking a tip from Netflix, you can build a distributed origin and control the CDN deeper into the delivery network.  Dell EMC has repeatedly partnered with some of the leading solution vendors in this space, who make it easier to deploy a well-managed and profitable multiplatform content delivery system.

IP-Based Workflows are here

Most industry commentators seem to get pretty excited about “the death of SDI”, and how soon IP networking can completely replace the dedicated video & audio circuits of the past.  But really, that is just a side show for which we will soon lose interest.  There is no “right or wrong” way to build a media facility.  The engineers and technical architects will select the appropriate technology on a case by case basis as they always have, based on reliability, quality, cost, ease of management etc.  And over time, there will simply be more connections made using IP network technology and fewer using dedicated single-purpose technology.

But what is the end-game?  I see it as moving our media equipment technology stacks (also known as the “rack room” or “central technical facility”) away from dedicated single-purpose vendor solutions built and managed carefully by Broadcast Engineers into a flexible virtualised technology stack that looks identical to a cloud-scale data centre – built and managed by IT and Media Technologists.  It will be open architecture, built on software-defined principles and capable of easy repurposing as the application technology needs of the business shift more frequently than they did in the past.

It is important to select your partners carefully as you make this transition into IP and software-defined.  Dell EMC has deliberately remained vendor neutral and standards-based.  We have aligned with SMPTE and AIMS who we believe are two organisations that have the broad interests of the industry (both end-users and vendors) at heart, and will result in practical, cost-effective and widely-adopted solutions.

As a pioneer and leader in scale-out storage, virtualisation and converged infrastructure, Dell EMC is in a great position to help you avoid costly mistakes during your transition to IP-based workflows.

EMC-Media and Entertainment-Infographic

Click to see the full M&E trends infographic

Ultra-HD Is Coming

Well, it’s already here.  Of course most people shopping for a new flat screen TV today will see that their options include 4K resolution, and are increasingly affordable when compared to the default HD TV resolution.  Some in the industry will say that 4K is unnecessary and is being pushed by the consumer electronics manufacturers – but when has that ever been a different story in the past?  There is no doubt that consumers appreciate improved quality of content, and story-tellers love the creative opportunities afforded by the latest technology.  When we can finally deliver ALL of the aspects of Ultra-HD, such as HDR (high dynamic range), HFR (high frame rates) and multi-channel surround sound that will be one step closer to reality.

At the SMPTE Future of Cinema Keynote during NAB 2016, pioneering movie Director Ang Lee said;

Technology must work for us to help tell the human story.  Whether it is from 2K to 4K, or 24 to 60fps, it improves the sensory experience and as a viewer, you become more relaxed and less judgmental.  We will always be chasing god’s work – which is the natural vision and sensory experience. We are getting closer and learning more about how we communicate with each other.”

In the world of content creation and media distribution, we will increasingly adopt 4K cameras, render graphics and animations at increased resolution and ensure the product we make has an increased shelf life.  This is natural, even if it is happening before we have an ability to deliver this content to our viewers.  And while it is difficult to “rip and replace” cable, satellite and terrestrial networks that are still only shifting from SD to HD with new 4K solutions, OTT content delivery using internet broadband and mobile networks will probably be the way most consumers first access Ultra-HD.

Dell EMC Isilon is a scale-out storage solution that grows in capacity and bandwidth as more nodes combine into a single-volume multi-tier cluster.  We already have numerous customers using Isilon for 4K editing and broadcast today.  As we constantly innovate and bring new technology to market, we continue to deliver to our customers the benefits of Moore’s Law.  The real key to Isilon technology is the way that we deliver platform innovation in an incremental and backward-compatible way – supporting the ability to scale and grow non-disruptively.

Beyond LTO Archiving

I mentioned earlier in this blog how my early career was defined by shelves and boxes of tapes – videotapes everywhere.  I spent time in my day handling tape, winding tape into cartridges, even editing audio and videotape using a razor blade!  The most important machine in the building (a commercial TV station) was the cart machine.  That was because it held all of the commercial 30 second spots, and if those did not play, the TV station did not make money and we would not get paid.

Finally we replaced cart machines and replay videotape machines with hard disk servers that were highly reliable, fast to respond to late changes and very flexible.  So I wonder when we will say it is time to replace the data tape archive library with a cloud store?  Certainly we are all familiar with and probably daily users of one of the biggest media archives in the world (I refer to Google’s YouTube).  Wouldn’t it be great if your company had its own YouTube?  A content repository that was always online, instantly searchable, growing with fresh material and just as easy to use?

So then we get down to cost.  It turns out, that even though they seem cheap, the cost of actually using a public cloud store for long term retention is a lot more expensive than existing data tape technology – especially as the LTO industry brings innovation beyond LTO-6 into the latest LTO-7 data tape format with 6TB native capacity.

But that migration process to move all of your media from one standard to the next is painful and time-consuming – introducing cost, wear & tear and impacting on end-user search & retrieval times from the library.

From our survey respondents, the top features for consideration of a storage solution are performance, scalable capacity and efficient use of resources (floor space, power, personnel).  So if we took those criteria into account, cloud storage should win hands-down – if only the price was right.

Well finally now it is.  Dell EMC has been developing an innovative product called ECS (Elastic Cloud Storage) which meets all of the requirements of a Modern Archive – scalable, multi-site geo-replication, open architecture, software-defined.  And now it is available in a range of hardware platforms that offer the high packing density of large capacity and very efficient hard drives – today 8TB is supported and clearly that native capacity will grow.

Increasingly customers are asking us whether this technology is price competitive with LTO libraries, and whether it is reliable and ready for mission-critical high-value archives.  The answer to both of these questions is yes, and the benefits of moving to your own cloud store are significant (whether you choose to deploy it within your own premises or have it hosted for you).

Cloud Solutions are gathering converts

When you boil it all down, our industry is in transformation from a legacy & bespoke architecture to that of a cloud. The great thing about a cloud, is that it is flexible and can easily change shape, scale and take on new processes and workloads.  And it doesn’t have to be the public cloud.  It can be “your cloud”.  Or it can be a mix of both – which really gives you the best of both worlds.  Public cloud for burst, private cloud for base load and deterministic performance.

Building clouds and bringing technology innovation to industry is what Dell EMC is really good at.  Speak with us to learn more about how to embark on this journey and the choices available to you.

SUMMARY

So we find that across the media industry the evolution is underway.  This is a multi-faceted transformation.  We are not just switching from “SD to HD”, we are actually evolving at the business, operations, culture and technology level.

Dell EMC is positioned as an open architecture vendor neutral infrastructure provider offering best in class storage, servers, networking, workstations, virtualisation and cloud management solutions.  Engage with us to secure your infrastructure foundation, to be future-ready, and to simplify your technology environment so that you can focus on what really matters to your business – what makes your offering attractive to viewers (on any platform)

 

trends-picture-1

Buckets, Apps & Digital Exhaust…All in a Day’s Work For a Dell EMC Splunk Ninja

Cory Minton

Principal Systems Engineer at EMC

Grab your hoodies, your witty black t-shirts, and maybe your capes…it’s time for another exciting Splunk .conf, the annual Splunk User Conference taking place this week at the Walt Disney Swan and Dolphin Resort.  All of us at EMC are excited to be sponsoring .conf for the third year in a row, and this year our presence will be bigger and better than ever before. Dell EMC is hosting two technical sessions this year, we’ll have more than 20 of the Dell EMC Splunk Ninjas running around learning and a large booth in the partner pavilion demonstrating our technology solutions.. For all the details, check out our .conf16 site.

This year marks the beginning of a great relationship between two awesome tech businesses: Dell EMC and Splunk.  We joined forces through a formal strategic alliance that started in February.  This alliance enables Dell EMC and its partners to sell Splunk’s industry leading platform. And, it allows Dell EMC unique access to Splunk technical resources for solution design, testing, and validation.  Most importantly, it creates a framework for these two technology powerhouses to collaborate more effectively for customer success.

Why Dell EMC for Splunk?


When we talk about customer success, we mean it in two distinct ways: deploying Splunk on Dell EMC platforms and, using Splunk to derive value from Dell EMC infrastructure.

First, we believe success is deploying Splunk on a flexible infrastructure that not only helps Splunk run fast and efficiently, but also one that can scale easily as the usage of Splunk evolves in a customer organization.  We believe that converged and hyper-converged technologies powered by Dell EMC’s robust portfolio of storage technologies delivers on this vision and provide additional enterprise capabilities:

  • Cost effective & Optimized Storage – Dell EMC delivers optimized and efficient storage by aligning the right storage to Splunk’s hot, warm, and cold data long retention and varying performance requirements.
  • Flexible & Scale-Out capacity consumption model – Scale-out infrastructure to meet capacity and compute requirements independently or as a single, converged platform as per your data growth.
  • Data Reduction & other data Powerful Enterprise Capabilities – including secure encryption, compression & deduplication of indexes, and fast, efficient zero-overhead copies for protection.
    Bottom-less cold bucket – Scale-Out storage platforms, whether on premise or in the cloud, obviates the need for a frozen bucket by providing a PB-scale cold bucket solution, simplifying data management and making data always searchable.

splunk2

 

Splunk and Dell EMC engineering teams have engaged in a strategic collaboration to ensure that all Dell EMC platforms have been validated by Splunk to “meet or exceed Splunk’s published reference server hardware” guidelines.  The Splunk team takes this validation process very seriously and customers can rest assured that if they are considering infrastructure for your Splunk deployment, we have done extensive testing. Whether you are looking at hyper-converged solutions like VXRail or VxRack, converged solutions like VBlock systems, or just storage from EMC like ScaleIO, XtremIO, VNX, Unity, Isilon, or ECS, you can be confident that the work has been done by both Splunk and Dell EMC to make sure it runs well.

Secondly, we believe Splunk is an incredibly powerful platform for capturing and deriving value from machine data.  As it turns out, Dell EMC products spin off a massive amount of “digital exhaust” that can be captured easily and used to drive operational intelligence in IT.  Dell EMC has made massive investments over the last few years to build apps for our platforms and make them available in Splunkbase for free.  We’ve built apps for XtremIO, Isilon  and VNX and expect to have many more in the works.  These apps make it simple to ingest data from Dell EMC platforms, and we offer useful, pre-built reports and dashboards to make monitoring these assets simple.  And it doesn’t stop there…once the data is extracted from your Dell EMC platforms, the underlying searches powering our reports or just the indexes themselves can be used in investigations across the entire IT service stack.  One of my favorite things to hear from our customers is the exciting ways they use the apps beyond just simple reporting and I hope to hear many more stories this year at .conf2016.

Dell EMC Splunk Ninjas And Our Top Ten List

 

splunk_ninja_resizeThe Dell EMC Splunk Ninja team are at the show in their Dell EMC blue Ninja Hunt shirts. The Dell EMC Splunk Ninja team is a group of more than 40 systems engineers from across Dell EMC who have been trained the same way Splunk trains its own systems engineers.  The Ninjas hold certifications ranging from SE1 all the way to SE3, we’ve got skills across not only using Splunk, but administering and architecting it at scale.  This is a global team not only available to talk to you at .conf, but also available in the field to have direct conversations with you when you head back to the office.

 

 

Our wise, passionate and zany Ninja team recently pulled together a list of Top Ten Best Practices for Splunk on Dell EMC. This list has been amassed based on years of lab testing and real world customer experience. You may say ‘Duh’ to some, but others may surprise you.

Happy Splunking!

Analyst firm IDC evaluates EMC Isilon: Lab-validation of scale-out NAS file storage for your enterprise Data Lake

Suresh Sathyamurthy

Sr. Director, Product Marketing & Communications at EMC

A Data Lake should now be a part of every big data workflow in your enterprise organization. By consolidating file storage for multiple workloads onto a single shared platform based on scale-out NAS, you can reduce costs and complexity in your IT environment, and make your big data efficient, agile and scalable.

That’s the expert opinion in analyst firm IDC’s recent Lab Validation Brief: “EMC Isilon Scale-Out Data Lake Foundation: Essential Capabilities for Building Big Data Infrastructure”, March 2016. As the lab validation report concludes: “IDC believes that EMC Isilon is indeed an easy-to-operate, highly scalable and efficient Enterprise Data Lake Platform.

The Data Lake Maximizes Information Value

The Data Lake model of storage represents a paradigm shift from the traditional linear enterprise data flow model. As data and the insights gleaned from it increase in value, enterprise-wide consolidated storage is transformed into a hub around which the ingestion and consumption systems work. This enables enterprises to bring analytics to data in-place – and avoid expensive costs of multiple storage systems, and time for repeated ingestion and analysis.

But pouring all your data into a single shared Data Lake would put serious strain on traditional storage systems – even without the added challenges of data growth. That’s where the virtually limitless scalability of EMC Isilon scale-out NAS file storage makes all the difference…

The EMC Data Lake Difference

The EMC Isilon Scale-out Data Lake is an Enterprise Data Lake Platform (EDLP) based on Isilon scale-out NAS file storage and the OneFS distributed file system.

As well as meeting the growing storage needs of your modern datacenter with massive capacity, it enables big data accessibility using traditional and next-generation access methods – helping you manage data growth and gain business value through analytics. You can also enjoy seamless replication of data from the enterprise edge to your core datacenter, and tier inactive data to a public or private cloud.

We recently reached out to analyst firm IDC to lab-test our Isilon Data Lake solutions – here’s what they found in 4 key areas…

  1. Multi-Protocol Data Ingest Capabilities and Performance

Isilon is an ideal platform for enterprise-wide data storage, and provides a powerful centralized storage repository for analytics. With the multi-protocol capabilities of OneFS, you can ingest data via NFS, SMB and HDFS. This makes the Isilon Data Lake an ideal and user-friendly platform for big data workflows, where you need to ingest data quickly and reliably via protocols most suited to the workloads generating the information. Using native protocols enables in-place analytics, without the need for data migration, helping your business gain more rapid data insights.

datalake_blog

IDC validated that the Isilon Data Lake offers excellent read and write performance for Hadoop clusters accessing HDFS via OneFS, compared against via direct-attached storage (DAS). In the lab tests, Isilon performed:

  • nearly 3x faster for data writes
  • over 1.5x faster for reads and read/writes.

As IDC says in its validation: “An Enterprise Data Lake platform should provide vastly improved Hadoop workload performance over a standard DAS configuration.”

  1. High Availability and Resilience

Policy-based high availability capabilities are needed for enterprise adoption of Data Lakes. The Isilon Data Lake is able to cope with multiple simultaneous component failures without interruption of service. If a drive or other component fails, it only has to recover the specific affected data (rather than recovering the entire volume).

IDC validated that a disk failure on a single Isilon node has no noticeable performance impact on the cluster. Replacing a failed drive is a seamless process and requires little administrative effort. (This is in contrast to traditional DAS, where the process of replacing a drive can be rather involved and time consuming.)

Isilon can even cope easily with node-level failures. IDC validated that a single-node failure has no noticeable performance impact on the Isilon cluster. Furthermore, the operation of removing a node from the cluster, or adding a node to the cluster, is a seamless process.

  1. Multi-tenant Data Security and Compliance

Strong multi-tenant data security and compliance features are essential for an enterprise-grade Data Lake. Access zones are a crucial part of the multi-tenancy capabilities of the Isilon OneFS. In tests, IDC found that Isilon provides no-crossover isolation between Hadoop instances for multi-tenancy.

Another core component of secure multi-tenancy is the ability to provide a secure authentication and authorization mechanism for local and directory-based users and groups. IDC validated that the Isilon Data Lake provides multiple federated authentication and authorization schemes. User-level permissions are preserved across protocols, including NFS, SMB and HDFS.

Federated security is an essential attribute of an Enterprise Data Lake Platform, with the ability to maintain confidentiality and integrity of data irrespective of the protocols used. For this reason, another key security feature of the OneFS platform is SmartLock – specifically designed for deploying secure and compliant (SEC Rule 17a-4) Enterprise Data Lake Platforms.

In tests, IDC found that Isilon enables a federated security fabric for the Data Lake, with enterprise-grade governance, regulatory and compliance (GRC) features.

  1. Simplified Operations and Automated Storage Tiering

The Storage Pools feature of Isilon OneFS allows administrators to apply common file policies across the cluster locally – and extend them to the cloud.

Storage Pools consists of three components:

  • SmartPools: Data tiering within the cluster – essential for moving data between performance-optimized and capacity-optimized cluster nodes.
  • CloudPools: Data tiering between the cluster and the cloud – essential for implementing a hybrid cloud, and placing archive data on a low-cost cloud tier.
  • File Pool Policies: Policy engine for data management locally and externally – essential for automating data movement within the cluster and the cloud.

As IDC confirmed in testing, Isilon’s federated data tiering enables IT administrators to optimize their infrastructure by automating data placement onto the right storage tiers.

The expert verdict on the Isilon Data Lake

IDC concludes that: “EMC Isilon possesses the necessary attributes such as multi-protocol access, availability and security to provide the foundations to build an enterprise-grade Big Data Lake for most big data Hadoop workloads.”

Read the full IDC Lab Validation Brief for yourself: “EMC Isilon Scale-Out Data Lake Foundation: Essential Capabilities for Building Big Data Infrastructure”, March 2016.

Learn more about building your Data Lake with EMC Isilon.

The Democratization of Data Science with the Arrival of Apache Spark

Keith Manthey

CTO of Analytics at EMC Emerging Technologies Division

As an emerging field, data science has seen rapid growth over the span of just a few short years. With Harvard Business Review referring to the data scientist role as the “sexiest job of the 21st century” in 2012 and job postings for the role growing 57 percent in the first quarter of 2015, enterprises are increasingly seeking out talent to help bolster their organizations’ understanding of their most valuable assets: their data.

The growing demand for data scientists reflects a larger business trend – a shifting emphasis from the zeros and ones to the people who help manage the mounds of data on a daily basis. Enterprises are sitting on a wealth of information but are struggling to derive actionable insights from it, in part due to its sheer volume but also because they don’t have the right talent on board to help.

The problem enterprises now face isn’t capturing data – but finding and retaining top talent to help make sense of it in meaningful ways. Luckily, there’s a new technology on the horizon that can help democratize data science and increase accessibility to the insights it unearths.

Data Science Scarcity & Competition

dataThe talent pool for data scientists is notoriously scarce. According to McKinsey & Company, by 2018, the United States alone may face a 50 to 60 percent gap between supply and demand for “deep analytic talent, i.e., people with advanced training in statistics or machine learning.” Data scientists possess an essential blend of business acumen, statistical knowledge and technological prowess, rendering them as difficult to train as they are invaluable to the modern enterprise.

Moreover, banks and insurance companies face an added struggle in hiring top analytics talent, with the allure of Silicon Valley beckoning top performers away from organizations perceived as less inclined to innovate. This perception issue hinders banks’ and insurance companies’ ability to remain competitive in hiring and retaining data scientists.

As automation and machine learning grow increasingly sophisticated, however, there’s an opportunity for banks and insurance companies to harness the power of data science, without hiring formally trained data scientists. One such technology that embodies these innovations in automation is Apache Spark, which is poised to shift the paradigm of data science, allowing more and more enterprises to tap into insights culled from their own data.

Spark Disrupts & Democratizes Data Science

Data science requires three pillars of knowledge: statistical analysis, business intelligence and technological expertise. Spark does the technological heavy-lifting, by understanding and processing data at a scale that most people aren’t comfortable. It handles the distribution and categorization of the data, removing the burden from individuals and automating the process. By allowing enterprises to load data into clusters and query it on an ongoing basis, the platform is particularly adept at machine-learning and automation – a crucial component in any system intended to analyze mass quantities of data.

Spark was created in the labs of UC Berkeley and has quickly taken the analytics world by storm, with two main business propositions: the freedom to model data without hiring data scientists, and the power to leverage analytics models that are already built and ready-for-use in Spark today. The combination of these two attributes allows enterprises to gain speed on analytics endeavors with a modern, open-source technology.

The arrival of Spark signifies a world of possibility for companies that are hungry for the business value data science can provide but are finding it difficult to hire and keep deep analytic talent on board. The applications of Spark are seemingly endless, from cybersecurity and fraud detection to genomics modeling and actuarial analytics.

What Spark Means for Enterprises

Not only will Spark enable businesses to hire non-traditional data scientists, such as actuaries, to effectively perform the role, but it will also open a world of possibilities in terms of actual business strategy.

Banks, for example, have been clamoring for Spark from the get-go, in part because of Spark’s promise to help banks bring credit card authorizations back in-house. For over two decades, credit card authorizations have been outsourced, since it was more efficient and far less dicey to centralize the authorization process.

The incentive to bring this business back in-house is huge, however, with estimated cost savings of tens to hundreds of millions annually. With Spark, the authorization process could be automated in-house – a huge financial boon to banks. The adoption of Spark allows enterprises to effectively leverage data science and evolve their business strategies accordingly.

The Adoption of Spark & Hadoophadoop_1_resized

Moreover, Spark works seamlessly with the Hadoop Distributions sitting on EMC’s storage platforms. As I noted in my last post, Hadoop adoption among enterprises has been incredible and is quickly becoming the de facto
standard for storing and processing terabytes or even petabytes of data.

By leveraging Spark and existing Hadoop platforms in tandem, enterprises are well-prepared to solve the ever-increasing data and analytics challenges ahead.

Follow Dell EMC

Categories

Archives

Connect with us on Twitter