Archive for the ‘Data Lake’ Category

Your Data Lake Is More Powerful and Easier to Operate with New Dell EMC Isilon Products

Karthik Ramamurthy

Director Product Management
Isilon Storage Division at Dell EMC

Earlier this year Dell EMC released a suite of Isilon products designed to enable your company’s data lake journey. Together IsilonSD Edge, Isilon OneFS 8.0, and Isilon CloudPools transformed the way your organization stores and uses data by harnessing the power of the data lake. Today we are pleased to announce all three of these products have been updated and further enhanced to make your data lake even more powerful and easier to operate from edge to core to cloud.

Starting with the release of OneFS 8.0.1

OneFS 8.0.1 builds on the powerful platform provided by OneFS 8.0 released in February 2016. The intent of this newest release is to provide features important to unique customer datacenter workflows, enhance usability and manageability of OneFS clusters. In addition, OneFS 8.0.1 is the first release that takes full advantage of the non-disruptive upgrade and rollback framework introduced in OneFS 8.0.

Let’s review some of the most compelling features of this software release.

Improved Management, Monitoring, Security and Performance for Hadoop on Isilon

Expanding on the Data Lake, one of the focus areas of this new release was increasing the scope and usefulness of our integration with leading Hadoop management tools. OneFS 8.0.1 delivers support for and integration with Apache Ambari 2.4 and Ranger. A single management point now allows Ambari operators to seamlessly manage and monitor Hadoop clusters with OneFS as the HDFS storage layer. Ranger is an important security management tool for Hadoop.  These Ambari and Ranger integration features benefit all customers using Hortonworks and ODP-I compliant Hadoop distributions with OneFS.

Additionally OneFS 8.0.1 adds new features including Kerberos encryption to secure and encrypt data between HDFS clients and OneFS. In addition, Datanode load balancing avoids overloading nodes and increases cluster resilience. OneFS 8.0.1 also supports the following HFDS distributions: Hortonworks HDP 2.5, Cloudera CDH 5.8.0, and IBM Open Platform (IOP) 4.1.

Introducing Scale-Out NAS with SEC Compliance and Asynchronous Replication for Disaster Recovery

With OneFS 8.0.1, Isilon becomes the first and only Scale-Out NAS vendor that offers SEC-17a4 compliance via SmartLock Compliance Mode combined with the asynchronous replication to secondary or standby clusters via SyncIQ. This powerful combination means companies that must comply with SEC-17a4 are no longer caught in a choice between compliance and data recovery – with OneFS 8.0.1 they have both!

Storage Efficiency Designed for the Healthcare Diagnostic Imaging Needs

For many years PACS (Picture Archiving and Communication System) applications diagnostic imaging data was stored in large “container” files for maximum storage efficiency. In recent years, the way referring physicians’ access individual diagnostic images changed and, as a result, the methods used to store diagnostic imaging files had to change as well. OneFS 8.0.1 has a new storage efficiency feature specifically designed for the Healthcare PACS archive market to provide significantly improved storage efficiency for diagnostic imaging files.  Isilon customers can expect to see storage efficiency similar to OneFS’s large file storage efficiency for diagnostic imaging files when using this feature.   you leverage Isilon to store your PACS application data you will want to talk with your sales representative and learn more about this new feature.

Upgrade with Confidence

OneFS 8.0, released in February 2016, provided the framework for non-disruptive upgrades for all supported upgrades going forward and the addition of release rollback. OneFS 8.0.1 is the first OneFS release that you will be able to test and validate and, if needed, rollback to the previously installed 8.0.x release. This means that you can non-disruptively upgrade to 8.0.1, without impacting users or applications! You will be able to upgrade sets of nodes or the entire cluster for your testing and validation and then, once complete, you decide to commit the upgrade or rollback to the prior release. Once committed to OneFS 8.0.1, future upgrades will be even easier and more transparent with the ability to view an estimate of how long an upgrade will take to complete and transparency of the upgrade process. The WebUI was enhanced to make upgrade management even easier than before.

Manage Performance Resources like Never Before

Even more exciting is the new Performance Resource Management framework introduced in OneFS 8.0.1. This framework is the start of a revolutionary scale-out NAS performance management system. In OneFS 8.0.1 you will be able to obtain and view statistics on the performance resources (CPU, operations, data read, data written, etc.) for OneFS jobs and services. This will allow you to identify quickly if a particular job or service may be the cause of performance issues. These statistics are available via the CLI, Platform API and can be visualized with InsightIQ 4.1. In future releases these capabilities will be expanded to clients, IP addresses, users, protocols and !

These are just some of the new features OneFS 8.0.1 has to offer. OneFS 8.0.1 improves on our support for MAC OS clients, SMB, audit, NDMP and data migrations, to name a few other areas.  The white paper, Technical Overview of New and Improved Features of EMC Isilon OneFS 8.0., provides additional details on these and other new and improved features in OneFS 8.0.1

Isilon SD Edge Management Server version 1.0.1

This July EMC released a new version of IsilonSD Edge Management Server. Version 1.0.1 provides support for VMware ESX 6.0 in addition to previously supported ESX versions. This management server also enables monitoring of the IsilonSD Edge Clusters via EMC’s Secure Remote Support (ESRS) server and tools.

Isilon CloudPools Just Got Easier to Manage

OneFS 8.0.1 provides improved flexibility for CloudPools deployments in the enterprise with the introduction of proxy support. This allows administrators to specify one or more proxy servers between the Isilon cluster and your cloud provider of .

The Data Lake Journey is Just Beginning!

OneFS 8.0.1 is an important step on the data lake journey; however, you can rest assured we are not stopping here! Look forward to amazing new hardware and software features in coming releases as we build on the Performance Resource Management Framework, provide more workload specific enhancements to address our customers’ needs and deliver new levels of supportability, serviceability, scale and performance.   Don’t wait, upgrade .  Click here to download OneFS 8.0.1.

Announcing Isilon OneFS 8.0.1

David Noy

VP Product Management, Emerging Technologies Division at EMC

It’s really been an exhilarating last couple of months leading up to the recent historical merger between Dell and EMC! We just completed our first Dell EMC World and announced Isilon All-Flash last week.  While all that was in progress, the Isilon team was heads-down focused on the next update to the industry leading Scale-Out NAS OneFS operating system.

Today, we’re announcing the new OneFS 8.0.1 release with a strong focus on strengthening the Data Lake with features supporting the horizontals and vertical markets we serve. For the horizontal markets, we’ve added new and improved capabilities around Hadoop big data analytics, Isilon CloudPools, and IsilonSD Edge. For the vertical industries we support, we’ve focused on enhancing the needs of the Healthcare and Financial markets.

Customers continue to gain more value from their data with analytics.  Hadoop based solutions have always been a pillar for Isilon customers because of native support for HDFS protocol in the OneFS operating system. In OneFS 8.0.1, we’ve added support for Apache Ambari to proactively monitor key performance metrics and alerts which enables enterprise customers to have a single point for management of the entire Hadoop cluster. In addition, from a security perspective, not only have we integrated with Apache Ranger to deliver seamless authorization and access control, but we’ve also added support for end-to-end data in-flight encryption between Isilon nodes and the HDFS client.

Many of the Isilon enterprise customers continue to use OneFS because of its simplicity and ease of management at scale. We’ve added many new features for enterprises like CloudPools proxy support to increase security, reduce risk, and simplify management. For IsilonSD Edge software defined storage, we’ve added support for VMware ESX 6.0 and have seamlessly integrated with EMC Remote Support (ESRS) for remote monitoring, issue resolution and troubleshooting.

Other enterprise capabilities include seamless non-disruptive upgrades from OneFS 8.0, upgrade rollback support, a 5X improvement in audit performance and a completely re-written framework for performance resource management,  reporting and data insights.

Isilon deployments continue to add value for customers across verticals like Media & Entertainment, Healthcare, Life Sciences, EDA and others. In this release we have strengthened our solutions for the Healthcare and Finance verticals.  For Healthcare PACS workloads, we’ve added capabilities in OneFS 8.0.1 that increases the efficiency, optimizes the storage and significantly improves the storage utilization for PACS archive workloads. For the Financial industry, we’ve added seamless integration for compliance data with business continuity features by integrating SmartLock compliance mode with SyncIQ replication for push button failover and failback.

OneFS 8.0.1 – the first major upgrade to the OneFS 8.0 code base – it contains a number of features that many enterprises were waiting for. If you are looking to upgrade to the OneFS 8.0 code base because you generally want to wait for a subsequent “dot release”, today is the day, your wait is over!

Software defined scale-out NAS extends your Data Lake from core to edge: IsilonSD Edge NAS software

Sri Seshadri

Product Marketing at EMC Isilon

When people consider enterprise data growth, they often focus on the ‘core’ IT within the corporate headquarters and datacenter. But what’s happening further away from the core – at your remote offices and branch offices?

We all know that the amount of enterprise data requiring storage is doubling every 2–3 years (according to analyst IDC’s ‘Digital Universe’ study). Managing these ever-growing quantities of (mostly unstructured) data is a constant challenge for most enterprises.

At the enterprise core, EMC Isilon is already addressing that challenge. The Isilon data lake offering helps you consolidate your data, eliminate storage silos, simplify management, increase data protection, and gain value from your data assets. Isilon’s built-in multi-protocol capabilities support a wide range of traditional and next-gen applications – including data analytics that can be used to gain better insights to accelerate your business.

But data is also growing at enterprise edge locations. A recent ESG study (“Remote Office/Branch Office Technology Trends”, May 2015) showed that 68% of organizations now have an average of more than 10 TB of data stored at each branch office – while only 23% reported this amount of edge-stored data in 2011.


Soon you won’t say Travel Safe, instead you’ll say Travel Smart!

Keith Manthey

CTO of Analytics at EMC Emerging Technologies Division

As a frequent traveler myself, I can appreciate this situation.  A lone traveler is enjoying a quiet evening in their hotel.   As they unwind from the day, they peruse the local paper.  They are shocked to learn that their attempt at returning home the next day will be dashed by transit strikes.  All modes of public transportation will be shutdown causing an ill-timed exit from their current travel stop.  There are certainly other ways for the traveler to reach the airport, but the 5x surge pricing for their popular ride sharing application makes it an expensive trip.  There is also an expectation that the ride sharing application drivers might face violence from striking transit workers.  This all could have been avoided if their company subscribed to a travel alert for pending situations.  The advent of situational awareness tools that can monitor travel threats and pair that to traveler itineraries is an evolving field.  It is an advance warning to that weary traveler that forewarns them to seek personal safety and adjust their travel plans accordingly.  In the case of our weary traveler, an advance warning would allow them to change their travel plans in time to avoid this sticky situation.


What do Analytics and the Suez Canal have in common?

Suresh Sathyamurthy

Sr. Director, Product Marketing & Communications at EMC

Suez Canal

1859: Egyptian workers under French engineers begin construction of the Suez Canal. A canal across the Isthmus of Suez would cut the ocean distance from Europe to Asia by up to 6,000 miles, and it could be built at sea level, without any locks. Circumventing the additional travel would reduce risk, overhead of additional supplies, and fewer sailors.  Completed ten years later, the effect on the world trade was immediate. This wonder shrunk the globe rapidly at a topography level but also in the time, it traditionally took to gain business and economic benefits.

The economic benefits of the Suez Canal and In-Place Hadoop analytics

My metaphor here is that like sailing the old, 12,000+ mile route around the coast of Africa, the traditional method of storing and moving data for analysis is a long and arduous journey that affects your business and economic benefits. Just as the pre-Suez Canal journey from Europe to Asia required significantly more time, larger ships, more crew and more provisions, the traditional route to analytics requires more time (copying and moving data) bigger ships (3x storage capacity), more crew (IT resources) and more provisions (overhead). Now, imagine taking the EMC data lake route that reduces overhead, takes much less time, and offers increased flexibility. The EMC Isilon data lake with its native Hadoop Distributed File System (HDFS) support is the modern route to actionable results. It effectively brings Hadoop to where your data exists today, as opposed to having to ship and replicate your data to a separate Hadoop stack for analysis.

The Open Data Platform Initiative (ODPi), IBM and EMC Isilon

The Isilon data lake’s shared storage architecture natively supports HDFS, and the ODPi common platform. IBM, EMC, Pivotal and Hortonworks established the ODPi to create a standardized, common platform for Hadoop analytics that enables organizations to realize business results more quickly.  Which brings us to the EMC and IBM analytics collaboration. IBM BigInsights, being a part of the ODPi, means now there’s another choice for in-place analytics with the EMC data lake. And, it quickly became evident to both EMC and IBM that there was a strong customer demand for IBM BigInsights and EMC Isilon to align on a data lake approach to analytics. The EMC and IBM collaboration enables analytics on your data right where it is, within the EMC Isilon data lake, while IBM BigInsights provides the separate compute resources that analyze the data. Now you’re on the expedited route to business analytics with EMC Isilon and IBM.

Whether you are looking to gain a 360-degree view of your customers, attempting to prevent fraud in the financial markets, or making smarter infrastructure investments, the increased efficiencies of the partnership allows you to be nimble in understanding and reacting to what your data is telling you.

About 15,000 ships make the 11-hour journey through the canal each year. It’s estimated that the canal bears roughly 8 percent of the world’s shipping and is recognized as one of the most important waterways in the world. Forrester Research1 predicts big data analytics as the number 2 priority of corporations, and states Hadoop has already disrupted the economics of data. Just as the Suez Canal offers key business benefits for trade between Europe and Asia, so does in-place analytics. Here’s how: Compass

  • No moving and copying of data
  • No 3X replication of data
  • Increased storage utilization efficiency (to an average of 80%)
  • Enterprise data resiliency and availability
  • Enterprise grade security features
  • Quicker time to business insight
  • Smarter infrastructure investments
  • Reduction of CAPEX and OPEX
  • Increased choice and flexibility

In summary, back to the metaphor, the modern route to analytics saves on time to benefit, and can be achieved with smaller ships, less crew, and with fewer provisions required.

Where can I get more details?

The EMC Hadoop Starter Kit for IBM BigInsights is available and has instructions on how to build and deploy IBM BigInsights Open Platform with EMC Isilon. You can also learn more about the Hadoop enabled EMC Data Lake here.

1 Source: Forrester Predictions 2015: Hadoop Will Become a Cornerstone of Your Business Technology Agenda

Big Data’s Revolutionary Roadmap

Ashvin Naik

Cloud Infrastructure Marketing at Dell EMC

Big data has the potential to transform humankind – from helping us cure diseases to simplifying and streamlining our lives. Now, let us not get into the debatable aspects of our increasingly digital universe but focus on the art of the possible.

After having digitized our entire lives, which is well documented in the digital universe study, we have Digital Universemoved on to machines.

With the help of devices, wearables, sensors and systems, we’ve begun collecting, connecting and disseminating information, thoughts, and experiences in ways unimaginable just a few years back.

Most organizations are looking for ways to ride the waves of big data and transform their domain – be it customer experiences, processes, systems or plain and simple time management. All big data projects need a roadmap, a recipe that can iteratively move you toward your goal. After wading through a wide array of literature, I was able to simplify the process into three steps: an exploration phase followed by optimization that leads to true transformation.Explore Optimize Transform

These iterative steps can help guide your projects irrespective of domain in an easy to execute format.

All big data projects start with a question; data scientists will not touch anything without a good question – the first is often to explore. The question can be as generic as:

  • “Is our customer experience with our company good enough to keep them coming?”
  • “How can we transform the customer experience so that customers become staunch evangelists for our brand?”
    or a much more specific question like
  • “How much more would a buyer spend if we kept him or her on the site for two more minutes?”




Connect with us on Twitter