Posts Tagged ‘unstructured data’

Unwrapping Machine Learning

Ashvin Naik

Cloud Infrastructure Marketing at Dell EMC

In a recent IDC spending guide titled Worldwide cognitive systems and artificial intelligence spending guide,   some fantastic numbers were thrown out in terms of opportunity and growth 50+ % CAGR, Verticals pouring in billions of dollars on cognitive systems. One of the key components of cognitive systems is Machine Learning.

According to wikipedia Machine Learning is a subfield of computer science that gives the computers the ability to learn without being explicitly programmed. Just these two pieces of information were enough to get me interested in the field.

After hours of daily  searching, digging through inane babble and noise across the internet, the understanding of how machines can learn evaded me for weeks, until I hit a jackpot. A source, that should not be named pointed me to a “secure by obscurity” share that had the exact and valuable insights on machine learning. It was so simple, elegant and completely made sense to me.

Machine Learning was not all noise, it worked on a very simple principle. Imagine, there is a pattern in this world that can be used to forecast or predict a behavior of any entity. There is no mathematical notation available to describe the pattern, but if you have the data that can be used to plot the pattern, you can use Machine Learning to model it.  Now, this may sound like a whole lot of mumbo jumbo but allow me to break it down in simple terms.

Machine learning can be used to understand patterns so you can forecast or predict anything provided

  • You are certain there is a pattern
  • You do not have a mathematical model to describe the pattern
  • You have the data to try to figure out the pattern.

Viola, this makes so much sense already. If you have data, know there is a pattern but don’t know what that is, you can use machine learning to find it out. The applications for this are endless from natural language processing, speech to text to predictive analytics. The most important is forecasting- something we do not give enough credit these days. The Most critical component of Machine Learning is Data – you should have the data. If you do not have data, you cannot find the pattern.

As a cloud storage professional, this is a huge insight. You should have data. Pristine, raw data coming from the systems that generate it- sort of like a tip from the horses mouth. I know exactly where my products fit in. We are able to ingest, store, protect and expose the data for any purposes in the native format complete with the metadata all through one system.

We have customers in the automobile industry leveraging our multi-protocol cloud storage across 2300 locations in Europe capturing data from cars on the roads. They are using proprietary Machine Learning systems to look for patterns in how their customers- the car owners use their products in the real world to predict the parameters of designing better, reliable and efficient cars. We have customers in the life-sciences business saving lives by looking at the patterns of efficacy and effective therapies for terminal diseases. Our customers in retail are using Machine Learning to detect fraud and protect their customers. This goes on and on and on.

I personally do not know the details of how they make it happen, but this is the world of the third platform. There are so many possibilities and opportunities ahead if only we have the data. Talk to us and we can help you capture, store and secure your data so you can transform humanity for the better.

 

Learn more about how Dell EMC Elastic Cloud Storage can fit into your Machine Learning Infrastructure

 

 

Converged Infrastructure + Isilon: Better Together

David Noy

VP Product Management, Emerging Technologies Division at EMC

You can’t beat Isilon for simplicity, scalability, performance and savings. We’re talking  world-class scale-out NAS that stores, manages, protects and analyzes your unstructured data with a powerful platform that stays simple, no matter how large your data environment. And Dell EMC already has the #1 converged infrastructure with blocks and racks. So bringing these two superstars together into one converged system is truly a case of one plus one equals three.

This convergence—pairing Vblock/VxBlock/VxRack systems and the Technology Extension for Isilon— creates an unmatched combination that flexibly supports a wide range of workloads with ultra-high performance, multi-protocol NAS storage. And the benefits really add up, too:

As impressive as these numbers are, it all boils down to value and versatility. These converged solutions give you more value for your investment because, quite simply, they store more data for less. And their versatility allows you to optimally run both traditional and nontraditional workloads These include video surveillance, SAP/Oracle/Microsoft applications, mixed workloads that generate structured and unstructured data, Electronic Medical Records and Medical Imaging and more – on infrastructure built and supported as one product.

With a Dell EMC Converged System, you’ll see better, faster business outcomes through simpler IT across a wide range of application workloads. For more information on modernizing your data center with the industry’s broadest converged portfolio, visit emc.com/ci or call your Dell EMC representative today.

 

Learn more about Converged Infrastructure and IsilonAlso, check out the full infographic

When It Comes To Data, Isolation Is The Enemy Of Insights

Brandon Whitelaw

Senior Director of Global Sales Strategy for Emerging Technologies Division at Dell EMC

Latest posts by Brandon Whitelaw (see all)

Within IT, data storage, servers and virtualization, there have always been ebbs and flows of consolidation and deconsolidation. You had the transition from terminals to PCs and now we’re going back to virtual desktops – it flows back and forth from centralized to decentralized. It’s also common to see IT trends repeat themselves.

dataIn the mid to late 90s, the major trend was to consolidate structured data sources into a single platform; to go from direct detached storage with dedicated servers per application to a consolidated central storage piece, called a storage array network (SAN). SANs allowed organizations to go from a shared nothing architecture (SN) to a shared everything architecture (SE), where you have a single point of control, allowing users to share available resources and not have data trapped or siloed within the independent direct detached storage systems.

The benefit of consolidation has been an ongoing IT trend that continues to repeat itself on a regular basis, whether it’s storage, servers or networking. What’s interesting is once you consolidate all the data sources, IT is able to finally look at doing more with them. The consolidation onto a SAN enables cross analysis of data sources that were otherwise previously isolated from each other. This was simply practically infeasible to do before. Now that these sources are in one place, this enables the emergence of systems such as an enterprise data warehouse, which is the concept of ingesting and transforming all the data on a common scheme to allow for reporting and analysis. Companies embracing this process led to growth in IT consumption because of the value gained from that data. It also led to new insights, resulting in most of the world’s finance, strategy, accounting, operations and sales groups all relying on the data they get from these enterprise data warehouses.

Next, companies started giving employees PCs, and what do you do on PCs? Create files. Naturally, the next step is to ask, “How do I share these files?” and “How do I collaborate on these files?” The end result is home directories and file shares. From an infrastructure perspective, there needed to be a shared common platform for this data to come together. Regular PCs can’t talk to a SAN without direct block level access, a fiber channel, or being connected in the data center to a server, so unless you want everyone to physically sit in the data center, you run Ethernet.

Businesses ended up building Windows file servers to be the middleman brokering the data between the users on Ethernet and the backend SAN. This method worked until companies reached the point where the Windows file servers steadily grew to dozens. Yet again, this led to IT teams being left with complexity, inefficiency and facing the original problem of having several isolated silos of data and multiple different points of management.

So what’s the solution? Let’s take the middleman out of this. Let’s take the file system that was sitting on top of the file servers and move it directly onto the storage system and allow Ethernet to go directly to it. Thus the network-attached storage (NAS) was born.

However, continuing the cycle, what started as a single NAS eventually became dozens for organizations. Each NAS device contained specific applications with different performance characteristics and protocol access. Also, each system could only store so much data before it didn’t have enough performance to keep up, so systems would continue expanding and replicating to accommodate.

This escalates until an administrator is startled to realize 80 percent of his/her company’s data being created is unstructured. The biggest challenge of unstructured data is that it’s not confined to the four walls of a data center. Once again, we find ourselves with silos that aren’t being shared (notice the trend repeating itself?). Ultimately, this creates the need for scale-out architecture with multiprotocol data access that can combine and consolidate unstructured data sources to optimize collaboration.

Doubling every two years, unstructured data is the vast majority of all data being created. Traditionally, the approach to gaining insights from this data has involved building yet another silo, which prevents having a single source of istock_000048860836_largetruth and having your data in one place. Due to the associated cost and the complexity, not all of the data goes into a data lake, for instance, but only sub-samples of the data that are relevant to that individual query. An option to ending this particular cycle is investing in a storage system that not only has the protocol access and tiering capabilities to consolidate all your unstructured data sources, but can also serve as your analytics platform. Therefore your primary storage, the single source of truth that comes with it and that ease of management will lend itself to become that next phase, which is unlocking its insights.

Storing data is typically viewed as a red-ink line item, but it can actually be to your benefit. Not because of regulation or policies dictating it, but as a deeper, wider set of data that can provide better answers. Often, you may not know what questions to ask until you’re able to see data sets together. Consider the painting technique, pointillism. If you look too closely, it’s just a bunch of dots of paint. However, if you stand back, a landscape emerges, ladies with umbrellas materialize and suddenly you realize you’re staring at Georges Seurat’s famous panting, A Sunday Afternoon on the Island of La Grande Jatte. Similar to pointillism, with data analytics, you never think of connecting the dots if you don’t even realize they’re next to one another.

Breakfast with ECS: Most Wanted Cloud Storage Feature Series

Diana Gao

Senior Product Marketing Manager at EMC² ECS

Welcome to another edition of Breakfast with ECS, a series where we take a look at issues related to cloud storage and ECS (Elastic Cloud Storage), EMC’s cloud-scale object storage platform.

Social media, mobile, cloud and big data are here to stay. Each has brought new opportunities to create more intimate, immediate relationships withElastic Cloud Storage customers, deliver better experiences, enhance business value, and gain competitive advantage. The principal asset that makes all this possible is the insight that comes from data – data that is growing exponentially and without structure.

Burdened by traditional storage systems, IT organizations have struggled to keep up with this explosion of data. This has led to adoption of public-cloud storage platforms like AWS S3, driven by compelling economic advantages over traditional SAN and NAS storage systems. However, public cloud storage platforms involve navigating some fundamental tradeoffs in the areas of data residency, compliance with local laws and regulations, as well as unforeseen costs.

With EMC’s Elastic Cloud Storage (ECS), you don’t have to balance these tradeoffs. It is one storage system that is able to balance the needs between capacity (flat budget for growing data), capability (more demands imposed by applications) and perceived economical alternatives in the public cloud.

Watch the video below and learn more about why Elastic Cloud Storage (ECS) is one storage solution that can cater to all of your needs.

With ECS, you’ll enjoy these key benefits:

  • Multi-purpose platform
  • Smart storage
  • Enterprise class
  • Superior economics

What features make ECS such an awesome platform? Stay tuned for more blogs.

Learn more about ECS and try ECS for free.

Categories

Archives

Connect with us on Twitter