Build vs. Buy for OpenStack Private Clouds

Doug Bowers

VP of Engineering, Infrastructure Solutions Group at Dell EMC

Latest posts by Doug Bowers (see all)

Over the past several months there have been some excellent posts on this blog that highlight Dell EMC build vs. buy options as it relates to OpenStack.  Dell EMC offers a range of OpenStack solutions starting with enabling technologies for customers who want a do-it-yourself (DIY) cloud and ending with turnkey solutions like VxRack System with Neutrino.

The goal for VxRack Neutrino is to bring the benefits of turnkey deployment and integrated lifecycle management to an open source software stack –  a stack that has its roots firmly planted in the DIY world.

OpenStack started life as a DIY alternative to public cloud offerings.  Its popularity has extended to customers that want the benefits of an open source platform without having to hire the expertise to assemble and operate the platform themselves (i.e. non-DIY) – hence VxRack Neutrino.  So what have we learned from customers using or considering VxRack Neutrino?

  • Customers want products that make it easier to deploy open source software stacks – products that pre-integrate disparate software components and ensure they will work on a stable hardware platform.  This need is not just limited to initial installation and deployment, but also support for day 2 and beyond in order to successfully monitor and manage the system and establish a clear way to upgrade the various software components that must stay in synch (life cycle management).
  • VxRack Neutrino is a turnkey solution – which means that the customer gives up a degree of flexibility to get the benefit of operational efficiency.  While in many cases this is a tradeoff customers are willing to make, early customer feedback indicates customers want more flexibility in hardware options than VxRack Neutrino – the turnkey solution – offers.
  • Customers also indicate that support and training on the OpenStack distribution itself is critical. Customers have expressed interest in getting these services from Dell EMC partner companies (e.g. Red Hat).

So what does all this mean?  Dell EMC has made the strategic decision to meet this customer demand for OpenStack private clouds with our Reference Architecture and Validated System portfolio and end of life VxRack Neutrino.

DELL EMC has the following solutions for customers looking to build OpenStack private clouds:

  • Red Hat OpenStack Solution – A validated solution using Dell servers and switches delivered via our strategic partnership with Red Hat and jointly engineered by Dell EMC and Red Hat
  • ScaleIO OpenStack Reference Architectures – Validated building block of ScaleIO software defined block storage and Dell servers. As a heterogeneous software defined storage offering; ScaleIO supports Red Hat, Mirantis and Canonical OpenStack environments.

These options provide outstanding hardware flexibility.  They also leverage partner relationships (e.g. Red Hat) to provide customers the OpenStack support and training experience they are seeking, while using a combination of up front engineering and validation along with services to provide a turnkey experience.

Dell EMC remains strongly committed to supporting the OpenStack ecosystem as demonstrated by the breadth of our offerings.   Some areas of particular focus:

  • OpenStack community engagement: This includes community participation and contributions to enhance OpenStack, development and support of plug-ins for all of our products, and development of reference architectures with multiple partners.
  • OpenStack committers: Steady increasing level of commits and committers release over release, and broad support for integrating Dell EMC storage products into an OpenStack based cloud.

In summary we remain committed to listening to our customer’s and offering choice across a broad range of OpenStack deployment options – from best in class components for those looking to “build” and validated solutions and reference architectures  for  those looking for more.

Data Security: Are You Taking It For Granted?

Keith Manthey

CTO of Analytics at EMC Emerging Technologies Division

sc1

Despite the fact that the Wells Fargo fake account scandal first broke in September, the banking giant still finds itself the topic of national news headlines and facing public scrutiny months later. While it’s easy to assign blame, whether to the now-retired CEO, the company’s unrealistic sales goals and so forth, let’s take a moment to discuss a potential solution for Wells Fargo and its enterprise peers. I’m talking about data security and governance.

There’s no question that the data security and governance space is still evolving and maturing. Currently, the weakest link in the Hadoop ecosystem is masking of data. As it stands at most enterprises using Hadoop, access to the Hadoop space translates to uncensored access to information that can be highly sensitive. Fortunately, there are some initiatives to change that. Hortonworks recently released Ranger 2.5, which starts to add allocated masking. Shockingly enough, I can count on one hand the number of clients that understand they need this feature. In some cases, CIO- and CTO-level executives aren’t even aware of just how critical configurable row and column masking capabilities are to the security of their data.

Another aspect I find to be shocking is the lack of controls around data governance in many enterprises. Without data restrictions, it’s all too easy to envision Wells Fargo’s situation – which resulted in 5,300 employees being fired – repeating itself at other financial institutions. It’s also important to point out entering unmasked sensitive and confidential healthcare and financial data into a Hadoop system is not only an unwise and negligent practice; it’s a direct violation of mandated security and compliance regulations.

Identifying the Problem and Best Practices

sc3From enterprise systems administrators to C-suite executives, both groups are guilty of taking data security for granted, and assuming that masking and encryption capabilities are guaranteed by default of having a database. These executives are failing to do their research, dig into the weeds and ask the more complex questions, often times due to a professional background that focused on analytics or IT rather than governance. Unless an executive’s background includes building data systems or setting up controls and governance around these types of systems, he/she may not know the right questions to ask.

Another common mistake is not strictly controlling access to sensitive data, putting it at risk of theft and loss. Should customer service representatives be able to pull every file in the system? Probably not. Even IT administrators’ access should be restricted to the specific actions and commands required to perform their jobs. Encryption provides some file level protections from unauthorized users.  Authorized users who have the permission to unlock an encrypted file can often look at fields that aren’t required for their job.

As more enterprises adopt Hadoop and other similar systems, they should consider the following:

Do your due diligence. When meeting with customers, I can tell they’ve done their homework if they ask questions about more than the “buzz words” around Hadoop. These questions alone indicate they’re not simply regurgitating a sales pitch and have researched how to protect their environment. Be discerning and don’t assume the solution you’re purchasing off the shelf contains everything you need. Accepting what the salesperson has to say at face value, without probing further, is reckless and could lead to an organization earning a very damaging and costly security scandal.

Accept there are gaps. Frequently, we engage with clients who are confident they have the most robust security and data governance available.
sc4However, when we start to poke and prod a bit more to understand what other controls they have in place, the astonishing answer is zero. Lest we forget that “Core” Hadoop only obtained security in 2015 without third-party add-ons, the governance around the software framework is still in its infancy stage in many ways. Without something as inherently rudimentary in traditional IT security as a firewall in place, it’s difficult for enterprises to claim they are secure.

Have an independent plan. Before purchasing Hadoop or a similar platform, map out your exact business requirements, consider what controls your business needs and determine whether or not the product meets each of them. Research regulatory compliance standards to select the most secure configuration of your Hadoop environment and the tools you will need to supplement it.

To conclude, here is a seven-question checklist enterprises should be able to answer about their Hadoop ecosystem:

  • Do you know what’s in your Hadoop?
  • Is it meeting your business goals?
  • Do you really have the controls in place that you need to enable your business?
  • Do you have the governance?
  • Where are your gaps and how are you protecting them?
  • What are your augmented controls and supplemental procedures?
  • Have you reviewed the information the salesperson shared and mapped it to your actual business requirements to decide what you need?

Solving the Video Vortex at the Secured Cities Conference

Gary Buonacorsi

CTO of State and Local Government at Dell EMC

Latest posts by Gary Buonacorsi (see all)

I’m in Houston today at the Secured Cities conference, the leading government security and public safety event, to participate on the “Video Vortex Drives Public Safety to the Datacenter” panel. I’ll be joined by Kenneth Baker, director of Infrastructure Support at the Metropolitan Transit Authority of Harris County (METRO), who recently helped implement a citywide video surveillance system for the bus and trolley service. I’m looking forward to hearing more about METRO’s specific architecture, the pain points and challenges the department faced and what problems it hopes to solve with the new system.

For those of you unable to join us in the “Space City” of Houston, here’s a glimpse of what I’ll be covering in the session:

 

What is driving the increase in data for state and local government? 

drroneOne key factor is the emergence of new surveillance technology, such as drones, body cameras, license plate trackers and audio/video recognizance. In particular, drone usage in the public safety arena has seen significant growth for providing situational awareness in tactical events such as bank robberies or hostage situations. In addition to tactical operations, drones are also being used around the country for policing activities. Pilot programs are popping up in cities like Modesto, California, where law enforcement is using drones to assist with search warrants and surveying crime scenes. The sky’s the limit for drone usage in law enforcement, as evidenced by Amazon patenting a voice-activated shoulder-mounted drone earlier this month that officers can use to help assess dangerous situations.

Secondly, resolution requirements are increasing. Grainy pictures are ineffectual when it comes to facial recognition, analytics and post-evaluation, forcing the transition from standard definition to 4K. As new tools and analytics are posed, resolution requirements are much higher.

Perhaps the most common reason for the increase in data for public safety organizations is the growing number of camera counts and longer video retention times. With the rise of citywide surveillance, cities such as London and New York City are moving towards having cameras on practically every street corner. Discovery activities in legal proceedings are extending the retention period and the chain of evidence storage requirements.

 

Given this exponential data growth, how is it impacting organizations and what do they need to focus on?

IT departments at these organizations should look for architectures that are open source, scalable and enterprise-ready to integrate with the system they currently have, in addition to any changes they may make in the future. Simply put, department heads should avoid spot solutions and instead adopt an integrated, strategic approach to help plan for the years ahead. I would counsel them to look for a solution that allows them to start small but grow big, and easily add more cameras and scale without disrupting the current environment.

The next major area to consider is life cycle management. Previously, video footage was kept for a week before it was written over or deleted. Now long term archiving is critical with the potential for courts to mandate digital assets such as video evidence in a capital case to be maintained indefinitely.

Organizations must embrace the shift to an enterprise model. For police departments, having body cameras isn’t enough. They must consider how to integrate them into dashboard cameras, 911 call centers, etc., taking each of these point solutions to form an enterprise approach.

 

Which platform will support retention policies and what are the three different storage architectures? How can organizations escape the video vortex?
cloud2Early video surveillance solutions presented a host of challenges, including restricting departments to certain file and storage protocols, and communication channels. Combine those factors with non IP-based cameras, and modernizing existing systems became extremely difficult. The first step for organizations to solve the video vortex is to select an open platform that not only allows them to migrate and move data from system to system, but that enables them to shift providers easily. Open platforms also present more options in terms of analytics and security, enabling departments to apply more traditional security tools on top of their data storage and data transportation needs.

Compute and data storage is the key element to eliminating the video vortex. Storage is the foundation layer of a sound architecture and must address the needs of an organization, including scaling, enterprise approach and open platform to avoid a lock-in. Currently, three storage architectures exist today: distributed, centralized and cloud. Police forces that are relatively small typically still rely on a distributed architecture, capturing the data from their cars and body cameras and physically transporting it back from a mobile storage device to a centralized repository where it can then be analyzed and managed. Distributed architectures can be folded into centralized architectures, allowing them to be part of the enterprise approach with a centralized location like police headquarters, schools, airports or the METRO. A centralized architecture makes it possible to gather all of these remote data feeds from their video surveillance solutions and bring them back to a centralized repository. In a case like this, the architecture must be efficient, storing only essential data to minimize utilization rates and costs. It must also be capable of supporting thousands of surveillance devices in order to scale to multiple distributed architectures that are coming back to one location.

The third architecture to consider is cloud. Cloud presents a useful solution in that it is elastic, scalable, expands very easily and can ramp up very quickly. However, cloud storage can be very costly in light of the potential retention policy changes, data sets and cloud size – all of a sudden, the portability of those cloud data sets become much more complex. From an architecture perspective, organizations must consider how to bridge that gap and determine the amount of data that can be returned to a more cost-effective on-premise solution without compromising the capabilities that cloud offers.

Finally, distributed, centralized and cloud platforms all underlie the data lake architecture, which is really the foundation for evidence management and helps solve the video vortex public safety organizations are facing.

Dell EMC DSSD D5 And Toshiba Accelerate AI With Deep Learning Test Bed

Jason Tolu

Senior Product Marketing Manager | DSSD, Emerging Technologies Division at Dell EMC

Latest posts by Jason Tolu (see all)

DSSD D5 Rack-Scale Flash Storage Provides High Performance Storage To New Toshiba Deep Learning Test Bed For Facilities Management

With increasing computing power and the Internet of Things supplying ever increasing sources, types and amounts of data, the potential for new, innovative applications and products are limitless.   One way to organizations are taking advantage of this is through machine learning or deep learning, where computers learn from the data with the use of analytical models.  More and more complex algorithms are being applied to massive quantities of data to develop machine learning, AI driven applications such as self-driving cars or smart buildings. image1

Toshiba Corporation is at the head of the deep learning movement.  The Toshiba Smart Community Center in Kawasaki, Japan, which opened up in 2013, makes use of a wide variety of IoT sensor devices and is a key element in their vision to bring new innovations to market for smarter facility management. To make this vision possible, Toshiba and Dell Technologies have joined forces to develop a deep learning test bed to improve the management of IoT edge devices that provide data to enterprise networks.  The jointly developed solution has become the first deep learning platform to be approved by the Industrial Internet Consortium (ICC).

The test bed will be used in Toshiba’s Smart Community Center in Kawasaki, Japan and will utilize big data from a variety of sensors, including building management, air conditioning and building security, to provide more efficient machine control, reduce maintenance costs and improve the management for building facilities.

screen-shot-2016-11-14-at-1-53-46-pm

DSSD D5 Provides The Storage Performance For Machine Learning and AI

Toshiba will be providing the deep learning technology for analyzing and evaluating big data for the Deep Learning test bed. Dell EMC DSSD D5 will provide the high-speed storage – the low latency, along with the IOPS, bandwidth and the capacity required for rapid ingest and complex analytics on large data sets in near real-time.

In developing the solution, Toshiba utilized Hadoop to achieve record-breaking performance. Toshiba’s choice of DSSD D5 as the storage layer in this Deep Learning solution validates DSSD D5’s standing as the high performance storage of choice for next generation applications that are striving to take advantage of growing data and computational power.

Limitless Possibilities For Smart Facilities Management

With the integration of Toshiba Deep Learning and high performance storage from Dell EMC DSSD D5, Toshiba and Dell Technologies are accelerating the application of artificial intelligence to benefit multiple industries.  And, with the approval of ICC, the jointly developed solution is a major step in the advancement of IoT for industrial usage. The verification of the test bed at the Smart Community Center is expected to be concluded by September 2017.  Once the verification is complete, Toshiba intends to roll out the solution to hospitals, hotels, shopping malls, factories and airports.

If you would like to find out more about the Toshiba and Dell Technologies Deep Learning Solution:

India’s Largest Search Engine Dials into Object Storage

Corey O'Connor

Senior Product Marketing Manager at Dell EMC² ETD

Welcome to another edition of the Emerging Technologies ECS blog series, where we take a look at issues related to cloud storage and ECS (Elastic Cloud Storage), Dell EMC’s cloud-scale storage platform. 

Navigating the World Wide Websearch

The World Wide Web was invented by an independent contractor at a nuclear research facility in Switzerland back in the late 80’s (who knew!) In its early stages, the web was extremely clumsy and had to be completely indexed by hand. It didn’t take long for the computer geeks of the world to create a very rudimentary search engine tool comprised of a searchable database of files that captured all public directory listings – the big problem here was the data they were able to ingest was limited and searching through it was a very manual and tedious task. After a few years of development, “all text” search engines were established (which is what we currently use today) providing users the ability to search for any word within the contents of any web page.

Up to this point, search engine tools were developed mostly by university researchers and small startups and although showing lots of promise, they had a difficult time monetizing them. Then one day a spinoff from a startup shop came up with the brilliant idea to sell search terms; a ‘pay-for-placement’ service to businesses which made search engines one of the most lucrative tech businesses almost overnight.     

Just Dial Limited

Like Google to the United States and Baidu Inc. to China, Just Dial Limited is the premier search engine provider in India. Just Dial also provides services to the US, UK, UAE, Canada and satisfies over 1.5 billion daily customer requests that come in from around the world.

The challenge: Just Dial had a strict retention policy of five years for their customer’s data with most of it being static and infrequently accessed. Their traditional SAN infrastructure was neither a cost effective nor scalable solution and like many other organizations, they had their concerns around putting sensitive customer data into the public cloud. There was also a constant demand for storage from their application developers and storage admins as capacity seemed to always be running thin.

The solution: Just Dial was in the market for an in-house, native object cloud-based solution that provided universal access, multi-site support, and easily integrated with their cloud services. They chose Dell EMC’s Elastic Cloud Storage (ECS) and would see an 80% reduction in their overall storage management costs. Just Dial was able to easily provision unlimited capacity to their end-users, move all static archival data to ECS by policy, and experience true cloud-scale economics across their data centers. Watch the video below for the full story:

 

Want to start your Digital Transformation with ECS? Find out how by visiting us at www.dellemc.com/ecs or try the latest version of ECS for FREE for non-production use by visiting www.dellemc.com/getecs.

When It Comes To Data, Isolation Is The Enemy Of Insights

Brandon Whitelaw

Senior Director of Global Sales Strategy for Emerging Technologies Division at Dell EMC

Latest posts by Brandon Whitelaw (see all)

Within IT, data storage, servers and virtualization, there have always been ebbs and flows of consolidation and deconsolidation. You had the transition from terminals to PCs and now we’re going back to virtual desktops – it flows back and forth from centralized to decentralized. It’s also common to see IT trends repeat themselves.

dataIn the mid to late 90s, the major trend was to consolidate structured data sources into a single platform; to go from direct detached storage with dedicated servers per application to a consolidated central storage piece, called a storage array network (SAN). SANs allowed organizations to go from a shared nothing architecture (SN) to a shared everything architecture (SE), where you have a single point of control, allowing users to share available resources and not have data trapped or siloed within the independent direct detached storage systems.

The benefit of consolidation has been an ongoing IT trend that continues to repeat itself on a regular basis, whether it’s storage, servers or networking. What’s interesting is once you consolidate all the data sources, IT is able to finally look at doing more with them. The consolidation onto a SAN enables cross analysis of data sources that were otherwise previously isolated from each other. This was simply practically infeasible to do before. Now that these sources are in one place, this enables the emergence of systems such as an enterprise data warehouse, which is the concept of ingesting and transforming all the data on a common scheme to allow for reporting and analysis. Companies embracing this process led to growth in IT consumption because of the value gained from that data. It also led to new insights, resulting in most of the world’s finance, strategy, accounting, operations and sales groups all relying on the data they get from these enterprise data warehouses.

Next, companies started giving employees PCs, and what do you do on PCs? Create files. Naturally, the next step is to ask, “How do I share these files?” and “How do I collaborate on these files?” The end result is home directories and file shares. From an infrastructure perspective, there needed to be a shared common platform for this data to come together. Regular PCs can’t talk to a SAN without direct block level access, a fiber channel, or being connected in the data center to a server, so unless you want everyone to physically sit in the data center, you run Ethernet.

Businesses ended up building Windows file servers to be the middleman brokering the data between the users on Ethernet and the backend SAN. This method worked until companies reached the point where the Windows file servers steadily grew to dozens. Yet again, this led to IT teams being left with complexity, inefficiency and facing the original problem of having several isolated silos of data and multiple different points of management.

So what’s the solution? Let’s take the middleman out of this. Let’s take the file system that was sitting on top of the file servers and move it directly onto the storage system and allow Ethernet to go directly to it. Thus the network-attached storage (NAS) was born.

However, continuing the cycle, what started as a single NAS eventually became dozens for organizations. Each NAS device contained specific applications with different performance characteristics and protocol access. Also, each system could only store so much data before it didn’t have enough performance to keep up, so systems would continue expanding and replicating to accommodate.

This escalates until an administrator is startled to realize 80 percent of his/her company’s data being created is unstructured. The biggest challenge of unstructured data is that it’s not confined to the four walls of a data center. Once again, we find ourselves with silos that aren’t being shared (notice the trend repeating itself?). Ultimately, this creates the need for scale-out architecture with multiprotocol data access that can combine and consolidate unstructured data sources to optimize collaboration.

Doubling every two years, unstructured data is the vast majority of all data being created. Traditionally, the approach to gaining insights from this data has involved building yet another silo, which prevents having a single source of istock_000048860836_largetruth and having your data in one place. Due to the associated cost and the complexity, not all of the data goes into a data lake, for instance, but only sub-samples of the data that are relevant to that individual query. An option to ending this particular cycle is investing in a storage system that not only has the protocol access and tiering capabilities to consolidate all your unstructured data sources, but can also serve as your analytics platform. Therefore your primary storage, the single source of truth that comes with it and that ease of management will lend itself to become that next phase, which is unlocking its insights.

Storing data is typically viewed as a red-ink line item, but it can actually be to your benefit. Not because of regulation or policies dictating it, but as a deeper, wider set of data that can provide better answers. Often, you may not know what questions to ask until you’re able to see data sets together. Consider the painting technique, pointillism. If you look too closely, it’s just a bunch of dots of paint. However, if you stand back, a landscape emerges, ladies with umbrellas materialize and suddenly you realize you’re staring at Georges Seurat’s famous panting, A Sunday Afternoon on the Island of La Grande Jatte. Similar to pointillism, with data analytics, you never think of connecting the dots if you don’t even realize they’re next to one another.

Follow Dell EMC

Categories

Archives

Connect with us on Twitter