The Transformation of Big Data
With digitization emerging as the latest trend in technology and data being generated by every interaction, businesses are flooded with Big Data. In fact, Big Data has transformed the way organizations work. A new pattern is being created where business and IT leaders collaborate to realize value from all the data collected.
Big Data provides insights that enable employees to makes better decisions, in turn improving customer engagement, business operations, fraud and threat prevention, and capitalization of new revenue sources. In all, Big Data can create a significant competitive advantage for businesses.
However, with its alarming volume, velocity and variety of sources, the rise of Big Data has brought into focus the challenges of trying to make sense of all this information with existing solution. Maintaining efficient operations with optimum processing power while churning out meaningful information from Big Data is even more taxing.
Enterprise data coming from day-to-day business transactions brings another challenge. A gap exists between enterprise data that resides in expensive hardware and Big Data that resides in less expensive, distributed commodity hardware. This difference in landscapes can pose difficulty for companies when trying to merge both sets of data to generate reports. The fact that running complex analytical queries on Big Data stored in a distributed environment yields poor quality performance doesn’t help the case for companies either.
Bridging the Gap with SAP Vora
What is SAP Vora?
Vora is derived from Latin term “Voracious”, clearly marking its ability to consume large amounts of data.
SAP HANA Vora is an in-memory query engine with the capability of powerful contextual analytics across all the data that is stored in enterprise systems, Hadoop, and various distributed data sources. SAP Vora is designed to add insights across volumes of contextual and operational data, taken from data warehouses, enterprise applications, data lakes and Internet of Things sensors. SAP HANA Vora leverages Hadoop and Apache Spark framework.
Technical Capabilities of SAP Vora
SAP Vora’s in-memory query engine plugs into Apache Hadoop’s framework and provides enriched interactive analysis. SAP Vora also uses Spark’s SQL library along with HANA computing engine.
Technically, SAP Vora is a combination of Hadoop/YARN (resource allocation), Spark (in memory query engine), and HANA push down query delegation capabilities. Vora handles OLAP analysis & hierarchical queries as well. It uses a set of in-memory, high-performance, and distributed computing engine.
SAP HANA Vora is bolstered with a simple web-driven interface to create, design, and blend data models from the enterprise grade data and Big Data stored in Hadoop frameworks. Developers and Data scientists can create hierarchical, graph, and time series data models.
Aiding with data discovery, SAP Vora also provides a quick drag-and-drop function to facilitate complex computations across relational, time series, and graphical data. Additionally, SAP Vora is equipped with built-in business functions such as currency conversions, data hierarchies, and other functions to simplify the development of information models to take quick and streamlined business decisions.
If the size of a data set exceeds the memory capacity of the engine, SAP Vora is equipped with a disk-to-memory accelerator to handle the data that does not fit in-memory.
The one SQL entry point ensures the developers do not have to invest time in learning the new set up and can quickly adapt themselves to the SAP Vora engine to create models and draw timely data patterns.
For security purpose, the product supports Kerberos-enabled Hadoop distributions and file system permissions to encrypt and protect the data as soon as it is loaded.
Finally, the insights can be plugged to analytic applications to create Self-Service BI, visualizations, and predictive analysis reports.
Scope of SAP Vora
Apache Hadoop and Apache Spark acts as a base for SAP HANA Vora. Hadoop’s framework allows distributed processing of large data sets, which spans across clusters of computers, yet uses a simple programming model. From a single server Hadoop can scale up to thousands of machines where each of the machines offers local computation and storage space.
Apache Spark is a fast engine for large scale data processing. Spark can interface with a variety of interfaces for distributed storage management, such as HDFS – Hadoop Distributed File System, Cassandra, Amazon S3, MapR File System, Swift, or OpenStack.
So, what does this combination offer to companies through SAP Vora?
- In-Memory query engine
Runs an in-memory query engine atop Apache Spark execution framework. Queries are compiled across nodes to speed up processing, which accelerates and simplifies OLAP analysis.
- Data hierarchies
Drill-down analysis and OLAP are performed using Spark SQL Semantics, which is enhanced with the inclusion of data hierarchies.
- Enhanced Mashup API
Users can create enriched data sets by taking projections from enterprise sources. This is possible due to enhanced mashup which is based upon Spark SQL Data Source APIs.
- Enhanced Spark and SAP HANA controller
Lightning fast data movements between Apache Spark and SAP HANA improve performance.
- Open development interface
Leverage open source development interfaces which support similar programming languages and all of the Hadoop distributions
The SAP HANA Vora engine can either be stand-alone or could be coupled with SAP HANA platform, to further extend enterprise grade analytics over Hadoop clusters.
Advantages of SAP Vora
Now that we are talking about Big Data and the combination of enterprise data along with it, SAP Vora actually helps businesses to analyze all the data on a distributed framework creating the ability to quickly deliver insights or data to applications that cater to business needs.
How will implementing SAP Vora help businesses achieve an edge?
- Precise Decisions
Combine external data with enterprise data, for faster and more precise decision making – even amid constantly changing circumstances.
- Data access democratized
Online Analytical Processing (OLAP) modeling over Hadoop data is made accessible for everyday users. This allows data scientists and analysts to have interactive access to Hadoop data and enterprise data – letting them create mashups in a jiffy for analysis.
- Big Data ownership simplified
A single solution processes and accesses both enterprise and Hadoop data, further simplifying the management of big data across varied landscapes.
A Real Life Scenario
To validate the fact that SAP Vora works in a tested environment, CenterPoint Energy deployed it. CenterPoint Energy is a utility company that delivers power to more than 2.3 million consumers. For that, they collect electronic meter data every 15 minutes for energy usage reporting, which leads to substantial data storage costs.
SAP and CenterPoint Energy had successfully built a testing environment in 6 weeks that had the capability to process 5 billion records of data with Hadoop, SAP HANA and SAP HANA Vora. This successful testing resulted in implementation standardization on the SAP HANA platform and SAP HANA Vora by CenterPoint Energy.
As organizations drift towards being smart digital enterprises, it’s highly unacceptable for businesses to let go of 60-70% of unused data. Especially where gaining deeper actionable insights of consumer behavior can give businesses the much-needed competitive edge.
For companies who are already on SAP, SAP Vora can be a great addition to integrate their transactional, lake, and other data sources into Vora to create mashup queries for deep dive and interactive analysis.