Online Tutorials & Training Materials |
Register Login

Hadoop, Hana Enhance SAP's Potential for Managing Big Data and IoT

|| || 0

Hadoop, Hana Enhance SAP's Potential for Managing Big Data and IoT

SAP engineers and software professionals at Sapphire Now have discussed the limitations and potential for SAP S/4HANA to act as a robust platform for managing enterprise based strategies for Internet of Things and Big Data.

IoT: The Future is in Differentiating Dumb and Smart Things

From the beginning, SAP has been very focused in stating that its path into the future runs alongside the path notched by its popular HANA in-memory technological offerings. SAP execs participating in this month in Sapphire Now 2015 were very clear about their plans about SAP S/4HANA becoming a major player in respect to Enterprise Internet of Things. This is exactly where things are getting very complicated.

Almost everyone concurs that Internet of Things (IoT) is likely to send out a tidal wave of large quantities of data into any enterprise. It is expected that billions of controllers and sensors, all desirous of connecting to some other machine for instruction and analysis, are waiting to create trillions of data events and transactions. On being questioned about how an organization could reconcile huge data quantities with in-memory built-for-speed computing, SAP managers and engineers shared their views via a series of conversations.

At the very onset, SAP(Systems Applications and Products) engineers admitted finding solutions to existing IoT (as well as related Big Data) issues that remained as work in progress. Today, there are diverse products in place, along with customers using the same, but then, engineers are aware of the need to offer more. When it comes to gauging the systems that are currently in place, the process has to be initiated by taking decisions that are made close to sensors and related edge systems.

An example of SAP's IoT for controlling intelligent vehicles

According to SAP, it is not essential for many control decisions to be taken by systems at the core of an enterprise; they are capable of being managed at the edge by control and concentration systems. For instance, SQL Anywhere serves as a lightweight database manager that is capable of being connected to a single-board computer. It gathers data from a wide array of sensors and makes moment-to-moment changes with respect to controlling those systems.

Both the full sensor data as well as change data (which refers to data on major changes or error conditions) is uploaded onto the enterprise database by using Mobilize, which is typically a product that queues the data and uploads it in bursts as and when connectivity and system resources are available. It is at this point that the Internet of Things seems like a burdensome and time-critical problem related to big data. In such cases, your decision of going ahead with processing at the core, rather than at the edge, can make your organization face the obstacles of revisiting their decisions to the edge in almost near real time. Additionally, once the controller and sensor data is uploaded, it triggers off the processes of data-tierring. 

In the course of conversations with SAP engineers, I had understood that they gave critical importance to the decisions pertaining to whether data was "warm," "hot," "cold," or something between one of these increments (Albeit, honestly speaking, the concept of why a company would choose to rely upon "tepid" data was beyond my comprehension.) While some data was capable of being categorized on the basis of their origin or content, other data would be sorted on the basis of the analysis that occurred in the enterprise core. According to SAP engineers, this sort of analysis can take place most rapidly in relation to a HANA database.

A HANA database that is used for referencing data linked to other databases (including MongoDB, Hadoop, Oracle, or any other commercially marketed database) is capable of using Smart Data Access from SAP, which is a data virtualization feature that allows HANA control structures to look into "virtual data tables" spanning multiple databases. Since the launch of SAP HANA SP 6, Smart Data Access has made its presence felt in every conceivable manner and has managed to sort out its limitations reasonably well. The circumstance in which Smart Data Access fails to be sufficient focuses on minute-by-minute processes and control operations necessitated by the Internet of Things. (Here, you may like to think about controlling a vehicle or adjusting a sensitive, chemical industrial process from a remotely driven site).

SAP is now looking towards using "streaming", a concept that integrates the features of HANA for analysis purposes, without going through the assumption that all data is "hot", until otherwise proven. Essentially, streaming will use a specific HANA instance as a Hogwarts Sorting Hat (high-speed) for determining whether the data flowing in has to be sent to any cooler location for all future analysis or stored for use in another HANA database. As per SAP engineers, the step is a critical one for allowing big data analysis to keep moving ahead at the speed of HANA; this process is also the one that is most difficult to apprehend. 

In this context, the term "streaming" refers to the depositing of incoming data in a database related to SAP HANA, for a duration that’s long enough for sorting algorithms to go about their jobs. Once the data is given a priority level by its system, it is sent on its way, thereby leaving enough space for other data bits to come in. Just like all plans of the same nature, a large percentage of the system's performance is dependent on caching, buffering, and so on.

Additionally, it is within these pieces of service and software where certain rough edges lie. Engineers are quite confident that over the next few months they will be in a position to demonstrate full capability and speed pertaining to all translations and connections that are required to take place.

Virtual tables unite the data stored in various database structures

Components linked to SAP’s big data universe are unlikely to begin and end with Hadoop and HANA. While talking about the mechanisms for connecting HANA instances to Hadoop, SAP engineers mentioned Cloudera as one of their most favoured products in many discussions. A popular Hadoop-based scalable data management system, Cloudera includes certain central pieces of Hadoop and other software inputs for free data flows. It acts as a data source and reliable management structure for tying everything together, and boasts of connectors to several third-party analytics packages and databases that a company might want to integrate into its complete big data solution. Cloudera also has SAP Business intelligence (BI), which is likely to act as an important part of the system. The BI suite includes many analytics bits that make good sense of the waves of big data rolling in.

Given the customer base of SAP IoT, the "things" that raise question are more likely to be those linked to industrial shop floors rather than those attached to the wrist of a customer. The controlled systems are likely to manipulate industrial processes to a larger extent than home air conditioning. Regardless of all this, data volumes are expected to be huge and the given timeframes in which the systems have to respond to related problems are likely to be small. With the help of HANA, SAP has now made a series of significant early steps for empowering partner and customer plans. If the company succeeds in polishing and fitting the last of its pieces as promised, SAP S4/HANA will occupy a formidable position in the overall IoT market.

Related Articles