Till on SAP Hana
Nine years ago I started my private validation project of the then highly hyped SAP Netweaver Stack. It became a rough trip with several highlights - NWDI is much better than its reputation - and some disappointments - WebDynpro/Java never left beta. The results were a book and greying hair.
This time - in may 2013 - I decided to take a softer approach to the next SAP hype: SAP Hana. I will spend 1 hour per week examining Hana with no predefined result. To force myself to follow this rule I will write a blog about my insights.
Day 1SAP offers a free 30 day trial version of SAP Hana. I started with this link and got one. It is hosted by CloudShare. It takes a couple of minutes to initialize, then you get access to a Linux Hana server and a preinstalled client Windows PC with Hana Studio. The server has 24 GB memory, 250 GB disk space and 6 CPUs. The web interface to both machines is flimsy due to network latency and also since it emulates a shell / OS screen in a browser. Anyhow this is not the core product but just the access enabling software. I got stuck twice and received valuable hints from SAP's support mail hotline firstname.lastname@example.org. These people are really supportive. Thanks, Rocky!
Day 2I found this document on the web. It explains briefly what Hana is all about. SAP Hana is an in-memory database. It is column based and allows for parallel execution of sub-queries. Fortunately it supports standard SQL as well as its own scripting language called SQL Script. Both are accessible via JDBC. These features combined promise crazy query performance. SAP will not only deliver Hana as a raw software building block but will make it accessible for BI systems and core SAP ERP systems as well.
Day 3Following the help texts in Hana Studio I managed to establish my first connection from (virtual) client PC to Hana server. In the "Navigator" view I could browse the database catalogue and create my own schema and table. This feels very much like a standard SQL client. In expectation of its sheer execution speed I am already impressed with the whole concept. SAP seems to have found its way into the cloud era and offers an alternative to Oracle databases. This makes much sense in a world with increasing demand for big data analysis.
I picture SAP selling cloud access to millions of Hana instances at cloud prices and think wow, what a highly competitive product! No one is going to stick to his fat old self-owned and self-administered database server any more.
Day 4Now I wanted to see the query monster in action. So I took some posts from web discussion boards that I had crawled earlier and tried to load them into my Hana instance. I chose to use the Hana Studio SQL command line. I reached it from the "SAP Hana Development" view's "Navigator" tab, right-clicked on the system name and chose "SQL Editor" from the context menu to get there. Then I pasted SQL insert statements into the editor and executed them.
It turned out to be not so fast on inserts. But the selects were fast. Hana studio shows the execution duration in microseconds, not milliseconds. Unfortunately the license conditions of the Hana developer package do not allow me to publish benchmarks.
Day 5Since the welcome message on the remote client PC encouraged me to change the initial passwords I did it on the server's UNIX command line. Afterwards CloudShare's web frontend showed a message that the server was misconfigured. It turned out that there are several different levels of credentials.
A the Linux Hana server credentials
I used E to reconnect Hana Studio to the server and did not have to supply A. This means the CloudShare framework has a special relationship to the Hana server that allows to bypass password entry. Probably based on certificate exchange.
Day 6The 30 days of trial are over. Now I must pay for my Hana instance. There are three different providers that offer SAP Hana hosting: CloudShare, Amazon and some asian company. CloudShare has the cheapest monthly rate. Amazon charges 51 cents per hour but it does not become clear what an hour means. Is is the time I actually use the Hana server or is it the time during which the server is available for me independent of me accessing it?
Day 7Finally I rented the Cloudshare version of Hana. It costs 137 dollars per month plus tax. This amount goes completely to Cloudshare. SAP charges nothing since this is a developer server as part of a promotion program.
I did not rent the remote desktop offered by Cloudshare since I am not sure if it is covered by the above rate. Therefore I'll have to connect directly from my laptop.
Day 8I bought a book called SAP HANA: An Introduction. Flipping through the pages I found an interesting paragraph on pricing on page 114. It says
The average SAP HANA license sale is $400.000... As of this writing SAP is not discounting the SAP HANA licenses.
This means the business case for Hana shrinks to bundling with core ERP sales. No broad market penetration with a cheap cloud based approach as I guessed earlier.
Another funny paragraph on page 115 is titled
"Is SAP HANA Just Another SAP Fad That Will Not Have a Long Lifespan - Like mySAP?"
Keep in mind this is just a quote from a book published by SAP Press! The conclusion in the following lines is: No, SAP is betting its future on Hana.
Day 9I downloaded the Hana Studio from this link and installed it directly on my PC. The connection to the Hana server worked soon using the connection data provided by Cloudshare.
Day 10I was lucky to spend some hours on a train in the last week. So I could countinue reading SAP HANA - an introduction. One chapter deals with data modeling in Hana Studio.
If you use Hana for reporting purposes then data is modeled in Hana in three different view types
A attribute views
To understand what an attribute view is I must explain how the word 'attribute' is used in this context. Business Intelligence terminology distinguishes between facts and attributes. Facts are measured values such as revenue or costs. Attributes partition the realm of your measurement e.g. in respect of location, time or organisational structures. So an attribute view (A) is a partition of your observation realm defined by one or few database fields.
Analytic views (B) join one fact table with one or several attribute views.
Calculation views (C) usually combine several attribute and analytic views and perform calculations on top of them. I think of revenue purged from seasonal and inflational effects.
Anyways this type of modelling seems to be relevant only if you plan to use Hana for reporting purposes. But I can picture more exciting uses.
Day 11After all the configuration issues of the last sessions I could finally get back to my initial goal - data insertion. Whenever I explore a new database layout or database product I try to get a million records to play around with. As a rule of thumb a million records is enough to get an impression of the runtime behavior but little enough to be handy.
I tried again to insert records from the SQL Editor as explained earlier. This does not work for larger amounts of data since the editor content is very limited.
Day 12Since some Hana hosters offer hourly rates instead of monthly rates I was wondering what that means. Do they only charge for the hours during which you actually consume CPU time - meaning few? Or do they charge for the hours during which you have access rights to the server - meaning many? I do not know the answer yet but I found out that Hana Studio cannot connect to the server before I log into my ClouseShare account. Thus the Hana server must be located on a virtual machine that is activated only on demand.
Anyways this distracts me from my initial aim to find out how to bulk load data into Hana. There are several web posts that describe how to do that. The main approaches seem to be these
A Load data from a client side CSV file using Hana Studio navigating File / Import / Sap Hana Content / Data from local file
A is quite straight forward but I expect it to be slower than B as it will probably involve parsing on the client side and record by record transmission to the server.
I have high expecations in aproach B. It reminds me of Oracle's SQL Loader mechanism that is very effective.
Approach C sounds as If I'd need additional software to try it.
Day 13I learned that it is important to first start the Hana Server from the Cloushare page before starting the Hana Studio. If you do it vice versa the Hana Studio makes endless turns trying to connect to the server. Even if you restart the studio you have to close the old SQL Editor which has earlier become stale and must open a new one.
My next insight was that table and Schema names are halfway case sensitive. I create a table called "POST" from the studio and another one called "Post". The SQL Editor is not case sensitive and let me access only one of both. So I deleted them both and created the upper case table again. This is the output
Then I started importing records using method A from day 14. The import stopped. In the "Job Log" tab on the right side I saw a message that the date format that I used is not correct.
It tried a few differnt date formats including YYYY-MM-DD or DD/MM/YYYY but no success.
My attempts got stuck because of connection problems. It took some time till I figure out that the Hana server had passivated itself. On the Cloudshare page you get the possibility to prolong the server uptime. From now on I'll extend the uptime to 180 minutes right at the start of the session.
Day 14Still not content with the import functionality I tried to figure out how to do mass import from the raw Hana command line on the server side. There seem to be two building blocks to go this way.
The first is the access to the Hana database command line HDBSQL. This video shows how to start.
The second one is the import statement described in the SAP Hana SQL Reference Guide starting on page 120.
No results yet.
Day 15Today I managed to configure my Putty SSH client to connect to the HANA server via SSH. Strangely this is done using the root user. Next I changed user to the HANA admin OS user hdbadm in my case.
Then I was able to log into the database command line tool
This way I could execute these commands
select table_name, schema_name from m_tables where schema_name like 'MY%'
select table_name, schema_name from m_tables where schema_name like 'MY%'
So this is a SQL command line tool that supports other commands, too.
I tried a million ways to spell an import statement like this one but I did no succeed yet. Still I am not giving up.
Day 16Today I tried the IMPORT FROM statement that is similar to the IMPORT statement but uses a control file. The control file test.ctl looks like this
The data file test.csv looks like this
Both are located on the Hana server in the folder /usr/sap/HDB/HDB00/import/. I logged into the Hana server command line and executed this statement
And BAM! the two records were inserted into the existing table myschema.post. Do not let the statment "0 rows affected..." discourage you. Make sure to spell the table and schema names in the control file in upper case and have a close look at the error file specified in the control file and success will almost be unevitable.
Day 17I spent the session copying a large data file onto my Hana server. I first copied it onto a web server I own and then loaded it via wget.
Day 18While trying to convert a million text based records to a suitable import format I found out that this format works, too.
Thus all I need to to in my case is remove ticks from my data file.
As a result 3800 records (out of 64.000) were inserted rather fast. In the error file I found messages such as this one
Day 19I examined one of the lines that were too long. It had 1009 characters in the CSV field instead of 1000. I am quite sure that the additional length appeared during export. The easiest way to fix this would be to adjust the field length in the database, not the content. I tried to do this in the Navigator tab of the Hana Studio. You can double-click on the table name or execute "Open Definition" from the context menue. Then right-click on a row and chose "Edit". There is no option to modify a column but you can "remove column" and "add column". So I added a new column with length 1020.
The result is that the field length is changed but the data in that field is lost. Probably there is a command to modify the field length
without data loss but it is currently not supported by the UI.
Next I deleted - not dropped - my table's content from the HDB command line. This happened very fast. Then I ran my insert statement again. The 64.000 records of my first data bulk were inserted in very little time.
I have dealt with databases for many years but I have never seen such an insertion speed. I am not sure if a hard disk can technically cope with this speed. The cruicible question is: what happens if the server is switched off immediately after the insert statement has returned? Will the data persist? And if not, when can I be sure the data is persisted?
I wrote a mail to devcenter to find out about the durability issue.
Then I prepared 1 million records and loaded them into the database. The insertion was still fast for this amount of data but did non scale linearely. One reason might be that the amout of data (roughly 1 GB) exceeded the available SD space. The other possible reason is that I found about 120.000 entries in the import's error log. Usually error handling is much slower than regular functionality.
I ran a first query on the new data. It is a full text search on the complete table.
The duration absolutely excited me! I received 1800 results plus this message. I assumed this is no error but rather an end of stream message.
Then I ran a Unix grep command on the import file to compare the execution speed.
A Data is mainly stored in columns instead of rows.
Each of these items alone does not seem really stunning. But combined they make a lot of sense. Think of (A). If you store data in columns you'll need to access much less data for the everage query that is based on conditions on a few fields. You'll only have to access the data regions where the respective attributes are stored and don't have to care about the irrelevant other fields. Let's assume that this way one can reduce the amount of data to be accessed on an average query by one order of magnitude.
Compression of stored data (B) might save another order of magnitude. Among others Hana uses a dictionary based copression algorithm that replaces long recurring string fields with numeric equivalents.
Now add some more RAM (C) to your hardware and eventually your whole database or at least the hot spots will fit into RAM. A hundredfold decrease in storage demand results in a 100.000 fold increase in speed. (1:100.000 is the factor between RAM latency and hard disk latency). The book claims that Hana's bottleneck is not the hard disk but the RAM - as compared to the even faster L1-3 caches.
Finally the items C and D both reflect a shift in hardware improvements. Today processors do not get faster as quickly as in the past.
Instead RAM becomes cheaper fast and processors allow for higher parallelism. So Hana provides a specialy query language called
SQL Script that enhances SQL by functional parallelization elements.
I reached a description how to program Hana from the ABAP stack. You'll need an ABAP 7.4 system. To run the combination of ABAP and Hana you should code from the Eclipse environment, not the SAP GUI. For those unfamiliar with Eclipse: it is a plugin framework for development tools which was originally developed by IBM but soon after made open source. SAP considers Eclipse a strategic platform that will allow them to bundle and release development environments for different SAP technologies. In our case we'll need a combination of "ABAP Development Tools for SAP Netweaver" and "SAP Hana Studio".
Unfortunately I do not have access to a ABAP 7.4 server yet but this will be one of the hottest topics on my wish list for future sessions.
The most relevant change for ABAP database programmers is an approach called code pushdown. In the past you could treat the database as a black box that was hidden from you by the Open SQL database abstraction layer. In most cases you didn't have to worry whether your data was located in the database or if it was being processed in the ABAP engine. It was usual to first load several rows into an ABAP working area and then manually execute filters or modifications on these data.
If you plan to fully exploit Hana's abilities you must become location aware for both data and code execution. And you will lose the portability
of Open SQL ABAP code. Fast Hana code is as importable as native SQL code. Much of the functionality that used to be performed in
ABAP on the application server will have to be moved to SQL Script to minimize the number of remote database calls and take full advantage of
in memory data access. Since schematic diagrams tend to show app server on top of their database this modification resembles of
pushing the code downward into the database.
It creates a full text index based on the content of the text column. According to the documentation i_text is the name of a hidden column that represents the index. Make sure to use this function-style syntax in your queries
This statement returned crazy fast. It searched 1 million lines of 1K of text considerably faster than Google returns your search results.
The result differed from the result of the corresponding query from day 20. This one returned 500 lines. Probably the above syntax means that a search for a full word is performed and not for any appearance of the string. The function 'contains' allows to use a third parameter that describes the type of search. I'll examine that in another ession.
Inserts into the indexed table will probably be slower than before creating the index.
Since the duration for the creation of the whole index was much smaller than the insertion duration
the speed loss on inserts won't be too big.
o application code runs within the database
This architecture sounds very clean and fast. Still I do not understand what the intended use case looks like. It is not very likely that any large company might migrate its existing app server landscape to such a special purpose niche product. Furthermore companies do have serious security concerns if externally reachable web applications run on the same server instance as the ERP system. This was already the case with WebDynpro/ABAP or fifteen years earlier with Oracle's 8i database app server.
So I guess that Hana XS applications will best be used for super fast in-house reporting. Or as separately served tiles within
a portal that runs on one of the big name app servers.
As discussed earlier SQL Script is Hana's counterpart of Oracle's PL/SQL plus a parallel execution booster. The syntax of a simple procedure must look like this. But caution, this is not proven yet!
This procedure executes the count(*) statement that you can see between BEGIN and END. The output parameter "count" is declared in the first line. I chose a scalar type for the first try. This way I did not have to define a table type. You'll find the full SQL Script language definition here.
To create a SQL Script procedure you can use the Project Explorer in SAP Hana Studio. I tried this click path:
File / New / Project / Sap Hana Development / XS Project
If you create a file with the extension "procedure" the IDE fills it with a predefined procedure template. You have to add the parameters and the procedure body.
Then I tried to active the procedure. This is a compilation and publication step that is similar to activation in the ABAP workbench. It is supposed to work by clicking "Team" / "Activate" in the procedure's context menu. Unforunately I received an error
The activation is obviously tied to repository which I missed to configure. I'll try to clarify this till next week. No skipping of subjects this time I promise.
right next to your hdbstudio installation. It contains a file called "regi.exe".
After installation of the Hana client I opened the Hana studio, right-clicked on the project I created on day 25. There I could execute "Team" / "Share Project". I first received an error message but it contained a link to a form where I could specify the path to "regi.exe" and define a workspace location.
Next I right-clicked on the folder from day 25 that contains my SQL Script file. The option "Team" / "Activate" was accessible now together with several other option such as "Commit". I pressed "Activate". A message told me that I have insufficient privileges.
Anyways I think I got a step further and I'm quite confident that I'll manage to run my SQL Script in the next session.
To fix the authorization problem from day 26 I had to run this statement in the SQL editor of Hana Studio.
The schema name "MYSCHEMA" results from the schema from which I started the SQL editor.
Then I got into the good old development loop of code modification and error message. I'm starting to feel at home in Hana Studio! I'd run Team / Check and draw my conclusions from the message in the "Problems" tab. My take home messages are these two:
You cannot return scalar values from a SQL Script procedure therefore you must define a table type.
SQL Script is case sensitive in the sense that it expects the field to be spelled the same way as in the table definition. Plain Hana SQL is only halfways case sensitive, see above.
Below is the checked version of the procedure. It serves a different purpose than before just to get the my first SQL Script running.
The procedure definition is in the tab "Script".
The table type definition is in the tab "Local Table Types"
With this code I coud also execute Team / Activiate. I also tried to call it from the SQL editor but I did not find the right syntax. It must follow this pattern
but I always got
I used the navigator tab in Hana Studio to click through all schemas. In schema _SYS_BIC I found a procedure called "MyXSProject.procedures/first_five" in the "Procedures" folder. So I tried this call which came quite close to the correct one.
I received a message that made clear I had to deal somehow with the return parameter.
You can probably write another SQL Script program that defines a table variable to hold the call's return value. But if you just want the out table to be displayed in Hana studio you can call the procedure with a question mark.
The result table is displayed as if it was the result of a query.
I'll go on examining SQL Script in future sessions in depth. Naturally performance will be my main focus then.
Basically there are two by two different ways how to access Hana from ABAP: as primary or as secondary database and via Open SQL or natively.
A primary database is the one that is accessed by default whenever you run an Open SQL statement such as
If you are using a Netweaver 7.4 server that uses Hana as sole database the above statement already accesses Hana. If you are using a 7.4 server that uses a database from any other vendor then you may choose to use an additional Hana database for performance critical operations, typically from reporting purposes. The secondary database can be accessed with such a statement
The difference is the CONNECTION clause that specifies the database to be accessed. It can be defined in transaction ST04 / "Database Connections" and may but does not have to be of type Hana.
Independently of your decision to use Hana as primary or secondary database you can either access it from Open SQL as shown before or natively. In a nutshell native database access trades portability of your database code for the ability to take advantage of database vendor specific features. You'll definitely want to do the latter if you are using Hana for an extra portion of speed! Native database access is a prerequisite for the code pushdown technique mentioned in earlier posts.
Native database access can be done with the EXECUTE SQL command. Alternatively it can be performed with a newer API called ADBC. ADBC (ABAP data base connecivity) is an ABAP version of Java's first database API which was called JDBC. ADBC code looks like this
Throughout the next sessions I will try to use Hana as a secondary database from a SAP server that will probably be installed on my laptop.
Neither one is exactly what I was looking for. Both approaches are based on a 90 day license. The first one is said to be slow on startup (30 to 40 minutes).
For the second one I need to get a Linux PC.
Here are some links that have proven to be helpful.
Step by step How-to secure version.
Step by step How-to less secure version.
How to create a billing alarm. Hint: if you try this before the AWS instance was created for the first time no billing metrics are available.
This link describes the permissions you need to set on AWS side to be able to access it from the SAP Cloud Library. It could save you a lot of time.
How to use the ABAP + Hana stack.
The description ABAP + Hana stack I mentioned in the last session lacks a bit of clarity at the beginning when it referrs to the login URL
Terminating means that all entries made to the Hana system are lost. Start up of a new Hana (which is called "creation") takes 40 minutes again. Then you must enter the license key, etc. from scratch. But I believe a terminated Hana creates no costs at all.
On the other hand a passivated Hana starts up (which is called "activiation") in about 5 minutes.
All changes you made in an earlier session are still there. But I guess storage will cost a bit. Anyways my current
overall costs are about 11 dollars. So this is affordable.
I explained the basic idea of ADBC in session 29. Today I adapted an ADBC code snippet from the book "ABAP Development for SAP HANA". It worked immediatley on my Hana ABAP stack.
This is the complete code.
This program is supposed to read all entries from the address table ADRC. I was surprised that the result was not limited to the single entry I could see from transaction SE16. In fact the output comprised three entries!
When I changed the report so that the field "client" was shown, too, everything became clear. ADBC does not care about the client restriction that applies to Open SQL database access and the SE16. The two additional results came from a different client than the one I used to log in.
I do not yet fully understand every single line of the report. I'll deepen my understanding in the next sessions.
The principle is simple. You replace the parameter's value in the SQL statement with a question mark (?). Then you can set the parameter's value from ABAP with the SET_PARAM method.
Here are the code fragments that you need to blend with the report from day 38 in order to set the client parameter programmatically.
I'll start with a data insertion programm that can populate the tables for a fictitious training application. There are four master data tables for tutors, courses, locations and attendees. Then there is one table that contains transactional data for attendances.
These tables cover most data constellations of interest for a data warehouse. There are time and location fields, prices and plain text fields.
Today I defined only the tables:
I did define them in the ABAP Workbench. Later on the challenge will be to access them from the Hana Studio.
Both development environments access the same Hana but I do not have an idea yet how the latter can address the data from the first.
Anyways it is quite usual that business data originates from the ERP system and is later transformed in the data warehouse.
I also investigated how to access them from the Hana Studio. With the help of some blog posts I found out that you can see them in the Project Explorer. For a known combination of SAP System ID (SID), SAP client ID, SAP user name and language you will find a Hana Studio project following the naming convention
In my case there was a project called A4H_001_developer_en . I opened it but did not find my tables such as ZTUTOR. I already assumed it to be hard to find since it is located in $TMP.
Then I tried the "Search" option from the top tool bar. With "Search"/"Search"/"ABAP Object Search" I could enter my table name ZTUTOR. It took a while till the result was shown in the lower right corner of Hana Studio in a window called "Search". My table was found and when I double-clicked on it an ABAP-Workbench-like table description came up in the main tab.
This feels real strange since it is a mixture of well known Eclipse behavior and well known ABAP Workbench behavior.
The context menues I clicked through earlier also showed entries like "New... ABAP Class". It will take some time till
I understand this whole thing but it is quite intrigueing.
I expected to get stuck somewhere soon. The "SAP Hana Modeler" perspective looks quite cumbersome and the propagation delay of the hosted desktop makes the setup feel flimsy. But I was wrong with my pessimism.
I could easily create a new package in the content tab. This is used to separate the development of different teams or vendors. Then I addad a new attribute view called COURSE_PLAIN. On view creation one must choose between "session client" and "cross client" type. This is similar to client restricted and cross client tables in the ABAP workbench.
The attribute view consists of a "data foundation" part and a "semantics" part. You specify the data foundation first. This is just the list of tables and fields you want to base the view on. I was happy to find the table ZCOURSE by name with no necessity to type in a package or project name. For my first attribute view a single table is enough. Then I double clicked on the fields I wanted to be visible in the attribute view. If you mark no fields at all you'll get an error later on when activating.
You can switch to the "semantics" part by clicking on the respective box in the window on the left. Here you define the primary keys.
Until now I just created a respository object. In order to use it you have to transform it into a runtime object by activating it from the object's context menu (right click + "activate"). If the object cannot be activated due to errors you see them in the "Job Log" window on the right side which happened when I missed to mark any visible fields in the data foundation.
Then there's a button labeled "Data Preview" on top of the main window. It allowed me to see the
course data I had inserted from ABAP on day 43.
Strangely the deletion of even 100.000 records takes no measurable time. I implemented this with "delete from...", no table dropping.
Then I defined a similar multi table view called FULL_BLOWN_COURSE. It covers the tables ZCOURSE, ZLOCATION and ZTUTOR. The fields are: firstname, lastname, rate_hour, title, datestart, price, name, town and courseid.
After activation both views were visible in the "data preview editor" and also in the "SQL editor" which I did not mention before. The SQL editor is intuitively comprehensible. It shows just the first 200 rows of the view. The output of the data preview editor is a bit harder to grasp. It shows the charted distribution of a single attribute. For instance the attendees' first names are distributed equally over the four possible values that my random data generator returns.
When I defined the joins between the table triples I had the choice to use the join types referential join, inner join, left/right outer join, text join. While I knew inner and outer join from relational databases I was not sure about referential and text joins.
With the help of some online research I found out that referential joins are roughly the same as inner joins with the prerequisite that the key field in the master table be unique. Under these circumstances referential joins are faster than inner joins. Text joins are there to allow internationalization of text fields. They link an attribute to the language specific text representation.
I just wanted to finish the session when I discovered that I could see all ABAP objects I had created from the
ABAP workbench earlier also from the Hana studio - also the programs and classes.
As mentioned much earlier you can use Eclipse to write ABAP
code but I did not have the opportunity to try this before. I added a field called "price" to the ZCOURSE table.
evaluation01 is the name of the package I created when I defined the attribute views.
Next I began hastily to build my first analytic view. Analytic views are the next layer of Hana's data warehouse view stack, see day 10. But soon I discovered that I had misconceived the concept.
Analytic views are not fully built on top of attribute views. Instead the data foundation of an analytic view
consists of a facts table(!) plus one or more attribute views. The attribute views are there to provide the
dimension data. Thus I'll have to design my view from scratch.
First I defined two attribute views called PLAIN_TUTOR and PLAIN_LOCATION. They contain the fields tutorid+rate_hour and locationid+town+country, respectively. Primary keys are tutorid and locationid.
Then I started defining an analytic view called REVENUE. Analytic views consist of three parts:
A data foundation
The "data foundation" is assembled of the facts tables ZCOURSE and ZATTENDANCE joined on courseid (there can be more than one). Then I marked the fields datestart, attendanceid and price as visible. Datestart will be used for the time dimensions.
In the "logical joins" part I added the attribute views PLAIN_TUTOR and PLAIN_LOCATION with joins on locationid and tutorid. The latter will provide the geographical dimensions.
In the "semantics" part I could mark attributes as "measures". The only measure for the revenue is the price field.
Then I hit the "Data Preview" button and drews quick graphs for price/country und price/datestart as y/x axis. The graphs looked nice but I discovered that it used "price (count)" in the vertical direction. What I intended was a sum of prices not the number of recoreds. When I switched to "price (sum)" I received the error
On first sight this is surprising since the price field is explicitely labeled "numeric".
The first hits on google did not seem to be helpful, either. We'll see.
Anyways the diagrams in the data preview for avg(price)/country look good.
But this is only the data preview! When do we get to the real data display functionality?
Well, we don't. It took me some time to figure out that Hana is only the data source and aggregation engine. If you want beautiful charts you need to apply SAP Business Objects Business Intelligence Platform or SAP BW.
Another interesting detail is that you cannot query analytic views without using aggregates. If you run this query
then you get
If you use an aggregation function like SUM, MIN, MAX or AVG then it works fine.
First of all Hana is a crazy fast database. Check.
After this year I'll go on working with Hana. But I'll post only sporadically whenever there is anything remarkable to write about.
Fortunately the developer license can be prolonged. This link describes how.
where func stands for a SQLScript function that performs and returns some kind of non trivial calculation on the input field. A straightfoward ABAP counterpart would have involved a large number of database accesses. And even a more sophisticated implementation would require at least to transfer the whole table into the ABAP runtime where the function's logic would be applied.
My first idea was to look for built-in SQLScript functions. I thought there might be some type of date conversion but I found nothing helpful. Then I stumbled upon a blog post from Rich Heilman on Table User Defined Functions. This looked reasonably useful. So there is a way to define functions in SQLScript. The big difference of functions as compared to procedures is the call style. Procedures require an own CALL statement while functions can be used as operands or parameters in another statement. The downside of the functions Rich Heilman describes in that blog is that they can only return table type results.
Then I found another blog, also from Rich Heilman: Scalar User Defined Functions. This is what I was really looking for and what makes SQLScript by far more useable.
The next problem was to persuade my Hana Studio to define a UDF. Starting with the "Systems" tab of the SAP Hana Development perspective I tried File / new / SAP Hana / Database Development / Scalar Functions. This click path led me to a text editor prefilled with a UDF template (FUNCTION ...) but with no visible means to activate the code. So I just added "CREATE" to the template and pasted everything into Hana Studio's SQL command line tool.
This is my function that converts dollars to euros.
It can be used this way for queries
During development I had to throw in one of these statements from time to time.
Today I've learned that there is a way to bypass that bulky sixties COBOL inspired coding style that SQLScript procedures promote. But since we are living in the age of functional programming there is still much room for improvement here. I hope that Rich will one day be able to blog about SQLScript closures.