Thursday, September 12, 2013

Immersing myself in data and database

With the buzz of Big Data, is it coincidence or deliberate measure on my part to learn more about database?

This week I finally complete the 7 week online course - M102: MongoDB for DBAs. A great course offered by MongoDB Inc. (formerly known as 10gen).

Not only do I get to learn about MongoDB, it also help me brush up on basic DBA stuff as well as infrastructure architecture for data storage.

As part of my job, I got to attend a the "DB2 10 for Linux, Unix and Windows Fundamentals" offered by IBM here locally in Hong Kong. I wasn't too excited on the first day about typical DBA administrative task, but the second day got more interested. From the course, I get to learn more about the theory behind column-based database and some basics around how DB2 implemented with their fancy name BLU Acceleration. 

Another interesting topic is Temporal Table feature in DB2. After some Googling, Temporal Table has been standardized into the ISO/IEC 9075:2011 or SQL 2011! It is already yesterday news. Even opensource database such as PostgreSQL supports Temporal Tables. I still recall those days manually doing it in the application layer, then later using Hibernate Envrs or EclipseLink History Policy and thought I was cool. Now this stuff is natively supported in standard SQL.

Sunday, July 21, 2013

Got my M101J: MongoDB for Java Developers Certificate



M101J: MongoDB for Java Developers is a 7 week course. The courses was well taught. I am now officially NoSQL certified!

Sunday, April 14, 2013

Exploring Cloud Foundry with Spring Stack + Vaadin

Update 2013/07/21:
Since Cloud Foundry deprecated its free-tier on 2013/07/01, the sample application went offline. As of today, I have successfully migrated the application (offline since Sept 2017 after Redhat withdraw their free-tier offering) to Redhat's OpenShift.

I have been reading about Cloud Foundry for a while and finally got around to trying it out.


Cloud Foundry provide an interesting cloud play which isn't quite the same as PaaS pure play such as Google App Engine and Force.com, while provide way more tools and support then a IaaS such as Amazon EC2.

The Test Application

To test out cloud foundry, I decide to build an application that has a Web UI component and also a periodic schedule background job component.

About a year ago, I read about the Hong Kong Government Data.One project, a similar project to other initiative around the global where Governments make public data more accessible to the people. Although Hong Kong Government is quite a bit behind in this area, I am very happy to see actual progress are being made and more dataset are being added.

With this resource, I decide to build a application that pull changes from the Restaurant License data set from  the Food and Environmental Hygiene Department every day and import in into a database and track the changes.

Basic Components and Technology Overview

The application is deployed on Cloud Foundry - http://hkrestaurants.cloudfoundry.com (Unfortunately cloudfoundry ended its beta/free-tier, and as a result the application is no longer available).

The UI is quite primitive right now, but it is setup to import data daily and the changes are tracked using EclipseLink History Policy feature. After collected some more data, it might be interesting to see what are the latest restaurants in Hong Kong daily, and create some interesting visualization base on this the historic data.

Impression of Cloud Foundry

While Cloud Foundry is still in beta, so some problems was expected. When I first it out, I did encounter some problems trying to build a standalone app to do period import from an external data source. I did manage to get over the problem, by re-installing Ruby and VMC, but what happened exactly is still a mystery.

As far as I know, we can only deploy standalone application using the VMC tool. The VMC tool is build on top of Ruby. I am no expert in Ruby, but I believe Windows isn't a first class platform for Ruby. For example the very useful RVM, tool to deploy and manage multiple version of ruby, isn't available on windows. I know there is pik, but there are some oddities with it and it feels unpolished, even for a command-line tool. Besides, it appear abandon with no significant update to the project for 2 years, 45 outstanding issues and some outstanding issues date back 3 years.

Having said the above with the VMC tool, deploying Web Application is a breeze due to its amazing integration with Springsource Tool Suite (STS)! Re-Deploying the Web application is as simple as a click!


In conclusion, I think the technologies behind Cloud Foundry is an interesting one. Because the technologies are open source and widely adopted, such as PostgreSQL or MySQL vs GAE's proprietary datastore, this provides a very enticing no-lock-in ecosystem which Enterprise customer may found very attractive. The no-lock-in is fundamental to the promise of Cloud Foundry due to the opensource technology stack supported by its cloud and its opensource tool set design to interact with it. Few companies has sprung up based on Cloud Foundry such as AppFog, Tier 3 and AppClound by Uhuru Software, Inc. As they are based on the Cloud Foundry, the migration from one provider to another should be relatively straightforward.

I am familiar with Google App Engine and Amazon EC2. I am not that interested in Force.com due to it proprietary language. Here are some interesting cloud I would like to try and learn more about:

  1. OpenShift
  2. Azure
  3. OpenStack
  4. CloudBee
For Openshift and Openstack, I am not even sure whether it is a PaaS or IaaS play, so there are lots to learn. Do you know of another PaaS or IaaS provider that I should take a look?