Monday, October 19, 2009

My experience with Google AppEngine - Java (Part 2)

In continuation of My experience with Google AppEngine - Java (Part 1).

After uploading the same DTA file which work local to Google AppEngine, I received a big fat Internal Server Error message. I diligently checked the Logs available through GAE admin console, and I found a suspicious error. I no longer have the exception information, (GAE rolled-over the logs), but it was quite obvious the problem is due to the file upload code. The example on Apache Commons File Upload User Guide, does not work out of the box on GAE.

The answer was not too hard to find after some googling. GAE/J does not support File system. The proper example is available even on the GAE/J FAQ: How do I handle multipart form data? or How do I handle file uploads to my app?

After updating my code according to the example in the FAQ, I no longer get an Internal Server Error, but my page would not refresh with the new information. It works fine locally, but will not work on GAE/J!

This is one of the tougher problems since the symptom is not specific, it is difficult do even know what to google about. There was no error message which I can google about. One of the best approach I find to resolve this issue is to think like the system. It is similar to the good old say of "think in someone else's shoe". However, the someone else this time is actually Google App Engine.

Going through the thought experiment combine with code inspection was a powerful way to troubleshoot a problem these kind of problems. It does require a good understanding of the different layers of abstraction in a distributed computational system. Good thing the early days of tinkering with PC parts + reading random computer articles/books + education did paid off. It all came down to this small segment of code.

Object myObject = request.getSession().getAttribute("mySessionObject");
// do something with myObject
return;

All I had to do was:
Object myObject = request.getSession().getAttribute("mySessionObject");
// do something with myObject
request.getSession().setAttribute("mySessionObject", myObject);
return;

Long and behold, the code is now working properly.

After going through the issue above, I am convince writing stateful code over a technology (HTTP) which was inherently stateless is difficult task. With the computer ecosystem getting increasingly complex, layered with abstraction on top of abstraction, it is getting harder and harder to find people who understanding the stack of technologies.

With the increase advancement in browsers, libraries that normalize the difference between browser, I think it is a good time to re-examine moving the state back into the browser (except security). No more 50+ MB session objects.

The plotter code stores the raw data of what to plot in the J2EE session, and I would like to move it out and maintain that information on the client side. In fact, I would like to re-architect the page such that no function will require a page load.

With that goal in mind, the only real reason the page requires a reload is the file upload form post. Some googling around, I was excited to find DWR3 is moving to support file upload.

After many tries, I was still unable to get DWR3 file upload to work on GAE. Even just including the DWR3 libraries into the GAE project, the project will fail to start up properly in the local sandbox environment. This is actually a good thing this failed locally, instead of having to find out after hours of writing the code that it would not work on GAE.

The error I was getting was:

javax.servlet.ServletException: org.directwebremoting.extend.ContainerConfigurationException: java.security.AccessControlException: access denied (java.lang.RuntimePermission modifyThreadGroup)
at org.directwebremoting.servlet.DwrServlet.init(DwrServlet.java:77)

To give DWR3 credit, it is still in a pre-release stage and the code-base I worked with was RC1 and RC2. Although DWR3 was explicitly listed as "Compatible" the the "Will it play on App Engine" list from google, it looks like I am not the only one who is out of luck.

Issue#376 has been open against DWR to hopefully get it to support GAE and other platforms.

There are many posting online from different people on different mailing list with the same problem.

The most useful post online I found was this. Basically, the problem has to do with DWR3 spawning threads during start up. These threads are responsible for some house keeping within the DWR3 library, but spawning threads is a big no-no in GAE/J. There are workarounds in the mailing list to override those house keeping code, but is it really worth investing all these energy towards it?

Maybe DWR3 is moving towards supporting the GAE/J platform, however, at the time of my investigation, only DWR RC1 and RC2 was available.

Inspecting ContainerUtil version 1.35 tagged for RC2, the problematic code of using the DefaultScriptSessionManager in the setupDefaults() method during initialization is still there. (DefaultScriptSessionManager is the one spawning threads).

container.addParameter(ScriptSessionManager.class.getName(), DefaultScriptSessionManager.class.getName());

There are later versions of the file, and the setupDefaults method in ContainerUtil has changed, but I don't think investing the energy at this stage to get DWR3 just to start up in GAE/J is a wise decision. There may be a lot more compatibility issue between DWR3 and GAE/J down the road.

So with all these rambling, what is the moral of the story?

GAE/J is a great platform which promise to reduce a lot of infrastructure and system administrative cost. However, there's a catch. The power of Java is its ubiquitousness. There are Java libraries for many many purpose, from game engines to integration with mainframes. But GAE/J does not fully support the J2SE/J2EE standards, and a lot of libraries which you or your enterprise relies on may not work on the GAE/J platform.

Secondly, writing truly distributed server side code on J2EE is not easy. I suspect there are many code out there running on the J2EE platform is either working because it is running in a single JVM, or relies on load-balancer session stickiness to make the non-distributable code works in a "cluster". You will lose that "luxury" moving to the GAE/J platform. The above example of remembering to set you session attribute back into the session is only the tip of the iceberg of the challenge in distributed code on J2EE. There are many applications out there with enormous session objects which will grind to a halt if the session object needs to replicate across nodes, as in the case on GAE/J.

Saturday, October 3, 2009

My experience with Google AppEngine - Java (Part 1)

Recently in the Information Technology space, cloud computing is the latest buzz words. One of the players in town is Google. Their offering, Google App Engine (GAE), provides Platform as a Service (PaaS). Python and Java are the two platforms provided by GAE, aka GAE/P and GAE/J.

Recently, I had a chance to worked on a mini web application project. The application is simple, where a user can upload a file captured by one of these nifty device, iButtons, and the information are charted out similar to the OneWireViewer desktop application.

A few years back when Google first came out with GAE/P, I toyed with it a bit, writing a simple (and ugly) bulletin board application. It was a good introductory experience, and PaaS idea looks promising. Unfortunately, my *real* work project went into crunch mode and I can no longer devote my time to study Python and Django framework on GAE/P.

Then Google came up with the Java Edition of GAE, which reinvigorate my interest in PaaS. With my background in Java, GAE/J is a nature choice to build my mini web charting application.

Although GAE/J was still in "Preview-Edition", it already came with a pretty decent eclipse plugin. I still remember the early days with GAE/P where I have to memorize command-lines in order to build, compile and run the local GAE/P environment to test. With this eclipse plugin, it freed up some of my brain cells to focus on the actual application development.

Armed with a sample data file, the first thing to do is to build a parser to read-in the file and transform it into a POJO data model which can be manipulated easily. Simple enough, no issues there.

Next up was to create the upload functionality. Apache Commons fileupload comes in handy. I follow the instruction available on their website and so far so good. The functionality works seamlessly on the local GAE environment (build on Jetty). After integrating the upload functionality and the parser, I have half the application completed in a few hours.

Lastly, the charting part. With my new found love for all things Google, I gave Google Chart API a spin. The concept of Google Chart is simple and elegant. It is a REST-ful API which you supply all the input to the src attribute of the <img> tag, and viola, you get a chart image back. Elegant yes, but not without its flaw. By simply looking at the API, the inherent limitation of the design became apparent. Since the <img> tag basically using a HTTP GET command, the payload of the API is limited by the inherent length of a URL!

http://chart.apis.google.com/chart?cht=p3&chd=t:60,40&chs=250x100&chl=Hello|World

You know how people say "Love is Blind"? Yes it is true. Even in the nerdy world of software development. Damn Google with its slick API and sexy looking charts. It is so hard to resist! With my new found love for all things Google, I ignored my intuition and solidered on.

Another hour or two, I have a local working prototype. Time to celebrate? ... Not yet. Let's upload the application and test it on the Cloud!

Being all things new and shiny, of course it did NOT work on the actual GAE/J cloud. Upon uploading a the test file, I get a nice big:

Internal Server Error

To be continued ...