Monday, October 19, 2009

My experience with Google AppEngine - Java (Part 2)

In continuation of My experience with Google AppEngine - Java (Part 1).

After uploading the same DTA file which work local to Google AppEngine, I received a big fat Internal Server Error message. I diligently checked the Logs available through GAE admin console, and I found a suspicious error. I no longer have the exception information, (GAE rolled-over the logs), but it was quite obvious the problem is due to the file upload code. The example on Apache Commons File Upload User Guide, does not work out of the box on GAE.

The answer was not too hard to find after some googling. GAE/J does not support File system. The proper example is available even on the GAE/J FAQ: How do I handle multipart form data? or How do I handle file uploads to my app?

After updating my code according to the example in the FAQ, I no longer get an Internal Server Error, but my page would not refresh with the new information. It works fine locally, but will not work on GAE/J!

This is one of the tougher problems since the symptom is not specific, it is difficult do even know what to google about. There was no error message which I can google about. One of the best approach I find to resolve this issue is to think like the system. It is similar to the good old say of "think in someone else's shoe". However, the someone else this time is actually Google App Engine.

Going through the thought experiment combine with code inspection was a powerful way to troubleshoot a problem these kind of problems. It does require a good understanding of the different layers of abstraction in a distributed computational system. Good thing the early days of tinkering with PC parts + reading random computer articles/books + education did paid off. It all came down to this small segment of code.

Object myObject = request.getSession().getAttribute("mySessionObject");
// do something with myObject
return;

All I had to do was:
Object myObject = request.getSession().getAttribute("mySessionObject");
// do something with myObject
request.getSession().setAttribute("mySessionObject", myObject);
return;

Long and behold, the code is now working properly.

After going through the issue above, I am convince writing stateful code over a technology (HTTP) which was inherently stateless is difficult task. With the computer ecosystem getting increasingly complex, layered with abstraction on top of abstraction, it is getting harder and harder to find people who understanding the stack of technologies.

With the increase advancement in browsers, libraries that normalize the difference between browser, I think it is a good time to re-examine moving the state back into the browser (except security). No more 50+ MB session objects.

The plotter code stores the raw data of what to plot in the J2EE session, and I would like to move it out and maintain that information on the client side. In fact, I would like to re-architect the page such that no function will require a page load.

With that goal in mind, the only real reason the page requires a reload is the file upload form post. Some googling around, I was excited to find DWR3 is moving to support file upload.

After many tries, I was still unable to get DWR3 file upload to work on GAE. Even just including the DWR3 libraries into the GAE project, the project will fail to start up properly in the local sandbox environment. This is actually a good thing this failed locally, instead of having to find out after hours of writing the code that it would not work on GAE.

The error I was getting was:

javax.servlet.ServletException: org.directwebremoting.extend.ContainerConfigurationException: java.security.AccessControlException: access denied (java.lang.RuntimePermission modifyThreadGroup)
at org.directwebremoting.servlet.DwrServlet.init(DwrServlet.java:77)

To give DWR3 credit, it is still in a pre-release stage and the code-base I worked with was RC1 and RC2. Although DWR3 was explicitly listed as "Compatible" the the "Will it play on App Engine" list from google, it looks like I am not the only one who is out of luck.

Issue#376 has been open against DWR to hopefully get it to support GAE and other platforms.

There are many posting online from different people on different mailing list with the same problem.

The most useful post online I found was this. Basically, the problem has to do with DWR3 spawning threads during start up. These threads are responsible for some house keeping within the DWR3 library, but spawning threads is a big no-no in GAE/J. There are workarounds in the mailing list to override those house keeping code, but is it really worth investing all these energy towards it?

Maybe DWR3 is moving towards supporting the GAE/J platform, however, at the time of my investigation, only DWR RC1 and RC2 was available.

Inspecting ContainerUtil version 1.35 tagged for RC2, the problematic code of using the DefaultScriptSessionManager in the setupDefaults() method during initialization is still there. (DefaultScriptSessionManager is the one spawning threads).

container.addParameter(ScriptSessionManager.class.getName(), DefaultScriptSessionManager.class.getName());

There are later versions of the file, and the setupDefaults method in ContainerUtil has changed, but I don't think investing the energy at this stage to get DWR3 just to start up in GAE/J is a wise decision. There may be a lot more compatibility issue between DWR3 and GAE/J down the road.

So with all these rambling, what is the moral of the story?

GAE/J is a great platform which promise to reduce a lot of infrastructure and system administrative cost. However, there's a catch. The power of Java is its ubiquitousness. There are Java libraries for many many purpose, from game engines to integration with mainframes. But GAE/J does not fully support the J2SE/J2EE standards, and a lot of libraries which you or your enterprise relies on may not work on the GAE/J platform.

Secondly, writing truly distributed server side code on J2EE is not easy. I suspect there are many code out there running on the J2EE platform is either working because it is running in a single JVM, or relies on load-balancer session stickiness to make the non-distributable code works in a "cluster". You will lose that "luxury" moving to the GAE/J platform. The above example of remembering to set you session attribute back into the session is only the tip of the iceberg of the challenge in distributed code on J2EE. There are many applications out there with enormous session objects which will grind to a halt if the session object needs to replicate across nodes, as in the case on GAE/J.