Optimizing for GAE: Objectify and Guice

When I started designing my webgallery application, I chose standard frameworks in order to keep the application (potentially) portable with minimal refactoring effort. I was not totally focused on Google App Engine, instead I was ignoring typical characteristics of its architecture. That is a totally valid approach and has on its positive side the advantage, that you can almost completely write normal Java applications that run on GAE (except some standard classes cannot be used). However, a big disadvantage of this approach is, that the application cold start time was uncomfortably long, as a lot of libraries have to be loaded before the application starts.

Therefore, I removed all references to the Spring framework and the Datanucleus JDO implementation. The migration started on the persistence layer. Luckily I added a DAO layer to my design, so all I needed was changing annotations in the model classes and writing new DAO implementations (interfaces remained the same). When using Spring I made use of the “JdoSupport” classes, which provide convenience methods for using transactions etc. As I don’t use the full scope in my application, it was easy to implement the part that I actually needed in my own framework.

Then I removed all references to the Spring framework. My initial thought was that I just get rid of DI at all until I find a better solution, but that appeared to be not practical and testing started to get annoying. Therefore I decided to use Guice. Migration was pretty simple. I chose the approach to use member injection, so only @Inject annotations to the member classes are required. As Guice doesn’t use XML configuration, a Module is required, that contains the configuration. Sounds a bit strange at first glance, but it turned out to be very comfortable and actually very type safe (as it is configuration in a Java class.
Next part was integration in Stripes: I was looking for some available implementations but quickly found out that integrating it into my application was just a matter of a very small class – all I needed was dependency injection in my ActionBeans. So I decided to write that integration by my own:


@Intercepts( { LifecycleStage.ActionBeanResolution } )
public class GuiceInterceptor implements Interceptor {
    private Injector injector;

    public GuiceInterceptor() {
        injector = Guice.createInjector(new WebgalleryModule());
    }

    @Override
    public Resolution intercept(ExecutionContext context) throws Exception {
        Resolution resolution = context.proceed();
        injector.injectMembers(context.getActionBean());
        return resolution;
    }
}

The point is to write a Stripes interceptor that intercepts the lifecycle stage of ActionBean resolution – therefore the annotation of the class. The interceptor first creates the Guice injector in the constructor – with referencing to my module (that’s the benefit of using a custom implementation – I don’t have to care about too much configuration). The main part is the intercept() method, which lets the interceptor inject the members in the ActionBean.
The GuiceInterceptor just has to be placed into the extension package of my application in order to get recognized by Stripes (since version 1.5). That was already configured in the web.xml deployment descriptor:


    <init-param>
      <param-name>Extension.Packages</param-name>
      <param-value>org.cloudme.webgallery.stripes.extensions</param-value>
    </init-param>

The application runs overall faster now and uses a lot less external libraries than before. I can recommend everyone who implements for Google App Engine to try to reduce the application size as much as possible by removing unnecessary complex dependencies. With the right application design, switching between technologies can be very easy. My initial concern that, when using very GAE specific technologies (such as Objectify), I get stuck in that platform and can never escape, is irrelevant, as migration to a different technology can be very simple. Just ensure you have sufficient test cases.

Cache strategies for Google App Engine

Google App Engine (GAE) provides distributed in-memory cache, called Memcache. Due to quite rigid quotas, it might be necessary to extensively use the cache.

In my example, a web based photo gallery, a lot of image scaling is performed. For example, when an album is loaded, all images are shown as thumbnails. These thumbnails are generated with the Images service and more sooner than later an OverQuotaException is thrown. Although a few minutes after an OverQuotaException image transformation can be resumed, it is still annoying for the user. My application catches the exception and shows a “sorry, over quota” default image.

In order to reduce the number of image transformations, all transformed data is put into the cache. For this I am using a very simple (custom) API consisting of a CacheProducer, which creates the data and a CacheService, which checks if the data is already in cache or needs to be created.


public interface CacheProducer<T> {
    T produce();
}
public interface CacheService {
    <T> T cache(CacheProducer<T> producer, Serializable... params);
    void invalidate();
}

The Serializable... params parameter represents the components of the cache key.

Now let’s have a look at how to use this API. First we look at the PhotoDataService (which is used to get the actual image data from the datastore):


public class PhotoDataService {
...
    public byte[] getPhotoData(final Long photoId, final ImageFormat format, final ContentType type) {
        return cacheService.cache(new CacheProducer<byte[]>() {
            public byte[] produce() {
                PhotoData photoData = photoDataRepository.find(photoId);
                byte[] input = photoData.getDataAsArray();
                return imageService.process(input, format, type);
            }
        }, photoId, format, type);
    }
...
}

For this example I removed all code that is not related to caching. What you don’t see here: the CacheService first looks into the cache whether the data has been cached already. If not, the produce() method is called to produce the data, the CacheService puts it into the cache and returns the data. With this approach, I don’t need to care about the actual cache mechanisms, and I reduce clutter in the actual business logic. The cache API is reusable and can be used for any other data and key parameters.

However, the first run still produces a lot of requests to the Images service and here OverQuotaExceptions are still thrown. Next I’ll explain how to use a background task to prepopulate the cache in order to reduce the exceptions even further.

First run Maven GAE Plugin

Until now I used the Eclipse plugin to develop and test Google App Engine applications. Today I tried the Maven GAE Plugin, in order to become more flexible and independent from Eclipse. My first observations were:

  • As I started with the Maven GAE Archetype, the package declaration of the generated code did not match the actual file location. That needed to get corrected first.
  • I changed the compiler settings from 1.6 (default settings of the archetype) to 1.5 and had to remove some @Override annotations.
  • The gae.home property is undefined (which makes sense), but I had to define it to run the development server.

Overall, I’m quite pleased with the first run. Although Maven adds a lot of overhead, and the actual build process took a bit too long initially to download all dependencies, the overall impression is very good.

Run GAE Development Environment in Eclipse on the Mac

I mentioned already how to set up Eclipse to get code completion for the Google App Engine development environment. To run the application from inside Eclipse has some advantages, too: All errors are logged to the console and you can directly jump to the error location in the code.

The run configuration needs to use the following main module:

/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/dev_appserver.py

My Development Environment

As CloudMe will not just focus on Java development, but will also utilize GAE (Google App Engine), I had to modify my development environment to support Python. I decided to use Eclipse as I’m already familiar with it from the Java development (I tried NetBeans, which is also really good, and needs less memory, but being more familiar with Eclipse made the difference). For this project I did a clean install:

For GAE, some documentation is required (considering that I’m new to Python):