Optimizing for GAE: Objectify and Guice

When I started designing my webgallery application, I chose standard frameworks in order to keep the application (potentially) portable with minimal refactoring effort. I was not totally focused on Google App Engine, instead I was ignoring typical characteristics of its architecture. That is a totally valid approach and has on its positive side the advantage, that you can almost completely write normal Java applications that run on GAE (except some standard classes cannot be used). However, a big disadvantage of this approach is, that the application cold start time was uncomfortably long, as a lot of libraries have to be loaded before the application starts.

Therefore, I removed all references to the Spring framework and the Datanucleus JDO implementation. The migration started on the persistence layer. Luckily I added a DAO layer to my design, so all I needed was changing annotations in the model classes and writing new DAO implementations (interfaces remained the same). When using Spring I made use of the “JdoSupport” classes, which provide convenience methods for using transactions etc. As I don’t use the full scope in my application, it was easy to implement the part that I actually needed in my own framework.

Then I removed all references to the Spring framework. My initial thought was that I just get rid of DI at all until I find a better solution, but that appeared to be not practical and testing started to get annoying. Therefore I decided to use Guice. Migration was pretty simple. I chose the approach to use member injection, so only @Inject annotations to the member classes are required. As Guice doesn’t use XML configuration, a Module is required, that contains the configuration. Sounds a bit strange at first glance, but it turned out to be very comfortable and actually very type safe (as it is configuration in a Java class.
Next part was integration in Stripes: I was looking for some available implementations but quickly found out that integrating it into my application was just a matter of a very small class – all I needed was dependency injection in my ActionBeans. So I decided to write that integration by my own:


@Intercepts( { LifecycleStage.ActionBeanResolution } )
public class GuiceInterceptor implements Interceptor {
    private Injector injector;

    public GuiceInterceptor() {
        injector = Guice.createInjector(new WebgalleryModule());
    }

    @Override
    public Resolution intercept(ExecutionContext context) throws Exception {
        Resolution resolution = context.proceed();
        injector.injectMembers(context.getActionBean());
        return resolution;
    }
}

The point is to write a Stripes interceptor that intercepts the lifecycle stage of ActionBean resolution – therefore the annotation of the class. The interceptor first creates the Guice injector in the constructor – with referencing to my module (that’s the benefit of using a custom implementation – I don’t have to care about too much configuration). The main part is the intercept() method, which lets the interceptor inject the members in the ActionBean.
The GuiceInterceptor just has to be placed into the extension package of my application in order to get recognized by Stripes (since version 1.5). That was already configured in the web.xml deployment descriptor:


    <init-param>
      <param-name>Extension.Packages</param-name>
      <param-value>org.cloudme.webgallery.stripes.extensions</param-value>
    </init-param>

The application runs overall faster now and uses a lot less external libraries than before. I can recommend everyone who implements for Google App Engine to try to reduce the application size as much as possible by removing unnecessary complex dependencies. With the right application design, switching between technologies can be very easy. My initial concern that, when using very GAE specific technologies (such as Objectify), I get stuck in that platform and can never escape, is irrelevant, as migration to a different technology can be very simple. Just ensure you have sufficient test cases.

Creating iPhone webapps with Vaadin Touchkit

In this post I’d like to write down my experiences (so far) with creating iPhone webapps with Vaadin TouchKit. For those of you who are not familiar with TouchKit:

TouchKit is a tool kit that lets you develop applications that look and feel like native iPhone applications using only Vaadin.

That means in plain English: Creating iPhone webapps in pure Java with a GWT-based, elegant web framework and no HTML, JavaScript or other technology required.

Vaadin provides an Eclipse plugin, which is the recommended way of developing a Vaadin application, but that would only be half the fun. I want to go two steps further:

  • Use Maven as build / project management tool – but still using Eclipse as IDE.
  • Use Google Appengine as hosting environment.

I wrote already about mavenizing a Google Appengine project.

Create a Vaadin Maven project

  1. Vaadin already has a good Maven plugin. For TouchKit it is required to have the option to compile the Vaadin widgetset. Therefore we use the Maven archetype to create a project which includes the GWT plugin already:
    
    mvn archetype:generate \
    -DarchetypeGroupId=com.vaadin \
    -DarchetypeArtifactId=vaadin-archetype-sample \
    -DarchetypeVersion=LATEST \
    -DgroupId=your.company \
    -DartifactId=project-name \
    -Dversion=1.0.0 \
    -Dpackaging=war
    
  2. Optional: run the application with mvn jetty:run. You can access the application at localhost:8080/project-name
  3. Optional: Create the Eclipse project files to import the project in Eclipse with mvn eclipse:eclipse. Of course this is not required, and instead of Eclipse you can use your IDE of choice.
  4. The created pom.xml file doesn’t reference the latest versions of Vaadin and GWT, unfortunately. For TouchKit at least Vaadin 6.3 is required. Therefore change Vaadin to version 6.3.0 and GWT to version 2.0.3:
    
    <dependency>
    <groupId>com.vaadin</groupId>
    <artifactId>vaadin</artifactId>
    <version>6.3.0</version>
    </dependency>
    <!-- This is also used by gwt-maven-plugin to deduce GWT version number. -->
    <dependency>
    <groupId>com.google.gwt</groupId>
    <artifactId>gwt-user</artifactId>
    <version>2.0.3</version>
    <scope>provided</scope>
    </dependency>
    

Add the TouchKit widgetset

  1. Add the TouchKit dependency to the classpath:
    
    <dependency>
    <groupId>org.vaadin</groupId>
    <artifactId>vaadin-touchkit</artifactId>
    <version>0.5</version>
    </dependency>
    
  2. Unfortunately, TouchKit is not available in a public Maven repository. You have two choices now: either add the TouchKit jar to your local repository using mvn install:install-file or you put it into your own remote repository. The advantage of the second option is obvious: you don’t need to install the file on all your development machines locally. Add your remote repository to pom.xml:
    
    <repository>
    <id>cloudme</id>
    <url>http://cloudme.googlecode.com/svn/maven</url>
    </repository>
    
  3. Update web.xml file and change the servlet class to org.vaadin.touchkit.mobileapplication.MobileApplicationServlet, as described here.
  4. Unfortunately, there is a bug in the current gwt-maven-plugin, and therefore a workaround is required to use the TouchKit widgetset: create your own widgetset which inherits TouchKit. It is important to set the entry point, otherwise the Maven plugin will not compile it correctly:
    
    <module>
    <entry-point class="com.vaadin.terminal.gwt.client.DefaultWidgetSet" />
    <inherits name="com.vaadin.terminal.gwt.DefaultWidgetSet" />
    <inherits name="org.vaadin.touchkit.widgetset.TouchKitWidgetset" />
    </module>
    
  5. Change the widgetset in web.xml:
    
    <init-param>
    <param-name>widgetset</param-name>
    <param-value>com.example.gwt.MyWidgetset</param-value>
    </init-param>
    

Now create a simple application using TouchKit widgets as described here, update the application parameter in web.xml and perform a clean build: mvn gwt:clean jetty:run

Enable Google Appengine

Please note that the following steps describe only the basic, most necessary steps required to run the TouchKit application in Google Appengine.

  1. Add the GAE version to the properties section of the pom.xml:
    
    <gae.version>1.3.2</gae.version>
    
  2. Add required plugins:
    
    <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-war-plugin</artifactId>
    <version>2.1-beta-1</version>
    <configuration>
    <webResources>
    <resource>
    <directory>src/main/webapp</directory>
    <filtering>true</filtering>
    <includes>
    <include>**/appengine-web.xml</include>
    </includes>
    </resource>
    </webResources>
    </configuration>
    </plugin>
    <!--
    The actual maven-gae-plugin. Type "mvn gae:run" to run project,
    "mvn gae:deploy" to upload to GAE.
    -->
    <plugin>
    <groupId>net.kindleit</groupId>
    <artifactId>maven-gae-plugin</artifactId>
    <version>0.5.7</version>
    </plugin>
    <!--
    Upload application to the appspot automatically, during
    release:perform
    -->
    <plugin>
    <artifactId>maven-release-plugin</artifactId>
    <configuration>
    <goals>gae:deploy</goals>
    </configuration>
    </plugin>
    
  3. And, of course, the plugin repository:
    
    <pluginRepository>
    <id>maven-gae-plugin-repo</id>
    <name>maven-gae-plugin repository</name>
    <url>http://maven-gae-plugin.googlecode.com/svn/repository</url>
    </pluginRepository>
    
  4. Create a appengine-web.xml file in WEB-INF:
    
    <appengine-web-app
    xmlns="http://appengine.google.com/ns/1.0">
    <application>project-name</application>
    <version>1</version>
    </appengine-web-app>
    

Now you can run the application with mvn gae:run

Mavenizing my project

I started working on a Google App Engine project the usual way: using the Eclipse plugin. However, unfortunately the update to the latest Eclipse plugin broke it and I haven’t found a fix yet. So I decided to try the maven-gae-plugin, once again; with Maven everything runs builds better anyway, right? So far I had only made some minor tests with the plugin.

Setup

For mavenizing the project I went the safe route: I created a new project with the maven archetype plugin:


mvn archetype:create\
 -DarchetypeGroupId=net.kindleit\
 -DarchetypeArtifactId=gae-archetype-gwt\
 -DarchetypeVersion=0.5.0\
 -DgroupId=your.groupId\
 -DartifactId=your-artifactId\
 -DremoteRepositories=http://maven-gae-plugin.googlecode.com/svn/repository

The project itself was only temporary, I was interested in the pom.xml file. I updated that file accordingly (e.g. the plugin version of the file was not the latest), removed GWT sections as I don’t use GWT in my project and changed some minor settings and added all required dependencies. Finally I copied the pom.xml into my project directory.

Subversion trouble

Then I made a lot of modifications within Eclipse – moved sources to standard Maven locations, removed JAR files etc. and was about to commit everything – but unfortunately Subversion detected a bunch of tree conflicts. Bummer. While resolving these conflicts seem to be hard, I decided to check out the project into another directory and start changing the structure from scratch – this time not using Eclipse but Subversion command line tools. That worked perfectly.

Running

With running mvn gae:run I started the development server. Startup was really smooth. However, by default the server is started on localhost only, but in my project I need a specific IP address as I need to access the server from the iPhone, too. Therefore I had to set the gae.address property. Of course it can be defined in pom.xml, but then it is the same for all development machines. I don’t want that. Therefore it must be defined in the users’s ~/.m2/settings.xml:


  <profiles>
    <profile>
      <id>gae</id>
      <properties>
        <gae.address>192.168.178.24</gae.address>
      </properties>
    </profile>
  </profiles>
  <activeProfiles>
    <activeProfile>gae</activeProfile>
  </activeProfiles>

Now the server runs on the right IP address.

Debugging

Now with using Maven I did not want to get rid of the ability to debug my application. How can this be done? Easy. First of all, the plugin provides the mvn gae:debug goal. When running this goal, the development server starts in debug mode.

In Eclipse a new run debug configuration has to be created.

  1. In the Console run mvn gae:debug; Maven will compile, execute tests and start the development server in debug mode.
  2. Wait until the server waits for the remote debugger; you will see the following output: “Listening for transport dt_socket at address: 8000″. The address is in this case the port, which needs to be set in the Eclipse debug configurations.
  3. In Eclipse go to Run > Debug Configurations …
  4. Create a new “Remote Java Application” configuration
  5. Give it a good name, choose Connection Type “Standard (Socket Attach)” and set the Connection Properties (in my case Host: 192.168.178.24, Port: 8000)
  6. Click on “Debug”. Now you see that the development server in the console continues.

Reloading webpages

Now everything works as good as when using the Eclipse plugin, right? Not quite. There is one thing that doesn’t work: dynamic reload of webpages, such as JSP or CSS files. Jetty allows dynamic reload, but when using Maven, Jetty does not use the src/main/webapp folder as working directory; instead it uses its own directory somewhere in target/…

To avoid long edit / build / deploy / run cycles, the recommended way at the moment is to run mvn gae:run in one console window and mvn cli:execute in another window. The command line interface allows you to quickly execute Maven goals. Run compile war to update the webpages in the development server’s working directory.

While this is not quite as simple as with Eclipse, it is a workaround that speeds up the development process significantly.

Cache strategies for Google App Engine

Google App Engine (GAE) provides distributed in-memory cache, called Memcache. Due to quite rigid quotas, it might be necessary to extensively use the cache.

In my example, a web based photo gallery, a lot of image scaling is performed. For example, when an album is loaded, all images are shown as thumbnails. These thumbnails are generated with the Images service and more sooner than later an OverQuotaException is thrown. Although a few minutes after an OverQuotaException image transformation can be resumed, it is still annoying for the user. My application catches the exception and shows a “sorry, over quota” default image.

In order to reduce the number of image transformations, all transformed data is put into the cache. For this I am using a very simple (custom) API consisting of a CacheProducer, which creates the data and a CacheService, which checks if the data is already in cache or needs to be created.


public interface CacheProducer<T> {
    T produce();
}
public interface CacheService {
    <T> T cache(CacheProducer<T> producer, Serializable... params);
    void invalidate();
}

The Serializable... params parameter represents the components of the cache key.

Now let’s have a look at how to use this API. First we look at the PhotoDataService (which is used to get the actual image data from the datastore):


public class PhotoDataService {
...
    public byte[] getPhotoData(final Long photoId, final ImageFormat format, final ContentType type) {
        return cacheService.cache(new CacheProducer<byte[]>() {
            public byte[] produce() {
                PhotoData photoData = photoDataRepository.find(photoId);
                byte[] input = photoData.getDataAsArray();
                return imageService.process(input, format, type);
            }
        }, photoId, format, type);
    }
...
}

For this example I removed all code that is not related to caching. What you don’t see here: the CacheService first looks into the cache whether the data has been cached already. If not, the produce() method is called to produce the data, the CacheService puts it into the cache and returns the data. With this approach, I don’t need to care about the actual cache mechanisms, and I reduce clutter in the actual business logic. The cache API is reusable and can be used for any other data and key parameters.

However, the first run still produces a lot of requests to the Images service and here OverQuotaExceptions are still thrown. Next I’ll explain how to use a background task to prepopulate the cache in order to reduce the exceptions even further.

First run Maven GAE Plugin

Until now I used the Eclipse plugin to develop and test Google App Engine applications. Today I tried the Maven GAE Plugin, in order to become more flexible and independent from Eclipse. My first observations were:

  • As I started with the Maven GAE Archetype, the package declaration of the generated code did not match the actual file location. That needed to get corrected first.
  • I changed the compiler settings from 1.6 (default settings of the archetype) to 1.5 and had to remove some @Override annotations.
  • The gae.home property is undefined (which makes sense), but I had to define it to run the development server.

Overall, I’m quite pleased with the first run. Although Maven adds a lot of overhead, and the actual build process took a bit too long initially to download all dependencies, the overall impression is very good.

First Google App Engine application

As a coding exercise, I started writing a small application for the Google App Engine environment. It is a very simple web photo gallery, managing photos in albums. The main purpose of the application was to get my hands on some new frameworks, and also learn about deployment on the Google App Engine Environment.

You can look at the application here: http://photos.moritzpetersen.de and browse the source code here.

The application’s architecture is a simple 3-tier architecture:

  • The repository tier is responsible for persistence. I chose JDO as ORM technology.
  • The service tier is responsible for handling business logic. I tried to encapsulate as much logic as possible here. However, there are some exceptions, but I wouldn’t consider it business logic. For some features I implemented JSP EL functions (e.g. showing the right copyright date at the footer of each page).
  • The web tier is manages the web requests and interaction with the forms. As web framework I have chosen Stripes, which was a good choice. From the Stripes framework, I used templating, validation and error handling and file upload.

All layers are stitched together using Spring 3.0. Here I used mostly annotation based configuration of the container. I also used the JdoTemplate to easily implement the JDO repositories. Using Spring was a lot of fun, as it allowed me to implement some parts of the application very generic first: the service tier was completely generic for the first CRUD use cases, and I could easily extend the generic services later. Spring simply stitched together the right classes (emmitting a lot of warnings as I use(d) autowiring by type).

For security I used the GAE specific application security configured in the deployment descriptor. That caused the first problem when deploying the application to the App Engine: I had to use my Gmail user, not my Google Apps user for administration of the application – and I’m still not sure what the actual problem was (but maybe I should just RTFM).

The next problem was also caused by my laziness: I didn’t create any index definitions – and promptly the application threw a lot of errors because of missing indexes. Again: I should better RTFM.

But after a while, GAE automatically generated indexes and the application ran without throwing exceptions. But I immediately noticed – confirmed by a quick look at the administration dashboard: the application is too slow. It consumes a lot of CPU time for scaling images (e.g. creating thumbnails). As a quick next step I implemented a cache layer. Architecturally, the cache layer itself is a service used by the PhotoService. I am not quite confident about this implementation, but it does its job well. My original idea was to implement the cache layer as a layer between repository tier and service tier, or implement it directly into the repository tier. But during development I figured out that the cache is quite closely related to the business logic (scaling of images) and not to the persistence.

Finally I am quite satisfied with web application development on the GAE platform. The development environment is very easy to use and deployment is almost trivial. It is very easy to bring an application in production. On the other hand, my initial impression of the performance is quite mixed; loading pages takes very long and measuring the performance indicates that the GAE platform itself is not very responsive.