March 25, 2015

Multiple runtimes into single Google App Engine project with Modules

There is no single tool to rule them all. Each Google App Engine runtime has it strengths, but sometimes you can't afford to reinvent the wheel and reimplement something that is ready to use in a library, because of the language you choose for a project in the first place. With the App Engine Modules feature, you can mix and match different programming languages that share the same data in a single Google Cloud Platform project.

In this post, we discuss what the modules feature is and how to leverage it's power with a sample application that mixes Go and Python.

For the impatient, here is the source code https://github.com/ronoaldo/appengine-multiple-runtimes-sample.

What are Modules?

Google Cloud Platform has a top-level organization unit called Project. Each project comes pre-configured with an App Engine application. By creating a project, you have access to a Cloud Datastore database, Memcache, and Task queue, all ready to use.

In the early days of App Engine, your app could have multiple versions, but there was no option to configure backend machines. Once the now deprecated Backends configuration was released, there was no way to deploy multiple versions of them.

With the Modules feature, you can split your application and have a different set of configurations for performance, scaling and version under the same Project, one for each module. This allows you to mix different runtimes, but still share the same state in the same project: all modules have access to the same resources, such as Datastore and Memcache. This allows you to, for instance, create a backend in Java that generates reports with the App Engine Map/Reduce, while you leverage Go in the frontend, also called the default module, that has auto-scaling.

That's just what I want! How to use this stuff?

You can have any setup you want depending on your codebase. The Modules API just requires you to specify the default module, and any other number of non-default modules when running and deploying.

You specify the module name using the module: directive in app.yaml and the <module> XML node in the appengine-web.xml configuration files. A file that does not contain any module directive, means that it is the default module; you can also explicitly set module: default, which I recommend, to avoid mistakes. To create other modules, all you need is to configure their names on separate configuration files. For Java, you can also use the standard project layout of an EAR (Java Enterprise Application) and set the configuration options under each application folder (each module).

My suggested project layout is one that has separate one folder for each module. For instance, in our sample application, we are going to have this layout:

project/
  go-frontend/
    app.yaml
    main.go
  py-backend/
    app.yaml
    main.py

In the go-frontend folder, we put our App Engine app in Go. Then, in the py-backend folder we put our backend logic in Python. To make all this works you have to:

  1. Make sure that go-frontend/app.yaml has the module: default directive.
  2. Make sure that py-backend/app.yaml has the module: py-backend (or any other name you want, except the word default)
  3. Make sure that both app.yaml files have the same application: element value.
  4. Use the gcloud command to install the App Engine SDKs.
With that setup, you can then run the app locally using the command:

$ gcloud preview app run go-backend py-frontend

The gcloud preview app subcommand will then parse all files you have and launch them locally. The default module run on port 8080, and the py-backend on 8081, so you can address them by the port number.

To deploy, you can use:

$ gcloud preview app deploy go-backend py-frontend

With this setup, you have some options:
  1. Each module can live on separated source code repositories, making easy to manage their lifecycle independently from each other.
  2. You can deploy a single module by just putting only one folder on the command line.
Some tips:
  1. Remember that all modules share the same state, but not share code. That said, be careful when changing the datastore. For instance, if you have a struct in Go that saved more properties than you Python db.Model has, you may lose data! In this particular case, it would be nice to use something like a raw db.Expando (Python) or a datastore.PropertyList (Go) to manipulate the data on the non-default module or to always keep your models in sync between the codebases.
  2. When deploying, you may need to convert your app performance settings to modules. That means you no longer can change instance class on the Application Settings, and instead, you need to specify this in the app.yaml / appengine-web.xml files. (See the modules documentation on how to migrate).
  3. This scenario is particularly useful to run App Engine Pipelines or App Engine Map/Reduce jobs using the Python runtime backend while keeping the front-end in Go.
  4. Some settings are global to your appliation like the task queue, cron, and dispatch entry definitions. So keep these files, queue.yaml, cron.yaml, and dispatch.yaml, only on the default module. If you have three queues defined in the go-frontend folder and only one in the py-backend, then if you deploy only the py-backend, only one queue will be on the server, and this can break things, particularly for the cron job definitions.

If I have custom domains, how can I make this work?

You can use the address wildcards to access the backends. Let's say you have a wildcard mapping of you custom domain like this:

*.mydomain.com CNAME glehosted.com

App Engine will then route request as follows:

www.mydomain.com is handled by the default version of go-frontend
- since no module is named www, and no version subdomain was specified, this falls into the default module, default version

default.mydomain.com is handled by the default version of go-frontend
- matches the default module/default version

py-backend.mydomain.com is handled by the default version of py-backend
- matches the py-backend module/default version

beta.py-backend.mydomain.com is handled by the version beta if it exists, of py-backend
- matches the py-backend module/beta version

Any string that does not match a module is handled by the default module, and any string that does not match a version, is handled by the default version of the module. This is the same for *.your-app-id.appspot.com domain that you have for free in your app.

You can also configure some rules that change that behavior using the dispatch.yaml (dispatch.xml) file. This file allows you to specify wildcard URLs and say from which modules they will be served. For instance, you can map */report/* to route requests to any hostname, that have a path starting with /report/, to be served to your Python backend.

Note that, if you dispatch /report/* to the python backend, the python backend handlers will handle requests with that path prefix (/report/), and not at the root path. Therefore, you need to map you WSGI app using that path as a prefix. See more about routing requests to modules in their docs. Check out the sample code dispatch.yaml as well.

Conclusion

With Modules, your App Engine applications can be composed by a more robust infrastructure, allowing you to mix different programming languages and runtimes. If you work with micro services on App Engine, this is a great fit to better take advantage of your app resources using the right language and tools to do the right job.

--
Update 25/03/2015 - Fixed grammar errors, and added clarifications to first paragraphs.