September 4, 2013

Mapping multiple domains to Google AppEngine with nginx reverse-proxy on GCE

If you work with Google AppEngine and want to map a custom domain to your application, you must also configure Google Apps for that domain. If you have an app with a white label option, your customers also have to configure Google Apps in order to use your app under their domain. Not only that, they must also manage the Google Apps setup, outside your application.

There are two issues on the public issue tracker that you may be interested to star and watch. One about creating an API to manage the domains: https://code.google.com/p/googleappengine/issues/detail?id=8528 and another about the problem itself: https://code.google.com/p/googleappengine/issues/detail?id=2587. I recommend both to be starred so we tell to Google AppEngine Team that this is important!

But if you simply can't wait for them to be fixed, you can fix it yourself building your own reverse-proxy solution, and make it scale with your app. With a simple nginx proxy setup, running on a GCE instance, a round-robin DNS mapping and a small set of changes into your source code, you may acomplish that with very little effort. As a bonus, those proxies can be usefull for other stuff too, like another level of caching that can save you some instance hits.

Initial setup

First of all, I'm assuming that you already have an AppEngine application with billing enabled, and that you have created a Google Cloud Console project for your application. If you don't have this already, then sign up for AppEngine, setup billing on your app and setup billing on the Google Cloud Console for you application. This will allow you to use all resources required.

If your AppEngine application don't have a Cloud Console Project yet, go to you AppEngine Dashboard -> Application Settings -> Cloud Integration and click in the button "Add Project". After that procedure take place, you will see a box like this one:

After that, login on Google Cloud Console at https://cloud.google.com/console, select the project with your application and go to Compute Engine -> VM Instances and click on the "New Instance" button. I recomend using the Debian Wheezy 7.0 image as starting point, as it is the current long-term stable release of  Debian, and because Debian is the new default image to use in GCE.

In the new instance screen, give it a name and description. Also, I recomend label it with tags to help you manage more than one instance at once. I've added the tags nginx, proxy and http. Later, we will use some tags to configure routes and firewalls.



It is also recommended that you choose "New static ip address..." in the Networking section, wich may allow you to configure the proxy more easily.

To finish your instance startup, click in the "Create" button and wait until your instance is ready.

Installing and configuring nginx

Once your instance boots, you are ready to configure the software that will allow it to proxy requests to your AppEngine application. I choose nginx as the software, because it is a lightweight and high scalable HTTP server. You can also do this with Apache or even Squid if you wish, but nginx scales easier and the configuration is very simple.

Login into your new instance via SSH, and issue the following commands:

sudo bash
apt-get update && apt-get upgrade
apt-get install nginx unattended-upgrades

The first command starts a new bash session as root, as we will need to configure and install everything as super user. The second line updates package lists and install security upgrades. The last line installs our HTTP server, nginx, with default configuration and the package unattended-upgrades, to automatically install security patches.

Next, let's configure nginx to proxy requests from www.example.com to our application. AppEngine has some wildcard domains in place for every app. Any requests reaching *.default.your-app-id.appspot.com will reach the default version of your application. With that in mind, we can easily map www.example.com on nginx to proxy www.example.com.default.your-app-id.appspot.com.

To put this into our newly installed nginx, let's create a file named /etc/nginx/conf.d/appspot.conf

Listing 1: /etc/nginx/conf.d/appspot.conf

server {
  listen 80;
  location / {
    resolver 8.8.8.8;
    proxy_set_header X-Forwarded-Host $host;
    proxy_set_header X-Forwarded-Server $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_pass http://$host.default.appid.appspot.com/$request_uri;
    proxy_redirect http://$host.default.appid.appspot.com/ /;
  }
}

Let's check this configuration:

  • First, we create a "nginx server" at port 80
  • In that server, we set a resolver to the Google DNS 8.8.8.8
  • Then, we configure the proxy to add some headers. I'm setting all of them, to be safe, but they may not all be needed.
  • Finally, the proxy_pass and proxy_redirect directives will make requests be routed to your app, and when you return a 302 redirect it will hide the proxy internal name.

Configure instance firewalls

To make your proxy available to the public, let's setup a firewall rule in the instance network to allow traffic incomming from any host to the port 80.

In the Compute Engine Console, click on the VM Instances section, and then click on your recently launched instance. Under the instance details page, you'll see the Network section, where you can find a link to the newtork your instance is attached to. Click on the link and in the network configuration. under Firewall, and fill the form with the following values. I used the tag http to apply this setting to all instances with that tag. Very handy!

Tip: for better control over your instances, you can create a new network under the Network section of GCE console, and then on instance creation, select the network to keep your proxies with the same setup.

Gotchas

As you may expect, there is aways a trade-off! This setup may cause some side-effects:
  1. UserService.createLoginUrl() aways redirect the user to the proxied address under appspot.com.
  2. To your app, the remote address will aways be the proxy IP and the request hostname will be under appid.appspot.com.
You can solve #1 by replacing the login management to tools like openid4java, or something similar, achieving the same goal. #2 is discussed bellow, and also require some code changes.

Update your code when you need the real hostname

You may already have some sort of code to make the hostname from the request map to your customer data and resources, like customizing the logo or a stylesheet. In this case, you have to tweak your code to handle the request headers we set on nginx instead of using only the values from the current request alone.

If your web framework allows you to add some middlewere classes like Django, or a servlet Filter like Java EE, then you can use this to intercept requests, check for proxy headers, and wrap the request methods that return the hostname or request URL to use the values from the proxy. See bellow for a sample Java Filter that will do the trick (Listing 2):

Listing 2: ProxyFilter.java

import java.io.IOException;
import javax.servlet.*;
import javax.servlet.http.*;

public class ProxyFilter implements Filter {

  public static final String HEADER = "x-forwarded-server";

  public class ProxyRequestWrapper extends HttpServletRequestWrapper {

    public ProxyRequestWrapper(HttpServletRequest request) {
      super(request);
    }

    @Override
    public String getServerName() {
      return getHeader(HEADER);
    }

    @Override
    public StringBuffer getRequestURL() {
      StringBuffer url = new StringBuffer(getScheme());
      url.append("://");
      url.append(getServerName()); 
      url.append(":").append(getServerPort());
      url.append(getRequestURI()); 
      return url;
    }
  }

  @Override
  public void doFilter(ServletRequest proxyRequest,
             ServletResponse proxyResponse,
             FilterChain chain) throws IOException, ServletException {
    HttpServletRequest request = (HttpServletRequest) proxyRequest;
    
    String header = request.getHeader(HEADER);
    if (header != null && !header.trim().isEmpty()) {
      request = new ProxyRequestWrapper(request);
    }

    chain.doFilter(request, proxyResponse);
  }

  @Override
  public void init(FilterConfig config) {}

  @Override
  public void destroy() {}
}

You can make similar changes using a Django middleware. Please refer to the django documentation for details.

Scaling your reverse-proxies 

To scale your proxies, you can add more instances with another reserved IPs, and use DNS Round Robin as simple way to achieve load-ballancing.

You just need to add multiple A records for a sudomain like lb.myapp.com, one for each proxy IP. Then, when mapping the example.com domain to your app, your customers can simply create a CNAME record to lb.myapp.com. The requests will be distributed over your proxies using the Round Robin algorithm. Please refer to your DNS hosting documentation for details, and to check if they provide this DNS algorithm.

Tip: Since this nginx setup is very simple, you can configure a small startup script, host it on the Google Cloud Storage, and then make it easier to start new proxies: just call the command gcutil with the startup script URL as a parameter! After your new instance boots, just add the new reserved IP as a new A record for lb.myapp.com. This is not an automatic scaling, but is very easy to start/stop them, and also to remove nodes from your system.

Conclusion

This procedure can also be used on other cloud providers, like Amazon AWS or Digital Ocean, or any other provider with Deiban Wheezy support. This configuratoin steps can also work on other distributions or nginx versions, altought I didn't tested.

Hope you enjoy this tips, and if you run into problems, drop a comment bellow and we can help you out!

Happy Hacking!