March 31, 2015

Building custom Linux kernel the Debian way

Image by: shokunin
Building a Linux kernel is a nice way to learn some new skills, install a fresh and new version of your Operating System core software, and also make any optimizations you see fit for your machine. However, building a kernel is not a trivial task. From downloading, configuring, building and installing, you may get lost if you don't follow some tips.

In this blog post, we are going to understand some basic steps on how to build a custom kernel from sources, using some tools available in the Debian GNU/Linux operating system. If you use any other Debian derivative, like Ubuntu or Mint, this tutorial should work as well.

Why build a custom Kernel?

First and most important: because you can! Secondly: because it is fun! And, off course, because you may need a newer kernel to support hardware, or may need to rebuild the kernel to enable specific features.

It is also a good experience that allows you to hack into the biggest open source project, see all available configuration options and disable those that you may never use, making a slim and optimized kernel for your machine.

Understanding the process

In Debian and derivatives, all packages are installed by the dpkg tool, or via the apt command-line and it's friends. These ensures that every file installed by your package is tracked by the system, allowing you to properly upgrade or remove it later. If you install something in Debian without using dpkg/apt, then you risk compromising your system and any future upgrades from Debian.

Building a Debian package is a complex task by itself: you need to follow a set of rules in order to do it right. But for some commonly used software, Debian also provides a few tools that help you automate the process of generating a .deb installable file out of the source tarball. This is true for the JDK, as well as for the Kernel.

To build a kernel from sources and generate a Debian packge using the tools available in the repository, you need to install the make-kpkg package:

# apt-get install make-kpkg build-essential

The process to build the kernel is:
  1. Download the sources from http://www.kernel.org/
  2. Extract the source tarball: 
    • tar xf linux-3.19.3.tar.xz
  3. Configure the Kernel options enabling/disabling features:
    • cd linux-3.19.3 && make menuconfig
  4. Compile all sources to generate the kernel binary and modules:
    • make-kpkg --revision=~custom1 -j 4 binary_image
This process is simple, except from the configuration options: if you don't know how to configure the kernel, you will get lost with the make menuconfig script. Since configuring the kernel itself is a hard thing to properly understand and master, let's use another script to to that in a more automated way. Instead of make menuconfig, use: make olddefconfig. This will take the current running kernel options as a base, and update it to match the new kernel options, using the default values for new options.

Helper script

This process can be even further automated using a script I wrote called kernelbuild. The script is available under the set of scripts I host at the tools repository on my Bitbucket account. To build the newest stable kernel version, all you need to do is:

$ kernelbuild --fast --safe-config

That's it. The script downloads the latest stable kernel from www.kernel.org, unpacks the source in the current directory, runs a simple option to automatically configure the kernel based on your current running one and uses auomated defaults for newer build options, and then compiles the kernel using the make-kpkg tool. This process will generate all .deb files for you: the kernel image, a linux-headers package and a package with debug symbols. Super easy!

How the script works

The script uses wget to download the kernel package for you. It parses the kernel.org HTML output to find the latest stable version, and downloads that version tarball for you. You can avoid autodetecting the kernel version to download a specific one with the -k switch.

Once downloaded, the script unpacks the source tarball. After that step, the script runs the make olddefconfig kernel configuration script. Altought simple, this can be odd if you are using a very different kernel release than the one you are building. You may be prompted to set some options by hand, and if unsure, just press ENTER to use the default value.

The next step is to build the kernel, and the script passes a few parameters to make-kpkg, such as a custom version tag (~custom1). This version number uses a tilde, so your kernel package may be upgraded by the oficial Debian kernel release once it is available in the repositories. The script also uses some other build options, like the -j flag that enables parallell build and use all CPU cores available.

Installing your new Kernel

To install your debian kernel, all you need to do is to use the dpkg tool. Use ls to see the deb files built, and install the image and header packages. If you built the 3.19.3 kernel version, to install use:

# dpkg -i linux-image-3.19.3_3.19.3~custom1_amd64.deb
# dpkg -i linux-headers-3.19.3_3.19.3~custom1_amd64.deb

This installs the packages built by the script, and you can then reboot your machine to use the new kernel. The make-kpkg tool builds a package that contains all triggers that will update/generate the initrd file (necessary to boot), and appropriate Grub menu entries.

Conclusion

Building the Linux Kernel is a very interesting task, that can help you better understand how an operating system works and learn more about the options available to optimize and customize your machine software.

In this post, we saw some tips to build a Kernel suitable to use on a Debian-based distribution, and how you can make use of a helper script to generate a Debian package that integrates nicelly with the operating system software.

Disclaimer: I have tested the scripts and commands provided in this post on multiple machines without issues. The source code is provided as is, without any warranties, and I am not resposible for any damange that you cause to your software or hardware when using it. Use at your own risk.

Happy hacking!

March 25, 2015

Multiple runtimes into single Google App Engine project with Modules

There is no single tool to rule them all. Each Google App Engine runtime has it strengths, but sometimes you can't afford to reinvent the wheel and reimplement something that is ready to use in a library, because of the language you choose for a project in the first place. With the App Engine Modules feature, you can mix and match different programming languages that share the same data in a single Google Cloud Platform project.

In this post, we discuss what the modules feature is and how to leverage it's power with a sample application that mixes Go and Python.

For the impatient, here is the source code https://github.com/ronoaldo/appengine-multiple-runtimes-sample.

What are Modules?

Google Cloud Platform has a top-level organization unit called Project. Each project comes pre-configured with an App Engine application. By creating a project, you have access to a Cloud Datastore database, Memcache, and Task queue, all ready to use.

In the early days of App Engine, your app could have multiple versions, but there was no option to configure backend machines. Once the now deprecated Backends configuration was released, there was no way to deploy multiple versions of them.

With the Modules feature, you can split your application and have a different set of configurations for performance, scaling and version under the same Project, one for each module. This allows you to mix different runtimes, but still share the same state in the same project: all modules have access to the same resources, such as Datastore and Memcache. This allows you to, for instance, create a backend in Java that generates reports with the App Engine Map/Reduce, while you leverage Go in the frontend, also called the default module, that has auto-scaling.

That's just what I want! How to use this stuff?

You can have any setup you want depending on your codebase. The Modules API just requires you to specify the default module, and any other number of non-default modules when running and deploying.

You specify the module name using the module: directive in app.yaml and the <module> XML node in the appengine-web.xml configuration files. A file that does not contain any module directive, means that it is the default module; you can also explicitly set module: default, which I recommend, to avoid mistakes. To create other modules, all you need is to configure their names on separate configuration files. For Java, you can also use the standard project layout of an EAR (Java Enterprise Application) and set the configuration options under each application folder (each module).

My suggested project layout is one that has separate one folder for each module. For instance, in our sample application, we are going to have this layout:

project/
  go-frontend/
    app.yaml
    main.go
  py-backend/
    app.yaml
    main.py

In the go-frontend folder, we put our App Engine app in Go. Then, in the py-backend folder we put our backend logic in Python. To make all this works you have to:

  1. Make sure that go-frontend/app.yaml has the module: default directive.
  2. Make sure that py-backend/app.yaml has the module: py-backend (or any other name you want, except the word default)
  3. Make sure that both app.yaml files have the same application: element value.
  4. Use the gcloud command to install the App Engine SDKs.
With that setup, you can then run the app locally using the command:

$ gcloud preview app run go-backend py-frontend

The gcloud preview app subcommand will then parse all files you have and launch them locally. The default module run on port 8080, and the py-backend on 8081, so you can address them by the port number.

To deploy, you can use:

$ gcloud preview app deploy go-backend py-frontend

With this setup, you have some options:
  1. Each module can live on separated source code repositories, making easy to manage their lifecycle independently from each other.
  2. You can deploy a single module by just putting only one folder on the command line.
Some tips:
  1. Remember that all modules share the same state, but not share code. That said, be careful when changing the datastore. For instance, if you have a struct in Go that saved more properties than you Python db.Model has, you may lose data! In this particular case, it would be nice to use something like a raw db.Expando (Python) or a datastore.PropertyList (Go) to manipulate the data on the non-default module or to always keep your models in sync between the codebases.
  2. When deploying, you may need to convert your app performance settings to modules. That means you no longer can change instance class on the Application Settings, and instead, you need to specify this in the app.yaml / appengine-web.xml files. (See the modules documentation on how to migrate).
  3. This scenario is particularly useful to run App Engine Pipelines or App Engine Map/Reduce jobs using the Python runtime backend while keeping the front-end in Go.
  4. Some settings are global to your appliation like the task queue, cron, and dispatch entry definitions. So keep these files, queue.yaml, cron.yaml, and dispatch.yaml, only on the default module. If you have three queues defined in the go-frontend folder and only one in the py-backend, then if you deploy only the py-backend, only one queue will be on the server, and this can break things, particularly for the cron job definitions.

If I have custom domains, how can I make this work?

You can use the address wildcards to access the backends. Let's say you have a wildcard mapping of you custom domain like this:

*.mydomain.com CNAME glehosted.com

App Engine will then route request as follows:

www.mydomain.com is handled by the default version of go-frontend
- since no module is named www, and no version subdomain was specified, this falls into the default module, default version

default.mydomain.com is handled by the default version of go-frontend
- matches the default module/default version

py-backend.mydomain.com is handled by the default version of py-backend
- matches the py-backend module/default version

beta.py-backend.mydomain.com is handled by the version beta if it exists, of py-backend
- matches the py-backend module/beta version

Any string that does not match a module is handled by the default module, and any string that does not match a version, is handled by the default version of the module. This is the same for *.your-app-id.appspot.com domain that you have for free in your app.

You can also configure some rules that change that behavior using the dispatch.yaml (dispatch.xml) file. This file allows you to specify wildcard URLs and say from which modules they will be served. For instance, you can map */report/* to route requests to any hostname, that have a path starting with /report/, to be served to your Python backend.

Note that, if you dispatch /report/* to the python backend, the python backend handlers will handle requests with that path prefix (/report/), and not at the root path. Therefore, you need to map you WSGI app using that path as a prefix. See more about routing requests to modules in their docs. Check out the sample code dispatch.yaml as well.

Conclusion

With Modules, your App Engine applications can be composed by a more robust infrastructure, allowing you to mix different programming languages and runtimes. If you work with micro services on App Engine, this is a great fit to better take advantage of your app resources using the right language and tools to do the right job.

--
Update 25/03/2015 - Fixed grammar errors, and added clarifications to first paragraphs.