sharetwitterlinkedIn

Using Cloud Native Buildpacks to Improve the Function Image Building Capability of Function Mesh

November 01, 2022
head img

The concept of Buildpacks was first conceived by Heroku in 2011. PaaS platforms like Heroku needed to support applications in multiple languages, which were often built with very similar logic. In January 2018, Pivotal and Heroku co-launched the Cloud Native Buildpacks (CNB) project, which joined the CNCF in October of the same year.

In this blog, I will give an overview of the CNB project and its core components, and then use an example to demonstrate how to use it to build images for Function Mesh.

What does CNB mean for developers and operators?

We know that the container runtime ecosystem today has long been more than a Docker monopoly. The advent of the Open Container Initiative (OCI) has set the standard for the industry, meaning that given an OCI image, any container runtime that implements the OCI standard can use that image properly.

Buildpacks is one such image builder that is able to produce OCI-compliant images. It satisfies the needs of both developers and operators and solves the conflict between the two groups.

The CNB project shields developers from the details of the application building and deployment process. They don’t need to understand and write the code for the runtime environment, or worry about details, such as which operating system to use for the image, the differences in scripts under such operating system, and image size optimization. When using CNB, developers only need to select the appropriate builder image and then provide their source code directory to build the application image.

For the Ops team, they can assemble the application image builder with several Buildpacks (the minimal build unit in CNB) in a lego-like manner to meet various needs. Based on the mechanism between the base runtime environment and the application artifacts (i.e. ABI) in the CNB image, operators can replace the base runtime environment in the application image with a single command when there is a CVE in the image's base runtime environment. They don’t need to rebuild a new image and make any adaptive changes for the new base runtime environment.

Why does Function Mesh need CNB?

Function Mesh is a serverless framework purpose-built for stream processing applications. It brings powerful event-streaming capabilities to your applications by orchestrating multiple Pulsar Functions and Pulsar IO connectors for complex stream processing jobs.

A serverless framework like Function Mesh inevitably needs to provide a way for users to submit their functions when it is working. There are currently two common ways to do this.

  1. Upload the function to the package management service of the Pulsar cluster
  2. Customize the function Docker image

Both approaches involve plenty of repetitive manual operations, including compiling, packaging, and uploading the function code to package management systems, and writing Dockerfiles.

CNB is well suited for scenarios where the build process is constant and has proven to be working on serverless frameworks such as Google Cloud Functions and OpenFunction. Thus, we have reason to believe that CNB will help improve the image building experience of Function Mesh.

CNB components

Cloud Native Buildpacks consist of the following main components.

  • Buildpack: The minimal build unit.
  • Stack: Provides the base runtime environment for the build phase and the application runtime phase.
  • Lifecycle: A lifecycle management interface abstracted from CNB to guide the entire build process.
  • Builder: A builder that integrates a Stack and several Buildpacks with a specific build purpose.
  • Platform: The executor of the interfaces in the lifecycle to meet the user's build requirements.

First, let’s look at these components in detail and how they can work together. Later, I will use an example to demonstrate how to create them.

Stack

A Stack entity is composed of two OCI images, namely the build image and the run image. For example, we can use Ubuntu as the base runtime environment for the build, and then run different phases for the application and install the required software.

Buildpack

To build a Java application, typically, the build logic is comprised of the following steps.

  1. Check if there is a Java code file in the target directory (i.e., files with the .java suffix).
  2. Check if there is a pom.xml file in the target directory.
  3. Make sure the necessary compilation tools such as maven are in the PATH.
  4. Run mvn clean install -B -DskipTests to compile and package the application.
  5. Set the entry point for the image to start the application.

A good principle for making a simple Buildpack is to determine the contents of each Buildpack based on the build steps, so now we need to make 5 Buildpacks.

Lifecycle

Lifecycle is the most important component of CNB. It is essentially an abstraction and orchestration of the build steps from the source code to the image, and its main phases are listed as follows.

  • Detect: Checks which Buildpack is to be executed
  • Build: Executes the build logic in the Buildpack
  • Analyze: Handles the cached content of the build process
  • Export: Exports the OCI image
  • Rebase: Replaces the base runtime environment of the application image

Builder

A Builder entity is an OCI image. By aggregating a Stack, several Buildpacks, and a Lifecycle (which does not need to be prepared by the user), and specifying the execution order of these Buildpacks, a builder with a specific build purpose is produced.

Platform

After you have the Builder ready, you can use the Platform to apply the Builder to the given source code, complete the execution in the Lifecycle, execute Buildpacks in a given order, and finally build the source code into an image and export it.

Common Platforms include Tekton and CNB's pack-cli.

Building a Java function image with Function Mesh Buildpacks

Prerequisites

Directory structure

.
|-- builders
|   `-- java-builder
|       `-- builder.toml
|-- buildpacks
|   `-- java-maven
|       |-- bin
|       |   |-- build
|       |   `-- detect
|       `-- buildpack.toml
`-- stack
   |-- stack.build.Dockerfile
   `-- stack.java-runner.run.Dockerfile

Stack

As I mentioned above, the Stack provides basic building and running environments for an application (in this case, a Java function). It is composed of a build image to construct the build environment and a run image to build application images.

Create the build image

The build image provides the OS environment for the application during the building phase. Note that the Stack ID is io.functionmesh.stack in this example.

stack.build.Dockerfile

FROM ubuntu:20.04

ARG pulsar_uid=10000
ARG pulsar_gid=10001
ARG stack_id="io.functionmesh.stack"

RUN apt-get update && \\
apt-get install -y xz-utils ca-certificates git wget jq gcc && \\
rm -rf /var/lib/apt/lists/* && \\
wget -O /usr/local/bin/yj <https://github.com/bruceadams/yj/releases/download/v1.2.2/yj.linux.x86_64> && \\
chmod +x /usr/local/bin/yj

LABEL io.buildpacks.stack.id=${stack_id}

RUN groupadd pulsar --gid ${pulsar_gid} && \\
useradd --uid ${pulsar_uid} --gid ${pulsar_gid} -m -s /bin/bash pulsar

ENV CNB_USER_ID=${pulsar_uid}
ENV CNB_GROUP_ID=${pulsar_gid}
ENV CNB_STACK_ID=${stack_id}

USER ${CNB_USER_ID}:${CNB_GROUP_ID}

Use the following command to create it.

docker build -t fm-stack-build:v1 -f ./stack.build.Dockerfile .

Create the run image

The run image provides the OS environment and Pulsar Function runtime for the application during the running phase.

stack.run.Dockerfile

Note that this example uses streamnative/pulsar-functions-java-runner:2.9.2.23 as the base image. You can also change the version of the base image as needed.

FROM streamnative/pulsar-functions-java-runner:2.9.2.23

ARG pulsar_uid=10000
ARG pulsar_gid=10001
ARG stack_id="io.functionmesh.stack"
LABEL io.buildpacks.stack.id=${stack_id}

ENV CNB_USER_ID=${pulsar_uid}
ENV CNB_GROUP_ID=${pulsar_gid}
ENV CNB_STACK_ID=${stack_id}

Use the following command to create it.

docker build -t fm-stack-java-runner-run:v1 -f ./stack.java-runner.run.Dockerfile .

Buildpacks

In this example, we need a Buildpack to check whether the Java files (with the suffix “.java”) and the required items (e.g. “pom.xml”) exist. If they do exist, we can build the target artifact (usually a “.jar” file) with Maven and move it to /pulsar.

Use the following command to create the Buildpack. Note that the Buildpack ID is functionmesh/java-maven in this example.

pack buildpack new functionmesh/java-maven \\
  --api 0.7 \\
  --path java-maven \\
  --version 0.0.1 \\
  --stacks io.functionmesh.stack

We can find that a directory named java-maven has been created.

`-- java-maven
  |-- bin
  |   |-- build
  |   `-- detect
  `-- buildpack.toml

buildpack.toml

buildpack.toml is the configuration file for the Buildpack, which contains the buildpack id, the stack id, and other information.

api = "0.7"

[buildpack]
id = "functionmesh/java-maven"
version = "0.0.1"

[[stacks]]
id = "io.functionmesh.stack"

bin/detect & bin/build

Create two scripts of bin/detect and bin/build. You can find them on this page.

The contents of bin/detect check if the Buildpack can be applied to the source code. In this example, bin/detect will check if the source directory includes .java files, and if so, the script will return true and let the Buildpack be applied to this source.

The contents of bin/build compiles the source code. The script is used to:

  • Download mvn and jdk tools
  • Build the package
  • Clear the source code

Builder

A Builder is an image that contains all the necessary components to execute a build.

builder.toml

# Buildpacks to include in builder
[[buildpacks]]
uri = "../../buildpacks/java-maven"

# Order used for detection
[[order]]
  # This buildpack will display build-time information (as a dependency)
  [[order.group]]
  id = "functionmesh/java-maven"
  version = "0.0.1"

# Stack that will be used by the builder
[stack]
id = "io.functionmesh.stack"
# This image is used at runtime
run-image = "fm-stack-java-runner-run:v1"
# This image is used at build-time
build-image = "fm-stack-build:v1"

Use the following command to create it.

pack builder create fm-java-maven-builder:v1 \\
 --config ./builder.toml \\
 --pull-policy if-not-present

Build a Java function image and create a Function

So far, we have created the following images:

  • A Stack build image: fm-stack-build:v1
  • A Stack run image: fm-stack-java-runner-run:v1
  • A Builder image: fm-java-maven-builder:v1

Now let's write a Java function file.

Package directory structure

.
|-- pom.xml
`-- src/
  `-- main/
      `-- java/
          `-- io.streamnative.example/
              `-- ExclamationFunction.java

The ExclamationFunction.java file:

package io.streamnative.example;

import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.Function;
import org.slf4j.Logger;

public class ExclamationFunction implements Function<String, String> {
  @Override
  public String process(String input, Context context) {
      Logger LOG = context.getLogger();
      LOG.debug("My exclamation function");
      return String.format("%s!", input);
  }
}

Build the function image in the current directory by running the following command.

pack build java-exclamation-function:v1 \\
  --builder fm-java-maven-builder:v1 \\
  --workspace /pulsar \\
  --pull-policy if-not-present

Expected output:

$ pack build java-exclamation-function:v1 \\
  --builder fm-java-maven-builder:v1 \\
  --workspace /pulsar \\
  --pull-policy if-not-present
===> ANALYZING
[analyzer] Previous image with name "java-exclamation-function:v1" not found
===> DETECTING
[detector] functionmesh/java-maven 0.0.1
===> RESTORING
===> BUILDING
[builder] ---> Installing Maven
[builder] ---> Running Maven
[builder] [INFO] BUILD SUCCESS
……
Successfully built image java-exclamation-function:v1

After uploading the image java-exclamation-function:v1 to the image repository, you can use the image to create a Function object.

For more information, see the demo video.

For examples of other runtimes, refer to Package Python Functions and Package Go Functions.

Another amazing thing about CNB is that when the runtime-runner image needs a patch update (for example, fixing a critical CVE that requires the version number of the runtime-runner image to be changed, like streamnative/pulsar-functions-java-runner:2.9.2.23-patch), you just need to prepare a new runtime image fm-stack-java-runner-run:v1-patch as follows.

FROM streamnative/pulsar-functions-java-runner:2.9.2.23-patch

ARG pulsar_uid=10000
ARG pulsar_gid=10001
ARG stack_id="io.functionmesh.stack"
LABEL io.buildpacks.stack.id=${stack_id}

ENV CNB_USER_ID=${pulsar_uid}
ENV CNB_GROUP_ID=${pulsar_gid}
ENV CNB_STACK_ID=${stack_id}

Then, use the CNB rebase interface to replace the run image in the function image java-exclamation-function:v1 with the following.

pack rebase java-exclamation-function:v1 --run-image fm-stack-java-runner-run:v1-patch --pull-policy if-not-present

This way, you don't even need to change the function configuration. You just need to restart its workload to apply the function to the place where the function-runner has been replaced.

Future work

I think we can already feel the changes that the CNB project has made to the serverless technology or to the Function Mesh project in terms of user experience. But there is still a lot of work to be done on how to seamlessly integrate CNB into a specific framework.

In the future development of Function Mesh, we plan to integrate CNB in a way that does not add complexity to the project itself, such as providing dedicated CLI tools combined with configurable builders.

More on Apache Pulsar

Pulsar has become one of the most active Apache projects over the past few years, with a vibrant community driving innovation and improvements to the project. Check out the following resources to learn more about Pulsar.

  • Start your on-demand Pulsar training today with StreamNative Academy.
  • Spin up a Pulsar cluster in minutes with StreamNative Cloud. StreamNative Cloud provides a simple, fast, and cost-effective way to run Pulsar in the public cloud.
  • Register now for free for Pulsar Summit Asia 2022! Held on November 19th and 20th, this two-day virtual event will feature 36 sessions by developers, engineers, architects, and technologists from ByteDance, Huawei, Tencent, Nippon Telegraph and Telephone Corporation (NTT) Software Innovation Center, Yum China, Netease, vivo, WeChat, Nutanix, StreamNative, and many more.
© StreamNative, Inc. 2022Apache, Apache Pulsar, Apache BookKeeper, Apache Flink, and associated open source project names are trademarks of the Apache Software Foundation.TermsPrivacy