menu

Converting Java to OpenCL

How Aparapi converts bytecode to OpenCL

Introduction

This page acts as a quick summary for the more detailed PDF that was originally written by AMD.

One of the unique Aparapi features is it’s ability to convert Java bytecode to OpenCL automatically.

In this page we will try to describe the process used to perform this conversion. If you are unfamiliar with bytecode consider visiting this page WhatIsBytecode.

The command


javac Source.java

Will compile the java source file Source.java to Source.class

The classfile format is well documented here and we will not go into too much detail here, however it should be known that Aparapi must parse the classfile of each Kernel to extract the bytecode for the Kernel.run() and any method reachable from Kernel.run().

Lets start with a simple Kernel.


import com.aparapi.Kernel;

public class Squarer extends Kernel{
   int[] in;
   int[] out;
   @Override public void run(){
      int gid = getGlobalId(0);
      out[gid] = in[gid] * in[gid];
   }
}

We will compile this


javac -g -cp path/to/aparapi/aparapi.jar Squarer.java

and then we can look at the bytecode using javap


javap -c -classpath path/to/aparapi/aparapi.jar;. Squarer

Compiled from “Squarer.java”


public class Squarer extends com.aparapi.Kernel
  SourceFile: "Squarer.java"
  minor version: 0
  major version: 50
  Constant pool:
const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V
const #2 = Method       #5.#18; //  Squarer.getGlobalId:(I)I
const #3 = Field        #5.#19; //  Squarer.out:[I
const #4 = Field        #5.#20; //  Squarer.in:[I
const #5 = class        #21;    //  Squarer
const #6 = class        #22;    //  com/amd/aparapi/Kernel
const #7 = Asciz        in;
const #8 = Asciz        [I;
const #9 = Asciz        out;
const #10 = Asciz       <init>;
const #11 = Asciz       ()V;
const #12 = Asciz       Code;
const #13 = Asciz       LineNumberTable;
const #14 = Asciz       run;
const #15 = Asciz       SourceFile;
const #16 = Asciz       Squarer.java;
const #17 = NameAndType #10:#11;//  "<init>":()V
const #18 = NameAndType #23:#24;//  getGlobalId:(I)I
const #19 = NameAndType #9:#8;//  out:[I
const #20 = NameAndType #7:#8;//  in:[I
const #21 = Asciz       Squarer;
const #22 = Asciz       com/amd/aparapi/Kernel;
const #23 = Asciz       getGlobalId;
const #24 = Asciz       (I)I;

{
int[] in;

int[] out;

public Squarer();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #1; //Method com/amd/aparapi/Kernel."<init>":()V
   4:   return


public void run();
  Code:
   Stack=5, Locals=2, Args_size=1
   0:   aload_0
   1:   iconst_0
   2:   invokevirtual   #2; //Method getGlobalId:(I)I
   5:   istore_1
   6:   aload_0
   7:   getfield        #3; //Field out:[I
   10:  iload_1
   11:  aload_0
   12:  getfield        #4; //Field in:[I
   15:  iload_1
   16:  iaload
   17:  aload_0
   18:  getfield        #4; //Field in:[I
   21:  iload_1
   22:  iaload
   23:  imul
   24:  iastore
   25:  return
}

Here we see constant pool of the class and the disassembled bytecode of the default constructor Squarer() and the Squarer.run() method.

The constant pool is a table of constant values that can be accessed from the bytecode of any methods from within this class. Some of the constants are String literals defined within the source (or literals used to name classes, fields, methods, variables or signatures), other slots represent Classes, Methods, Fields or Type signatures. These later constant pool entries cross-reference other constant pool entries to describe higher level artifact.

For example constant pool entry #1 is


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

So entry #1 defines a method. The class containing the method is defined in constant pool entry #6. So lets look at constant pool entry #6.


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

const #6 = class        #22;    //  com/amd/aparapi/Kernel

At constant pool entry #6 we find a class definition which refers to entry #22


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

const #6 = class        #22;    //  com/amd/aparapi/Kernel

const #22 = Asciz       com/amd/aparapi/Kernel;

Which just contains the String (Ascii) name of the class.

Looking back at entry #1 again, we note that the Method also references entry #17 which contains a NameAndType entry for determining the method name and the signature.


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

const #6 = class        #22;    //  com/amd/aparapi/Kernel


const #17 = NameAndType #10:#11;//  "<init>":()V

const #22 = Asciz       com/amd/aparapi/Kernel;

Entry #17’s “NameAndType” references #10 for the method name.


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

const #6 = class        #22;    //  com/amd/aparapi/Kernel

const #10 = Asciz       <init>;

const #17 = NameAndType #10:#11;//  "<init>":()V

const #22 = Asciz       com/amd/aparapi/Kernel;

And then references #11 to get the signature.


const #1 = Method       #6.#17; //  com/amd/aparapi/Kernel."<init>":()V

const #6 = class        #22;    //  com/amd/aparapi/Kernel

const #10 = Asciz       <init>;

const #11 = Asciz       ()V;

const #17 = NameAndType #10:#11;//  "<init>":()V

const #22 = Asciz       com/amd/aparapi/Kernel;

So from constant pool #1 we ended up using slots 1,6,10,11,17 and 22 to fully resolve the method.

This looks like a lot of work, however by breaking method and field references up like this, allows the various slots to be reused by other field/method descriptions.

So when we see disassembled bytecode which references a constantpool slot the actual slot # (2 in the example below) will appear after the bytecode for invokevirtual.


2:   invokevirtual   #2; Method getGlobalId:(I)I

Bytecode is basically able to access three things

  1. Constant pool entries
  2. Variable slots
  3. Stack operands

Instructions are able to pop operands from the stack, push operands to the stack, load values from variable slots (to the stack), store values (from the stack) to variable slots, store values from accessed fields (to the stack) and call methods (popping args from the stack).

Some instructions can only handle specific types (int, float, double, and object instances - arrays are special forms of objects) and usually the first character of the instruction helps determine which type the instruction acts upon. So imul would be a multiply instruction that operates on integers, fmul would multiply two floats, dmul for doubles. Instructions that begin with ‘a’ operate on object instances.

So lets look at the first instruction.


0:   aload_0

This instruction loads an object (a is the first character) from variable slot 0 (we’ll come back to the variable slots in a moment) and pushes it on the stack.

Variables are held in ‘slots’ that are reserved at compiled time.

Consider this static method.


static int squareMe(int value){
  value += value;
  return(value);
}

This method requires one variable slot. At any one time there is only one variable that is live, it just happens to be an argument to the method.

The following method also contains one slot.


static int squareMe(){
  int value=4;
  value += value;
  return(value);
}

Here we need two slots


static int squareMe(int arg){
  int value=arg*arg;
  return(value);
}

Suprisingly the following also only requires two slots.

static int squareMe(int arg){
  {
    int temp = arg*arg;
  }
  int value=arg*arg;
  return(value);
}

Note that in the above example the temp variable loses scope before the local variable value is used. So only two slots are required. Both temp and value can share a slot.

If we have an instance method we always require one extra slot (always slot 0) for the this reference.

So


int squareMe(int arg){
  int value=arg*arg;
  return(value);
}

Requires three slots.

Anyway back to our bytecode


0:   aload_0

This loads the object instance in slot 0 (this) and pushes it on the stack.

Next we have


1:   iconst_0

Which pushes the int constant 0 on the stack. So the stack contains {this,0}

Next we have


2:   invokevirtual   #2; //Method getGlobalId:(I)I

This is the bytecode for calling a method. Basically the instruction itself references the constant pool (we’ll come back to this ;) ) and pulls the method description in constantPool2 which happens to be the description for a method called getGlobalId() which takes an integer and returns an int.

So the VM will pop the top value (int - const 0) as the method arg, and then will pop an object reference (this!) and will call the method this.getGlobalId(0) and will push the result (an int) back on the stack.

So our stack which contains {this,0} now contains the result of this.getGlobalId(0), lets assume it is {0}. We describe this invoke instruction as consuming two operands from the stack and producing one.

Before we start executing our stack is empty {}, the slots are initialized with ‘this’ (if an instance method) and any arguments passed to the method.


                                                            0   1
                                                   slots=[this, ?  ]    stack={}

                                                            0   1
0:   aload_0                                        slots=[this, ?  ]    stack={this}
                                                            0   1
1:   iconst_0                                       slots=[this, ?  ]    stack={this, 0}
                                                            0   1
2:   invokevirtual   #2; Method getGlobalId:(I)I    slots=[this, ?  ]  stack={result of this.getGlobalId(0) lets say 0}

5:   istore_1                                       slots=[this, 0  ]    stack={}

6:   aload_0                                        slots=[this, 0  ]    stack={this}

7:   getfield        #3; //Field out:[I