The Java Virtual Machine Explained

Whether you are a professional Java software developer, a university student who learns Java as a subject or an IT geek, you might have heard about the JVM at some point or another. JVM is the specification that makes it possible for Java applications to follow the famous “Write Once, Run Anywhere” approach. This article discusses more about the JVM, the components that it is made of and what happens internally within the JVM.

What Is The Java Virtual Machine?

Java is an object oriented programming language developed with the concept of “Write Once, Run Anywhere” where the code once written, can be run on any platform. To facilitate this ability, an intermediary language called Bytecode is used.

Java Virtual Machine is a specification that provides a runtime environment where bytecode can be executed. Initially the java code is compiled by the Java compiler into bytecode to generate a class file. Then this class file is interpreted by the JVM for the underlying platform. The bytecode class file is platform independent, therefore can be executed on JVMs running on any platforms or operating systems.

Similar to other application based virtual machines, the JVM is responsible to create an dedicated space on the host machine to execute java programs irrespective of the platform or the operating system of the machine.

The JVM Architecture

The below diagram illustrates the components of the JVM. It consists of three main sub systems; the class loader, the runtime data area, and the execution engine. Apart from that the JVM also includes the native method interface and the native method library.

The Class Loader

The Class loader is a subsystem in JVM which is used to load class files. Loading process is carried out in three phases; loading, linking and initialization. When a java source file(.java file) is compiled, it gets converted to bytecode(.class file). When this class is used/referred, the class loader loads it into the main memory. Usually the class with the main method gets loaded first.

Loading

This process involves taking the bytecode of a class/interface with a particular name and re-generating the original class/interface back. This is done by utilizing three built-in class loaders;

  • Bootstrap Class Loader: This is the root class loader and it’s responsible to load standard java packages like java.util, java.lang etc., from the bootstrap class path which is the rt.jar file.
  • Extension Class Loader: Subclass of the Bootstrap class loader. It's responsible to load the extensions of Java standard libraries.
  • Application Class Loader: This is the last class loader and is the subclass of the Extension class loader. It’s responsible to load the files present in the classpath.

Linking

After the class loading process has finished where a class has been loaded into memory, JVM starts the linking process. It involves combining different dependencies and elements of the class/interface together. Linking process is carried by the following steps;

  • Verification: Under this phase, the JVM checks the structure and the formatting of the .class file against a set of predefined rules. It throws an exception if the verification fails.
  • Preparation: This phase involves allocating memory for the static fields of the class/interface and initializing those fields with default values.
  • Resolution: This phase involves replacing all symbolic references with direct references from the method area.

Initialization

This is the final process executed under the class loader. Initialization involves the execution of the initialization method of the class/interface. This includes making calls to the class’s constructor, assigning values to static variables and execution of the static block.

Runtime Data Area

The runtime data area is divided into five major components; the method area, heap area, stack area, pc register and native method stack. Below discusses the usage of each component.

Method Area

In this data area, all the class level data such as static variables, run time pool constant etc. will be stored. There is only one method area per JVM and if the storage of that is not sufficient to start a program, then the JVM will throw an exception.

Heap Area

This storage area stores all the objects of a class and their corresponding instance variables. Moreover memory areas for arrays are also allocated from the heap area. Similarly to method area, there is only one heap area per JVM.

Stack Area

Unlike the above two memory areas, a separate runtime stack area gets created per thread, therefore it’s thread safe. All local variables, method calls and partial results will be stored in the stack area. If a larger stack than what’s available is required to carry out the processes of the thread, then the JVM will throw an exception.

For every method call in a class, an entry will be made in the stack memory which will be called as a Stack Frame. Once the method call has been finished, that stack frame will be destroyed.

The stack frame is broken into three sub components;

  • Local Variable Array: It is an array that stores all local variables related to the method and their corresponding values.
  • Operand Stack: It is a LIFO stack that acts as a runtime workspace for any intermediate operations.
  • Frame Data: All symbols related to the method and catch block information if an exception has occurred will be stored here.

Program Counter Registers

Similarly to stack area, PC registers will be created per thread. Address of the currently executing JVM instruction will be stored in the PC register. Once the instruction has finished executing, it will be updated with the next instruction.

Native Method Stacks

Native method stacks are used to store information about native methods. These methods are coded using languages other than java, such as C++, etc. Similarly to the above two storage areas, native method stacks also get created per thread.

Execution Engine

The next step, once the bytecode has been loaded and the details are available in the runtime data area, is to run the program. The execution engine handles this process by reading the bytecodes, translating it to machine code and finally by executing it. The translation of bytecode to machine readable code and the execution of it is done by the interpreter or the JIT compiler.

Interpreter

Bytecode is read and executed line by line, therefore the interpreter is slow when compared with the JIT compiler. Another disadvantage with it is that when a method is called repeatedly, a new interpretation is required for each call.

JIT Compiler

The execution engine initially uses the interpreter to translate and execute bytecode but when it identifies code repetitions, it uses the JIT compiler. The JIT compiler compiles the entire bytecode and translates it to machine code as well as use the machine code itself for repeated method calls. Therefore the performance of the system will increase.

Garbage Collector

Garbage collection is the process of reclaiming unused memory area by identifying and removing unreferenced objects from the heap area. It makes memory management more efficient as it regains unused memory and increases the free space for the new objects. The garbage collector in JVM is the component responsible for carrying out this process.

Garbage collection is an automatic process done at regular time periods. However, it can also be explicitly called using the System.gc() method, but the execution is not guaranteed.

Java Native Interface

There might be cases where it might be necessary to use native code(C,C++, etc.), for example, to interact with hardware that cannot be done by Java. Java handles the execution of native code via the JNI. The JNI acts as an interface that interacts with the native method libraries and provides the necessary libraries required for the execution of native code.

Native Method Libraries

It is a collection of libraries usually in the form of .dll and .so files, that are written using other programming languages like Assembly, C, C++, etc.

Conclusion

Often when we write and execute java programs, we do not care about how the code gets executed or how the JVM handles the execution. We only dig deep into internal mechanisms when something goes wrong within the program. This article gave you an insight on what the JVM’s components are and how the internal mechanics work. Whether you are looking at JVM internals to fix a problem within your program, referring to JVM for an interview or to get a quick idea about it, hope this article was able to provide a clear understanding about JVM.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store