Friday, November 26, 2010

Project Lombok - Trick Explained

In my previous blog post, I introduced Project Lombok, a library that can inject code into a class at compile time. When you see it in action, it almost seems magical. I will attempt to explain the trick behind the magic. Java Compilation To understand how Project Lombok works, one must first understand how Java compilation works. OpenJDK provides an excellent overview of the compilation process. To paraphrase, compilation has 3 stages: 1. Parse and Enter 2. Annotation Processing 3. Analyse and Generate In the Parse and Enter phase, the compiler parses source files into an Abstract Syntax Tree (AST). Think of the AST as the DOM-equivalent for Java code. Parsing will only throw errors if the syntax is invalid. Compilation errors such as invalid class or method usage are checked in phase 3. In the Annotation Processing phase, custom annotation processors are invoked. This is considered a pre-compilation phase. Annotation processors can do things like validate classes or generate new resources, including source files. Annotation processors can generate errors that will cause the compilation process to fail. If new source files are generated as a result of annotation processing, then compilation loops back to the Parse and Enter phase and the process is repeated until no new source files are generated. In the last phase, Analyse and Generate, the compiler generates class files (byte code) from the Abstract Syntax Trees generated in phase 1. As part of this process, the AST is analyzed for broken references (e.g. class not found, method not found), valid flow is checked (e.g. no unreachable statements), type erasure is performed, syntactic sugar is desugared (e.g. enhanced for loops become iterator loops) and finally, if everything is successful, class files are written out. Project Lombok and Compilation Project Lombok hooks itself into the compilation process as an annotation processor. But Lombok is not your normal annotation processor. Normally, annotation processors only generate new source files whereas Lombok modifies existing classes. The trick is that Lombok modifies the AST. It turns out that changes made to the AST in the Annotation Processing phase will be visible to the Analyse and Generate phase. Thus, changing the AST will change the generated class file. For example, if a method node is added to the AST, then the class file will contain that new method. By modifying the AST, Lombok can do things like generate new methods (getter, setter, equals, etc) or inject code into an existing method (e.g. cleaning up resources). Trick or Hack? Some people call Lombok's trick a hack, and I'd agree. But don't pass judgement yet. Like any hack, you should examine the risk/reward and alternatives before determining if you are comfortable with it. The "hack" in Lombok is that, strictly speaking, the annotation processing spec doesn't allow you to modify existing classes. The annotation processing API doesn't provide a mechanism for changing the AST of a class. The clever people at Project Lombok got around this through some unpublished APIs of javac. Since Eclipse uses an internal compiler, Lombok also needs access to internal APIs of the Eclipse compiler. If Java officially supported compile-time AST transformations then Lombok wouldn't need to rely on backdoor APIs. This makes Project Lombok vulnerable to future changes in the JDK. There is no guarantee the private APIs won't change in a later JDK and break Project Lombok. If that happens, then you're left hoping that the guys at Lombok will be responsive about patching their library to work with the new JDK. Same thing goes for the new Eclipse compilers. Given how often we get a new version of Java, this may not be that big of an issue. Alternatives in Java There are other alternatives for modifying the behavior of classes. One approach is to use byte-code manipulation at runtime via a library like CGLib or ASM. This is how Hibernate is able to do things like lazily initialize a persistent Collection the first time it is accessed. In general, this can be used to enhance the behavior of existing methods. This trick could possibly be used to implement the @Cleanup behavior in Lombok, so that a resource is closed when it goes out of scope. Runtime byte-code manipulation is no help for generating getters and setters which you intend to reference in source code. Another approach is to use byte-code manipulation on the class files. For example, Kohsuke Kawaguchi of Hudson fame created a library called Bridge Method Injector, that helps perserve binary compatibility when changing a method's return type in a way that is source compatible but not binary compatible. Kohsuke implements this by using ASM to modify the byte-code in a class file after compilation. This trick could be used to mimic the behavior of the Getter/Setter/ToString/EqualsHashCode annotations of Lombok with one caveat: generated methods would only be visible to classes external to your library but not to classes within your library. In other words, projects that depended on classes in your library as a jar would see your getters and setters, but classes within your library would not see these getters and setters at compile time. The trick that makes Lombok special is that the code it generates is weaved in before Analyze and Generate phase of compilation. This allows classes within the same compilation unit to have visibility to the generated methods. It appears another library called Juast may be using a similar trick (modifying the AST) to do things like operator overloading. For some developers, the immediate benefits of Lombok's approach may outweigh the potential risks. Alternatives outside Java If you're willing to switch to Scala, Lombok becomes a moot point. Scala has Case classes that eliminate the getter/setter/toString/hashCode/equals boiler-plate. Scala also has Automatic Resource Management that covers Lombok's @Cleanup behavior. Another option is Groovy if you don't care about static typing. Groovy has similar support for Scala-like Case classes. Groovy also officially supports compile-time, AST transformations. Final thoughts Project Lombok can do tricks that are impossible via other dynamic code generation methods in Java but you should be aware the it uses some back-door APIs to accomplish it.

Thursday, November 11, 2010

Project Lombok: Annotation-driven development - Part 1

As the Java language has reached an evolutionary plateau, annotations are emerging as a way to extend the language. Two interesting frameworks that I've been looking at are Project Lombok and the Checker Framework. In this first entry of a series I'm calling "Annotation-driven development", I'll be looking at Project Lombok.

Project Lombok

Project Lombok aims to eliminate boilerplate code. For example most POJO classes are littered with trivial getters/setters like the following:
public class Person {

  private final String firstName; // read-only
  
  private String lastName;

  public Person(String firstName, String lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
  }

  public String getFirstName() { 
    return firstName;
  }

  public String getLastName() { 
    return lastName;
  }
  public void setLastName(String value) {
    this.lastName = value;
  }  

}
With Lombok, you can use annotations to create the equivalent set of methods:
@AllArgsConstructor
public class Person {
  @Getter @Setter private final String firstName;
  @Getter private String lastName;
}
Or even simpler:
@Data
public class Person {
  private final String firstName;
  private String lastName;
}
Actually, the final example generates more than just getters/setters and constructor. You also get a toString and hashCode method, which are often "forgotten" because they are a pain to write correctly (unless you are using Pojomatic). Lombok isn't just about POJO boilerplate. From the Project Lombok features page, here are all the annotations available currently:
  • @Getter / @Setter Never write public int getFoo() {return foo;} again.<
  • @ToString No need to start a debugger to see your fields: Just let lombok generate a toString for you!
  • @EqualsAndHashCode Equality made easy: Generates hashCode and equals implementations from the fields of your object.
  • @NoArgsConstructor, @RequiredArgsConstructor and @AllArgsConstructor Constructors made to order: Generates constructors that take no arguments, one argument per final / non-null field, or one argument for every field.
  • @Data All together now: A shortcut for @ToString, @EqualsAndHashCode, @Getter on all fields, and @Setter on all non-final fields, and @RequiredArgsConstructor!
  • @Cleanup Automatic resource management: Call your close() methods safely with no hassle.
  • @Synchronized synchronized done right: Don't expose your locks.
  • @SneakyThrows To boldly throw checked exceptions where no one has thrown them before!

Setting up Lombok

To use lombok, you'll need the lombok.jar. This can be obtained from the Project Lombok website. Maven users can simply add the lombok dependency and repository. For example:
<dependencies>
  <dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>0.9.3</version>
  </dependency>
</dependencies>
<repositories>
    <repository>
      <id>projectlombok.org</id>
      <url>http://projectlombok.org/mavenrepo</url>
    </repository>
  </repositories>
</repositories>
This will allow you to use lombok using javac. For the 99.9% of developers who use an IDE, you'll want Lombok IDE support unless you enjoy seeing code that won't compile. You're in luck if you're using Eclipse or NetBeans, as those are the only IDEs supported (sorry IntelliJ users). For Eclipse 3+ and NetBeans 6.8+, simply run java -jar lombok.jar and an install wizard will guide you through adding Lombok support to your chosen IDE(s). The wizard will modify your IDE start script to include Lombok.jar as a Java agent. NetBeans 6.9 users also have the option of using Lombok as a inline annotation processor.

Using Lombok

Once you're got your environment set up to use Lombok, you simply add a Lombok annotation your class, and your class is magically enhanced. Lombok generates code during the compilation phase. Within the IDE, you'll now have access to methods that aren't in your source file. The methods exist in the generated class so they show up in your class outline and are available for code completion. If you ask your IDE to open/jump to the method, it will open the source file but obviously you won't see the code for the method. If you want to view the generated code, you can use JAD to decompile the class or you can use the delombok tool to generate a Lomboked source file from your original source file. You can run delombok manually from the command-line or automatically via the Maven Lombok plugin. Despite being a Maven user, I found it easier to use delombok from the command-line because the maven plugin requires moving all Lomboked files to a non-standard directory. Delombok serves a few useful purposes. You may be curious to see what code Lombok is generating. Or you later decide you want to remove your dependency on Lombok, then you can delombok all your source, and replace the original source with the delomboked source. Running Delombok on the previously mentioned @Data example, generated the following source code:
public class Person {
        private final String firstName;
        private String lastName;

        @java.beans.ConstructorProperties({"firstName"})
        @java.lang.SuppressWarnings("all")
        public Person(final String firstName) {
                this.firstName = firstName;
        }

        @java.lang.SuppressWarnings("all")
        public String getFirstName() {
                return this.firstName;
        }

        @java.lang.SuppressWarnings("all")
        public String getLastName() {
                return this.lastName;
        }

        @java.lang.SuppressWarnings("all")
        public void setLastName(final String lastName) {
                this.lastName = lastName;
        }

        @java.lang.Override
        @java.lang.SuppressWarnings("all")
        public boolean equals(final java.lang.Object o) {
                if (o == this) return true;
                if (o == null) return false;
                if (o.getClass() != this.getClass()) return false;
                final Person other = (Person)o;
                if (this.getFirstName() == null ? other.getFirstName() != null : !this.getFirstName().equals(other.getFirstName())) return false;
                if (this.getLastName() == null ? other.getLastName() != null : !this.getLastName().equals(other.getLastName())) return false;
                return true;
        }

        @java.lang.Override
        @java.lang.SuppressWarnings("all")
        public int hashCode() {
                final int PRIME = 31;
                int result = 1;
                result = result * PRIME + (this.getFirstName() == null ? 0 : this.getFirstName().hashCode());
                result = result * PRIME + (this.getLastName() == null ? 0 : this.getLastName().hashCode());
                return result;
        }

        @java.lang.Override
        @java.lang.SuppressWarnings("all")
        public java.lang.String toString() {
                return "Person(firstName=" + this.getFirstName() + ", lastName=" + this.getLastName() + ")";
        }
}
Like any source code generator, delombok produces code that looks, well, generated. It seems that Lombok lazily adds @java.lang.SuppressWarnings("all") to all methods. The toString/hasCode/equals methods are definitely uglier that what you'd get if you were using Pojomatic.

Extending Lombok

After playing with Lombok, you will likely think of other boilerplate code you'd like to eliminate using the Lombok. The Builder pattern came to my mind and apparently others others as well. There are very few resources that explain how to extend Lombok. You can obviously download the source. Besides that, the best resource I could find was a blog by Nicolas Frankel that describes the basic steps as well as example source. Nicolas states that writing custom Lombok plugins is not for the faint-hearted and I'd agree. You'll quickly discover that you need to know something about annotation processors as well as some rather low-level Javac APIs. Honestly, when you look under the covers of Lombok, things get a bit scary. Some have called Lombok a hack because it relies on internal javac APIs.

Final Thoughts

Lombok is a very interesting use of annotations to extend the Java language. For those who really hate writing getters/setters/etc, Lombok is worth checking out. Although I don't enjoy writing getters/setters/etc, I don't spend a lot of my time writing those types of methods and the IDE can generate much of the boilerplate. It should be noted that Scala has eliminated many of the pain points that Lombok aims to remedy. Right now I'm just using Lombok for prototyping. It definitely speeds up the process of creating POJOs and let's me focus on the more interesting aspects of the prototype. I am not using Lombok for production code. Although I use Eclipse, other developers at Overstock use IntelliJ and the lack of support for IntelliJ is a show-stopper. Even if all IDEs were supported, I'm still not comfortable unleashing Lombok until I've spent some more time with it. I don't think I've found all its warts yet. I'm also concerned about later releases of JDK (or Eclipse) breaking Lombok compatibility. It wouldn't be the end of the world because I can always delombok my source, but that could be a painful process for large projects. That said, it's definitely worth looking Lombok. Even if it never makes it into my Java toolkit, it's a fascinating library to examine and my knowledge of Java has increased because of it.