Saturday 5 February 2011

Groovy, MetaClasses and MOPs

In a recent post I said that I'm learning a bit of Groovy. In the last days I've continued to play around with the language and I have to say I'm pretty delighted by this beautiful piece of wisdom.

Thought I've never done any sort of compiler-interpreter programming, we could say that I'm a language freak, in the sense that I have great interest in any kind of advanced, new, unusual... features, new/different paradigms... that a language/framework provide. Expression Trees, lambdas, runtime code generation, Prototype-based programing, runtime weaving, MetaClasses... well, the list could go on for a while.

So, while diving into Groovy going through the User Guide, I found plenty of interesting stuff. "Closures" (I quote them cause as I already explained here they are not always real closures) are implemented in a very similar way to C# delegates, that is, compiler magic creates classes the inherit from Closure/MulticastDelegate. With this in mind, I soon realized that most of the beauties in the "Compile-time Metaprogramming - AST Transformations" are just compiler magic. The compile time metaprogramming capabilites provided by AST Transformations remind me of Boo's syntactic macros. So thought very useful and elegant, don't seem like "revolutionary" to me.

However, when getting to the Dynamic Groovy - ExpandoMetaClass section, things turned really exciting.
We have two pretty important concepts here:

  • Expando. I think I heard this concept for first time applied to javascript objects. An expando (or maybe open object) is an object that can be expanded, by adding to it new methods and properties. Of course, expando objects are the basis for Prototype-based languages like javascript. Python objects are also expandable objects (well, in both languages normal objects are mainly dictionaries of properties and functions), and .Net has a cool ExpandoObject.


  • Metaclass. I first heard of MetaClasses years ago, in my Pythonist times. Unfortunately it's been a long while since my last adventures in Python (thought it's full of nice features I think the the non-C syntax is a major drawback, at least for me), but I remember MetaClasses were an obscure concept that stayed on the Guru (or wannabe Guru) side. My understanding of Python MetaClasses was basically classes that allow you to create new classes. When a new class gets created (Python interpreter gets to a class sentence, its MetaClass is used to intefere in the creation process, adding, modifying behaviour, and thus determining the behaviour of instances of that Class.
    It fits quite well with the definition in wikipedia:
    In object-oriented programming, a metaclass is a class whose instances are classes. Just as an ordinary class defines the behavior of certain objects, a metaclass defines the behavior of certain classes and their instances. Not all object-oriented programming languages support metaclasses. Among those that do, the extent to which metaclasses can override any given aspect of class behavior varies. Each language has its own metaobject protocol, a set of rules that govern how objects, classes, and metaclasses interact.
    By the way, seems like in the last years MetaClass have been used in some real life scenarios, being its usage in Django broadly praised.



Groovy MetaClasses work at the class and instance level cause both instance objects and Class objects have a metaClass property. One note here:
In Groovy classes are first-class objects, that's why they have properties like metaClass. When we write Person.metaClass, it's not trying get access to a metaClass static property in Person class, but it's doing sort of a Person.getClass() and accessing the metaClass property in the returned Class object. OK, Java Class objects do not have a metaClass property, well, in Groovy (I'd like to have some additional information, cause it's a bit confusing to me), they have, check the Groovy JDK (don't confuse it with the Groovy API).

The main application of MetaClasses in Groovy is the ExpandoMetaClass (EMC). We can add methods (Closures) and properties to the ExpandoMetaClass of a Class object or an instance object and all instances of that class or that particular instance will get expanded with that method/property:

Person.metaClass.sayHi = {-> return delegate.name + " says Hi"};

p1 = new Person("xana");
p1.metaClass.sayBye = {-> return delegate.name + " says Bye"};


Simple, right?. So we have the same power that we have in javascript and Python. As I said above, in these languages objects are basically dictionaries and method-property access boils down to a dictionary lookup thing, but in Groovy things can't be that simple. When compiled to JVM bytecode each Groovy class gets compiled to a Java class (besides that, some extra classes are generated for Closures, categories, the main program...) Java classes/objects are closed, they can't be expanded with new properties or methods, and method invocation and property access is fixed at compile time (well, of course we have a small degree of dynamism in method invocation through method overloading and vTables, but vTables can't be modified at runtime, so no way to add/delete/update methods), so, how does this work?
how does a method invocation end up reaching a method added to an EMC?

well, all the magic comes from the fact that when type a simple method call like myInstance.myMethod(); the Groovy compiler does a nice job under the covers,
transforming the call into a call to ScriptByteCodeAdapter.invokeMethod, that in turn triggers a chain of method calls that get the MetaClasses involved:

ScriptBytecodeAdapter.invokeMethod(...) (static method)
InvokerHelper.invokeMethod(...) (static method)
Invoker.invokeMethod(...) (instance method called on InvokerHelper's single instance)
Invoker calls invokeMethod(...) on the MetaClass of our class (with exceptions, see below). It finds this MetaClass by looking it up in the MetaClassRegistry. The Invoker holds a single instance of this registry.


All this is well depicted here.
From the above, one new mistery, the MetaClassRegistry and how it relates to my per instance MetaClass and per class MetaClass. Well, I think the reference to MetaClassRegistry is misleading, with Groovy objects it's the metaClass property of the involved object what is used to find the MetaClass.
Well, I've gone through much googling, this nice pdf (but bear in mind that it's an old document and new versions of Groovy have brought some changes) and finally some debugging and source code exploration (eclipse rocks here!) and these are my findings:


All user defined Groovy classes implement the GroovyObject interface, which means that they have a metaClass property.
As I previously said, Classes also get a metaClass property.
At first, this Class.metaClass points to an instance of MetaClassImpl. (Person.metaClass)
When an instance (p1) of a Class (Person) is created, its metaClass property (p1.metaClass) will point to the same object pointed by the metaClass property of its Class (Person.metaClass).
When a method is added to a MetaClass (either the MetaClass of an instance or the MetaClass of a Class) (p1.metaClass = {...}; or Person.metaClass = {};) the affected object no longer points to the old MetaClassImpl object, but to a new ExpandoMetaClass object.
If it's been the Class (Person) who got the new EMC, all new
instances of that Class (p2, p3) will point to that EMC, and will be affected by additions to that EMC done from the Class. However, the instances already existing before the EMC got assigned to the Class, will still point to the old MetaClassImpl object, not being affected by the changes done to the EMC.
If it's been an instance who has got an EMC assigned (per instance metaClass: p1.metaClass.sayHi = {...};) it only affects to that instance, not to the Class metaClass or any other new instances.

You can check this code that adds some other marginal cases and should certify my conclusions.


All this usage of the metaClass property and the ScriptBytecodeAdapter is ingenious but should not impress any Python or javascript lover. What is pretty astonishing is that it works for normal Java classes!!!

String.metaClass.wrapInBrackets({->"{" + delegate + "}"});
println "xana".wrapInBrackets();


Normal Java classes (String for example) are closed and out of your control, you can't make them implement the GroovyObject Interface, so what?
This is where the MetaClassRegistry comes into play. As a normal class defined in Java does not have a metaClass property, when the method invocation gets to the point where it needs to check the MetaClass to obtain the method, it looks it up in the MetaClass Registry. I guess this is a Dictionary where Class objects are keys and MetaClass objects values. In old versions MetaClasses for Java classes only worked at the class level, but in modern Groovy it also works for instances, so I guess also instances are added to that Dictionary.
Another question here is how does the access to the (non existing) metaClass property work, I guess methodMissing will have something to do here.


Interception
This is the feature that really blew me away. Different languages provide different mechanisms for intercepting method calls and property access (I'm talking about the typical AOP thing, intercept method call, write a log and continue the call). In Java and C# interception is just confined to some compile time weaving or creating Dynamic Proxies at runtime. In Python we can use decorators or MetaClasses to "decorate" functions at creation time. Also in Python and in JavaScript we can go one step ahead, and given an existing object, patch it by traversing its list of methods and making each method reference point to a new function, that will contain the extra code and then invoke the original method. It's powerful, but it has a "hack" feeling, as something the language was not intended for.
That's where Groovy excels thanks to method invocations going through the invokeMethod method.
one class can implement the GroovyInterceptable interface and all method calls go through the implemented invokeMethod. This is a compile time decision, so not too especial.

class MyClass implements GroovyInterceptable
{
def hello(){ 'Hi' }

def invokeMethod(String name, Object args)
{
def result;
def metaMethod = this.metaClass.getMetaMethod(name,args);
try
{
result = "success, result: " + metaMethod.invoke(this,args);
}
catch(ex)
{
result = "invocation failed";
}
return "intercepted $name(${args.join(', ')}) " + result;
}
}

more code here

The cool part is that we can do our interception decisions at runtime, adding the invokeMethod to the MetaClass of a class or the MetaClass of an instance. As we saw in the section above, method calls end up going to the invokeMethod of the corresponding MetaClass. Also bear in mind that what we saw above with respect to methods added to a Class MetaClass not being available in instances previously created also applies to this.

I've got a sample here.

I think something similar in terms of adding interception at runtime can be achieved in Python by means of __getattribute__. In this sample defines it inside the class declaration, but I think it should work adding it later dynamically.
One more feature that fairly amazed me is that in those rare cases where we're typing parameters, Groovy supports Multiple Dispatch!!!

Categories
Another cute feature that called my attention is Categories. They're similar to C#'s Extension methods, but more powerful cause it is limited to a block (use...) and it does not have the compile time type limitation of C#'s compiler. I mean, if we're typing a variable "p" to Parent, but at runtime it points to a Child object, and the Category was created to work with Child, it will work fine with our "p" variable (something that the C# compiler would not allow).

MOP
Metaclasses are also related to the broader concept of MetaObject Protocol.

The metaobject protocol approach ... is based on the idea that one can and should open languages up, allowing users to adjust the design and implementation to suit their particular needs. In other words, users are encouraged to participate in the language design process.

Well, the text above sounds impressive and I guess it's more ambicious than that we find in Groovy (for example, a MOP could allow us to modify how inheritance or exception handling work, how object comparison is done). Anyway Groovy MOP seems pretty powerful to me, cause it allows us to do all the nice things explained in the previous paragraphs.
Groovy's MOP mainly draws upon the existence of hooks/extension points:
taken from here

Groovy's MOP system includes some extension/hooks points that you can use to change how a class or an object behaves, mainly being:
getProperty/setProperty: control property access
invokeMethod: controls method invocation, you can use it to tweak parameters of existing methods or intercept not-yet existing ones
methodMissing: the preferred way to intercept non-existing methods
propertyMissing: also the preferred way to intercept non-existing properties


I've found a book about the DLR that seems to explain the MOP in our beloved "dynamic" C# 4. Implementing IDynamicMetaObjectProvider to return a DynamicMetaObject... seems like pretty interesting stuff.

Contrary to Java, Groovy is evolving very fast, and its designer have really great plans for next versions. There are 2 that specially caught my eye:

  • New Meta-Object Protocol

  • ability to pass expression trees / AST nodes as parameters (see C# 4's own expression tree)



Final musings
After this long post praising Groovy it's easy to figure out that I'd love to see an implementation of Groovy for .Net (IronGroovy?) but the question is, would it be worth the effort?
Why my doubts? Well, dynamic languages like Boo, the Iron ones (IronPython, IronRuby) and many incomplete implementations of other languages... don't seem to have caught in the .Net ecosystem with the same strength that in the Java Platform, and I think there's a good reason for this. Java (the language) seems like a stagnated language lacking many of the expressive features that many coders have discovered thanks to javascript, Python, Ruby... so many people programming for the Java Platform are more than receptive to alternatives. I think there's little doubt about the fact that C# is rather more expressive than Java. It's not just my taste, Jim Hugunin, that Java, .Net and Python genius also thinks so)
I will suffer some pain when I have to write code in Java now that I've learned to love the elegance of C#.

So what I think happens is that many .Net programmers do not feel the urge to embrace a new language cause C# serves them well (and given its fast paced evolution in these years, many goodies can be expected for next versions :-)

For those concerned about the performance of dynamic languages, and leaving benchmarks aside, I'll say: Who cares about performance when you can write such stylish code :-D

No comments:

Post a Comment