Friday 10 May 2024

Kotlin iteration, toList vs asIterable

In this previous post I talked about Iterables and Sequences in Kotlin, about the eager vs deferred and lazy nature of their extension functions (this StackOverflow discussion is pretty good). As I read somewhere "Collections (that are classes that implement Iterable) contain items while Sequences produce items". What I did not mention in that post is that the fact that Sequences are not Iterables feels quite strange to me (and I see no reason for the Sequence[T] interface not to implement the Interface[T] interface, it would not interfere with using the right extension function). In Python and JavaScript (where we don't have an Iterable class or interface) an iterable is a sort of "protocol". Anything that can be iterated (we can get an iterator from it) is iterable. In Kotlin we have an Iterable interface, but there are things (Sequences) that can be iterated (both the Iterable and Sequence interfaces have an iterator() method) without implementing that Interface. I guess what is confusing for me is saying "X is iterable" cause it seems very similar to saying "X is (implements) an Iterable". Saying "X can be iterated" seems more appropriate, a Sequence can be iterated but it's not an Iterable.

Well, let's put semantic discussions aside and let's see something more concrete. We can obtain a Sequence from an Iterable by means of the asSequence() extension function or the Sequence() function. We'll do that if we want to leverage the lazy extension functions for mapping, filtering, etc... In the other direction we can "materialize" a sequence into a Collection using the toList() extension function. That way we force the sequence to "produce" all its items and we store them in a collection. Additionally we also have the Sequence[T].asIterable() extension function. When/why would we use asIterable() rather than toList()?

Let's say we have a sequence on which we will be applying some filtering/mapping, and we want those functions to be applied in one go, eagerly, as soon as one item would be needed in an ensuing iteration. BUT we don't want to do it now. We can use asIterable() to get an object that implements the Iterable interface but contains no elements (so we are deferring the execution). Then, when we first invoke the map/filter extension function (on Iterable) the whole original sequence will be iterated and processed in one go (egarly). If we use .toList() the whole iteration of the sequence occurs immediatelly, before calling map/filter to process the items. I'm using the deferred/immediate execution and lazy/eager evaluation terms as explained in this excellent post.



fun createCountriesSequence() = sequence {
    println("Before")
    yield("France")
    println("next city")
    yield("Portugal")
    println("next city")
    yield("Russia")
}

    var countriesSequence  = createCountriesSequence()
    // converting the sequence to list immediatelly iterates the sequence
    var countries = countriesSequence.toList()
    println("after converting to list")

    // Before
    // next city
    // next city
    // after converting to list

    println()

    // what's the use of calling .asIterable() rather than directly .toList()?
    // with .asIterable we obtain a deferred collection (nothing runs until we first need it)
    // but as the extension functions in Iterable are eager, first time we apply a map or whatever the whole iteration is performed 
 
    countriesSequence = createCountriesSequence()
    //converting to iterable does not perform any iteration yet
    var countriesIterable = countriesSequence.asIterable()
    println("after creating iterable")

    println("for loop iteration:") //no iteration had happenned until now
    for (country in countriesIterable) {
        println(country)
    }

    // after creating iterable
    // for loop iteration:
    // Before
    // France
    // next city
    // Portugal
    // next city
    // Russia
    
    println()

    countriesSequence  = createCountriesSequence()
    countriesIterable = countriesSequence.asIterable()
    println(countriesIterable::class)
    println("after creating iterable") //no iteration has happenned yet (deferred execution)

    //applying an Iterable extension function performs the whole iteration (eager evaluation)
    val upperCountries = countriesIterable.map({ 
        println("mapping: $it")
        it.uppercase() 
    })
    println("after mapping")
    println(upperCountries::class.simpleName)


    // after creating iterable
    // Before
    // mapping: France
    // next city
    // mapping: Portugal
    // next city
    // mapping: Russia
    // after mapping
    // ArrayList


One more note. The iterator() method of an Iterable or Sequence has to be marked as an operator, why? This is the explanation I found in the documentation and in some discussion:

The for-loop requires the iterator(), next() and hasNext() methods to be marked with operator. This is another use of the operator keyword besides overloading operators.

The key point is that there is a special syntax that will make use of these functions without a syntactically visible call. Without marking them as operator, the special syntax would not be available.

Remember than in Kotlin (as in Python) we can not define custom operators, we can just overload the predefined ones.

Saturday 4 May 2024

Kotlin Companion Objects

Some weeks ago I wrote a post reviewing how static members work in different languages. I was prompted to perform such review to better understand the motivations for Kotlin designers to come up with that odd feature called companion objects (I think the only other well-known language with such concept is Scala).

As you can read here Kotlin features object expressions (let's think of it as an advanced form of object literals) and object declarations that initially I thought were only used for declaring singletons. Well, object declarations are also used for declaring companion objects. I won't repeat here the basis explained there in the official documentation.

So, what's the point with this companion objects thing? Honestly, for the moment I don't see them as providing any revolutionary advantage over static members in Python or JavaScript (as per my previous post static members in these 2 languages seem a bit richer than those in Java and C#), but it's an interesting approach. I should also note that this nice article explains that thinking of companion objects as just a different way to implement static members is missing part of the picture and that they should be seen more like "a powerful upgrade to static nested classes". As I very rarely use inner/nested classes I have little to say about this.

At one point in this discussion someone gives a good summary of how to think about companion objects:

Think of each member of the ‘real class’, being instanced with a reference to the same singleton of a special ‘secondary class’ designed to hold data shared by all members of the ‘main class’. You are actually calling a method the secondary class with calling a companion object method, and not actually having a static method at all.

I remember having read somewhere that from a theoretical point companion objects are superior to static members because we use objects to gather together all the members belonging to the class rather than to an instance of the class, instead of throwing them "in the middle of the class" as we do with static members. That's right, and thanks to using a single object for all those elements (methods and properties) belonging to the class, that object can inherit from another object, which seems to me (and others) like the main distinctive feature of companions.

On the other side, there are a couple of things that can be seen as limitations. First, companion objects are not inherited. If I define a companion object in class A, and have a class B that inherits from A, I can not access to the companion through B (while in Python and JavaScript static members are inherited). Second, I can not access the companion through an instance of the class, only through the class (this is possible in Python, but not in JavaScript). Notice that inside a method of the companion "this" (the receiver) points to the companion itself, so we can use "this" explicitly or omit it (implicit this). We can also use just the class name. I'll add here some code showing this:


package companionObjects


open class Country(var name: String) {
    var xPos = 0
    var yPos = 0
    companion object {
        val planetName = "Earth"
        fun formatPlanet(): String {
            // KO. We have no access to the "name" property in the Country instance 
            //return "${name} - ${planetName}"
            
            // OK, "this" refers to the companion
            //return "${this.planetName}"

            //so we can just use the implicit "this" (that refers to the companion itself)    
            return "${planetName}"

            // OK, access through the class rather than through "this"
            //return "${Country.planetName}"
            
        }
    }

    fun move(x: Int, y: Int) {
        xPos += x
        yPos += y
    }
}

class ExtendedCountry(name: String): Country(name) {}


// kotlinc Main.kt -d myApp.jar
// kotlin -cp myApp.jar companionObjects.MainKt

fun main(){
    val country1 = Country("France")
    
    // KO: Unresolved reference (so I can not access the companion through an instance, only through the class)
    //println(country1.formatPlanet())
    
    println(Country.formatPlanet())
    //Earth
    
    // KO: Unresolved reference (I can not access the companion from the derived class)
    //println(ExtendedCountry.formatPlanet())
}

So all in all we can say that a companion object is like having a single static member in our class, with that member pointing to an object. The members in that object are accessible directly through the containing class (no need to write MyClass.companion.member, just MyClass.member), so it's a form of delegation.

Wednesday 24 April 2024

Python "Power Instances"

We saw in my recent post about making an instance callable (__call__()) that the implicit search for __dunder__ methods will start searching for them in the class of the instance, not in the instance itself. So this happens also with __getattribute__, __setattribute__, we can change the look-up mechanism for an object by defining a __getattribute__, __setattribute__ in its class, but defining it directly in the instance will have no effect. Something similar happens also for descriptors, they are executed (__get__(), __set__()...) when found in a class, if found in an instance they are returned as such. I showed in that post a technique for overcoming this "limitation". We create a new class, add the dunder method(s) to it, and change the class of our instance to that new class (by reassigning the __class__ attribute). We could generalize this technique, having classes that create "power-instances", instances that we can easily not only do callable, but also define a __getattribute__, add descriptors, and why not, change its inheritance mechanism. All this for an specific instance, not for the whole class (which would affect all its instances).

So I've come up with the idea of having a decorator that decorates classes, turning them into classes which instances are "power-instances" on which we can alter the __getattribute__ behaviour, do __callable__ etc. The decorator returns a new class (PowerInstance class) inheriting from the original one and with additional methods (to do it callable, for adding instance methods, for adding properties, for interception, for adding a base class...). When this class is used for creating an instance it will return an instance not of this class, but of a new class (yes, we create a new class on each instantiation), that inherits from this PowerInstance class. The different additional methods that we defined in PowerInstance will perform the actions (add the specific __call__, __getattribute__(), modify inheritance...) on this new class. This is possible because we can return an instance of a new class rather than of the expected one by defining a new __new__() method in the class. OK, let's see the implementation:


from types import MethodType

# decorator that enhances a class so that we can add functionality just to the instance, not to the whole class
# it creates a new class PowerInstance that inherits from the original class
def power_instance(cls):
    class PowerInstance(cls):
        # each time we create an instance of this class, we create an instance of new Child class 
        # (in this Child class is where we add the extra things that we can't directly add to the instance)
        def __new__(_cls, *args, **kwargs):
            # it's interesting to note what someone says in OS that returning instances of different classes 
            # in a sense breaks object orientation, cause you would expect getting instances of a same class
            class PerInstanceClass(_cls):
                pass
            return super().__new__(PerInstanceClass)
            # remember this is how a normal __new__() looks like:
                # def __new__(cls, *args, **kwargs):
                #     return super().__new__(cls)
                  
        def _add_to_instance(self, item, item_name):
            # type(self) returns me the specific class (PerInstanceClass) created for this particular instance           
            setattr(type(self), item_name, item)

        def add_instance_method(self, fn, fn_name: str):
            self._add_to_instance(fn, fn_name)

        def add_instance_property(self, prop, prop_name: str):
            self._add_to_instance(prop, prop_name)

        def do_callable(self, call_fn):
            type(self).__call__ = call_fn

        def intercept_getattribute(self, call_fn):
            type(self).__getattribute__ = call_fn

        def do_instance_inherit_from(self, cls):
            # create a new class that inherits from my current class and from the provided one
            class PerInstanceNewChild(type(self), cls):
                pass
            #NewChild.__name__ = type(self).__name__
            self.__class__ = PerInstanceNewChild
    
    # there's no functools.wraps for a class, but we can do this, so the new class has more meaningful attributes
    for attr in '__doc__', '__name__', '__qualname__', '__module__':
        setattr(PowerInstance, attr, getattr(cls, attr))
    return PowerInstance
    

And now one usage example. Notice how we enrich the p1 instance and this affects only that specific instance, if I create a new instance of that class, it's "clean" of those features.



@power_instance
class Person:
    def __init__(self, name: str):
        self.name = name

    def say_hi(self, who: str) -> str:
        return f"{self.name} says Bonjour to {who}"

print(f"Person.__name__: {Person.__name__}")
p1 = Person("Iyan")
print(p1.say_hi("Antoine"))

def say_bye(self, who: str):
    return f"{self.name} says Bye to {who}"
p1.add_instance_method(say_bye, "say_bye")
print(p1.say_bye("Marc"))

p1.add_instance_property(property(lambda self: self.name.upper()), "upper_name")
print(p1.upper_name)

p1.do_callable(lambda self: f"[[{self.name.upper()}]]")
print(p1())

# Person.__name__: Person
# Iyan says Bonjour to Antoine
# Iyan says Bye to Marc
# IYAN
# [[IYAN]]

print("------------------")

# verify that say_bye is only accessible by p1, not by p2
p2 = Person("Francois")
print(p2.say_hi("Antoine"))

try:
    print(p2.say_bye("Marc"))
except Exception as ex:
    print(f"Exception! {ex}")

try:
    print(p2())
except Exception as ex:
    print(f"Exception! {ex}")

# Francois says Bonjour to Antoine
# Exception! 'PerInstanceClass' object has no attribute 'say_bye'
# Exception! 'PerInstanceClass' object is not callable


Let's do our instance extend another class:



class Animal:
    def growl(self, who: str) -> str:
        return f"{self.name} is growling to {who}"
        
p1.do_instance_inherit_from(Animal)

print(p1.growl("aa"))
print(p1.say_hi("bb"))
print(p1.say_bye("cc"))
print(p1())
print(f"mro: {type(p1).__mro__}")
print(f"bases: {type(p1).__bases__}")

# Iyan is growling to aa
# Iyan says Bonjour to bb
# Iyan says Bye to cc
# [[IYAN]]
# mro: (.PowerInstance.do_instance_inherit_from..PerInstanceNewChild'>, .PowerInstance.__new__..PerInstanceClass'>, , , , )
# bases: (.PowerInstance.__new__..PerInstanceClass'>, )


And now some interception via __getattribute__:



# intercept attribute access in the instance
print("- Interception:")

# interceptor that does NOT use "self" in the interception code
def interceptor(instance, attr_name):
    attr = object.__getattribute__(instance, attr_name)
    if not callable(attr):
        return attr
    
    def wrapper(*args, **kwargs):
        print(f"before invoking {attr_name}")
        # attr is already a bound method (if that's the case)
        res = attr(*args, **kwargs)
        print(f"after invoking {attr_name}")
        return res
    return wrapper

p1.intercept_getattribute(interceptor)
print(p1.say_hi("Antoine"))

p3 = Person("Francois")
# interceptor that does use "self" in the interception code
def interceptor2(instance, attr_name):
    attr = object.__getattribute__(instance, attr_name)
    if not callable(attr):
        return attr
    
    def wrapper(self, *args, **kwargs):
        print(f"before invoking {attr_name} in instance: {type(self)}")
        # attr is already a bound method (if that's the case)
        res = attr(*args, **kwargs)
        print(f"after invoking {attr_name} in instance: {type(self)}")
        return res
    
    return MethodType(wrapper, instance)        

p3.intercept_getattribute(interceptor2)
print(p3.say_hi("Antoine"))

# before invoking say_hi
# after invoking say_hi
# Iyan says Bonjour to Antoine

# before invoking say_hi in instance: .PowerInstance.__new__..PerInstanceClass'>
# after invoking say_hi in instance: .PowerInstance.__new__..PerInstanceClass'>
# Francois says Bonjour to Antoine


This thing of being able to have "constructors" that return something different from the expected "new instance of the class" is not something unique to Python. In JavaScript a function used as constructor (invoked via new) can return a new object, different from the one that new has created and passed over to it as "this". In Python construction-initialization is done in 2 steps, __new__ and __init__ methods, that are invoked by the __call__ method of the metaclass, so we normally end up in: type.__call__. In this discussion you can see a rough implementation of such type.__call__


# A regular instance method of type; we use cls instead of self to
# emphasize that an instance of a metaclass is a class
def __call__(cls, *args, **kwargs):
    rv = cls.__new__(cls, *args, **kwargs)  # Because __new__ is static!
    if isinstance(rv, cls):
        rv.__init__(*args, **kwargs)
    return rv

# Note that __init__ is only invoked if __new__ actually returns an instance of the class calling __new__ in the first place.


It's interesting to note that someone says in stackoverflow that returning from a constructor an instance of class different to that of the constructor, particularly a different class each time, in a sense breaks object orientation, cause normally you expect all instances returned by a constructor to be of the same class (this expectation does not exist if what you use is a factory function).

Thursday 18 April 2024

Static Members Comparison

Companion objects is a rather surprising Kotlin feature. Being a replacement (allegedly an improvement) for the static members that we find in most other languages (Java, C#, Python), in order to grasp its advantages I've needed first to review how static members work in these other languages. That's what this post is about.

In Java static members (fields or methods) can be accessed from the class and also from instances of the class (which is not recommended because of what I'm going to explain). Static members are inherited, but static methods are not virtual, their resolution is done at compile-time, based on the compile-time type, which is something pretty important to take into account if we are going to invoke them through an instance rather than through the class (it seems to be a source of confusion, and one of the reasons why Kotlin designers decided not to add "static" to the language). If we define a same static method in a Parent class and its Child class, and we invoke it through a Parent variable pointing to a Child instance, as the resolution is done at compile time (there's no polymorphism for static members) the method being invoked will be the one in Parent rather than in Child. You can read more here.

Things are a bit different in C#. Probably aware of that problem in Java, C# designers decided to make static members only accessible from the class, not from instances. static members are inherited (you can use class Child to access a static member defined in class Parent) and you can redefine a static method (hide the inherited one with a new one) in a Child class using the new modifier.

Recent versions of JavaScript have seen the addition of static members to classes (of course remember that classes in JavaScript are just syntactic sugar, the language continues to be prototype based). They work in the same way as in C#. They can be accessed only through the class, not through instances. You have access to them using a Child class (they are inherited) and you can also redefine them in a Child class.


class Person {
    static planet = "Earth"
    
    constructor(name) {
        this.name = name;
    }

    static shout() {
        return `${this.planet} inhabitant AAAAAAAAAAAA`;
    }

}

class ExtendedPerson extends Person {

}

console.log(Person.shout())

try {
    console.log(new Person("Francois").shout());
}
catch (ex) {
    console.log(ex);
}

// inheritance of static fields/methods works OK
console.log(ExtendedPerson.shout());

//it works because of this:
console.log(Object.getPrototypeOf(ExtendedPerson) === Person);
//true

I assume static members are implemented by just setting properties in the object for that class (Person is indeed a function object), I mean: Person.shout = function(){};. Inheritance works because as you can see in the last line [[Prototype]] of a Child "class" points to the Parent.

An interesting thing is that from a static method you can (and should) access other static methods of the same class using "this". This makes pretty good sense, "this" is dynamic, it's the "receiver" and in a static method such receiver is the class itself. Using "this" rather than the class name allows a form of polymorphism, let's see:


class Person {
    static shout() {
        return "I'm shouting";
    }

    static kick() {
        return "I'm kicking";
    }

    static makeTrouble() {
        return `${this.shout()}, ${Person.kick()}`;
    }

}

class StrongPerson extends Person {
    static shout() {
        return "I'm shouting Loud";
    }
    static kick() {
        return "I'm kicking Hard";
    }    
}

console.log(Person.makeTrouble());
console.log("--------------");
console.log(StrongPerson.makeTrouble());

// I'm shouting, I'm kicking
// --------------
// I'm shouting Loud, I'm kicking


Notice how thanks to using this we end up invoking the Child.shout() method, while for kick() we are stuck in the Parent.kick()

Static/class members in Python have some particularities. In Python any attribute declared in a standard class belongs to the class. This means that for static data attributes we don't have to use any extra keyword, we just add them at the class level (rather than in the __init__() method). For static/class methods we have to use the @classmethod decorator (if it's going to call other class methods) of the @staticmethod decorator if not. When we invoke a method in an object Python uses the attribute lookup algorithm to get the function that then will be invoked. As explained here Functions are indeed data-descriptors that have a __get__ method, so when we retrieve this function via the attribute lookup the __get__ method of the descriptor is executed, creating a bound method object, bound to the instance or to the class (if the function has been decorated with classmethod) or a staticmethod object, that is not bound, if the function has been decorated with staticmethod. Based on this we have that class/static methods can be invoked both via the class or also via an instance, that they are inherited, and that the polymorphism we saw in JavaScript works also nicely in Python. Let's see some code:


class Person:
    planet = "Earth"
    
    def __init__(self, name: str):
        self.name = name

    def say_hi(self):
        return f"Bonjour, je m'appelle {self.name}"
    
    @staticmethod
    def shout():
        return "I'm shouting"

    @staticmethod   
    def kick():
        return "I'm kicking"

    @classmethod
    def makeTrouble(cls):
        return f"{cls.shout()}, {cls.kick()}"


class StrongPerson(Person):
    @staticmethod
    def shout():
        return "I'm shouting Loud"

    @staticmethod   
    def kick():
        return "I'm kicking hard"


print(Person.makeTrouble())
p1 = Person("Iyan")
print(p1.makeTrouble())

print("--------------")

# inheritance works fine, with polymorphism, both invoked through the class or through an instance
print(StrongPerson.makeTrouble())
p2 = StrongPerson("Iyan")
print(p2.makeTrouble())

# I'm shouting, I'm kicking
# I'm shouting, I'm kicking
# --------------
# I'm shouting Loud, I'm kicking hard
# I'm shouting Loud, I'm kicking hard


print(Person.planet) # Earth
print(p1.planet) # Earth

Person.planet = "New Earth"
print(Person.planet) # New Earth
print(p1.planet) # New Earth

# this assignment will set the attibute in the instance, not in the class
p1.planet = "Earth 22"
print(Person.planet) # New Earth
print(p1.planet) # Earth 22

Notice how we can read a static attribute (planet) both via the class or via an instance, but if we modify it via an instance the attribute will added to the instance rather than updated in the class.

One extra note. We know that when using dataclasses we declare the instance members at the class level (then the dataclass decorator will take care of the logic for setting them in the instance in each instantiation), so for declaring static/class attributes in our dataclasses we have to use ClassVar type annotation: cvar: ClassVar[float] = 0.5

Sunday 24 March 2024

Python Object Literals

Kotlin object expressions are pretty nice. They can be used as the object literals that we like so much in JavaScript, but furthermore you can extend other classes or implement interfaces. In Kotlin for the JVM these object expressions are instances of (anonymous) classes that the compiler creates for us.

Python does not provide syntax for object literals, but indeed it's rather easy to get something similar to what we have in Kotlin. For easily creating a "bag of attributes" we can leverage the SimpleNamespace class. To add methods to our object we have to be careful, cause if we just set an attribute to a function, when we later invoke it it will be called without the "receiver-self". We have to simulate the "bound methods" magic applied to functions declared inside a class (that are indeed descriptors that return bound-methods). We just have to use types.MethodType to bind the function to the "receiver". Of course, we also want to allow our "literal objects" to extend other classes. This turns out to be pretty easy too, given that in Python we can change the class of an existing object via the __class__ attribute (I tend to complain about Python syntax, but its dynamic features are so cool!) a feature that we combine with the sometimes disdained multiple inheritance of classes. So we'll create a new class (with types.new_class) that extends the old and the new class, and assign this class to our object.
So, less talk and more code! I ended up with a class extending SimpleNamespace with bind and extend methods. Both methods modify the object in place and return it to allow chaining.


class ObjectLiteral(SimpleNamespace):
    def bind(self, name: str, fn: Callable) -> "ObjectLiteral":
        setattr(self, name, MethodType(fn, self))
        return self
    
    def extend(self, cls, *args, **kwargs) -> "ObjectLiteral":
        parent_cls = types.new_class("Parent", (self.__class__, cls))
        self.__class__ = parent_cls
        cls.__init__(self, *args, **kwargs)
        return self

We'll use it like this:



class Formatter:
    def __init__(self, w1: str):
        self.w1 = w1
    
    def format(self, txt) -> str:
        return f"{self.w1}{txt}{self.w1}"
    

p1 = (ObjectLiteral(
        name="Xuan",
        age="50",
    )
    .extend(Formatter, "|")
    .bind("format2", lambda x, wr: f"{wr}{x.name}-{x.age}{wr}")
    .bind("say_hi", lambda x: f"Bonjour, je m'appelle {x.name} et j'ai {x.age} ans")
)

print(p1.format("Hey"))
print(p1.format2("|"))
print(p1.say_hi())
print(f"mro: {p1.__class__.__mro__}")
# let's add new attributes
p1.extra = "aaaa"
print(p1.extra)

# let's extend another class
class Calculator:
    def __init__(self, v1: int):
        self.v1 = v1
    
    def calculate(self, v2: int) -> str:
        return self.v1 * v2
    
p1.extend(Calculator, 2)
print(f"mro: {p1.__class__.__mro__}")
print(p1.format("Hey"))
print(p1.format2("|"))
print(p1.say_hi())
print(p1.calculate(4))

print(f"instance Formatter: {isinstance(p1, Formatter)}")
print(f"instance Formatter: {isinstance(p1, Calculator)}")


# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# |Hey|
# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# mro: (, , , , )
# aaaa
# mro: (, , , , , , )
# |Hey|
# |Xuan-50|
# Bonjour, je m'appelle Xuan et j'ai 50 ans
# 8
# instance Formatter: True
# instance Formatter: True

Notice how after the initial creation of our object we've continued to expand it with additional attributes, binding new methods and extending other classes.