Skip to Content

2024.07.08

Proxy Objects

The Python internals that make proxy objects possible

Introduction

If you’re a Python developer a pattern in that you’ve probably come across, whether or not you’re aware of it, is the concept of the “proxy object”: an object which (doing what it says on the tin) “proxies” calls to another object. Which is to say, calling any method or attribute in FooProxy passes the call or attribute access through to the underlying Foo object in some way that makes sense for it’s usage: FooProxy.doStuff() does the same thing as Foo.doStuff() more or less.

It’s a powerful construct that enables things like database connection pooling & connection management, filesystem abstraction (writing to a local file, S3 or other destination can become transparent to the developer) and many others. You may wish to (as SQLAlchemy does) proxy the list object. This proxy can be treated by the programmer like a list but when acted on transparently changes database tables. This is without the programmer needing that logic to be explicitly added, and the object can be passed to a library function that expects a list, duck typing (”if it quacks like a duck …”) rather than strong typing being a feature of the Python language.

It’s all well and good to accept that fact, but like so many other things we can gain a more enlightened understanding on how they work & how to use them by asking what they are exactly, and how do they work?

Some Python Review

Object

In Python, everything derives from object. That is: everything is (eventually) an inherited subclass of object. Complex classes are object, integers are object, True is object, functions are object, modules are object, types are object, None is object, everything is object.

>>> isinstance({}, object)
True
>>> isinstance(1, object)
True
>>> isinstance(True, object)
True
>>> isinstance(isinstance, object)
True
>>> isinstance(os, object)
True
>>> isinstance(type, object)
True
>>> isinstance(None, object)
True

This ends up being more than an implementation detail. We can get pretty far by treating it as such, we can build entire decades long careers working in Python without ever needing to think about this fact too hard but when we do start to examine this property some interesting details emerge.

One question that arises from this examination is that if everything is object, what then is the fundamental difference between an integer and a module? 1 + 1 is a sensible statement, and Python will return a sensible answer. urllib + datetime is nonsense, and Python will raise an exception if it’s encountered. Both, though are statements of object + object. Both the concept of int and the concept of module are abstractions provided to us the programmers by the developers of the Python language. The only real difference between what an int object is and what a module object is are what extra methods each implements.

Special Methods

Many classes implement methods that are meant to be directly called: datetime.now() or str.strip() , and these are the kinds of methods we’re most familiar with. The existence of certain method definitions is how Python knows to treat certain types of objects differently. Our ability to write code like 1 + 1 comes from a number of special methods (known in Python parlance as “dunder” methods, surrounded by two underscore characters.)

Python lacks the concept of a private method that many other object-oriented languages strictly enforce, but as a matter of convention a leading underscore character indicates “you probably shouldn’t use this” and dunder methods are a convention for “really, though, this is a fundamental class property, it’s not meant for general usage.” Python does not, however, enforce this and it remains a decision by the programmer to ignore at their own peril or amusement.

>>> type(datetime)
<class 'module'>
>>>
>>> dir(datetime)
['MAXYEAR', 'MINYEAR', 'UTC', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'date', 'datetime', 'datetime_CAPI', 'time', 'timedelta', 'timezone', 'tzinfo']
>>> type(1)
<class 'int'>
>>>
>>> dir(1)
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_count', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'is_integer', 'numerator', 'real', 'to_bytes']

Here for instance, modules are objects that implement (among other things) __loader__ and __spec__, which Python’s import machinery uses to load a module in to the global namespace, and numbers are objects that implement all your standard numeric functions (__add__ etc)

This leaves open the question of “Why does 1+1 work but urllib + datetime does not? How does Python know that I can use + on numbers but not on modules” and the answer, simply is: it doesn’t!

Operators

The Python interpreter provides us with a number of shorthands for our convenience which in actuality just call methods of their objects. The + character is interpreted by Python as “call the __add__ method of the left hand object, using the right hand object as it’s argument.” myvar + 1 means the same thing as myvar.__add__(1) (and the latter is how Python actually operates on the two objects)

We can even implement these methods ourselves with the confidence that Python will treat it the way we expect:

>>> class Foo:
...     def __init__(self, value):
...             self.value = value
...     def __add__(self, other):
...             return self.value + other
...
>>> foo = Foo(1)
>>> foo + 1
2

Class inheritance means we can override some methods to give us an object that can be interchangeable but with different behavior:

>>> class AppendInt(int):
...     def __add__(self, other):
...             return int(str(self) + str(other))
...
>>> foo = AppendInt(1)
>>> foo + 2
12

Attributes

One important method inherited all the way down from object itself is __getattribute__ (and it’s related twin __getattr__) This is the method that Python uses to implement the . operator, the operator used to access the attributes & methods of an object.

Every time you call someobject.somemethod() Python is effectively transparently calling someobject.__getattribute__('somemethod') to return the function to call.

__getattribute__ being just yet another method every object inherits from it’s base, object means that like every other method in the inheritance hierarchy, it can be overridden.

A tangential note, overriding this method makes it extremely easy to leave a class in the state where attributes cannot be accessed or accessing any attribute leads to infinite recursion when Python attempts to access such basic methods as __getattribute__ itself, there is a related __getattr__ method that is much safer to override

Overriding __getattribute__ or __getattr__ means that a class can implement methods & attributes that it doesn’t even know about. A class can return default methods such that no matter which attribute you access something will be returned, a class can log accesses to unknown attributes (for debugging or telemetry), or change snake_case to camelCase if that is the programming style you choose to enforce, etc.

Proxy Objects

A proxy object is an object which, for various reasons, encapsulates an object & provides a transparent interface to the thing it’s meant to proxy.

Proxy objects can be implemented several ways but typically follow the pattern of keeping a reference to an instance of their underlying class, overriding whichever methods needed for the side-effect, and finally overriding __getattr__ with a method that will call the referent.

An example we use at Crowdalert is Flask-SQLAlchemy. SQLAlchemy is extremely powerful and does many things but it’s concept of session & connection handling don’t 100% fit with Flask’s concept of request & application contexts, and Flask-SQLAlchemy helps abstract those details without the API developer needing to handle session logic & it accomplishes this, as expected, by overriding __getattr__ with a function that returns the attribute of it’s underlying connection ( ref ).

Proxy objects are a foundational pattern in Python, that make our lives easier, and that drive to make our lives less exhausting through technology is at the heart of Crowdalert’s mission. To stretch a metaphor Crowdalert is something of a conversation proxy object. Instead of the security team manually asking involved parties for information for every event, Crowdalert will gather that information for you by asking the right human at the right time. If that’s the kind of proxy that would make your life easier, sign up below.
By
John Sonnenschein

Last Updated 2024.07.08