Be taught all about weak references in Python: reference counting, rubbish assortment, and sensible makes use of of the weakref
module
Chances are high that you just by no means touched and possibly haven’t even heard about Python’s weakref
module. Whereas it may not be generally utilized in your code, it is elementary to the inside workings of many libraries, frameworks and even Python itself. So, on this article we’ll discover what it’s, how it’s useful, and the way you may incorporate it into your code as properly.
To grasp weakref
module and weak references, we first want a bit intro to rubbish assortment in Python.
Python makes use of reference counting as a mechanism for rubbish assortment — in easy phrases — Python retains a reference depend for every object we create and the reference depend is incremented each time the thing is referenced in code; and it’s decremented when an object is de-referenced (e.g. variable set to None
). If the reference depend ever drop to zero, the reminiscence for the thing is deallocated (garbage-collected).
Let’s take a look at some code to know it a bit extra:
import sysclass SomeObject:
def __del__(self):
print(f"(Deleting {self=})")
obj = SomeObject()
print(sys.getrefcount(obj)) # 2
obj2 = obj
print(sys.getrefcount(obj)) # 3
obj = None
obj2 = None
# (Deleting self=<__main__.SomeObject object at 0x7d303fee7e80>)
Right here we outline a category that solely implements a __del__
methodology, which known as when object is garbage-collected (GC’ed) – we do that in order that we are able to see when the rubbish assortment occurs.
After creating an occasion of this class, we use sys.getrefcount
to get present variety of references to this object. We’d anticipate to get 1
right here, however the depend returned by getrefcount
is mostly one larger than you may anticipate, that is as a result of once we name getrefcount
, the reference is copied by worth into the operate’s argument, briefly bumping up the thing’s reference depend.
Subsequent, if we declare obj2 = obj
and name getrefcount
once more, we get 3
as a result of it is now referenced by each obj
and obj2
. Conversely, if we assign None
to those variables, the reference depend will lower to zero, and ultimately we’ll get the message from __del__
methodology telling us that the thing bought garbage-collected.
Effectively, and the way do weak references match into this? If solely remaining references to an object are weak references, then Python interpreter is free to garbage-collect this object. In different phrases — a weak reference to an object is just not sufficient to maintain the thing alive:
import weakrefobj = SomeObject()
reference = weakref.ref(obj)
print(reference) # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>
print(reference()) # <__main__.SomeObject object at 0x707038c0b700>
print(obj.__weakref__) # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>
print(sys.getrefcount(obj)) # 2
obj = None
# (Deleting self=<__main__.SomeObject object at 0x70744d42b700>)
print(reference) # <weakref at 0x7988e2d70590; useless>
print(reference()) # None
Right here we once more declare a variable obj
of our class, however this time as an alternative of making second sturdy reference to this object, we create weak reference in reference
variable.
If we then test the reference depend, we are able to see that it didn’t improve, and if we set the obj
variable to None
, we are able to see that it instantly will get garbage-collected although the weak reference nonetheless exist.
Lastly, if attempt to entry the weak reference to the already garbage-collected object, we get a “useless” reference and None
respectively.
Additionally discover that once we used the weak reference to entry the thing, we needed to name it as a operate ( reference()
) to retrieve to object. Due to this fact, it’s typically extra handy to make use of a proxy as an alternative, particularly if you want to entry object attributes:
obj = SomeObject()reference = weakref.proxy(obj)
print(reference) # <__main__.SomeObject object at 0x78a420e6b700>
obj.attr = 1
print(reference.attr) # 1
Now that we all know how weak references work, let’s take a look at some examples of how they may very well be helpful.
A typical use-case for weak references is tree-like knowledge buildings:
class Node:
def __init__(self, worth):
self.worth = worth
self._parent = None
self.youngsters = []def __repr__(self):
return "Node({!r:})".format(self.worth)
@property
def guardian(self):
return self._parent if self._parent is None else self._parent()
@guardian.setter
def guardian(self, node):
self._parent = weakref.ref(node)
def add_child(self, little one):
self.youngsters.append(little one)
little one.guardian = self
root = Node("guardian")
n = Node("little one")
root.add_child(n)
print(n.guardian) # Node('guardian')
del root
print(n.guardian) # None
Right here we implement a tree utilizing a Node
class the place little one nodes have weak reference to their guardian. On this relation, the kid Node
can dwell with out guardian Node
, which permits guardian to be silently eliminated/garbage-collected.
Alternatively, we are able to flip this round:
class Node:
def __init__(self, worth):
self.worth = worth
self._children = weakref.WeakValueDictionary()@property
def youngsters(self):
return record(self._children.objects())
def add_child(self, key, little one):
self._children[key] = little one
root = Node("guardian")
n1 = Node("little one one")
n2 = Node("little one two")
root.add_child("n1", n1)
root.add_child("n2", n2)
print(root.youngsters) # [('n1', Node('child one')), ('n2', Node('child two'))]
del n1
print(root.youngsters) # [('n2', Node('child two'))]
Right here as an alternative, the guardian retains a dictionary of weak references to its youngsters. This makes use of WeakValueDictionary
— each time a component (weak reference) referenced from the dictionary will get dereferenced elsewhere in this system, it robotically will get faraway from the dictionary too, so we do not have handle lifecycle of dictionary objects.
One other use of weakref
is in Observer design pattern:
class Observable:
def __init__(self):
self._observers = weakref.WeakSet()def register_observer(self, obs):
self._observers.add(obs)
def notify_observers(self, *args, **kwargs):
for obs in self._observers:
obs.notify(self, *args, **kwargs)
class Observer:
def __init__(self, observable):
observable.register_observer(self)
def notify(self, observable, *args, **kwargs):
print("Acquired", args, kwargs, "From", observable)
topic = Observable()
observer = Observer(topic)
topic.notify_observers("check", kw="python")
# Acquired ('check',) {'kw': 'python'} From <__main__.Observable object at 0x757957b892d0>
The Observable
class retains weak references to its observers, as a result of it does not care in the event that they get eliminated. As with earlier examples, this avoids having to handle the lifecycle of dependant objects. As you in all probability seen, on this instance we used WeakSet
which is one other class from weakref
module, it behaves identical to the WeakValueDictionary
however is carried out utilizing Set
.
Remaining instance for this part is borrowed from weakref
docs:
import tempfile, shutil
from pathlib import Pathclass TempDir:
def __init__(self):
self.identify = tempfile.mkdtemp()
self._finalizer = weakref.finalize(self, shutil.rmtree, self.identify)
def __repr__(self):
return "TempDir({!r:})".format(self.identify)
def take away(self):
self._finalizer()
@property
def eliminated(self):
return not self._finalizer.alive
tmp = TempDir()
print(tmp) # TempDir('/tmp/tmp8o0aecl3')
print(tmp.eliminated) # False
print(Path(tmp.identify).is_dir()) # True
This showcases yet another characteristic of weakref
module, which is weakref.finalize
. Because the identify counsel it permits executing a finalizer operate/callback when the dependant object is garbage-collected. On this case we implement a TempDir
class which can be utilized to create a short lived listing – in supreme case we’d at all times bear in mind to scrub up the TempDir
once we do not want it anymore, but when we neglect, we’ve got the finalizer that may robotically run rmtree
on the listing when the TempDir
object is GC’ed, which incorporates when program exits fully.
The earlier part has proven couple sensible usages for weakref
, however let’s additionally check out real-world examples—certainly one of them being making a cached occasion:
import logging
a = logging.getLogger("first")
b = logging.getLogger("second")
print(a is b) # Falsec = logging.getLogger("first")
print(a is c) # True
The above is fundamental utilization of Python’s builtin logging
module – we are able to see that it permits to solely affiliate a single logger occasion with a given identify – which means that once we retrieve identical logger a number of instances, it at all times returns the identical cached logger occasion.
If we needed to implement this, it may look one thing like this:
class Logger:
def __init__(self, identify):
self.identify = identify_logger_cache = weakref.WeakValueDictionary()
def get_logger(identify):
if identify not in _logger_cache:
l = Logger(identify)
_logger_cache[name] = l
else:
l = _logger_cache[name]
return l
a = get_logger("first")
b = get_logger("second")
print(a is b) # False
c = get_logger("first")
print(a is c) # True
And eventually, Python itself makes use of weak references, e.g. in implementation of OrderedDict
:
from _weakref import proxy as _proxyclass OrderedDict(dict):
def __new__(cls, /, *args, **kwds):
self = dict.__new__(cls)
self.__hardroot = _Link()
self.__root = root = _proxy(self.__hardroot)
root.prev = root.subsequent = root
self.__map = {}
return self
The above is snippet from CPython’s collections
module. Right here, the weakref.proxy
is used to forestall round references (see the doc-strings for extra particulars).
weakref
is pretty obscure, however at instances very great tool that you need to preserve in your toolbox. It may be very useful when implementing caches or knowledge buildings which have reference loops in them, akin to doubly linked lists.
With that stated, one ought to concentrate on weakref
assist — the whole lot stated right here and within the docs is CPython particular and completely different Python implementations can have completely different weakref
conduct. Additionally, most of the builtin sorts do not assist weak references, akin to record
, tuple
or int
.
This text was initially posted at martinheinz.dev