The last couple of days I have been playing around with the Kaggle Halite environment. Quickly, I ran into issues with their requirement that any agent has to be a single python file. Initially, my code was short and easy to keep self-contained. However, after adding more and more functionality, it soon became unwieldly and unintelligible. At this point, I decided to look for alternatives. In this blog post, I describe how to hack Python's import system to embed full packages into a single python module.
In the end, the functionality was surprisingly simple to get working. There are two parts required to build a custom import logic:
For simplicity, I just implemented both interfaces in the same class:
class Bootstraper:
def find_spec(self, name, path, target=None):
from importlib.machinery import ModuleSpec
mangled_name = self.mangle("module", name)
if not mangled_name in globals():
return None
is_pkg = self.mangle("pkg", name)
return ModuleSpec(
name,
self,
is_package=globals().get(is_pkg, False),
)
def create_module(self, spec):
return None
def exec_module(self, module):
exec(
globals()[self.mangle("module", module.__name__)],
module.__dict__,
)
def mangle(self, type, name):
return type + "__" + name.replace(".", "__")
import sys
sys.meta_path.insert(0, Bootstraper())
And that all there is to it. With this bootstrapping code, module definitions
can be embedded as strings into global variables. Each variable has to have name
of the form module__{a}__{b}__{c}
, where each __
represents a .
inside the
qualified module name. To mark packages, a boolean variable with the prefix
pkg
can be used.
Using this bootstrapping logic, a simple package can be defined and used in a single file as follows (download example):
# NOTE: this snippet needs to be prefixed with above's bootstrap code
pkg__foo = True
module__foo = '''
'''
module__foo__bar = '''
def bar_func():
print("bar")
'''
module__foo__baz = '''
def baz_func():
print("baz")
'''
from foo.bar import bar_func
bar_func()
from foo.baz import baz_func
baz_func()
While it works, it is admittedly quite easy to break. For example, having a
module or package named foo
in your Python path, will break this setup in hard
to debug ways. So will I use this code? Probably not in any "real" project, but
in the context of the Kaggle challenge, it gets the job done and is easy to
adapt if new issues pop up. In case, you're looking for more battle tested
implementations with similar approaches, checkout the following packages on
PyPI:
pinliner
: is using an approach similar to the one outlined in this post, but uses a dictionary for storagestickytape
: is re-creating the package in a temporary directory inserted into the Python path