Leo's Technical Blog

The Jython Import Logic

Introduction

user

Leo Soto


python, jython, java

The Jython Import Logic

Posted by Leo Soto on .
Featured

python, jython, java

The Jython Import Logic

Posted by Leo Soto on .

Motivation

I think the coolest feature of Jython is the seamless integration with Java. Let say you have the following java class:

  
package com.leosoto;  
public class HelloWorld {  
    public void hello() {
        System.out.println("Hello World");
    }
    public void hello(String name) {
        System.out.printf("Hello %s!", name);
    }
}

If the class is on the classpath when you start Jython, using it from python code is straightforward:

  
>>> from com.leosoto import HelloWorld
>>> h = HelloWorld()
>>> h.hello()
Hello World  
>>> h.hello("joe")
Hello joe!  

Now, did you knew that if the class was not pointed by the classpath, we could also package it on a jar, and the following would also work:

>>> import sys 
>>> sys.path.append('/path/to/helloworld.jar')
>>> from com.leosoto import HelloWorld

Until yesterday, I didn't knew!

Since part of my GSoC project is to come with a way to package Django projects in a single distributable war file, I've spent a complete day reading and playing with the Jython import logic, and here is what I got.

Not much different than Python, right?

First of all, Jython is an implementation of the Python language. So the import mechanism follow strictly what is know as PEP 302: import hooks. I don't want to repeat what is documented there, but a quick explanation is in order:

  • First, try custom importers registered on sys.meta_path. If one of them is capable of importing the requested module, we are done.

How are java classes loaded then?

With a built-in import hook, naturally ;-)

If you start CPython and look at sys.path_hooks you get:

  
>>> import sys
>>> sys.path_hooks
[<type 'zipimport.zipimporter'>]

On Jython, the result is slightly different:

  
>>> import sys
>>> sys.path_hooks
[<type 'JavaImporter'>, <type 'zipimport.zipimporter'>]

The JavaImporter only recognizes the 'classpath' entry on sys.path, so it is fired after looking at path components before 'classpath'. This gives us some control over which namespaces will end up containing python modules and which will contain java packages/classes, if some conflict occurs (such as the very real issue of having the 'test' python module and the 'test' java package). Naturally, the 'classpath' entry is added automagically to sys.path on Jython startup.

But...

  
>>> sys.path_hooks = []
>>> sys.path_importer_cache = {}
>>> del sys.modules['java']
>>> import java
>>> dir(java)
['__name__', 'applet', 'awt', 'beans' ... ]

This should have failed, after removing the JavaImporter hook and all the involved caches. Well, there is also some magic going on here...

Jython, JavaImporter and Java Packages

When an import is going to fail (that is, after searching on all sys.path entries and having no results), Jython tries to load a java package or java class. But wasn't that the task of the JavaImporter?

Well, sort of. Half of such job is the responsability of the JavaImporter. The other half is managed by the SysPackageManager, which keeps in memory a tree of discovered java packages.

When the Jython interpreter starts, the SysPackageManager looks for all jars and directores on the classpath and build the tree of java packages. You can also explicitely add a Java package into the PackageManager by calling sys.add_package("package.which.was.not.autodiscovered"). This is useful on environments where Jython is not allowed to look at the system classpath, or doesn't get the right information (as maybe the case when running inside a JavaEE container).

Back to JavaImporter, its job is to just look into the SysPackageManager's loaded packages and check if the requested name is present there.

And here is the magic

Another way to get packages loaded into SysPackageManager is to add a zip or jar to sys.path. The next time the import logic runs, it automatically add the contents of the new jar (or zip) to the tree of known java packages.

This is a little weird, because if you have the following in your sys.path:

['__classpath__', '/foo', 'foo.jar']

Then, if java packages on foo.jar conflicts with python modules from /foo then the java packages will prevail, because the 'classpath' entry is before '/foo', and then the JavaImporter will do its magic.

And the other bit of magic is what we have already seen: Jython does a last attempt to load a Java package, or to be more precise, to add a package to the SysPackageManager if the imported name is know to the JVM as a class or package name. If this operation is successful, the module is directly imported by this Jython builtin import logic addition (no way to go back to the JavaImporter at this time).

Some observations

Here ends the objective part of this post. What follow now are my observations on the whole process:

  • I don't quite understand why Jython tries to load Java classes or packages at the end of the import logic after trying the standard procedure. Seems like such fallback would make the calls to sys.addpackage unnecessary, but then, why does addpackage exists? And, in any case, I think that JavaImporter should do this
  • The confusing situation of jar files (and java classes) in sys.path is well... confusing. The good news is that namespace conflicts aren't that common in practice, so just remembering that all java "modules" come from the magic 'classpath' element is enough.* It would be nice if the Jython standard loader were installed on the meta_path. Then, JavaImporter could be added there too, just after the default python code loader. This way, we would have a more clear precedence rule (Python modules first, Java packages/classes later), instead of the current "first python modules before classpath, followed by java "modules", followed by python modules after classpath, followed bt java "modules" wich weren't registered yet on the PackageManager).

That's all

OK, that was a long post. Now that I've dumped all that info here, I can go back to coding and try to make distributable WAR files for django projects, containing the complete Jython, modjy and Django runtime.