dmo.ca/ blog/ Fun With Perl, Dynamic Linking, and C++ Exceptions

I'm hacking on some tools that use the Search::Xapian module to build up search indexes. It's an excellent Perl interface to Xapian, but unfortunately it seemed to be too slow for our purposes. Tracing our code showed that much of the slowness was in passing data back and forth between Perl and the C++ library for every call.

I decided to write my own XS module to speed things up. Instead of using Search::Xapian, I'd bundle everything up into a Perl datastructure, and pass it down to libxapian through my own module, once, and do all the indexing work in C++. This worked great -- until I started trying to do some exception handling.

The problem appeared when I tried to take C++ exceptions and pass them back to Perl, using something like:

    try {
        this->db->open_database(....);
    } catch( const Xapian::Error & e ) {
        croak("Exception: %s: %s", e.get_type(), e.get_msg().c_str());
    }

This is similar to how Search::Xapian implements it (though they provide proper Perl exception classes instead of stringifying). Unfortunately, my test code ended up aborting with the helpful error:

    terminate called after throwing an instance of 'Xapian::DatabaseOpeningError'

Sure... ok. That's a standard C++ "you forgot to catch the exception" error message. But, I just tried to catch that. Some googling turned up things like this post, but none of his solutions worked. This got us on the track of looking for linker issues, and lo and behold, about 10 minutes after I gave up on the problem for the day, my boss discovered one:

If you're using two Perl modules, both with native library components (their own .so files), that link dynamically against the same library (in this case, libxapian.so), C++ exception handling in the native (XS) portion of the Perl modules will mysteriously fail when the exception is thrown from code in a different .so file.

This is because these symbols are weak symbols. Thus, without some linker-fu, the weak symbols in each .so don't match and the C++ runtime can't identify the exception properly. To further confuse matters, the exceptions work just fine when thrown and caught inside the same .so, and they happily print out the correct classname when they are uncaught and handled by terminate().

Now that we've identified the problem, what's the solution? Well, what you want is to pass the RTLD_GLOBAL flag to dlopen() so that the same symbols are available to all libraries. However, we're not calling dlopen() directly -- Perl is, via DynaLoader. However, a read of the DynaLoader manpage makes it pretty obvious how to fix it. Just add:

    sub dl_load_flags { 0x01 }

to the .pm file that invokes DynaLoader's bootstrap() method, and voila, everyone gets the same symbols for their exceptions.

Technically, the dl_load_flags function need only be provided on the first module that provides those symbols, but if you don't (or can't) know which module will get loaded first, every module that could be using the same back-end library should do so. Since it's likely that others might want to speed up some bits of their Xapian-using code by implementing additional XS modules, we've reported the issue upstream. The Xapian-specific issue can be seen as ticket 522 in the xapian bugtracker.