Question about SSWriter::flush

Frang · November 1, 2022, 3:10pm

Continuing to work through things for JNI binding. I’ve now come to SSWriter in panda/src/downloader (socketStream.*).

Both ostream (as exported in dtool/src/parser-inc) and SSWriter declare a ‘flush()’ method. Though they define different return types. Void for ostream, bool for SSWriter. C++ is very sketchy on such co-variants. Generally overloading on return type is not meaningful and definitely is not supported in most ABI function name mangling.

Java is even more picky about this, which comes up as an issue for SocketStream and OSocketStream.

Left to my own devices, I’d be tempted to rename the ‘flush’ in SSWriter to something like ‘flush_write’. Then hunt down usage sites.

However, I am open to other ideas.

–
-Cary

rdb · November 1, 2022, 3:29pm

Maybe we could have flush() just match the return type of the base class? All it returns is !is_closed() which can be checked separately anyway.

Out of curiosity, how are you handling multiple inheritance in general? I recall that Java doesn’t really allow inheriting from more than one class. I’m separately working on bindings for the nim language (which also doesn’t have multiple inheritance) so I’m curious if you’re doing things differently than I am.

Frang · November 1, 2022, 3:52pm

I was planning to do a detailed write-up for what I ended up doing once it actually functioned end-to-end (as I need to do that for work internally ). But I’d be happy to give an ad-hoc version right now.

Java, as you note, does not support multiple inheritance in concrete (instantiatable) types. It does, however, allow it in a form for interfaces. I say in a form, because unlike C++, Java will collapse multiple instances of a multiply inherited interface. But this model is close enough, I believe (and has shown to be workable for all of the things I’ve looked at in PANDA thus far).

So what I’ve done is express the type taxonomy, and nesting, in Java interface types.

Next a form is needed that can be instantiated. Similar to how things were done in the early days of interrogate integration I declare concrete types that implement points in the interface taxonomy that are ‘shadow’ types for the C++ and contain only a private (technically, protected, but that’s an implementation detail) holder for a ‘pointer’ to the real C++ entity. Java interface types cannot declare the JNI ‘native’ methods needed to bridge to the C++, so those live in the ‘proxy’ types. The ‘proxy’ types implement the abstract methods declared in the interface type(s), and vector to the appropriate JNI ‘native’ methods.

The interface types also declare static methods which model the constructors, these vector to static methods on the ‘proxy’ types that perform and wrap a proper construction. The proxy types also interact with end-of-live control (during Java GC) and arrange to call the correct end-of-life (delete, or unref_delete) in the C++.

There is one more implementation detail that is (unfortunately) important for the Java space. It is simplest for the C++ side of the JNI if a given method can know what type the object-like parameters are passed in as. Particular in the case of base types. From the Java side, a given ‘proxy’ type contains a pointer for that type. Java knows that the proxy type implements the base interfaces, but not how to adjust this pointer, or how to re-wrap it in an appropriate ‘proxy’. Further, since we are only doing this cast for a function call, we would rather Java’s ref-counting not get overly involved in the lifetime of the pointer in that parameter instance.

So to ‘cheat’ this, I introduced an intermediate instantiatable form between the interface and the ‘proxy’. I call this a ‘param’ form. Ultimately all of the non-construct, non-destruct methods are implemented there. The ‘proxy’ form extends this with construct/destruct. When converting to a form for passing as a parameter, the proxy presents a re-wrapped param form. This fixes all of the otherwise painful lifetime stuff. Though this shim could be thinner for PANDA types that also ref-counted, I wanted a single working model to get things working. Optimization can happen after.

Neither the ‘proxy’ or ‘param’ forms should ever be referenced in user code. Only the interfaces. All the rest is implementation detail.

Frang · November 1, 2022, 3:52pm

Sure! I’m good with making that change.

Frang · November 1, 2022, 5:22pm

FWIW, this only required touching 3 call sites in socketStream.I, 1 in recorder/socketStreamRecorder.I, and 1 in distributed/cConnectionRepository.cxx.

So not a great deal of dependency on this other interface. Yay!

–
-Cary

rdb · November 1, 2022, 9:48pm

Oh, this is clever. I look forward to seeing what the completed bindings will look like.

Frang · November 28, 2022, 3:33pm

I apologize for the lack of update. I got pulled onto some other high priority projects at work for a few weeks. I’m picking it back up now.

To keep your curiosity happy, I’ll emit and post here the generated Java for one of the smaller libs so you can get a feel for how it is coming together.

Frang · December 13, 2022, 12:46am

Here is some of the sample java I promised earlier. There are some artifacts here that will not be present in the final form. For example the “force_delete” idiom is going to go away and “finalize()” will in all cases have access to the right thing to do. Also in this sample TypedReferenceCount, TypedWritableReferenceCount, and Namable are all present only as artifacts of the .in fragment I used to make this sample; and as such should be ignored.

core_module_jni.java.zip (6.8 KB)

rdb · December 15, 2022, 10:59am

This looks great!

Are you planning to remap the method names such that they become camelCase? I believe that the coding convention in Java is fairly universally camelCase, right?

Is interrogate producing both a C++ JNI file and a corresponding Java file at the same time, or do you use a separate step to produce the .java file based on the interrogatedb database?

Frang · December 15, 2022, 4:40pm

I had not done function name remapping currently. Though that is fairly trivial to do and I have a place already where that would go. I’ll add that to my hit list during the cleanup pass.

Right now I have ‘interrogate’ itself generate the C++ side of the JNI (along with a .in). Because I had to do something to prevent the Python (usually EXTENSION) APIs from leaking out into the JNI binding, I needed to set an extra -D symbol. So I didn’t want this to pollute the main .in db files.

Then those JNI .in files are run through interrogate_module with a flag that causes it to generate the Java side of the JNI.

The upside of this approach is that it is fairly simple and I didn’t need to cause either pass to have to learn how to output additional files. I also didn’t need to test specifically for various PyFoo types and exclude those interfaces.

The downside is that it means we run interrogate twice over the public headers, and interrogate_module twice per module.

I also have not yet tackled the CMake side of the build system. So far I have been working only with makepanda.py. I might need some help there as I just don’t know CMake that well.

To that point I’ve started thinking about staging separable pieces of this project to get upstreaming PRs out sooner. For example I can put out a PR which has only the “#ifdef HAVE_PYTHON” cleanup that was needed. Then others for the handful of covariant issues I ran into. Etc. The intent would be to make smaller pieces so as not to overwhelm the poor souls trying to look at it.

–
-Frang

rdb · December 16, 2022, 3:10pm

Hmm, yeah, this does sound quite awkward if someone wants to build with both Python and Java bindings. However, I do think that you should be able to figure out if something is decorated with __extend (which is the storage class that EXTENSION adds to a function definition) in the interrogate code and exclude it based on that (and then later figure out how we’re going to handle Java vs Python extension code, possibly with a specific PY_EXTENSION macro).

Sounds great!