Migration to Python 3

cslos77 · May 12, 2013, 8:43pm

Hi rdb, just checking in to see if there has been any change in the status of a python3 build for panada. I’ll have a script ready to test whenever there is anything working. No rush on my end but I thought I’d see how things are going.

rdb · May 13, 2013, 10:28pm

I’ve been able to work around the segfault. It only occurs when I import libpandaexpress before libpanda, and not the other way around. Which baffles me, as libpandaexpress is a dependency of libpanda. Since I don’t have a debugging environment set up on the laptop that I do the Python 3 work on, I have no means to see what’s going on. But here’s some code that works for me:

import panda3d
mgr = panda3d.__manager__
mgr.libimport('libpanda')
mgr.libimport('libpandaexpress')
print("Imported")

from libpandaexpress import *
from libpanda import *

Presumably, simply swapping the libraries around in the definition for the ‘core’ module in panda3d.py will also work, but I won’t do that until I’ve found out why this happens. At some point in the future, I plan on doing a complete overhaul of the way we compile the Python bindings into the libraries, which may solve some of the problems we’re encountering, but it may not make much sense to wait for that unless I find time to get around to it soon.

I’m still getting another segfault, but it happens at shutdown, so I suppose it’s less important.

The next step for me would probably be to integrate the py2to3-based tool into makepanda. Can you send the script so that I can try it for myself?

cslos77 · May 17, 2013, 12:02am

Here’s the script, sorry for the length but I don’t have any host to use right now:

# ============
# panda_to_py3
# ============

__version__ = "0.3"

import sys
import os
import shutil
from datetime import datetime
from lib2to3.refactor import *
from lib2to3.main import *

# Options.
LOGGING = True
PRINTING = True
BACK_UP = True
if "--no-log" in sys.argv: LOGGING = False
if "--no-backup" in sys.argv: BACK_UP = False
if "--no-print" in sys.argv: PRINTING = False


class Custom_Fixes:

    def panda_fix_builtin_ref(line):
        """Fix refs to __builtin__ that py2to3 misses."""
        fixed_line = line.replace("__builtin__", "builtins")
        return fixed_line

    # Map str patterns to custom fix methods.
    fix_map = {"__builtin__":panda_fix_builtin_ref,}


class Panda3D_Refactoring_Tool(StdoutRefactoringTool):

    def __init__(self, src_dir):
        """Custom Panda3d py2to3 refactoring tool."""
        fixers = sorted(refactor.get_fixers_from_package("lib2to3.fixes"))
        StdoutRefactoringTool.__init__(self, fixers, [],
                                       [], True, None,
                                       input_base_dir=src_dir)
                                       
    def refactor_panda_file(self, src_file):
        """Perform py2to3 conversion on panda file "src_file"."""
        global file_count, line_count, lines_fixed
        _file_lines_fixed = lines_fixed
        _file_lines_added = lines_added
        
        # Extract lines from src_file for stats and testing.
        with open(src_file) as file:
            lines = file.readlines()
        line_count += len(lines)
        file_count += 1
        
        # Refactor (test for indents by line.)
        for line in lines:
            if line == "\n": continue
            if line.startswith(" ") or line.startswith("\t"):
                # Some files begin indented; use workaround.
                self.handle_parse_error(src_file, "in")
                self.refactor_file(src_file, write=True)
                self.handle_parse_error(src_file, "out")
            else:
                self.refactor_file(src_file, write=True)
            break

        # Print file refactor info.
        if PRINTING:
            file_name = os.path.split(src_file)[-1]
            f_lines_fixed = lines_fixed - _file_lines_fixed
            f_lines_added = lines_added - _file_lines_added
            if f_lines_fixed > 1: f_str = "lines fixed"
            else: f_str = "line fixed"
            if f_lines_added > 1: a_str = "lines added"
            else: a_str = "line added"
            if f_lines_added:
                f_line_str = "({} {}, {} {})".format(f_lines_fixed, f_str,
                                                     f_lines_added, a_str)
            else:
                f_line_str = "({} lines fixed)".format(f_lines_fixed)
            print("  {:<30}{}".format(file_name, f_line_str))
        
    def handle_parse_error(self, file_path, mode):
        """Allow py2to3 to handle files in direct/extensions folder that
        begin with indents by putting a temporary "class Temp:" statement
        at the top of the file to fool the parser. Remove it after parse."""
        with open(file_path, "r") as file:
            lines = file.readlines()
            if mode == "in":
                lines.insert(0, "class Temp:")  # Temp class statement.
            elif mode == "out":
                lines.pop(0)  # Remove temp class statement in "out" mode.
                lines.insert(0, "\n")
        with open(file_path, "w") as file:
            file.writelines(lines)
                                       
    def print_output(self, old, new, filename, equal):
        """Override method in StdoutRefactoringTool so that we can
        set up custom log output as well as perform a second layer
        of custom fixes to cover things that py2to3 misses."""
        global lines_fixed, lines_added, custom_fixes
        
        if LOGGING:
            # Start new set of log lines for each file.
            dec_str = "".zfill(len(filename)).replace("0", "-")
            log_lines.extend(["".join(["\n", dec_str]),
                              filename.replace(".\\", ""),
                              "".join([dec_str, "\n"])])
        # Handle custom fixes.
        new_lines = new.split("\n")
        fix_list = list(Custom_Fixes.fix_map.keys())
        _new_lines = []
        for line in new_lines:
            for fix in fix_list:
                if fix in line:
                    panda_fix = Custom_Fixes.fix_map[fix]
                    line = panda_fix(line)
                    custom_fixes += 1
                    break
            _new_lines.append(line)
        new = "\n".join(_new_lines)
            
        # Get "diff_lines" (from py2to3.main) and set line tracking vars.
        diff_lines = diff_texts(old, new, filename)
        line_no = 0
        fixed_line_no = 0
        fixed_lines_offset = 0
        
        # "diff_lines" lists subtractions and additions in series
        # ie: (-,-,-,-) then (+,+,+,+). Increment 'stack' for every "-"
        # line and decrement it for every "+" line.
        stack = 0
        _prev_stack = 0  # helps track line removals.
        
        # Process diff lines to find and apply required panda fixes
        # Generate a log if LOGGING is True.
        for line in diff_lines:
            if line.startswith("+++") or line.startswith("---"):
                continue
                
            # Use sentinel lines to reset line counting vars.
            if line.startswith("@@"):
                line_list = line.split(" ")
                line_no = int(line_list[1].split(",")[0].replace("-", ""))
                fixed_line_no = line_no + fixed_lines_offset
                continue
                
            # Log removals.
            if line.startswith("-") and str(line).strip() != "-":
                stack += 1
                lines_fixed += 1
                if LOGGING: # Update log_lines.
                    log_str = "{:>6}: {}".format(line_no, line)
                    log_lines.append(log_str)
                line_no += 1
                continue
                
            # Log additions.
            elif line.startswith("+") and str(line).strip() != "+":
                _lines_added = 0
                if stack > 0:
                    fixed_line = line
                else:
                    # A new line when stack is at zero means this is a line
                    # that was simply added by py2to3, usually an import.
                    _lines_added += 1
                    line_no += 1
                    fixed_line_no += 1
                    fixed_line = line
                    
                # Update stats and log.
                if stack: stack -= 1
                if LOGGING:
                    log_str = "{:>6}: {}".format(fixed_line_no, fixed_line)
                    if stack == 0: log_str = "{}\n".format(log_str)
                    log_lines.append(log_str)
                fixed_line_no += 1
                
                # Handle rare case where py2to3 removes a line .
                if _prev_stack:
                    if LOGGING:
                        log_lines[-1] = "{}\n".format(log_lines[-1])
                    
                fixed_lines_offset += _lines_added
                lines_added += _lines_added
                continue
            
            # Line stats.
            line_no += 1
            fixed_line_no += 1
            _prev_stack = stack
            
        # Rewrite files.
        self.apply_fixes_to_file(filename, new)
        
    def apply_fixes_to_file(self, file_path, new):
        """Perform actual update of python file."""
        new_lines = []
        lines = new.splitlines(keepends=True)
        for i, line in enumerate(lines):
                new_lines.append(line)
        # Write new file.
        with open(file_path, "w") as file:
            file.writelines(new_lines)
    
    def write_file(self, new_text, filename, old_text, encoding=None):
        """Overriden from StdoutRT to prevent overwrite of our changes."""
        self.wrote = True

# Utility objects.
class Timer:
    def __enter__(self):
        self.start_dt = datetime.now()
        return self
    def __exit__(self, *e_info):
        if e_info[0] != None: print(e_info)
        end_dt = datetime.now()
        delta = end_dt - self.start_dt
        self.time = "{}.{}".format(delta.seconds, round(delta.microseconds, 3))

class Change_Logger:
    def __init__(self, dir):
        """Generate a "changes.txt" file for this dir if LOGGING == True."""
        self.dir = dir
    def __enter__(self):
        self._ref_dir_count = dir_count
        self._ref_file_count = file_count
        self._ref_line_count = line_count
        self._ref_lines_fixed = lines_fixed
        self._ref_lines_added = lines_added
        return self
    def __exit__(self, *e_info):
        if not LOGGING: return
        d_dir_count = dir_count - self._ref_dir_count
        d_file_count = file_count - self._ref_file_count
        d_line_count = line_count - self._ref_line_count
        d_lines_fixed = lines_fixed - self._ref_lines_fixed
        d_lines_added = lines_added - self._ref_lines_added
        with open(os.path.join(self.dir, "changes.txt"), "w") as log:
            date_str = datetime.now().strftime("%a %b %d, %Y - %H:%M")
            log.write("================\n")
            log.write("Panda to Python3 - {}\n".format(self.dir))
            log.write("================\n\n")
            log.write("date:         {}\n\n".format(date_str))
            log.write("dirs:         {}\n".format(d_dir_count))
            log.write("files:        {}\n".format(d_file_count))
            log.write("lines:        {}\n".format(d_line_count))
            log.write("lines fixed:  {}\n".format(d_lines_fixed))
            log.write("lines added:  {}\n\n\n".format(d_lines_added))
            log.write("'-' = old line\n'+' = py2to3 fix\n'*' = panda fix\n\n")
            for line in log_lines:
                line = "".join([line, "\n"])
                log.write(line)

# -----------------------------------
# Refactor panda.py files in SRC_DIRS
#------------------------------------

# Source directories for files to be refactored.
src_dir_list = ["direct", "pandac", "samples"]
SRC_DIRS = []
for s_dir in src_dir_list:
    src_dir = os.path.join(".", s_dir)
    SRC_DIRS.append(src_dir)
    if BACK_UP:
        # Back up src dirs.
        if PRINTING: print("Backing up: {}".format(s_dir))
        copy_dir = os.path.join(".", "_backup", s_dir)
        shutil.copytree(src_dir, copy_dir)   

# Refactoring algorithm.
dir_count, file_count, line_count = 0, 0, 0
lines_fixed, lines_added, custom_fixes = 0, 0, 0
with Timer() as timer:
    for src_dir in SRC_DIRS:
        for root, dirs, files in os.walk(src_dir):
            if PRINTING: print("".join(["\n", root.replace(".\\", "")]))
            with Change_Logger(root):
                prt = Panda3D_Refactoring_Tool(root)
                if LOGGING: log_lines = []
                for file in files:                    ## maybe allow __?
                    if file.endswith(".py") and not file.startswith("__"):  
                        src_file = os.path.join(root, file)
                        prt.refactor_panda_file(src_file)
                        pass
            dir_count += 1

# Finally refactor "panda3d.py" in main panda3d installation folder.
if PRINTING: print();print("<root dir>")
with Change_Logger("."):
    prt = Panda3D_Refactoring_Tool(".")
    if LOGGING: log_lines = []
    file_path = os.path.join(".", "panda3d.py")
    if BACK_UP:
        print("Backing up: panda3d.py")
        copy_path = os.path.join(".", "_backup")
        shutil.copy2(file_path, copy_path)  # backup panda3d.py.
    prt.refactor_panda_file(file_path)

if PRINTING: 
    # Print totals.
    print();print()
    print("panda_to_py3: {} seconds\n".format(timer.time))
    print("dirs:         {}".format(dir_count))
    print("files:        {}".format(file_count))
    print("lines:        {}".format(line_count))
    print("lines fixed:  {}".format(lines_fixed))
    print("lines added:  {}".format(lines_added))
    print("custom fixes: {}".format(custom_fixes))

Just run it as a file in the main panda installation directory using python 3. It prints the results for each file and creates a change log for each directory so if there’s any issues with the resulting code the exact change that caused it can be quickly found. It also backs up all the files and puts them in a dir called “_backup” before changing them. These features are really only for the development phase and can be turned off with various options for the release version.

Right now, I don’t expect that it would produce working results because there are numerous things that the py2to3 script misses. There’s a “Custom_Fixes” object for creating panda specific fixes for these cases, but I wasn’t able to get very far in my testing before encountering the memory error so I was only able to create a fix for one of these (py2to3 doesn’t catch references to the old builitin module if they are in quotations). I would anticipate there being similar cases along the way.

I’ve set it up so it’s fairly straightforward to add more custom fixes so either you can do this yourself if you encounter further problems or I can do it when there’s a 32-bit build ready that works with python 3. Also, obviously re-write or remove any parts you want for the final release version. Hopefully this helps; let me know if you have any issues.

Gaulois94 · July 12, 2013, 10:31pm

Hi . Have we any informations about the migration to python 3 ? Thanks .

Talkless · September 30, 2013, 4:12pm

I would like to bump this post too. Any progress?

rdb · September 30, 2013, 5:09pm

No, sorry, I’m swamped with tasks that are way more important.

cslos77 · October 1, 2013, 12:39am

I recently upgraded to a Win7/64-bit system so I’m going to start messing around with this again. I’ll post any developments.

Talkless · October 5, 2013, 4:18pm

Nice to hear, good luck!

rdb · December 17, 2013, 11:14pm

OK, I’ve just checked in a range of fixes regarding Python 3 support. Most importantly, I’ve fixed the segmentation fault on module load, which was caused by a stupid brainfart on my end. Sorry about that.

I’ve also altered makepanda to automatically invoke 2to3 for the “direct” tree when copying it over to the “built” directory. However, genPyCode doesn’t work, because the direct tree is broken. There’s too much cruft in there that still uses old APIs (like new.instancemethod and list.sort(cmp=x)) that can’t be ported so easily to Python 3 (and it isn’t done by 2to3). I’ve made a little bit of progress toward changing the source to use the newer APIs, but there’s still a lot of tedious work ahead.

I’d greatly appreciate any help on this, particularly with patching the original source to use newer APIs (rather than hacking custom fixers into 2to3) since a majority of them will still work in Python 2.

cslos77 · December 20, 2013, 2:28am

Hi, I can definitely still help with this. It’s been awhile but I think it was the seg fault that prevented me from testing any further so that alone should help. I remember that genPyCode was the first module which caused errors for me as well, I managed to actually get through a few module imports just using hand substitutions but I think the seg fault eventually stopped me. I’m a bit confused though:

So you’re aiming for a source that’s compatible with both 2 and 3? I’m not sure if that’s possible without having those ugly hacks in certain places that test for version (i.e. incompatible import semantics). Sorry if I misunderstand what you mean here. Either way I’ll look back into what I was doing and see what I can do with that new build.

rdb · December 21, 2013, 8:50am

No, that’s not what I’m talking about. I don’t intend to make the source compatible with both Python 2 and 3 (though “as close as we can comfortably get” sounds like a nice long-term goal). But there are two problems with using our source in Python 3 as it is:

(1) We use Python 2’s syntax and idioms, obviously.
(2) We use old APIs that were mostly already deprecated in Python 2.

Now (1) is really easy to solve, which is already more or less taken care of by the use of lib2to3. However, 2to3 doesn’t touch the more complicated things, such as our use of the “new” module or our use of a custom “cmp” function in sort(). In these cases there is usually no straightforward substitution. But we can manually replace these things with code that uses a newer API (one that wasn’t removed in Python 3) while still using Python 2 syntax and idioms.

So 2to3 will still be required, but 2to3 and a few module import changes are not enough by itself.

cslos77 · December 22, 2013, 10:59pm

Ok, thanks for clarifying. I’ll be able to work on this in the coming weeks; what’s the goal in terms of time? Is there any particular release you’re aiming to have this ready for?

rdb · December 22, 2013, 11:47pm

No, this is more like a pet project to me that I just spontaneously decide to have time for once every blue moon.

Thanks for the help!

rdb · December 23, 2013, 7:43pm

I just toyed a bit more, and it turned out to take surprisingly few changes to get genPyCode to work. In fact, I not only have the pandac.PandaModules imports working, even “from direct.directbase import DirectStart” now works.

I suppose the next step is to try and get some of the sample programs to work.

cslos77 · December 24, 2013, 1:56am

That’s good to hear; if you’ve got the core working then I guess the rest would just be the grunt work of going through all the other modules/samples and getting them working as well. Are the new py3 support changes in the latest buildbot version or would I have to build from the CVS version to start testing on this?

rdb · December 24, 2013, 10:13am

Well, all of the buildbot builds are compiled with Python 2, not 3. You’d have to grab the latest code from CVS in order to build with Python 3 support. You have to replace the Python version in thirdparty/win-python-x64 (or win-python for 32-bit systems) simply by removing that directory, running the appropriate Python 3 installer and pointing it at that directory (hit “install for this user only”).

You will need to see this in order to compile latest CVS:

Talkless · January 3, 2014, 3:14pm

It’s amazing to see progress on this

shimrod · January 10, 2014, 10:34am

Just for information:

http://www.robg3d.com/?p=1175

cslos77 · January 13, 2014, 3:37pm

Thanks for the link, definitely an interesting blog; I think a lot of that reasoning would also apply to panda3d, but there are a couple of differences. First of all, I’m guilty of mis-naming this thread, it really should have been something like “Branching to Python 3”, because I don’t think the goal here is to permanently migrate the entire code base to py3, but to give users the option of working with panda3d using py3 if they want. Also, for Eve the end user is a game player, so the language choice is not relevant to them, but for Panda3d obviously the end user is another developer so language choice has more bearing.

For myself, all my other projects are in py3 so it’s really just a matter of convenience; I have no problem working in py2, but the editor I use is in python and some features fail if I don’t use the version that matches the code I’m working on.

shimrod · January 14, 2014, 8:35am

I think my purpose was to say, that nothing is urgent around python 3, and there is lot of thing to develop around current version that developer requires.
To my point of view, I prefer enhancement around panda against upgrade to python 3