How do you recursively get all submodules in a python package?

python import all submodules
python get all modules in package
python package loader
python recursive import
python module info
python find module
python import
python dynamic import
Problem

I have a folder structure like this:

- modules
    - root
        - abc
            hello.py
            __init__.py
        - xyz
            hi.py
            __init__.py
          blah.py
          __init__.py
      foo.py
      bar.py
      __init_.py

Here is the same thing in string format:

"modules",
"modues/__init__.py",
"modules/foo.py",
"modules/bar.py",
"modules/root",
"modules/root/__init__.py",
"modules/root/blah,py",
"modules/root/abc",
"modules/root/abc/__init__.py",
"modules/root/abc/hello.py",
"modules/root/xyz",
"modules/root/xyz/__init__.py",
"modules/root/xyz/hi.py"

I am trying to print out all the modules in the python import style format. An example output would like this:

modules.foo
modules.bar
modules.root.blah
modules.root.abc.hello
modules.root.xyz.hi

How can I do this is in python(if possible without third party libraries) easily?

What I tried
Sample Code
import pkgutil

import modules

absolute_modules = []


def find_modules(module_path):
    for package in pkgutil.walk_packages(module_path):
        print(package)
        if package.ispkg:
            find_modules([package.name])
        else:
            absolute_modules.append(package.name)


if __name__ == "__main__":
    find_modules(modules.__path__)
    for module in absolute_modules:
        print(module)

However, this code will only print out 'foo' and 'bar'. But not 'root' and it's sub packages. I'm also having trouble figuring out how to convert this to preserve it's absolute import style. The current code only gets the package/module name and not the actual absolute import.

This uses setuptools.find_packages (for the packages) and pkgutil.iter_modules for their submodules. Python2 is supported as well. No need for recursion, it's all handled by these two functions.

import sys
from setuptools import find_packages
from pkgutil import iter_modules

def find_modules(path):
    modules = set()
    for pkg in find_packages(path):
        modules.add(pkg)
        pkgpath = path + '/' + pkg.replace('.', '/')
        if sys.version_info.major == 2 or (sys.version_info.major == 3 and sys.version_info.minor < 6):
            for _, name, ispkg in iter_modules([pkgpath]):
                if not ispkg:
                    modules.add(pkg + '.' + name)
        else:
            for info in iter_modules([pkgpath]):
                if not info.ispkg:
                    modules.add(pkg + '.' + info.name)
    return modules

How to import all submodules?, Edit: Here's one way to recursively import everything at runtime (Contents of __​init__.py in top package directory) Recursive Data Structures in Python. A data structure is recursive if it can be defined in terms of a smaller version of itself. A list is an example of a recursive data structure. Let me demonstrate. Assume that you have only an empty list at your disposal, and the only operation you can perform on it is this:

The below code will give you the relative package module from the codes current working directory.

import os
import re

for root,dirname,filename in os.walk(os.getcwd()):
    pth_build=""
    if os.path.isfile(root+"/__init__.py"):
        for i in filename:
            if i <> "__init__.py" and i <> "__init__.pyc":
                if i.split('.')[1] == "py":
                    slot = list(set(root.split('\\')) -set(os.getcwd().split('\\')))
                    pth_build = slot[0]
                    del slot[0]
                    for j in slot:
                        pth_build = pth_build+"."+j
                    print pth_build +"."+ i.split('.')[0]

This code will display:

modules.foo
modules.bar
modules.root.blah
modules.root.abc.hello
modules.root.xyz.hi

If you run it outside the modules folder.

31.2. pkgutil — Package extension utility, This will add to the package's __path__ all subdirectories of directories on for all modules recursively on path, or, if path is None, all accessible modules. on the given path, in order to access the __path__ attribute to find submodules. I made D a submodule of C, C a submodule of B, and B a submodule of A. I then cloned A using git clone --recursive A A-test, and it properly populated all the submodules. So, if this did not work for you, I would have to assume it was something specific to the way you had your modules and submodules set up.

So I finally figured out how to do this cleanly and get pkgutil to take care of all the edge case for you. This code was based off python's help() function which only displays top level modules and packages.

import importlib
import pkgutil

import sys

import modules


def find_abs_modules(module):
    path_list = []
    spec_list = []
    for importer, modname, ispkg in pkgutil.walk_packages(module.__path__):
        import_path = f"{module.__name__}.{modname}"
        if ispkg:
            spec = pkgutil._get_spec(importer, modname)
            importlib._bootstrap._load(spec)
            spec_list.append(spec)
        else:
            path_list.append(import_path)
    for spec in spec_list:
        del sys.modules[spec.name]
    return path_list


if __name__ == "__main__":
    print(sys.modules)
    print(find_abs_modules(modules))
    print(sys.modules)

This will work even for builtin packages.

5. The import system, It's important to keep in mind that all packages are modules, but not all Thus you might have a module called sys and a package called email phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz . 1 Python pyqt pulsing progress bar with multithreading Jul 18 '17 1 Multi-Threading in PyQt 5 Jul 20 '17 1 How do you recursively get all submodules in a python package?

31.5. pkgutil — Package extension utility, This will add to the package's __path__ all subdirectories of directories on for all modules recursively on path, or, if path is None , all accessible modules. on the given path, in order to access the __path__ attribute to find submodules. You can use git submodule update --init --recursive here as well, but if you’re cloning slingshot for the first time, you can use a modified clone command to ensure you download everything, including any submodules: git clone --recursive <project url> Switching to submodules. It can be a little tricky to take an existing subfolder and turn it into an external dependency. Let’s look at an example. You’re about to start a new project—a magic roll-back can–which also needs a rubber-band.

importlib — The implementation of import, Find the loader for a module, optionally within the specified path. To properly import a submodule you will need to import all parent packages of the submodule sys.modules before any loading begins, to prevent recursion from the import. Listing top modules is relatively easy if you know how to do it. This script prints a list of all top level modules: import pkgutil for p in pkgutil.iter_modules(): print(p[1]) Now we need to look inside modules to find sub-modules. For performance reasons I want to do that only when it is needed:

pkgutil.walk_packages Python Example, def import_submodules(package, recursive=True): """Import all submodules of a module, recursively, def find_and_import(session, whitelist): """ find all python files in the same directory as this file and  @chrisdrackett - I'm not sure I understand why you'd have to edit "many setup.py files" - do you have many Python packages whose repo uses submodules (if so, that seems questionable to me; I'd look at re-thinking that)? Or are you imagining that you'd have to handle it in the setup.py for any package dependent on the one that uses a submodule?

Comments
  • Why do you ask "without any third party libraries"? You are reinventing the wheel (pardon the pun), this is already implemented by pkg_resources (a part of the setuptools distribution).
  • Well, I want to learn how to do this so I can customize it
  • OK, but I'm still not seeing why that rules out third party libs.
  • Ummm, well the reason is because someone on IRC suggested using the gather library which introduces a @decorator into all the submodules that want to be collected. This is a terrible way to collect module names. As long as the module is actually in the stdlib, it should be fine. Should also be fine if the code is an actively maintained third party lib which in most cases it is not.
  • I don't have time to verify this works. So I've unmarked my answer as the correct version. However please note: len(find_abs_modules(xml)) == len(list(find_modules(xml.__path__[0]))) returns False and also shows _private modules.
  • Hmm ... it did not scan recursively in my usecase, but then this helped me coming up with my own solution: I'm using pkg=importlib.import_module(import_path) and then recursively call find_abs_modules(pkg).
  • @TheDiveO Can you tell me what the case was it didn't work for? I'm currently using this in a dev env and would like to patch any edge cases.
  • Inside the application (package) foobar I start scanning from package foobar.plugins (which was imported before scanning) and the scan function did not find the foobar.plugins.footest subpackage (which I did not import yet). footest has an __init__.py, but it was never found, even after importing it inside foobar.plugins.__init__. So I resorted to relying solely on importlib.import_module() because I need to import the plugins found anyway.
  • It's on Python 3.5.3 (Deb 9)
  • @TheDiveO Thanks for the details.I wrote this code for for 3.6.5 upwards. So it may explain why it's not working for your case (calling private methods is a bad idea in general). I'll be updating this soon with public versions of the same functions so everyone can benefit. Thanks for your help!