Defining Hooks

coma has a template-based architecture. Hooks not only implement coma’s default behavior, but also make it easy to tweak, replace, or extend that behavior.

Hook Semantics

coma has very few baked in assumptions. All of coma’s default behavior results from pre-defined hooks that have been chosen to fill its template slots. Nearly all behavior can be drastically changed with user-defined hooks.

coma has 10 total hook slots in its template. To help users decide which hooks to define, each of these slots has semantics (the type of functionality that coma expects that particular hook slot will have). Most of these semantics are not hard requirements, and hook implementations are free to vary wildly from their semantics. That said, semantics provide a solid base from which to explore this space.

At the highest level, hooks belong to one of two semantic types: parser and invocation.

Parser Hooks

Parser hooks are the only type of hook that get executed at command registration time (prior to command invocation). The parser_hook semantics are to add command line flags via calls to add_argument() on the underlying ArgumentParser bound to the command that will later be invoked.

Parser hooks for all declared commands are executed (at registration time). This is needed so that argparse has all the necessary information to invoke the correct command based on the provided command line arguments. This means that parser hooks with side effects will always execute those side effects, even if the command they are bound to isn’t the one that ultimately gets invoked.

Invocation Hooks

Invocation hooks are stored during command registration, but only get executed if the command to which they are bound is invoked (i.e., is chosen by argparse based on the provided command line arguments). Invocation hooks, as their name suggests, are responsible for completing all necessary steps involved in successfully invoking the command to which they are bound.

Invocation hooks semantics further belong to one of three sub-types:

Config:

Config hooks are meant to initialize or affect the config objects that are bound to a particular command.

Init:

Init hooks are meant to instantiate or affect the command object (either a function or a class).

Warning

Function-based command objects are internally wrapped in a programmatically-generated class, and it is this wrapper class that an init hook receives, not the raw function object. This unifies the interface, since (from the perspective of an init hook) all command objects are classes that ought to be instantiated.

Run:

Run hooks are meant to execute or surround the execution of the command object after it has been instantiated (presumably by an init hook).

Each of the three invocation hook sub-types (config, init, and run) is further split into three flavors:

Pre:

Pre hooks are executed immediately before the main hook of the same type as a way to add additional behavior.

Main:

Main hooks are generally meant to perform the bulk of the work for that semantic category (config, init, or run).

Post:

Post hooks are executed immediately after the main hook of the same type as a way to add additional behavior.

Altogether, there are 9 invocation hooks. The following keywords are used in @command and wake() to define command and shared hooks, respectively:

Type

Sub-Type

Flavor

Keyword

parser

N/A

N/A

parser_hook

invocation

config

pre

pre_config_hook

main

config_hook

post

post_config_hook

init

pre

pre_init_hook

main

init_hook

post

post_init_hook

run

pre

pre_run_hook

main

run_hook

post

post_run_hook

Hook Pipeline

As stated above, parser hooks are executed when a command is registered, whereas the invocation hooks are executed if, and only if, the command to which they are bound is invoked by argparse. The invocation hook pipeline consists of executing all the invocation hooks (in order) one immediately following the other, with no other code in between. In other words, the invocation hooks make up the entirety of the code responsible for completing all necessary steps involved in successfully invoking the command to which they are bound.

Hook Protocol

To enable interoperability between hooks (especially in the hook pipeline), all hooks must follow a specific protocol (i.e., function signature). All hooks, regardless of semantics, must take exactly one parameter. For parser hooks, this parameter is a ParserData object, whereas it is an InvocationData object for invocation hooks. Both of these inherit from HookData, and it is perfectly acceptable to subclass any of these to add additional attributes needed in custom hooks.

Hooks typically modify their input parameter inplace and return None. However, a hook can also return a new object (of the same type as its input parameter) derived from the input parameter instead of making inplace modifications. Subsequent hooks in the pipeline receive whichever object is the latest non-None return object from a preceding hook.

Default Hooks

Rather than being hardcoded, coma’s default behavior is, almost entirely, a result of having specific pre-defined hooks as default value in the definition of wake() that propagate to all command declarations unless explicitly redefined. The upshot is that there is almost no part of coma’s default behavior that cannot be tweaked, replaced, or extended through hooks.

That being said, coma’s default hooks already provide extensive functionality. Of coma’s 10 total hooks, only 4 have pre-defined defaults: the parser_hook, the main config_hook, the main init_hook, and the main run_hook. All default hooks are generated from factory functions with default parameters.

Note

Factories to enable behavioral tweaks as one-liners by redefining a default hook using its factory with a single changed parameter. For example, run_hook.default_factory() can be used to change the command execution method name from the default run() to something else. See here.

Browse the hooks’ package reference to explore factory options. Factory function names always end with *_factory. All the default factories are named default_factory and can be found in their respective hook-semantic module. For example, the default factory for run_hook is found in coma.hooks.run_hook.default_factory().

If you are finding that the factory functions are insufficient, consider making use of the many config-related utilities found here to help you in writing your own custom hooks.

In the explanations below, data refers to the input parameter of the hook (ParserData for parser hooks and InvocationData for invocation hooks).

Default Parser Hook:

The default parser_hook uses data.persistence_manager to add, for each serializable config, a parser path argument. This enables an explicit file path to the config file to be specified on the command line via a flag. By default, the flag is --{config_name}-path, where config_name is the name of the corresponding config parameter in the command signature.

Default Main Config Hook:

The default config_hook does all the heaving lifting for manifesting coma’s default behavior regarding configs. It makes the following assumptions:

  • Configs are declarative. They should always follow the declarative hierarchy.

  • Declared configs are required. This means that declared configs (both in the command’s signature and any supplemental configs) are loaded (based on the declarative hierarchy) by default.

  • Persistence of configs is typically desirable. This means that, by default, all serializable configs are serialized (to enable the middle step of the declarative hierarchy), but skipping serialization for a particular config is easy.

In short, for each config, this hook initializes the config based on the declarative hierarchy protocol:

  • At minimum, each config is initialized from its base declaration.

  • Serializable configs are then loaded from file (if one exists) or written to file (otherwise) unless serialization has been explicitly toggled off for that particular config. Serialization interacts with the default parser_hook since it queries the same data.persistence_manager to get the file path of each config based on its path declaration in the default parser_hook. See here for more details on config files.

  • For each config, an attempt is made to override its config attribute values with any command line arguments that fit omegaconf’s dot-list notation.

Note

Each config variant in the declarative hierarchy is stored so that later hooks can access any variant (if needed). This is particularly helpful in cases where some configs need to be preloaded before others.

The config_hook’s default factory includes many flags for tweaking the default behavior. For example, you can skip the override or the serialization of some configs but not others. Or you can raise a FileNotFoundError if a particular config file cannot be found. Or even force the serialization of the override values rather than the base config declaration.

Default Main Init Hook:

The default init_hook instantiates the data.command class by calling its __init__() method with all declared parameters (config, inline, and regular) filled in through the call_on() method of data.parameters. Then, the value of data.command (a class type) gets replaced inplace with the value of the instantiated object.

Warning

In user-defined hooks, be sure to never make decisions based on directly inspecting the data.command object. Not only are function-based commands implicitly wrapped in a class, but also the value of data.command changes from a class type to an instance of that class as part of this default init hook.

Instead, use data.name if you need to determine which command is being invoked, since the command name is guaranteed to be unique across all declared commands.

Default Main Run Hook:

The default run_hook calls the data.command object’s run() (by default, though this can be changed) method with no parameters. This assumes that the init_hook has instantiated data.command from a class type to an instance.

Hooks as Sequences

Typically, a hook is a function with a signature based on the hook protocol. However, there are three additional (non-function) sentinel objects (SHARED, DEFAULT, and None) that have special meaning as command and/or shared hook values. A valid “plain” hook can be any single function adhering to the hook protocol or any single of these three sentinels.

In addition, any (recursively) nested sequences of these singular/plain values is also a valid hook. Each item in these sequences is recursively inspected for the presence of any of the three sentinels. These are replaced at runtime with their semantic equivalent function. This is particularly useful to extend coma’s default behavior, rather than outright replacing replacing it. To emphasize the recursive potential of nested hook sequences, consider this toy example:

from coma import command, wake, DEFAULT

@command(
    run_hook=(
        (
            None,
            lambda _: print("First"),
        ),
        lambda _: print("Second"),
        (
            (
                (
                    (
                        DEFAULT,
                        lambda _: print("Fourth"),
                    ),
                ),
            ),
        ),
        None,
        (),
        lambda _: print("Last"),
    ),
)
def nested():
    print("Third")

if __name__ == "__main__":
    wake()

Let’s see how coma resolves the nested sequences:

$ python main.py nested
First
Second
Third
Fourth
Last

Notice that DEFAULT gets replaced at runtime with the default run_hook which runs the command and prints Third at that position in the nested sequences.

Beyond this toy example, sequences are helpful in practice for decomposing a complex hook function into a series of smaller ones. Often these component functions will be hook variants created using factories. Hook sequences essentially wrap each component function into a higher-order function that executes the components in order following the rules of the hook protocol.

As an extreme example, we could redefine the pre_config_hook of a command to stuff the entire default invocation pipeline into it while setting the standard hooks to None:

from coma import command, wake, config_hook, init_hook, run_hook

@command(
    pre_config_hook=(
        config_hook.default_factory(),
        init_hook.default_factory(),
        run_hook.default_factory(),
    ),
    config_hook=None,
    init_hook=None,
    run_hook=None,
)
def cmd():
    print("No problem!")

if __name__ == "__main__":
    wake()

This example also highlights the utility of pre and post hooks. They are really just conceptual convenience functions. All functionality could in principle be placed in a single hook sequence as shown here. The benefit of multiple hook types and sub-types with differing semantics is to help conceptually separate concerns. Consider that, in this example, we defined a pre_run_hook that exits the program before running the command. In principle, we could have implemented this same functionality by redefining the run_hook as (pre_run_hook, SHARED). However, because the new functionality is an early exit (before running the command), it feels conceptually cleaner to exit as as a separate pre_run_hook, rather than as an initial component of the run_hook in the invocation pipeline. This distinction is purely conceptual. The resulting behavior is essentially equivalent.