Declaring Commands¶
Using coma primarily consists of registering commands using the
@command decorator. The basic functionality
of @command is explained in the introductory tutorial.
Here, the emphasis is on more advanced use cases.
Command Hooks¶
coma has a template-based design that leverages hooks to implement, tweak, replace,
or extend its behavior. See the dedicated hooks tutorial for details on
coma’s hook protocols. In this tutorial, we’ll assume prior knowledge of hooks,
while demonstrating how to use hooks in @command to extend coma’s default
behavior for a particular command declaration.
Specifically, any of coma’s 10 total hooks can be
redefined in @command using the corresponding keyword argument:
from coma import command
@command(
parser_hook=...,
pre_config_hook=...,
config_hook=...,
post_config_hook=...,
pre_init_hook=...,
init_hook=...,
post_init_hook=...,
pre_run_hook=...,
run_hook=...,
post_run_hook=...,
)
def my_cmd(...):
...
Typically, a hook is a function with a specific signature. However, there are three additional (non-function) sentinel objects that have special meaning as command hook values.
Any hook in @command that is not explicitly redefined defaults to the
SHARED sentinel. The goal of SHARED is to indicate that
that particular hook for this particular command ought be be replaced at runtime
with the value of the corresponding hook from wake().
Nearly all of coma’s default behavior results from pre-defined hooks declared
in wake(). SHARED copies that functionality to command declarations.
See here for the full details on shared hooks. The
relevant point here is that any hook declared in wake() is automatically
propagated to command declarations, not because that behavior is baked
into coma, but rather because each hook in @command defaults to SHARED.
By explicitly redefining a @command hook to some value other than SHARED,
we are able to affect just that particular command declaration. What if we redefine
one of the shared hooks in wake() to some user-defined hook, but we still want
to maintain the default hook behavior for a specific command? This can happen, for
example, if we want custom (non-default) behavior for most hooks, but we want to
retain the default behavior for a few hooks. This is where the next special sentinel
comes in: DEFAULT. DEFAULT gets replaced at runtime
with the corresponding default hook regardless of the value of wake()’s hook.
Finally, a particular hook can be disabled altogether by setting its value to None.
Although None is a built in Python object, here it is being used as a sentinel to
mean “skip this hook” (though, in practice, we replace it with the no-op
identity() function rather than truly skipping it).
The dedicated hooks tutorial also emphasizes that hooks can be single/plain values,
or they can be nested sequences of such values. These
nested sequences (if any) are recursively inspected for the presence of any of these
three sentinels (SHARED, DEFAULT, and None). These are replaced at
runtime with their semantic equivalent function. This is particularly useful to
add behavior to the shared hook, rather than outright replacing it. For example:
from coma import command, SHARED
@command(
parser_hook=(SHARED, additional_hook),
...,
)
def my_cmd(...):
...
means the parser_hook for this command declaration will first call the shared
parser_hook defined in wake() and then call additional_hook. The order
here matters. Having SHARED after additional_hook calls them in the
reverse order.
Summary:
By default, an undefined
@command-level hook falls back to the correspondingSHAREDhook defined inwake(). In general, we think in terms of thewake()-level hook as propagating to each command declaration by default (unless an explicit@command-level definition is given).By default, the hooks defined in
wake()are precisely those that givecomaits default behavior as explored throughout these tutorials. That is how each command declaration comes to inherit this same default behavior. It is not baked into@command.If a
wake()-level hook is redefined, the defaultcomabehavior can be recovered in a particular command declaration by defining its@command-level hook asDEFAULT.Setting a hook to
Nonedisables (skips) that particular hook. This is the idiomatic way to prevent awake()-level hook from propagating to a particular command.Hook definitions can be plain/simple objects, or sequences thereof. In particular, setting a
@command-level hook to(SHARED, additional_hook)is the idiomatic way to add additional behaviour to a particular command beyond what is specified in the shared hook. Note that the order here matters:(SHARED, additional_hook) != (additional_hook, SHARED).
Let’s see how a few hooks can easily extend a command’s functionality beyond coma’s
defaults. In this example, we define a parser_hook that adds a new --dry-run
flag to the command line, as well as a pre_run_hook that exits the program early
(before the command is actually executed) if that flag is given on the command line:
from coma import InvocationData, add_argument_factory, command, wake, SHARED
parser_hook = add_argument_factory("--dry-run", action="store_true")
def pre_run_hook(data: InvocationData):
if data.known_args.dry_run:
print(f"Early exit for command: {data.name}")
quit()
@command(
parser_hook=(SHARED, parser_hook),
pre_run_hook=(SHARED, pre_run_hook),
)
def greet():
print("Hello World!")
if __name__ == "__main__":
wake()
Let’s see this new functionality in action:
$ python main.py greet
Hello World!
$ python main.py greet --dry-run
Early exit for command: greet
Note
coma provides factory functions for some of the more common hooks. In this
example, we used add_argument_factory(), which simply
creates a parser_hook that in turn relays the provided parameters to the
add_argument()
method of the underlying ArgumentParser
bound to this command.
Most hooks have factories to enable behavioral tweaks as one-liners as seen
here. Browse the hooks’ package reference
for details. Factory function names always end with *_factory.
Command Signature Inspection¶
How does @command inspect the command signature to determine which command
parameters are configs and which are regular parameters?
@command accepts an optional SignatureInspectorProtocol
to which the signature inspection is delegated. When no explicit signature inspector
is given, the default is a SignatureInspector with default
parameters. Here, we’ll explore the parameter space of SignatureInspector.
This forms the basis of coma default behavior, but is not baked into
@command. In fact, tweaking the default (particularly with inline configs)
is quite common, as we will see.
SignatureInspector is just a lightweight wrapper around
ParamData.from_signature(),
which does all the heavy lifting. We’ll explore from_signature()’s parameter
options in the upcoming example. But first,
let’s get a basic sense of how the command signature is inspected.
Configs vs Regular Parameters¶
The distinction between configs and
other_parameters
(which we will interchangeably call regular parameters) in a command’s signature is
determined by inspecting its type annotation (if any), its default value
(if any), its kind,
and whether the parameter is marked as inline.
Configs take priority over regular parameters. If a parameter can be considered
a config (as per the criteria below), it is treated as one. All parameters that
cannot be interpreted as configs are assumed to be regular parameters unless
marked as inline.
Criteria for Interpreting a Parameter as a Config:
The parameter has a type annotation that exactly matches one of
list,dict, or anydataclasstype. We refer to these as config annotations.The parameter does not have a default value. Since configs enjoy a dedicated declarative initialization protocol, default parameter values are not needed.
Note
This means that a convenient way to ensure that a config-annotated parameter is interpreted as a regular parameter is to give it a default. For example,
list_cfg: listis interpreted as a config whereasnon_cfg_list: list = Noneis interpreted as a regular parameter.The parameter is not marked
inline. Even if the parameter otherwise conforms to criteria (1) and (2), being markedinlineis disqualifying.
Special case: Because variadic positional (
*args) and variadic keyword (**kwargs) parameters cannot be assigned defaults in Python, and because they can never be marked asinline, criteria (2) and (3) cannot be used for them. Instead, use the special flagsSignatureInspector.args_as_configandSignatureInspector.kwargs_as_configwhich are passed directly toParamData.from_signature()to toggle whether variadic parameters are interpreted as configs or regular parameters. By default, they are interpreted as configs.
See the example below to get a better sense of how this gets applied.
Inline Configs¶
An inline parameter is a one-off config field. Specifically, all parameters marked
as SignatureInspector.inline are
aggregated into a special inline_config, which is
backed by a programmatically-created dataclass. This provides all the rigorous
runtime type validation of a standard dataclass-backed omegaconf config without
requiring a user-defined dataclass to be created just for these one-off fields.
Moreover, inline configs are non-serializable,
whereas a user-defined dataclass aggregating the same fields would be serializable
by default.
On mutable inline default values:
An inline parameter requires a default value (see criterion (2) below). Because
it is un-Pythonic to declare a mutable default value in a function definition,
it can be tricky to set a good default value for inline parameters. For
example, Python recommends a default value of inline_list: list | None = None
rather than the mutable inline_list: list = [] to avoid accidentally sharing
the mutable inline_list between function calls.
To circumvent this, each item in the SignatureInspector.inline container
can consist of either just the name of the parameter to mark as inline,
or be 2-tuple where the first value is the parameter’s name and the second
value is a default_factory conforming to the requirements of the same
argument to dataclasses.field().
See the example below for details.
Criteria for Interpreting a Parameter as Inline:
The parameter has a type annotation. A missing annotation is disqualifying.
The parameter has a default value. A missing default value is disqualifying. The default value can be specified directly in the command’s signature, or it can be provided as a
default_factorytoSignatureInspector.inline. It is an error to specify both a default value and a default factory.The default value (or the return value of the default factory) is a valid instance of the annotation type. If not,
omegaconfwill raise aValidationError.The parameter’s name is found in
SignatureInspector.inline. If this is true, but one of the above criteria are violated, an error is raised. If this is false, the parameter is considered not marked asinlineand is instead treated as a regular parameter.The parameter’s kind is not variadic positional or variadic keyword. These two special cases can be configs or regular parameters, but never
inline. This is done to avoid duplicate parameter values when executing the command at runtime.
See the example (next) to get a better sense of how this gets applied.
Example¶
In this example, even though Data is a dataclass, it is not considered
a config because of its non-config annotation and its None default value (either
of which is disqualifying on its own).
On the other hand, both out_file and my_list can be overridden on the command
line because of their inline declaration. Notice further that because my_list is
a mutable type, we specify a default_factory as part of the inline declaration,
rather than providing a mutable default directly in the command signature. That is
not necessary for out_file because strings are immutable in Python.
List-like command line arguments are appended to my_list because it is
marked inline. However, list-like arguments are not given to *args because
args_as_config is False. On the other hand, because kwargs_as_config is
True (implicitly, by default), any dict-like command line arguments are given to
**kwargs.
from coma import SignatureInspector, command, wake
from dataclasses import dataclass
from typing import Optional
@dataclass
class Data:
x: int = 42
@dataclass
class Config:
y: float = 3.14
@command(
signature_inspector=SignatureInspector(
args_as_config=False, inline=["out_file", ("my_list", list)]
),
)
def cmd(
cfg: Config,
my_list: list,
data: Optional[Data] = None,
out_file: str = "out.txt",
*args,
**kwargs,
):
print("cfg is:", cfg)
print("my_list is:", my_list)
print("data is:", data or Data())
print("out_file is:", out_file)
print("*args is:", args)
print("**kwargs is:", kwargs)
if __name__ == "__main__":
wake()
Invoking on the command line with some carefully-chosen overrides to highlight these difference results in the following:
$ python main.py cmd x=1 y=2 z inline::out_file=foo.txt inline::my_list='[bar]'
cfg is: Config(y=2.0)
my_list is: ['bar']
data is: Data(x=42)
out_file is: foo.txt
*args is: ()
**kwargs is: {'x': 1, 'y': 2}
$ ls
main.py
cfg.yaml
Notice that:
The list-like argument
'z'is not in*argsbecause*argsis not a config (otherwise, it would have been in*args). It is also not inmy_listbecausemy_listis an inline config and so adding tomy_listrequires an explicitomegaconfdot-list notation to be used (inline::my_list='[bar]'in this example). See here for further explanation.**kwargsincludes both dict-like arguments (xandy).out_fileis overridden. Just likemy_list, we prefixedout_filewith the inline config name ("inline"). See the next point for an explanation.out_fileandmy_listare prefixed with the inline config name ("inline") to prevent**kwargsfrom also containing these fields. See here for further explanation. The upshot relevant to this discussion is that including either field in**kwargswould result in a runtime error from named parameters appearing multiple times in the command’s parameter list (which is aTypeErrorin Python).Because
cfgis a config, it’syattribute was overridden. Notice that bothcfgand**kwargsacceptedy. This sharing of overrides is the default behavior incoma. To disable it, seeOverridein conjunction with the defaultconfig_hook.Because
datais not a config, it’sxattribute is not overridden. In fact, because the default value ofdatais not replaced in any user-defined hook, its value when invoking the command will invariably beNonein this example. UseParamData.replace()in a hook to change this. See here and here for examples.Because
inlineconfigs and variadic configs are non-serializable, the only config file that gets created from invoking the command iscfg.yaml. Nothing gets written formy_list,out_file, or**kwargs.
See here for more details on command line overrides.
Supplemental Configs¶
Supplemental configs are additional config parameters that required by the command
declaration but do not appear in the command’s signature. These can be helpful for
providing additional configurable information to the hooks beyond what
the command object itself requires. See here for an example.
Any objects passed as supplemental_configs to @command are treated as
configs and converted into Configs without additional
SignatureInspector checks except for ensuring that no supplemental config names
clash with any parameter names in the command signature (or with the special
ParamData.inline_identifier
for inline config fields).
In the example below, suppose we desperately want a supplemental config called
"inline". Since this clashes with the default name of the inline_identifier,
we rename the inline_identifier to "param" while providing a supplemental
config named "inline". Although this supplemental config won’t be available as
part of the command invocation, it is available in all the hooks via
get_config().
from coma import SignatureInspector, command, wake
@command(
pre_init_hook=lambda data: print(
"supplemental:", data.parameters.get_config("inline").get_latest()
),
signature_inspector=SignatureInspector(
inline_identifier="param", inline=["only"]
),
inline=dict,
)
def cmd(only: str = ""):
print(f"cfg: {only=}")
if __name__ == "__main__":
wake()
Invoking on the command line with some carefully-chosen overrides to highlight these difference results in the following:
$ python main.py cmd inline::only=supplemental param::only=cfg
supplemental: {'only': 'supplemental'}
cfg: only='cfg'
Config Serialization and Persistence Management¶
Note
We refer to both config serialization and config persistence management. While
these terms are closely related and mostly interchangeable, the subtle distinction
is that serialization refers to whether a config file is written and what
the contents of that file are, whereas persistence management refers to where
the config file exists (if any) in the file system (both the path and the base file
name) and how coma is made aware of this path (via argparse flags).
@command accepts an optional PersistenceManager that
manages the file paths of serializable configs as well as the argparse flags for
setting these file paths as part of the default hooks.
When no explicit persistence manager is given, the default is a PersistenceManager
that favors .yaml file extensions. This is why config files in most tutorials
and examples in these docs are YAML files. It is not baked into @command.
Note
coma supports both YAML and JSON config file formats, but defaults to YAML.
For setting JSON as the default see, here.
A persistence manager allows you to register()
an explicit file path and explicit argparse flag arguments for a specific config.
If no explicit registration is given, sensible defaults
are used. For full details, see here.
Warning
Registering a particular
config with a persistence manager does not guarantee/force that the config
will be serialized, but rather only explicitly determines which parameters get
passed to add_argument()
(overriding the sensible defaults that are otherwise provided).
Non-Serializable Configs¶
coma considers variadic positional (*args) and keyword (**kwargs) configs,
as well as all inline configs to be non-serializable. These configs will never be
serialized by coma’s default hooks regardless of
whether that config gets register()ed with a persistence manager.
Note
To force a non-serializable config to be serialized, write a custom hook that
directly calls write() on that config object. See
write_factory() for additional details.
Parameters to argparse¶
By default, coma uses ArgumentParser.add_subparsers().add_parser()
to create a new ArgumentParser
with default parameters for each declared command. However, you can provide
keyword arguments to override the default parameter values to the internal
add_parser() call through the parser_kwargs parameter to @command.
For example, suppose you want to add command aliases.
This can be achieved through the aliases keyword:
from coma import command, wake
if __name__ == "__main__":
command(
name="greet",
cmd=lambda: print("Hello World!"),
parser_kwargs=dict(aliases=["gr"]),
)
wake()
With this alias, greet can now be invoked with just gr:
$ python main.py gr
Hello World!