Command Line Config Overrides¶
For each config in a command declaration, whether standard,
inline, or supplemental,
an attempt is made to override its config attribute values with any command line
arguments that fit. We employ a variant of omegaconf’s
dot-list notation
syntax augmented with config prefixes.
Basic Overview¶
For the most part, we defer to the omegaconf (>=2.0.0)
merge()
function for overriding config attributes. omegaconf uses plain Python objects to
back its configs. The supported types are list, dict, or any dataclass
type. dataclass-backed configs create so-called
structured
configs. omegaconf rigorously type validates these configs at runtime based
on the underlying dataclass declaration. Besides this
runtime validation aspect, structured configs behave
identically to dict-like configs in terms of dot-list notation syntax.
For dict-like command line overrides, omegaconf uses = as its key-value
separator in dot-list notation. This is built into omegaconf and we can’t modify
it. omegaconf splits key-value strings on the first = (if any). Subsequent
occurrences of = are considered part of the value. If an override string does
not contain any =, it is considered a list-like override instead.
Let’s examine each backing type in turn.
list-like Configs¶
list-like command line overrides are appended to all list-list configs.
Consider this simple example:
from coma import command, wake
@command
def cmd(cfg: list):
print(cfg)
if __name__ == "__main__":
wake()
Let’s create a cfg.yaml file to see list-like overrides in action:
$ echo '[a]' > cfg.yaml
The declarative hierarchy for configs means
that cfg will first be loaded from cfg.yaml to have value ['a']. Then,
of the following command line arguments, those that are list-like will be
appended to cfg, resulting in:
$ python main.py cmd dict=like b c
['a', 'b', 'c']
Notice that dict=like does not get appended to cfg because it is a
dict-like command line override instead of a list-like override.
*args as Configs¶
Variadic positional parameters (*args) are treated as
non-serializable list-like configs by default
(though this can be toggled off). They behave exactly like
any other list-like config in terms of command line dot-list syntax. Then, once
the command is invoked, they behave exactly like Python variadic positional parameters:
from coma import command, wake
@command
def cmd(list_cfg: list, *args):
print(list_cfg)
print(args)
if __name__ == "__main__":
wake()
$ python main.py cmd a b c
['a', 'b', 'c']
('a', 'b', 'c')
Notice that both list-like configs here accepted all list-like overrides.
To choose which config receives which argument, prefix
them:
$ python main.py cmd list_cfg::a args::b args::c
['a']
('b', 'c')
dict-like Configs¶
dict-like configs can have arbitrarily nested structure, which is referenced
via omegaconf’s dot-list notation. dict-like configs accept all
dict-like command line overrides (consisting of key-value pairs), where the
key is a config attribute path in dot-list notation and the value is arbitrary.
Changing the config’s structure is always allowed. If the key path already exists
in the config’s structure, the new value replaces the existing one. If the key
path represents a new attribute, that new path is merged into the existing
config structure and given the new value. For example:
from coma import command, wake
@command
def cmd(cfg: dict):
print(cfg)
if __name__ == "__main__":
wake()
Let’s create a cfg.yaml file to see dict-like overrides in action:
$ printf "foo:\n bar: baz" > cfg.yaml
The declarative hierarchy for configs means
that cfg will first be loaded from cfg.yaml to have value
{'foo': {'bar': 'baz'}}. Then, of the following command line arguments, those
that are dict-like will replace or be merged into cfg, resulting in:
$ python main.py cmd fizz=buzz list like
{'foo': {'bar': 'baz'}, 'fizz': 'buzz'}
$ python main.py cmd foo.bar=replace list like
{'foo': {'bar': 'replace'}}
$ python main.py cmd foo.new=merge list like
{'foo': {'bar': 'baz', 'new': 'merge'}}
Notice that list and like never interact with cfg because they are
list-like command line overrides instead of a dict-like overrides.
**kwargs as Configs¶
Variadic keyword parameters (**kwargs) are treated as
non-serializable dict-like configs by default
(though this can be toggled off). They behave exactly like
any other dict-like config in terms of command line dot-list syntax. Then, once
the command is invoked, they behave exactly like Python variadic keyword parameters:
from coma import command, wake
@command
def cmd(dict_cfg: dict, **kwargs):
print(dict_cfg)
print(kwargs)
if __name__ == "__main__":
wake()
$ python main.py cmd a=b c=d
{'a': 'b', 'c': 'd'}
{'a': 'b', 'c': 'd'}
Notice that both dict-like configs here accepted all dict-like overrides.
To choose which config receives which argument, prefix
them:
$ python main.py cmd dict_cfg::a=b kwargs::c=d
{'a': 'b'}
{'c': 'd'}
Variadic keyword parameters have an additional constraint required by Python’s syntax:
No key in **kwargs can match the name of a command parameter. To illustrate
the difference, let’s first see how dict_cfg can easily accept a self-referential
key called "dict_cfg":
$ python main.py cmd dict_cfg::dict_cfg=OK
{'dict_cfg': 'OK'}
{}
But **kwargs cannot contain a key called "dict_cfg" because dict_cfg
is already the name of a parameter to the cmd function:
$ python main.py cmd kwargs::dict_cfg=OK
Traceback (most recent call last):
...
ValueError: Named parameter is defined more than once: dict_cfg
Note
Raising this ValueError is the default behavior and is the safest option. If
your use case requires an alternative behavior (for example, forcibly overriding
the value of dict_cfg with the contents of kwargs.dict_cfg), other
override policies exist. These can be
set by redefining the default
init_hook. Be cautious.
Structured Configs¶
Structured configs behave exactly as dict-like configs, except in one key aspect:
Attempting to alter their structure (e.g., by adding a new attribute) or attempting
to assign an invalid value to an existing attribute (e.g., type-mismatched) raises
an omegaconf ValidationError. Instead of crashing the program, coma
simply ignores non-matching command line overrides for structured configs. For
example, if our config only has an x attribute:
from coma import command, wake
from dataclasses import dataclass
@dataclass
class Config:
x: int = 0
@command
def cmd(cfg: Config):
print(cfg.x)
if __name__ == "__main__":
wake()
then, having x as a command line argument does override that attribute, whereas
any other command line argument, such as y, is ignored:
$ python main.py cmd x=1 y=2
1
Prefixing Overrides¶
Command line overrides can be shared between configs, which can be helpful in
certain instances. In the example below, we have two configs, both of which define
the same x attribute:
from coma import command, wake
from dataclasses import dataclass
@dataclass
class Config1:
x: int = 1
@dataclass
class Config2:
x: int = 1
@command
def multiply(first: Config1, second: Config2):
print(first.x * second.x)
if __name__ == "__main__":
wake()
By default, coma enables x as a command line argument to override both
configs at once:
$ python main.py multiply x=3
9
This causes multiply to essentially act as square. To prevent this, we can
target a specific config by prefixing the override’s standard omegaconf dot-list
notation with the config’s parameter name using the prefix delimiter (::):
$ python main.py multiply first::x=3 second::x=4
12
By default, coma also supports prefix abbreviations. A prefix can be abbreviated
as long as the abbreviation is unambiguous (i.e., matches exactly one config name).
This enables convenient shorthands for command line overrides:
$ python main.py multiply f::x=3 s::x=4
12
Overriding Inline Configs¶
All inline configs are aggregated into a special
non-serializable inline_config
that is backed by a programmatically-created dataclass. This provides all the
rigorous runtime type validation of a standard structured
config without requiring a user-defined dataclass to be created just for these
one-off fields. By default, this implicit config uses "inline" as its
name. To illustrate,
consider that the following two commands exhibit equivalent behavior:
from coma import SignatureInspector, command, wake, config_hook
from dataclasses import dataclass
@command(signature_inspector=SignatureInspector(inline=["x", "y"]))
def proper_inline(x: int = 0, y: str = "foo"):
print(x, y)
@dataclass
class MockInline:
x: int = 0
y: str = "foo"
@command(
config_hook=config_hook.default_factory(skip_write=["inline"]),
signature_inspector=SignatureInspector(inline_identifier="unused"),
)
def mock_inline(inline: MockInline):
print(inline.x, inline.y)
if __name__ == "__main__":
wake()
mock_inline calls its config parameter inline, which clashes with the
reserved name for the default inline config. So, we rename its identifier to
"unused", since we won’t be using it. The mocked inline config is a
regular config and so would get serialized by default. We disable that, rendering
it non-serializable, by adding it to skip_write
in a redefined config_hook.
mock_inline now behaves identically to proper_inline:
$ python main.py proper_inline x=42 y=bar
42 bar
$ python main.py mock_inline x=42 y=bar
42 bar
Overriding Nested Objects¶
Config attributes in coma can be deeply nested objects. Since coma delegates
to omegaconf for command line config overrides, the behavior of these overrides
follows that of omegaconf (>=2.0.0). In particular, command line arguments:
replace
listattributes with the command line valuesreplace existing keys of
dictattributes with the command line valuesmerge new
dictattribute key-value pairs into the existing dictionary
Note
See here
for an answer directly from omegaconf’s developer on why list attributes
can only replace and not merge.
Consider the following example, where l has type list with default value
[1, 2] and d has type dict with default value {'a' : {'b': 3}}.
from coma import command, wake
from dataclasses import dataclass, field
from omegaconf import OmegaConf
@dataclass
class Config:
l: list = field(default_factory=lambda: [1, 2])
d: dict = field(default_factory=lambda: {'a': {'b': 3}})
@command
def nested(cfg: Config):
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
wake()
Without command line overrides, the default values are maintained, as expected:
$ python main.py nested
l:
- 1
- 2
d:
a:
b: 3
Specifying l as a command line argument entirely replaces that config attribute:
$ python main.py nested l='[3, 4, 5]'
l:
- 3
- 4
- 5
d:
a:
b: 3
To delete existing list entries, omit them from the command line, while continuing to include existing list entries that ought to be kept:
$ python main.py nested l='[2]' # Deletes 1 from the list.
l:
- 2
d:
a:
b: 3
$ python main.py nested l='[]' # Deletes [1, 2] from the list.
l: []
d:
a:
b: 3
For d, specifying existing keys replaces the value, whereas new keys are merged.
Typically, omegaconf’s standard dot-list notation is used, but a dictionary syntax
is also supported:
Merge the new key-value pair
{'c': 4}using dot-list notation:$ python main.py nested d.c=4 l: - 1 - 2 d: a: b: 3 c: 4
Merge the new key-value pair
{'c': 4}using dictionary syntax:$ python main.py nested d='{c: 4}' l: - 1 - 2 d: a: b: 3 c: 4
Replace an existing key-value pair with
{'a' : {'b': 4}}using dot-list notation:$ python main.py nested d.a.b=4 l: - 1 - 2 d: a: b: 4
Replace an existing key-value pair with
{'a' : {'b': 4}}using dictionary syntax:$ python main.py nested d='{a: {b: 4}}' l: - 1 - 2 d: a: b: 4
Although the dictionary syntax may seem more verbose than the dot-list notation at first, it can helpful for overriding and/or merging multiple key-value pairs at once (especially as the size of the override grows), which is a feature that the dot-list notation does not directly support:
$ python main.py nested d='{a: {b: 4}, c: 5}'
l:
- 1
- 2
d:
a:
b: 4
c: 5
$ python main.py nested d.a.b=4 d.c=5
l:
- 1
- 2
d:
a:
b: 4
c: 5
Note
Unlike with lists, deletion of dictionary entries is not supported by omegaconf.
In the following example, omegaconf simply merges the empty command line
dictionary with the default dictionary, resulting in a new dictionary that is
equivalent to the default one:
$ python main.py nested d='{}'
l:
- 1
- 2
d:
a:
b: 3
Capturing Superfluous Overrides¶
For rapid prototyping, it is often beneficial to capture superfluous command line
overrides. These can then be transferred to a proper config object once the codebase
is solidifying. Variadic keyword config parameters
(**kwargs) are ideal for this:
from coma import command, wake
from omegaconf import OmegaConf
@command
def greet(**kwargs):
print("Hello World!")
print("extra command line arguments:")
print(OmegaConf.to_yaml(kwargs))
if __name__ == "__main__":
wake()
This works because **kwargs are a non-serializable
dict-like config by default that accept any
dict-like command line arguments:
$ python main.py greet
Hello World!
extra command line arguments:
{}
$ python main.py greet a='{b: {c: 1}, d: 2}' foo=3 bar.baz=4
Hello World!
extra command line arguments:
a:
b:
c: 1
d: 2
foo: 3
bar:
baz: 4